Message ID | 20220223184812.2413004-1-grive@u256.net |
---|---|
State | Accepted |
Commit | f77dbc1eb2da2523625cd36922c6fccfcb3f3eb7 |
Headers | show |
Series | [ovs-dev] ofproto: Use xlate map for uuid lookups | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | warning | apply and check: warning |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
Bleep bloop. Greetings Gaetan Rivet, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Unexpected sign-offs from developers who are not authors or co-authors or committers: Yunjian Wang <wangyunjian@huawei.com> Lines checked: 166, Warnings: 1, Errors: 0 Please check this out. If you feel there has been an error, please email aconole@redhat.com Thanks, 0-day Robot
I enabled ASan on MSVC (https://devblogs.microsoft.com/cppblog/addresssanitizer-asan-for-windows-wi th-msvc/) and this patch alleviates the issues found by the CI and my local testing. Tested-by: Alin-Gabriel Serdean <aserdean@ovn.org> Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org> -----Original Message----- From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Gaetan Rivet Sent: Wednesday, February 23, 2022 8:48 PM To: ovs-dev@openvswitch.org Subject: [ovs-dev] [PATCH] ofproto: Use xlate map for uuid lookups The ofproto map 'all_ofproto_dpifs_by_uuid' does not support concurrent accesses. It is however read by upcall handler threads and written by the main thread at the same time. Additionally, handler threads will change the ams_seq while an ofproto is being destroyed, triggering crashes with the following backtrace: (gdb) bt hmap_next (hmap.h:398) seq_wake_waiters (seq.c:326) seq_change_protected (seq.c:134) seq_change (seq.c:144) ofproto_dpif_send_async_msg (ofproto_dpif.c:263) process_upcall (ofproto_dpif_upcall.c:1782) recv_upcalls (ofproto_dpif_upcall.c:1026) udpif_upcall_handler (ofproto/ofproto_dpif_upcall.c:945) ovsthread_wrapper (ovs_thread.c:734) To solve both issues, remove the 'all_ofproto_dpifs_by_uuid'. Instead, another map already storing ofprotos in xlate can be used. During an ofproto destruction, its reference is removed from the current xlate xcfg. Such change is committed only after all threads have quiesced at least once during xlate_txn_commit(). This wait ensures that the removal is seen by all threads, rendering impossible for a thread to still hold a reference while the destruction proceeds. Furthermore, the xlate maps are copied during updates instead of being written in place. It is thus correct to read xcfg->xbridges while inserting or removing from new_xcfg->xbridges. Finally, now that ofproto_dpifs lookups are done through xcfg->xbridges, it is important to use a high level of entropy. As it used the ofproto pointer hashed, fewer bits were random compared to the uuid key used in 'all_ofproto_dpifs_by_uuid'. To solve this, use the ofproto uuid as the key in xbridges as well, improving entropy. Fixes: fcb9579be3c7 ("ofproto: Add 'ofproto_uuid' and 'ofp_in_port' to user action cookie.") Suggested-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: Gaetan Rivet <grive@u256.net> ---
On 2/23/22 19:48, Gaetan Rivet wrote: > The ofproto map 'all_ofproto_dpifs_by_uuid' does not support > concurrent accesses. It is however read by upcall handler threads > and written by the main thread at the same time. > > Additionally, handler threads will change the ams_seq while > an ofproto is being destroyed, triggering crashes with the > following backtrace: > > (gdb) bt > hmap_next (hmap.h:398) > seq_wake_waiters (seq.c:326) > seq_change_protected (seq.c:134) > seq_change (seq.c:144) > ofproto_dpif_send_async_msg (ofproto_dpif.c:263) > process_upcall (ofproto_dpif_upcall.c:1782) > recv_upcalls (ofproto_dpif_upcall.c:1026) > udpif_upcall_handler (ofproto/ofproto_dpif_upcall.c:945) > ovsthread_wrapper (ovs_thread.c:734) > > To solve both issues, remove the 'all_ofproto_dpifs_by_uuid'. > Instead, another map already storing ofprotos in xlate can be used. > > During an ofproto destruction, its reference is removed from the current > xlate xcfg. Such change is committed only after all threads have quiesced > at least once during xlate_txn_commit(). This wait ensures that the > removal is seen by all threads, rendering impossible for a thread to > still hold a reference while the destruction proceeds. > > Furthermore, the xlate maps are copied during updates instead of > being written in place. It is thus correct to read xcfg->xbridges while > inserting or removing from new_xcfg->xbridges. > > Finally, now that ofproto_dpifs lookups are done through xcfg->xbridges, > it is important to use a high level of entropy. As it used the ofproto pointer > hashed, fewer bits were random compared to the uuid key used in > 'all_ofproto_dpifs_by_uuid'. To solve this, use the ofproto uuid as the key > in xbridges as well, improving entropy. > > Fixes: fcb9579be3c7 ("ofproto: Add 'ofproto_uuid' and 'ofp_in_port' to user action cookie.") > Suggested-by: Adrian Moreno <amorenoz@redhat.com> > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > Signed-off-by: Gaetan Rivet <grive@u256.net> > --- > > Following the discussion on the fix > https://patchwork.ozlabs.org/project/openvswitch/patch/1638530715-44436-1-git-send-email-wangyunjian@huawei.com/ > > I tested it with Peng's ofproto refcount patch in tree: > https://patchwork.ozlabs.org/project/openvswitch/patch/20220219032607.15757-1-hepeng.0320@bytedance.com/ > > CI result: https://github.com/grivet/ovs/actions/runs/1889083195 > > ofproto/ofproto-dpif-xlate.c | 21 +++++++++++++++++++-- > ofproto/ofproto-dpif-xlate.h | 1 + > ofproto/ofproto-dpif.c | 19 +------------------ > 3 files changed, 21 insertions(+), 20 deletions(-) > > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > index 578cbfe58..4a7388071 100644 > --- a/ofproto/ofproto-dpif-xlate.c > +++ b/ofproto/ofproto-dpif-xlate.c > @@ -865,7 +865,7 @@ xlate_xbridge_init(struct xlate_cfg *xcfg, struct xbridge *xbridge) > ovs_list_init(&xbridge->xbundles); > hmap_init(&xbridge->xports); > hmap_insert(&xcfg->xbridges, &xbridge->hmap_node, > - hash_pointer(xbridge->ofproto, 0)); > + uuid_hash(&xbridge->ofproto->uuid)); > } > > static void > @@ -1639,7 +1639,7 @@ xbridge_lookup(struct xlate_cfg *xcfg, const struct ofproto_dpif *ofproto) > > xbridges = &xcfg->xbridges; > > - HMAP_FOR_EACH_IN_BUCKET (xbridge, hmap_node, hash_pointer(ofproto, 0), > + HMAP_FOR_EACH_IN_BUCKET (xbridge, hmap_node, uuid_hash(&ofproto->uuid), > xbridges) { > if (xbridge->ofproto == ofproto) { > return xbridge; > @@ -1661,6 +1661,23 @@ xbridge_lookup_by_uuid(struct xlate_cfg *xcfg, const struct uuid *uuid) > return NULL; > } > > +struct ofproto_dpif * > +xlate_ofproto_lookup(const struct uuid *uuid) > +{ > + struct xlate_cfg *xcfg = ovsrcu_get(struct xlate_cfg *, &xcfgp); > + struct xbridge *xbridge; > + > + if (!xcfg) { > + return NULL; > + } > + > + xbridge = xbridge_lookup_by_uuid(xcfg, uuid); > + if (xbridge != NULL) { > + return xbridge->ofproto; > + } > + return NULL; > +} > + > static struct xbundle * > xbundle_lookup(struct xlate_cfg *xcfg, const struct ofbundle *ofbundle) > { > diff --git a/ofproto/ofproto-dpif-xlate.h b/ofproto/ofproto-dpif-xlate.h > index 851088d79..2ba90e999 100644 > --- a/ofproto/ofproto-dpif-xlate.h > +++ b/ofproto/ofproto-dpif-xlate.h > @@ -176,6 +176,7 @@ void xlate_ofproto_set(struct ofproto_dpif *, const char *name, struct dpif *, > bool forward_bpdu, bool has_in_band, > const struct dpif_backer_support *support); > void xlate_remove_ofproto(struct ofproto_dpif *); > +struct ofproto_dpif *xlate_ofproto_lookup(const struct uuid *uuid); > > void xlate_bundle_set(struct ofproto_dpif *, struct ofbundle *, > const char *name, enum port_vlan_mode, > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c > index 8143dd965..7b4a1b3d8 100644 > --- a/ofproto/ofproto-dpif.c > +++ b/ofproto/ofproto-dpif.c > @@ -215,10 +215,6 @@ struct shash all_dpif_backers = SHASH_INITIALIZER(&all_dpif_backers); > static struct hmap all_ofproto_dpifs_by_name = > HMAP_INITIALIZER(&all_ofproto_dpifs_by_name); > > -/* All existing ofproto_dpif instances, indexed by ->uuid. */ > -static struct hmap all_ofproto_dpifs_by_uuid = > - HMAP_INITIALIZER(&all_ofproto_dpifs_by_uuid); > - > static bool ofproto_use_tnl_push_pop = true; > static void ofproto_unixctl_init(void); > static void ct_zone_config_init(struct dpif_backer *backer); > @@ -1720,9 +1716,6 @@ construct(struct ofproto *ofproto_) > hmap_insert(&all_ofproto_dpifs_by_name, > &ofproto->all_ofproto_dpifs_by_name_node, > hash_string(ofproto->up.name, 0)); > - hmap_insert(&all_ofproto_dpifs_by_uuid, > - &ofproto->all_ofproto_dpifs_by_uuid_node, > - uuid_hash(&ofproto->uuid)); > memset(&ofproto->stats, 0, sizeof ofproto->stats); > > ofproto_init_tables(ofproto_, N_TABLES); > @@ -1820,8 +1813,6 @@ destruct(struct ofproto *ofproto_, bool del) > > hmap_remove(&all_ofproto_dpifs_by_name, > &ofproto->all_ofproto_dpifs_by_name_node); > - hmap_remove(&all_ofproto_dpifs_by_uuid, > - &ofproto->all_ofproto_dpifs_by_uuid_node); > > OFPROTO_FOR_EACH_TABLE (table, &ofproto->up) { > CLS_FOR_EACH (rule, up.cr, &table->cls) { > @@ -5818,15 +5809,7 @@ ofproto_dpif_lookup_by_name(const char *name) > struct ofproto_dpif * > ofproto_dpif_lookup_by_uuid(const struct uuid *uuid) > { > - struct ofproto_dpif *ofproto; > - > - HMAP_FOR_EACH_WITH_HASH (ofproto, all_ofproto_dpifs_by_uuid_node, > - uuid_hash(uuid), &all_ofproto_dpifs_by_uuid) { > - if (uuid_equals(&ofproto->uuid, uuid)) { > - return ofproto; > - } > - } > - return NULL; > + return xlate_ofproto_lookup(uuid); > } > > static void Hi Gaëtan. Thanks for working on the patch. It perferctly reflects what we've discussed. Acked-by: Adrian Moreno <amorenoz@redhat.com>
On 2/23/22 19:48, Gaetan Rivet wrote: > The ofproto map 'all_ofproto_dpifs_by_uuid' does not support > concurrent accesses. It is however read by upcall handler threads > and written by the main thread at the same time. > > Additionally, handler threads will change the ams_seq while > an ofproto is being destroyed, triggering crashes with the > following backtrace: > > (gdb) bt > hmap_next (hmap.h:398) > seq_wake_waiters (seq.c:326) > seq_change_protected (seq.c:134) > seq_change (seq.c:144) > ofproto_dpif_send_async_msg (ofproto_dpif.c:263) > process_upcall (ofproto_dpif_upcall.c:1782) > recv_upcalls (ofproto_dpif_upcall.c:1026) > udpif_upcall_handler (ofproto/ofproto_dpif_upcall.c:945) > ovsthread_wrapper (ovs_thread.c:734) > > To solve both issues, remove the 'all_ofproto_dpifs_by_uuid'. > Instead, another map already storing ofprotos in xlate can be used. > > During an ofproto destruction, its reference is removed from the current > xlate xcfg. Such change is committed only after all threads have quiesced > at least once during xlate_txn_commit(). This wait ensures that the > removal is seen by all threads, rendering impossible for a thread to > still hold a reference while the destruction proceeds. > > Furthermore, the xlate maps are copied during updates instead of > being written in place. It is thus correct to read xcfg->xbridges while > inserting or removing from new_xcfg->xbridges. > > Finally, now that ofproto_dpifs lookups are done through xcfg->xbridges, > it is important to use a high level of entropy. As it used the ofproto pointer > hashed, fewer bits were random compared to the uuid key used in > 'all_ofproto_dpifs_by_uuid'. To solve this, use the ofproto uuid as the key > in xbridges as well, improving entropy. > > Fixes: fcb9579be3c7 ("ofproto: Add 'ofproto_uuid' and 'ofp_in_port' to user action cookie.") > Suggested-by: Adrian Moreno <amorenoz@redhat.com> > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > Signed-off-by: Gaetan Rivet <grive@u256.net> > --- Thanks, everyone! Applied and backported down to 2.13. Best regards, Ilya Maximets.
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 578cbfe58..4a7388071 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -865,7 +865,7 @@ xlate_xbridge_init(struct xlate_cfg *xcfg, struct xbridge *xbridge) ovs_list_init(&xbridge->xbundles); hmap_init(&xbridge->xports); hmap_insert(&xcfg->xbridges, &xbridge->hmap_node, - hash_pointer(xbridge->ofproto, 0)); + uuid_hash(&xbridge->ofproto->uuid)); } static void @@ -1639,7 +1639,7 @@ xbridge_lookup(struct xlate_cfg *xcfg, const struct ofproto_dpif *ofproto) xbridges = &xcfg->xbridges; - HMAP_FOR_EACH_IN_BUCKET (xbridge, hmap_node, hash_pointer(ofproto, 0), + HMAP_FOR_EACH_IN_BUCKET (xbridge, hmap_node, uuid_hash(&ofproto->uuid), xbridges) { if (xbridge->ofproto == ofproto) { return xbridge; @@ -1661,6 +1661,23 @@ xbridge_lookup_by_uuid(struct xlate_cfg *xcfg, const struct uuid *uuid) return NULL; } +struct ofproto_dpif * +xlate_ofproto_lookup(const struct uuid *uuid) +{ + struct xlate_cfg *xcfg = ovsrcu_get(struct xlate_cfg *, &xcfgp); + struct xbridge *xbridge; + + if (!xcfg) { + return NULL; + } + + xbridge = xbridge_lookup_by_uuid(xcfg, uuid); + if (xbridge != NULL) { + return xbridge->ofproto; + } + return NULL; +} + static struct xbundle * xbundle_lookup(struct xlate_cfg *xcfg, const struct ofbundle *ofbundle) { diff --git a/ofproto/ofproto-dpif-xlate.h b/ofproto/ofproto-dpif-xlate.h index 851088d79..2ba90e999 100644 --- a/ofproto/ofproto-dpif-xlate.h +++ b/ofproto/ofproto-dpif-xlate.h @@ -176,6 +176,7 @@ void xlate_ofproto_set(struct ofproto_dpif *, const char *name, struct dpif *, bool forward_bpdu, bool has_in_band, const struct dpif_backer_support *support); void xlate_remove_ofproto(struct ofproto_dpif *); +struct ofproto_dpif *xlate_ofproto_lookup(const struct uuid *uuid); void xlate_bundle_set(struct ofproto_dpif *, struct ofbundle *, const char *name, enum port_vlan_mode, diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 8143dd965..7b4a1b3d8 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -215,10 +215,6 @@ struct shash all_dpif_backers = SHASH_INITIALIZER(&all_dpif_backers); static struct hmap all_ofproto_dpifs_by_name = HMAP_INITIALIZER(&all_ofproto_dpifs_by_name); -/* All existing ofproto_dpif instances, indexed by ->uuid. */ -static struct hmap all_ofproto_dpifs_by_uuid = - HMAP_INITIALIZER(&all_ofproto_dpifs_by_uuid); - static bool ofproto_use_tnl_push_pop = true; static void ofproto_unixctl_init(void); static void ct_zone_config_init(struct dpif_backer *backer); @@ -1720,9 +1716,6 @@ construct(struct ofproto *ofproto_) hmap_insert(&all_ofproto_dpifs_by_name, &ofproto->all_ofproto_dpifs_by_name_node, hash_string(ofproto->up.name, 0)); - hmap_insert(&all_ofproto_dpifs_by_uuid, - &ofproto->all_ofproto_dpifs_by_uuid_node, - uuid_hash(&ofproto->uuid)); memset(&ofproto->stats, 0, sizeof ofproto->stats); ofproto_init_tables(ofproto_, N_TABLES); @@ -1820,8 +1813,6 @@ destruct(struct ofproto *ofproto_, bool del) hmap_remove(&all_ofproto_dpifs_by_name, &ofproto->all_ofproto_dpifs_by_name_node); - hmap_remove(&all_ofproto_dpifs_by_uuid, - &ofproto->all_ofproto_dpifs_by_uuid_node); OFPROTO_FOR_EACH_TABLE (table, &ofproto->up) { CLS_FOR_EACH (rule, up.cr, &table->cls) { @@ -5818,15 +5809,7 @@ ofproto_dpif_lookup_by_name(const char *name) struct ofproto_dpif * ofproto_dpif_lookup_by_uuid(const struct uuid *uuid) { - struct ofproto_dpif *ofproto; - - HMAP_FOR_EACH_WITH_HASH (ofproto, all_ofproto_dpifs_by_uuid_node, - uuid_hash(uuid), &all_ofproto_dpifs_by_uuid) { - if (uuid_equals(&ofproto->uuid, uuid)) { - return ofproto; - } - } - return NULL; + return xlate_ofproto_lookup(uuid); } static void