Message ID | 4099DE2E54AFAD489356C6C9161D53339735B3E8@dggeml522-mbs.china.huawei.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,v2] ovs rcu: update rcu pointer first | expand |
On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > We should update rcu pointer first then use ovsrcu_postpone to free > otherwise maybe cause use-after-free. > e.g.,reader indicates momentary quiescent and access old pointer after > writer postpone free old pointer and before setting new pointer. > > Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> I don't see how that's possible, since the writer hasn't quiesced.
Hi Ben, If my understanding is correct, the writer could not be a rcu thread because it does not need report holding or not holding pointers. So old memory will be freed after all rcu thread report quiesce. Best Regards, Wei Yanqin > -----Original Message----- > From: Ben Pfaff <blp@ovn.org> > Sent: Wednesday, June 3, 2020 1:28 AM > To: Linhaifeng <haifeng.lin@huawei.com> > Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > > We should update rcu pointer first then use ovsrcu_postpone to free > > otherwise maybe cause use-after-free. > > e.g.,reader indicates momentary quiescent and access old pointer after > > writer postpone free old pointer and before setting new pointer. > > > > Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > > I don't see how that's possible, since the writer hasn't quiesced.
This is not how RCU works in OVS. Every thread is by default considered active. They rarely quiesce except implicitly inside poll_block(). Please read the large comment at the top of ovs-rcu.h. Is your patch based on actual bugs that you have found, or is it just some kind of precaution? If it is the latter, then it is not needed. On Tue, Jun 02, 2020 at 11:22:57PM +0000, Yanqin Wei wrote: > Hi Ben, > > If my understanding is correct, the writer could not be a rcu thread because it does not need report holding or not holding pointers. > So old memory will be freed after all rcu thread report quiesce. > > Best Regards, > Wei Yanqin > > > -----Original Message----- > > From: Ben Pfaff <blp@ovn.org> > > Sent: Wednesday, June 3, 2020 1:28 AM > > To: Linhaifeng <haifeng.lin@huawei.com> > > Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > > <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > > > On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > > > We should update rcu pointer first then use ovsrcu_postpone to free > > > otherwise maybe cause use-after-free. > > > e.g.,reader indicates momentary quiescent and access old pointer after > > > writer postpone free old pointer and before setting new pointer. > > > > > > Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > > > > I don't see how that's possible, since the writer hasn't quiesced.
Hi Ben, This patch is from Linhai, but I have the same concern about this. I will read ovs-rcu comments and feedback. Thanks for your time. Best Regards, Wei Yanqin > -----Original Message----- > From: Ben Pfaff <blp@ovn.org> > Sent: Wednesday, June 3, 2020 8:35 AM > To: Yanqin Wei <Yanqin.Wei@arm.com> > Cc: Linhaifeng <haifeng.lin@huawei.com>; dev@openvswitch.org; nd > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > This is not how RCU works in OVS. Every thread is by default considered active. > They rarely quiesce except implicitly inside poll_block(). > Please read the large comment at the top of ovs-rcu.h. > > Is your patch based on actual bugs that you have found, or is it just some kind > of precaution? If it is the latter, then it is not needed. > > On Tue, Jun 02, 2020 at 11:22:57PM +0000, Yanqin Wei wrote: > > Hi Ben, > > > > If my understanding is correct, the writer could not be a rcu thread because it > does not need report holding or not holding pointers. > > So old memory will be freed after all rcu thread report quiesce. > > > > Best Regards, > > Wei Yanqin > > > > > -----Original Message----- > > > From: Ben Pfaff <blp@ovn.org> > > > Sent: Wednesday, June 3, 2020 1:28 AM > > > To: Linhaifeng <haifeng.lin@huawei.com> > > > Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > > > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; > > > chenchanghu <chenchanghu@huawei.com>; Lichunhe > <lichunhe@huawei.com> > > > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > > > > > On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > > > > We should update rcu pointer first then use ovsrcu_postpone to > > > > free otherwise maybe cause use-after-free. > > > > e.g.,reader indicates momentary quiescent and access old pointer > > > > after writer postpone free old pointer and before setting new pointer. > > > > > > > > Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > > > > > > I don't see how that's possible, since the writer hasn't quiesced.
Oh, I apologize that I made a mistake about the author. I appreciate feedback from anyone. On Wed, Jun 03, 2020 at 12:37:27AM +0000, Yanqin Wei wrote: > Hi Ben, > > This patch is from Linhai, but I have the same concern about this. I will read ovs-rcu comments and feedback. > Thanks for your time. > > Best Regards, > Wei Yanqin > > > -----Original Message----- > > From: Ben Pfaff <blp@ovn.org> > > Sent: Wednesday, June 3, 2020 8:35 AM > > To: Yanqin Wei <Yanqin.Wei@arm.com> > > Cc: Linhaifeng <haifeng.lin@huawei.com>; dev@openvswitch.org; nd > > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > > <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > > > This is not how RCU works in OVS. Every thread is by default considered active. > > They rarely quiesce except implicitly inside poll_block(). > > Please read the large comment at the top of ovs-rcu.h. > > > > Is your patch based on actual bugs that you have found, or is it just some kind > > of precaution? If it is the latter, then it is not needed. > > > > On Tue, Jun 02, 2020 at 11:22:57PM +0000, Yanqin Wei wrote: > > > Hi Ben, > > > > > > If my understanding is correct, the writer could not be a rcu thread because it > > does not need report holding or not holding pointers. > > > So old memory will be freed after all rcu thread report quiesce. > > > > > > Best Regards, > > > Wei Yanqin > > > > > > > -----Original Message----- > > > > From: Ben Pfaff <blp@ovn.org> > > > > Sent: Wednesday, June 3, 2020 1:28 AM > > > > To: Linhaifeng <haifeng.lin@huawei.com> > > > > Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > > > > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; > > > > chenchanghu <chenchanghu@huawei.com>; Lichunhe > > <lichunhe@huawei.com> > > > > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > > > > > > > On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > > > > > We should update rcu pointer first then use ovsrcu_postpone to > > > > > free otherwise maybe cause use-after-free. > > > > > e.g.,reader indicates momentary quiescent and access old pointer > > > > > after writer postpone free old pointer and before setting new pointer. > > > > > > > > > > Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > > > > > > > > I don't see how that's possible, since the writer hasn't quiesced.
-----Original Message----- From: Ben Pfaff [mailto:blp@ovn.org] Sent: Wednesday, June 3, 2020 1:28 AM To: Linhaifeng <haifeng.lin@huawei.com> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > We should update rcu pointer first then use ovsrcu_postpone to free > otherwise maybe cause use-after-free. > e.g.,reader indicates momentary quiescent and access old pointer after > writer postpone free old pointer and before setting new pointer. > > Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> I don't see how that's possible, since the writer hasn't quiesced. I think the logic is as follow, Could you help me find out where is incorrect? 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4 -> 3.3 -> 2.2(use after free) wirter: 1.1 use postone to free old pointer 1.2 flush cbsets to flushed_cbsets 1.3 update new pointer 1.4 quiesced Read: 2.1. read pointer 2.2. use pointer 2.3. quiesced Rcu: 3.1 pop flushed_cbsets 3.2 ovsrcu_synchronize 3.3 call all cb to free
-----Original Message----- From: Yanqin Wei [mailto:Yanqin.Wei@arm.com] Sent: Wednesday, June 3, 2020 7:23 AM To: Ben Pfaff <blp@ovn.org>; Linhaifeng <haifeng.lin@huawei.com> Cc: dev@openvswitch.org; nd <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> Subject: RE: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first Hi Ben, If my understanding is correct, the writer could not be a rcu thread because it does not need report holding or not holding pointers. So old memory will be freed after all rcu thread report quiesce. The write also is rcu thread. If not first update pointer the reader can also get the old pointer after call ovsrcu_quiesced. Best Regards, Wei Yanqin > -----Original Message----- > From: Ben Pfaff <blp@ovn.org> > Sent: Wednesday, June 3, 2020 1:28 AM > To: Linhaifeng <haifeng.lin@huawei.com> > Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > > We should update rcu pointer first then use ovsrcu_postpone to free > > otherwise maybe cause use-after-free. > > e.g.,reader indicates momentary quiescent and access old pointer > > after writer postpone free old pointer and before setting new pointer. > > > > Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > > I don't see how that's possible, since the writer hasn't quiesced.
> -----Original Message----- > From: Ben Pfaff [mailto:blp@ovn.org] > Sent: Wednesday, June 3, 2020 8:35 AM > To: Yanqin Wei <Yanqin.Wei@arm.com> > Cc: Linhaifeng <haifeng.lin@huawei.com>; dev@openvswitch.org; nd > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > This is not how RCU works in OVS. Every thread is by default considered > active. They rarely quiesce except implicitly inside poll_block(). > Please read the large comment at the top of ovs-rcu.h. > > Is your patch based on actual bugs that you have found, or is it just some kind > of precaution? If it is the latter, then it is not needed. > Is an actual bug for old version bug it's also suitable for the other codes in ovs. Here is the debug info: linux-mNuKFc:/Images/linhf/830/Euler_compile_env # gdb -p `pidof ovs-vswitchd` GNU gdb (GDB) Red Hat Enterprise Linux 8.2-3.h2 Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "aarch64-Huawei-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word". Attaching to process 102706 [New LWP 109133] [New LWP 109134] [New LWP 109297] [New LWP 109298] [New LWP 109299] [New LWP 109300] [New LWP 109303] [New LWP 109304] [New LWP 109308] [New LWP 109309] [New LWP 109310] [New LWP 109311] [New LWP 109522] [New LWP 109523] [New LWP 109603] [New LWP 109615] [New LWP 109619] [New LWP 109655] [New LWP 109673] [New LWP 109794] [New LWP 109795] [New LWP 113953] [New LWP 114362] [New LWP 114364] [New LWP 114368] [New LWP 114370] [New LWP 114373] [New LWP 114377] [New LWP 115594] [New LWP 115595] [New LWP 115596] [New LWP 115597] [New LWP 115598] [New LWP 115600] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". 0x0000ffff879981ac in poll () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glib2-2.54.2-2.h1.aarch64 glibc-2.28-9.h17.aarch64 keyutils-libs-1.5.8-3.aarch64 krb5-libs-1.15.1-34.h2.aarch64 libcgroup-0.41-15.h3.aarch64 libcom_err-1.44.3-1.h4.aarch64 libgcc-7.3.0-20190804.h18.aarch64 libselinux-2.5-12.aarch64 numactl-libs-2.0.9-7.h1.aarch64 openssl-libs-1.0.2k-16.h6.aarch64 pcre-8.32-17.h9.aarch64 uvpkmc-1.0.1-807.aarch64 zlib-1.2.7-17.aarch64 (gdb) b dpcls_destroy_subtable Breakpoint 1 at 0x508bcc: file lib/dpif-netdev.c, line 6919. (gdb) b ovsrcu_call_postponed Breakpoint 2 at 0x5b7d34: file lib/ovs-rcu.c, line 336. (gdb) c Continuing. [Switching to Thread 0xffff83b97860 (LWP 109304)] Thread 9 "urcu2" hit Breakpoint 2, ovsrcu_call_postponed () at lib/ovs-rcu.c:336 warning: Source file is more recent than executable. 336 { (gdb) n 339 int wait_del = 0; (gdb) 340 while(wait_del); (gdb) set wait_del = 1 (gdb) c Continuing. [Switching to Thread 0xffff51748860 (LWP 115598)] Thread 34 "revalidator19" hit Breakpoint 1, dpcls_destroy_subtable (cls=0xffff1c00a420, subtable=0xffff3c009250) at lib/dpif-netdev.c:6919 6919 int wait_get = 0; (gdb) n 6920 VLOG_DBG("Destroying subtable %p for in_port %d", subtable, cls->in_port); (gdb) 6921 pvector_remove(&cls->subtables, subtable); (gdb) 6922 cmap_remove(&cls->subtables_map, &subtable->cmap_node, (gdb) set wait_get = 1 (gdb) n 6924 cmap_destroy(&subtable->rules); (gdb) p subtable->rules $1 = {impl = {p = 0xffff30008940}} (gdb) s cmap_destroy (cmap=0xffff3c009258) at lib/cmap.c:288 288 if (cmap) { (gdb) n 289 struct cmap_impl *impl = cmap_get_impl(cmap); (gdb) 290 if (impl != &empty_cmap) { (gdb) 291 ovsrcu_postpone(free_cacheline, impl); (gdb) s ovsrcu_postpone__ (function=0x6029b0 <free_cacheline>, aux=0xffff30008940) at lib/ovs-rcu.c:315 315 struct ovsrcu_perthread *perthread = ovsrcu_perthread_get(); (gdb) n 318 int size = ARRAY_SIZE(cbset->cbs); (gdb) 319 cbset = perthread->cbset; (gdb) 320 if (!cbset) { (gdb) 325 cb = &cbset->cbs[cbset->n_cbs++]; (gdb) 326 cb->function = function; (gdb) 327 cb->aux = aux; (gdb) 329 if (cbset->n_cbs >= size) { (gdb) set size = cbset->n_cbs (gdb) n 330 ovsrcu_flush_cbset(perthread); (gdb) s ovsrcu_flush_cbset (perthread=0xffff30001210) at lib/ovs-rcu.c:397 397 ovsrcu_flush_cbset__(perthread, false); (gdb) s ovsrcu_flush_cbset__ (perthread=0xffff30001210, protected=false) at lib/ovs-rcu.c:380 380 struct ovsrcu_cbset *cbset = perthread->cbset; (gdb) n 382 if (cbset) { (gdb) 383 guarded_list_push_back(&flushed_cbsets, &cbset->list_node, SIZE_MAX); (gdb) 384 perthread->cbset = NULL; (gdb) 386 if (protected) { (gdb) 389 seq_change(flushed_cbsets_seq); (gdb) 392 } (gdb) ovsrcu_flush_cbset (perthread=0xffff30001210) at lib/ovs-rcu.c:398 398 } (gdb) ovsrcu_postpone__ (function=0x6029b0 <free_cacheline>, aux=0xffff30008940) at lib/ovs-rcu.c:332 332 } (gdb) cmap_destroy (cmap=0xffff3c009258) at lib/cmap.c:294 294 } (gdb) dpcls_destroy_subtable (cls=0xffff1c00a420, subtable=0xffff3c009250) at lib/dpif-netdev.c:6925 6925 while(wait_get); (gdb) p wait_get $2 = 1 (gdb) thir 9 Undefined command: "thir". Try "help". (gdb) thr 9 [Switching to thread 9 (Thread 0xffff83b97860 (LWP 109304))] #0 ovsrcu_call_postponed () at lib/ovs-rcu.c:340 340 while(wait_del); (gdb) set wait_del = 0 (gdb) n 342 guarded_list_pop_all(&flushed_cbsets, &cbsets); (gdb) 343 if (ovs_list_is_empty(&cbsets)) { (gdb) pcbsets cbsets $173 = {list_node = {prev = 0xfffefc041660, next = 0xffff83b96f98}, cbs = {{function = 0x6029b0 <free_cacheline>, aux = 0xffff3c0094c0}, {function = 0x6029b0 <free_cacheline>, aux = 0xffff30008940}, {function = 0x508b9c <subtable_free>, aux = 0xffff3c009c10}, {function = 0xffff87947448 <free>, aux = 0xffff30001390}, {function = 0x4f9b74 <dp_netdev_flow_free>, aux = 0xffff34009b80}, {function = 0x0, aux = 0x0}, {function = 0x0, aux = 0x0}, {function = 0x90, aux = 0x74}, {function = 0x0, aux = 0xabc7c8 <ovsrcu_threads>}, {function = 0x0, aux = 0x0}, {function = 0xffffffff, aux = 0x0}, {function = 0x0, aux = 0x0}, {function = 0x0, aux = 0x2263e3}, {function = 0x0, aux = 0x3275637275}, {function = 0x0, aux = 0x21}, {function = 0xffff3000ea30, aux = 0xffff300013f0}}, n_cbs = 2} (gdb) prcus ovsrcu_threads $174 = {list_node = {prev = 0xabc7c8 <ovsrcu_threads>, next = 0xffff30001390}, mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, __align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3301106, cbset = 0x0, name = "pmd14\000\000\000\000\000\000\000\000\000\000"} $175 = {list_node = {prev = 0xffff38003970, next = 0xffff30001210}, mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, __align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3297445, cbset = 0x0, name = "urcu2\000\000\000\000\000\000\000\000\000\000"} $176 = {list_node = {prev = 0xffff30001390, next = 0xabc7c8 <ovsrcu_threads>}, mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, __align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3300573, cbset = 0x0, name = "revalidator19\000\000"} (gdb) n 347 ovsrcu_synchronize(); (gdb) s ovsrcu_synchronize () at lib/ovs-rcu.c:225 225 { (gdb) n 226 unsigned int warning_threshold = INIT_WARNING_THRESHOLD_MS; (gdb) 227 unsigned int block_report_frequent = BLOCK_REPORT_FREQUENT_MS; (gdb) 230 int wait_round = 0; (gdb) 232 if (single_threaded()) { (gdb) 236 target_seqno = seq_read(global_seqno); (gdb) 237 ovsrcu_quiesce_start(); (gdb) p target_seqno $177 = 3301107 (gdb) thr 23 //切换到pmd一直next,直到执行ovsrcu_try_quiesce (gdb) 4975 if (!ovsrcu_try_quiesce()) { (gdb) 4976 emc_cache_slow_sweep(&pmd->flow_cache); (gdb) 4980 if (OVS_UNLIKELY(pmd->pmd_xinfo.flow_table_on_dp != flow_table_on_dp)) { (gdb) prcus ovsrcu_threads $194 = {list_node = {prev = 0xabc7c8 <ovsrcu_threads>, next = 0xffff30001210}, mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, __align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3301109, cbset = 0x0, name = "pmd14\000\000\000\000\000\000\000\000\000\000"} $195 = {list_node = {prev = 0xffff38003970, next = 0xabc7c8 <ovsrcu_threads>}, mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, __align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3300573, cbset = 0x0, name = "revalidator19\000\000"} (gdb) b util.c:261 if (p == 0xffff30008940) Breakpoint 3 at 0x6029bc: file lib/util.c, line 261. (gdb) b dpif-netdev.c:7129 if (subtable->rules->impl->p == 0xffff30008940) Breakpoint 4 at 0x509444: file lib/dpif-netdev.c, line 7129. (gdb) thr 34 [Switching to thread 34 (Thread 0xffff51748860 (LWP 115598))] (gdb) set wait_get = 0 (gdb) c Continuing. [Switching to Thread 0xffff43ffe860 (LWP 113953)] Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7129 7129 while(wait_free); (gdb) p wait_free $200 = 0 (gdb) set wait_free = 1 (gdb) c Continuing. [Switching to Thread 0xffff43ffe860 (LWP 113953)] Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7129 7129 while(wait_free); (gdb) c Continuing. Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7129 7129 while(wait_free); (gdb) c Continuing. Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7129 7129 while(wait_free); (gdb) c Continuing. [Switching to Thread 0xffff83b97860 (LWP 109304)] Thread 9 "urcu2" hit Breakpoint 3, free_cacheline (p=0xffff30008940) at lib/util.c:261 261 free(p); (gdb) c (gdb) c Continuing. [Switching to Thread 0xffff43ffe860 (LWP 113953)] Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7129 7129 while(wait_free); (gdb) set wait_free = 0 (gdb) c Continuing. Thread 23 "pmd14" received signal SIGSEGV, Segmentation fault. 0x00000000004e1e50 in read_counter (bucket_=0xffff3003bfc0) at lib/cmap.c:333 333 atomic_read_explicit(&bucket->counter, &counter, memory_order_acquire); (gdb) bt #0 0x00000000004e1e50 in read_counter (bucket_=0xffff3003bfc0) at lib/cmap.c:333 #1 0x00000000004e1e9c in read_even_counter (bucket=0xffff3003bfc0) at lib/cmap.c:344 #2 0x00000000004e2270 in cmap_find_batch (cmap=0xffff3c009258, map=1, hashes=0xffff43ffcb18, nodes=0xffff43ffcb98) at lib/cmap.c:459 #3 0x0000000000509468 in dpcls_lookup (cls=0xffff1c00a420, keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) at lib/dpif-netdev.c:7131 #4 0x00000000004fabdc in dp_netdev_lookup_flow (classifiers=0xffff5237b0c8, key=0xffff43ffd3c0, lookup_num_p=0xffff43ffcd84) at lib/dpif-netdev.c:2111 #5 0x000000000050748c in fast_path_processing (pmd=0xffff51f4b000, packets_=0xffff43ffddb8, keys=0xffff43ffd3c0, batches=0xffff43ffd270, n_batches=0xffff43ffd6c0, in_port=11, now=862863019) at lib/dpif-netdev.c:6027 #6 0x0000000000507bb0 in dp_netdev_input__ (pmd=0xffff51f4b000, packets=0xffff43ffddb8, md_is_valid=false, port_no=11) at lib/dpif-netdev.c:6138 #7 0x0000000000507fa0 in dp_netdev_input (pmd=0xffff51f4b000, packets=0xffff43ffddb8, port_no=11) at lib/dpif-netdev.c:6222 #8 0x000000000067d130 in dp_uevs_forward (evs_ctx=0xffff43ffdd60, port_no=11, md_is_valid=false) at lib/evs/evs-dpdk.c:2717 #9 0x00000000005035e4 in evs_netdev_process_rxq_port (enable_blc=0, port_args=0xffff43ffdd40, pmd_args=0xffff43ffdd88) at lib/evs/evs-dpdk-inline.h:162 #10 evs_forward_loop (lacp_only_mode=false, ts=0xffff43ffdd30, port_args=0xffff43ffdd40, pmd_args=0xffff43ffdd88, poll_list=0xffff38009030, poll_cnt=24) at lib/dpif-netdev.c:4788 #11 pmd_thread_main (f_=0xffff51f4b000) at lib/dpif-netdev.c:4905 #12 0x00000000005bb0fc in ovsthread_wrapper (aux_=0x282d2280) at lib/ovs-thread.c:715 #13 0x0000ffff87f5c8bc in start_thread () from /lib64/libpthread.so.0 #14 0x0000ffff879a1e7c in thread_start () from /lib64/libc.so.6 > On Tue, Jun 02, 2020 at 11:22:57PM +0000, Yanqin Wei wrote: > > Hi Ben, > > > > If my understanding is correct, the writer could not be a rcu thread because it > does not need report holding or not holding pointers. > > So old memory will be freed after all rcu thread report quiesce. > > > > Best Regards, > > Wei Yanqin > > > > > -----Original Message----- > > > From: Ben Pfaff <blp@ovn.org> > > > Sent: Wednesday, June 3, 2020 1:28 AM > > > To: Linhaifeng <haifeng.lin@huawei.com> > > > Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > > > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; > > > chenchanghu <chenchanghu@huawei.com>; Lichunhe > <lichunhe@huawei.com> > > > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > > > > > On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > > > > We should update rcu pointer first then use ovsrcu_postpone to > > > > free otherwise maybe cause use-after-free. > > > > e.g.,reader indicates momentary quiescent and access old pointer > > > > after writer postpone free old pointer and before setting new pointer. > > > > > > > > Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > > > > > > I don't see how that's possible, since the writer hasn't quiesced.
On Wed, Jun 03, 2020 at 01:22:52AM +0000, Linhaifeng wrote: > > > -----Original Message----- > From: Ben Pfaff [mailto:blp@ovn.org] > Sent: Wednesday, June 3, 2020 1:28 AM > To: Linhaifeng <haifeng.lin@huawei.com> > Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > > We should update rcu pointer first then use ovsrcu_postpone to free > > otherwise maybe cause use-after-free. > > e.g.,reader indicates momentary quiescent and access old pointer after > > writer postpone free old pointer and before setting new pointer. > > > > Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > > I don't see how that's possible, since the writer hasn't quiesced. > > I think the logic is as follow, Could you help me find out where is incorrect? > > 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4 -> 3.3 -> 2.2(use after free) > > wirter: > 1.1 use postone to free old pointer > 1.2 flush cbsets to flushed_cbsets > 1.3 update new pointer > 1.4 quiesced > > Read: > 2.1. read pointer > 2.2. use pointer > 2.3. quiesced > > Rcu: > 3.1 pop flushed_cbsets > 3.2 ovsrcu_synchronize > 3.3 call all cb to free So you're saying this: 1.1 use postone to free old pointer (A) 1.2 flush cbsets to flushed_cbsets 3.1 pop flushed_cbsets 3.2 ovsrcu_synchronize 2.1. read pointer (A) 2.2. use pointer (A) 2.3. quiesced 2.1. read pointer (A) 1.3 update new pointer (B) 1.4 quiesced 3.3 call all cb to free (A) 2.2. use pointer (A) Wow, you are absolutely right. This had never occurred to me. Thank you! I'll review your patch.
> -----Original Message----- > From: Ben Pfaff [mailto:blp@ovn.org] > Sent: Wednesday, June 3, 2020 1:26 PM > To: Linhaifeng <haifeng.lin@huawei.com> > Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > On Wed, Jun 03, 2020 at 01:22:52AM +0000, Linhaifeng wrote: > > > > > > -----Original Message----- > > From: Ben Pfaff [mailto:blp@ovn.org] > > Sent: Wednesday, June 3, 2020 1:28 AM > > To: Linhaifeng <haifeng.lin@huawei.com> > > Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > > <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > > > On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > > > We should update rcu pointer first then use ovsrcu_postpone to free > > > otherwise maybe cause use-after-free. > > > e.g.,reader indicates momentary quiescent and access old pointer > > > after writer postpone free old pointer and before setting new pointer. > > > > > > Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > > > > I don't see how that's possible, since the writer hasn't quiesced. > > > > I think the logic is as follow, Could you help me find out where is incorrect? > > > > 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4 -> > > 3.3 -> 2.2(use after free) > > > > wirter: > > 1.1 use postone to free old pointer > > 1.2 flush cbsets to flushed_cbsets > > 1.3 update new pointer > > 1.4 quiesced > > > > Read: > > 2.1. read pointer > > 2.2. use pointer > > 2.3. quiesced > > > > Rcu: > > 3.1 pop flushed_cbsets > > 3.2 ovsrcu_synchronize > > 3.3 call all cb to free > > So you're saying this: > > 1.1 use postone to free old pointer (A) > 1.2 flush cbsets to flushed_cbsets > > 3.1 pop flushed_cbsets > 3.2 ovsrcu_synchronize > > 2.1. read pointer (A) > 2.2. use pointer (A) > 2.3. quiesced > > 2.1. read pointer (A) > > 1.3 update new pointer (B) > 1.4 quiesced > > 3.3 call all cb to free (A) > > 2.2. use pointer (A) > > Wow, you are absolutely right. This had never occurred to me. Thank you! > I'll review your patch. Yes, it's really hard to happen. If it happened it's also hard to find the reason so I suggest it can be a rule for using rcu.
Hi Ben, I read the comments and codes about ovs-rcu. Each thread is added into ovsrcu_threads in the thread start routine. And most of threads quiesce in the poll_block/ xsleep/ ovsrcu_quiesce/ ovsrcu_quiesce_start+ ovsrcu_quiesce_end. Only one thread "afsync_thread" may not be in the ovsrcu_threads list, because it invokes ovsrcu_quiesce_start but does not call ovsrcu_quiesce_end. But this thread does not update/free rcu memory, so no writer is in the quiescent state. I am sorry to make an incorrect comment for Haifeng's patch. He has a correct description in original patch. Please help to correct. https://patchwork.ozlabs.org/project/openvswitch/patch/4099DE2E54AFAD489356C6C9161D53339735AB4A@dggeml522-mbs.china.huawei.com/ Best Regards, Wei Yanqin > -----Original Message----- > From: Ben Pfaff <blp@ovn.org> > Sent: Wednesday, June 3, 2020 8:48 AM > To: Yanqin Wei <Yanqin.Wei@arm.com> > Cc: Linhaifeng <haifeng.lin@huawei.com>; dev@openvswitch.org; nd > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > Oh, I apologize that I made a mistake about the author. > > I appreciate feedback from anyone. > > On Wed, Jun 03, 2020 at 12:37:27AM +0000, Yanqin Wei wrote: > > Hi Ben, > > > > This patch is from Linhai, but I have the same concern about this. I will read > ovs-rcu comments and feedback. > > Thanks for your time. > > > > Best Regards, > > Wei Yanqin > > > > > -----Original Message----- > > > From: Ben Pfaff <blp@ovn.org> > > > Sent: Wednesday, June 3, 2020 8:35 AM > > > To: Yanqin Wei <Yanqin.Wei@arm.com> > > > Cc: Linhaifeng <haifeng.lin@huawei.com>; dev@openvswitch.org; nd > > > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; > > > chenchanghu <chenchanghu@huawei.com>; Lichunhe > <lichunhe@huawei.com> > > > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > > > > > This is not how RCU works in OVS. Every thread is by default considered > active. > > > They rarely quiesce except implicitly inside poll_block(). > > > Please read the large comment at the top of ovs-rcu.h. > > > > > > Is your patch based on actual bugs that you have found, or is it > > > just some kind of precaution? If it is the latter, then it is not needed. > > > > > > On Tue, Jun 02, 2020 at 11:22:57PM +0000, Yanqin Wei wrote: > > > > Hi Ben, > > > > > > > > If my understanding is correct, the writer could not be a rcu > > > > thread because it > > > does not need report holding or not holding pointers. > > > > So old memory will be freed after all rcu thread report quiesce. > > > > > > > > Best Regards, > > > > Wei Yanqin > > > > > > > > > -----Original Message----- > > > > > From: Ben Pfaff <blp@ovn.org> > > > > > Sent: Wednesday, June 3, 2020 1:28 AM > > > > > To: Linhaifeng <haifeng.lin@huawei.com> > > > > > Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > > > > > <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; > > > > > chenchanghu <chenchanghu@huawei.com>; Lichunhe > > > <lichunhe@huawei.com> > > > > > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer > > > > > first > > > > > > > > > > On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > > > > > > We should update rcu pointer first then use ovsrcu_postpone to > > > > > > free otherwise maybe cause use-after-free. > > > > > > e.g.,reader indicates momentary quiescent and access old > > > > > > pointer after writer postpone free old pointer and before setting new > pointer. > > > > > > > > > > > > Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > > > > > > > > > > I don't see how that's possible, since the writer hasn't quiesced.
On 6/3/20 9:04 AM, Linhaifeng wrote: > > >> -----Original Message----- >> From: Ben Pfaff [mailto:blp@ovn.org] >> Sent: Wednesday, June 3, 2020 1:26 PM >> To: Linhaifeng <haifeng.lin@huawei.com> >> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd >> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu >> <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> >> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first >> >> On Wed, Jun 03, 2020 at 01:22:52AM +0000, Linhaifeng wrote: >>> >>> >>> -----Original Message----- >>> From: Ben Pfaff [mailto:blp@ovn.org] >>> Sent: Wednesday, June 3, 2020 1:28 AM >>> To: Linhaifeng <haifeng.lin@huawei.com> >>> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd >>> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu >>> <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> >>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first >>> >>> On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: >>>> We should update rcu pointer first then use ovsrcu_postpone to free >>>> otherwise maybe cause use-after-free. >>>> e.g.,reader indicates momentary quiescent and access old pointer >>>> after writer postpone free old pointer and before setting new pointer. >>>> >>>> Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> >>> >>> I don't see how that's possible, since the writer hasn't quiesced. >>> >>> I think the logic is as follow, Could you help me find out where is incorrect? >>> >>> 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4 -> >>> 3.3 -> 2.2(use after free) >>> >>> wirter: >>> 1.1 use postone to free old pointer >>> 1.2 flush cbsets to flushed_cbsets >>> 1.3 update new pointer >>> 1.4 quiesced >>> >>> Read: >>> 2.1. read pointer >>> 2.2. use pointer >>> 2.3. quiesced >>> >>> Rcu: >>> 3.1 pop flushed_cbsets >>> 3.2 ovsrcu_synchronize >>> 3.3 call all cb to free >> >> So you're saying this: >> >> 1.1 use postone to free old pointer (A) >> 1.2 flush cbsets to flushed_cbsets >> >> 3.1 pop flushed_cbsets >> 3.2 ovsrcu_synchronize >> >> 2.1. read pointer (A) >> 2.2. use pointer (A) >> 2.3. quiesced >> >> 2.1. read pointer (A) >> >> 1.3 update new pointer (B) >> 1.4 quiesced >> >> 3.3 call all cb to free (A) >> >> 2.2. use pointer (A) >> >> Wow, you are absolutely right. This had never occurred to me. Thank you! >> I'll review your patch. > > Yes, it's really hard to happen. If it happened it's also hard to find the reason so I suggest it can be a rule for using rcu. I agree that there is an issue here, but I think that we should not force users to call ovsrcu_set() before ovsrcu_postpone(). Current users doesn't do anything illegal since pointer must not be freed before the next grace period from their point of view. For me it looks like the main issue is existence of point 1.2, i.e. flushing cbsets while writer is not quiesced yet. And we need to fix this inside rcu library itself. For example, we could avoid flushing inside ovsrcu_postpone() by making cbs[16] a dynamically allocated array and using x2nrealloc instead of flushing. Thoughts? Regarding the patch itself: > diff --git a/lib/ovs-rcu.h b/lib/ovs-rcu.h > index ecc4c9201..98c238aea 100644 > --- a/lib/ovs-rcu.h > +++ b/lib/ovs-rcu.h > @@ -118,10 +118,10 @@ > * void > * change_flow(struct flow *new_flow) > * { > + * struct flow *old_flow = ovsrcu_get_protected(struct flow *, &flowp) ovsrcu_get_protected() can not be used outside of the critical section. > * ovs_mutex_lock(&mutex); > - * ovsrcu_postpone(free, > - * ovsrcu_get_protected(struct flow *, &flowp)); > * ovsrcu_set(&flowp, new_flow); > + * ovsrcu_postpone(free, old_flow); > * ovs_mutex_unlock(&mutex); > * } > Best regards, Ilya Maximets.
> -----Original Message----- > From: Ilya Maximets [mailto:i.maximets@ovn.org] > Sent: Wednesday, June 3, 2020 6:50 PM > To: Linhaifeng <haifeng.lin@huawei.com>; Ben Pfaff <blp@ovn.org> > Cc: i.maximets@ovn.org; dev@openvswitch.org; Lilijun (Jerry) > <jerry.lilijun@huawei.com>; Lichunhe <lichunhe@huawei.com>; nd > <nd@arm.com>; chenchanghu <chenchanghu@huawei.com> > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > On 6/3/20 9:04 AM, Linhaifeng wrote: > > > > > >> -----Original Message----- > >> From: Ben Pfaff [mailto:blp@ovn.org] > >> Sent: Wednesday, June 3, 2020 1:26 PM > >> To: Linhaifeng <haifeng.lin@huawei.com> > >> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > >> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > >> <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > >> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > >> > >> On Wed, Jun 03, 2020 at 01:22:52AM +0000, Linhaifeng wrote: > >>> > >>> > >>> -----Original Message----- > >>> From: Ben Pfaff [mailto:blp@ovn.org] > >>> Sent: Wednesday, June 3, 2020 1:28 AM > >>> To: Linhaifeng <haifeng.lin@huawei.com> > >>> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > >>> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; > >>> chenchanghu <chenchanghu@huawei.com>; Lichunhe > <lichunhe@huawei.com> > >>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > >>> > >>> On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > >>>> We should update rcu pointer first then use ovsrcu_postpone to free > >>>> otherwise maybe cause use-after-free. > >>>> e.g.,reader indicates momentary quiescent and access old pointer > >>>> after writer postpone free old pointer and before setting new pointer. > >>>> > >>>> Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > >>> > >>> I don't see how that's possible, since the writer hasn't quiesced. > >>> > >>> I think the logic is as follow, Could you help me find out where is incorrect? > >>> > >>> 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4 > >>> -> > >>> 3.3 -> 2.2(use after free) > >>> > >>> wirter: > >>> 1.1 use postone to free old pointer > >>> 1.2 flush cbsets to flushed_cbsets > >>> 1.3 update new pointer > >>> 1.4 quiesced > >>> > >>> Read: > >>> 2.1. read pointer > >>> 2.2. use pointer > >>> 2.3. quiesced > >>> > >>> Rcu: > >>> 3.1 pop flushed_cbsets > >>> 3.2 ovsrcu_synchronize > >>> 3.3 call all cb to free > >> > >> So you're saying this: > >> > >> 1.1 use postone to free old pointer (A) > >> 1.2 flush cbsets to flushed_cbsets > >> > >> 3.1 pop flushed_cbsets > >> 3.2 ovsrcu_synchronize > >> > >> 2.1. read pointer (A) > >> 2.2. use pointer (A) > >> 2.3. quiesced > >> > >> 2.1. read pointer (A) > >> > >> 1.3 update new pointer (B) > >> 1.4 quiesced > >> > >> 3.3 call all cb to free (A) > >> > >> 2.2. use pointer (A) > >> > >> Wow, you are absolutely right. This had never occurred to me. Thank > you! > >> I'll review your patch. > > > > Yes, it's really hard to happen. If it happened it's also hard to find the reason > so I suggest it can be a rule for using rcu. > > I agree that there is an issue here, but I think that we should not force users to > call ovsrcu_set() before ovsrcu_postpone(). Current users doesn't do > anything illegal since pointer must not be freed before the next grace period > from their point of view. > > For me it looks like the main issue is existence of point 1.2, i.e. flushing cbsets > while writer is not quiesced yet. And we need to fix this inside rcu library itself. > For example, we could avoid flushing inside > ovsrcu_postpone() by making cbs[16] a dynamically allocated array and using > x2nrealloc instead of flushing. > > Thoughts? > Hi, Ilya Maximets May be this is a good idea therefor the users not need to think about call ovsrcu_set() first or ovsrcu_postpone(). How about you think, ben? May be you can send a patch to modify the ovsrcu_postpone() not to flush cbsets to replace of my patches. . > > Regarding the patch itself: > > > diff --git a/lib/ovs-rcu.h b/lib/ovs-rcu.h index ecc4c9201..98c238aea > > 100644 > > --- a/lib/ovs-rcu.h > > +++ b/lib/ovs-rcu.h > > @@ -118,10 +118,10 @@ > > * void > > * change_flow(struct flow *new_flow) > > * { > > + * struct flow *old_flow = ovsrcu_get_protected(struct flow *, > &flowp) > > ovsrcu_get_protected() can not be used outside of the critical section. > > > * ovs_mutex_lock(&mutex); > > - * ovsrcu_postpone(free, > > - * ovsrcu_get_protected(struct flow *, > &flowp)); > > * ovsrcu_set(&flowp, new_flow); > > + * ovsrcu_postpone(free, old_flow); > > * ovs_mutex_unlock(&mutex); > > * } > > > Best regards, Ilya Maximets.
On 6/3/20 1:08 PM, Linhaifeng wrote: > > >> -----Original Message----- >> From: Ilya Maximets [mailto:i.maximets@ovn.org] >> Sent: Wednesday, June 3, 2020 6:50 PM >> To: Linhaifeng <haifeng.lin@huawei.com>; Ben Pfaff <blp@ovn.org> >> Cc: i.maximets@ovn.org; dev@openvswitch.org; Lilijun (Jerry) >> <jerry.lilijun@huawei.com>; Lichunhe <lichunhe@huawei.com>; nd >> <nd@arm.com>; chenchanghu <chenchanghu@huawei.com> >> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first >> >> On 6/3/20 9:04 AM, Linhaifeng wrote: >>> >>> >>>> -----Original Message----- >>>> From: Ben Pfaff [mailto:blp@ovn.org] >>>> Sent: Wednesday, June 3, 2020 1:26 PM >>>> To: Linhaifeng <haifeng.lin@huawei.com> >>>> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd >>>> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu >>>> <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> >>>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first >>>> >>>> On Wed, Jun 03, 2020 at 01:22:52AM +0000, Linhaifeng wrote: >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Ben Pfaff [mailto:blp@ovn.org] >>>>> Sent: Wednesday, June 3, 2020 1:28 AM >>>>> To: Linhaifeng <haifeng.lin@huawei.com> >>>>> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd >>>>> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; >>>>> chenchanghu <chenchanghu@huawei.com>; Lichunhe >> <lichunhe@huawei.com> >>>>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first >>>>> >>>>> On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: >>>>>> We should update rcu pointer first then use ovsrcu_postpone to free >>>>>> otherwise maybe cause use-after-free. >>>>>> e.g.,reader indicates momentary quiescent and access old pointer >>>>>> after writer postpone free old pointer and before setting new pointer. >>>>>> >>>>>> Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> >>>>> >>>>> I don't see how that's possible, since the writer hasn't quiesced. >>>>> >>>>> I think the logic is as follow, Could you help me find out where is incorrect? >>>>> >>>>> 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4 >>>>> -> >>>>> 3.3 -> 2.2(use after free) >>>>> >>>>> wirter: >>>>> 1.1 use postone to free old pointer >>>>> 1.2 flush cbsets to flushed_cbsets >>>>> 1.3 update new pointer >>>>> 1.4 quiesced >>>>> >>>>> Read: >>>>> 2.1. read pointer >>>>> 2.2. use pointer >>>>> 2.3. quiesced >>>>> >>>>> Rcu: >>>>> 3.1 pop flushed_cbsets >>>>> 3.2 ovsrcu_synchronize >>>>> 3.3 call all cb to free >>>> >>>> So you're saying this: >>>> >>>> 1.1 use postone to free old pointer (A) >>>> 1.2 flush cbsets to flushed_cbsets >>>> >>>> 3.1 pop flushed_cbsets >>>> 3.2 ovsrcu_synchronize >>>> >>>> 2.1. read pointer (A) >>>> 2.2. use pointer (A) >>>> 2.3. quiesced >>>> >>>> 2.1. read pointer (A) >>>> >>>> 1.3 update new pointer (B) >>>> 1.4 quiesced >>>> >>>> 3.3 call all cb to free (A) >>>> >>>> 2.2. use pointer (A) >>>> >>>> Wow, you are absolutely right. This had never occurred to me. Thank >> you! >>>> I'll review your patch. >>> >>> Yes, it's really hard to happen. If it happened it's also hard to find the reason >> so I suggest it can be a rule for using rcu. >> >> I agree that there is an issue here, but I think that we should not force users to >> call ovsrcu_set() before ovsrcu_postpone(). Current users doesn't do >> anything illegal since pointer must not be freed before the next grace period >> from their point of view. >> >> For me it looks like the main issue is existence of point 1.2, i.e. flushing cbsets >> while writer is not quiesced yet. And we need to fix this inside rcu library itself. >> For example, we could avoid flushing inside >> ovsrcu_postpone() by making cbs[16] a dynamically allocated array and using >> x2nrealloc instead of flushing. >> >> Thoughts? >> > Hi, Ilya Maximets > > May be this is a good idea therefor the users not need to think about call ovsrcu_set() first or ovsrcu_postpone(). > How about you think, ben? May be you can send a patch to modify the ovsrcu_postpone() not to flush cbsets to > replace of my patches. The change could look like this: diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c index ebc8120f0..cde1e925b 100644 --- a/lib/ovs-rcu.c +++ b/lib/ovs-rcu.c @@ -30,6 +30,8 @@ VLOG_DEFINE_THIS_MODULE(ovs_rcu); +#define MIN_CBS 16 + struct ovsrcu_cb { void (*function)(void *aux); void *aux; @@ -37,7 +39,8 @@ struct ovsrcu_cb { struct ovsrcu_cbset { struct ovs_list list_node; - struct ovsrcu_cb cbs[16]; + struct ovsrcu_cb *cbs; + size_t n_allocated; int n_cbs; }; @@ -310,16 +313,19 @@ ovsrcu_postpone__(void (*function)(void *aux), void *aux) cbset = perthread->cbset; if (!cbset) { cbset = perthread->cbset = xmalloc(sizeof *perthread->cbset); + cbset->cbs = xmalloc(MIN_CBS * sizeof *cbset->cbs); + cbset->n_allocated = MIN_CBS; cbset->n_cbs = 0; } + if (cbset->n_cbs == cbset->n_allocated) { + cbset->cbs = x2nrealloc(cbset->cbs, &cbset->n_allocated, + sizeof *cbset->cbs); + } + cb = &cbset->cbs[cbset->n_cbs++]; cb->function = function; cb->aux = aux; - - if (cbset->n_cbs >= ARRAY_SIZE(cbset->cbs)) { - ovsrcu_flush_cbset(perthread); - } } static bool @@ -341,6 +347,7 @@ ovsrcu_call_postponed(void) for (cb = cbset->cbs; cb < &cbset->cbs[cbset->n_cbs]; cb++) { cb->function(cb->aux); } + free(cbset->cbs); free(cbset); } --- I could submit a formal patch, if agreed. Ben, what do you think?
On 6/3/20 7:26 AM, Ben Pfaff wrote: > On Wed, Jun 03, 2020 at 01:22:52AM +0000, Linhaifeng wrote: >> >> >> -----Original Message----- >> From: Ben Pfaff [mailto:blp@ovn.org] >> Sent: Wednesday, June 3, 2020 1:28 AM >> To: Linhaifeng <haifeng.lin@huawei.com> >> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> >> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first >> >> On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: >>> We should update rcu pointer first then use ovsrcu_postpone to free >>> otherwise maybe cause use-after-free. >>> e.g.,reader indicates momentary quiescent and access old pointer after >>> writer postpone free old pointer and before setting new pointer. >>> >>> Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> >> >> I don't see how that's possible, since the writer hasn't quiesced. >> >> I think the logic is as follow, Could you help me find out where is incorrect? >> >> 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4 -> 3.3 -> 2.2(use after free) >> >> wirter: >> 1.1 use postone to free old pointer >> 1.2 flush cbsets to flushed_cbsets >> 1.3 update new pointer >> 1.4 quiesced >> >> Read: >> 2.1. read pointer >> 2.2. use pointer >> 2.3. quiesced >> >> Rcu: >> 3.1 pop flushed_cbsets >> 3.2 ovsrcu_synchronize >> 3.3 call all cb to free > > So you're saying this: > > 1.1 use postone to free old pointer (A) > 1.2 flush cbsets to flushed_cbsets > > 3.1 pop flushed_cbsets > 3.2 ovsrcu_synchronize > > 2.1. read pointer (A) > 2.2. use pointer (A) > 2.3. quiesced > > 2.1. read pointer (A) IIUC, 'Read' thread should enter quiescent state here at least one more time in order to trigger the issue (or, maybe, some other thread could advance global_seqno before step 2.3 and after reading target_seqno by RCU thread on step 3.2). Otherwise, target_seqno will be equal to thread->seqno of the 'Read' thread and RCU thread will wait for the next quiescent state before calling callbacks. > > 1.3 update new pointer (B) > 1.4 quiesced > > 3.3 call all cb to free (A) > > 2.2. use pointer (A) > > Wow, you are absolutely right. This had never occurred to me. Thank > you! I'll review your patch. Sharing my more detailed diagram of the execution flow that should trigger the issue here: ------------------ ------------------ ------------------- Thread 1 Thread 2 RCU Thread ------------------ ------------------ ------------------- pointer = A ovsrcu_quiesce(): thread->seqno = 30 global_seqno = 31 quiesced read pointer A postpone(free(A)): flush cbset pop flushed_cbsets ovsrcu_synchronize: target_seqno = 31 ovsrcu_quiesce(): thread->seqno = 31 global_seqno = 32 quiesced read pointer A use pointer A ovsrcu_quiesce(): thread->seqno = 32 global_seqno = 33 quiesced read pointer A pointer = B ovsrcu_quiesce(): thread->seqno = 33 global_seqno = 34 quiesced target_seqno exceeded by all threads call cbs to free A use pointer A (use after free) ----------------------------------------------------------- Best regards, Ilya Maximets.
> -----Original Message----- > From: Ilya Maximets [mailto:i.maximets@ovn.org] > Sent: Thursday, June 4, 2020 12:01 AM > To: Ben Pfaff <blp@ovn.org>; Linhaifeng <haifeng.lin@huawei.com> > Cc: dev@openvswitch.org; Lilijun (Jerry) <jerry.lilijun@huawei.com>; Lichunhe > <lichunhe@huawei.com>; nd <nd@arm.com>; chenchanghu > <chenchanghu@huawei.com>; i.maximets@ovn.org > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > On 6/3/20 7:26 AM, Ben Pfaff wrote: > > On Wed, Jun 03, 2020 at 01:22:52AM +0000, Linhaifeng wrote: > >> > >> > >> -----Original Message----- > >> From: Ben Pfaff [mailto:blp@ovn.org] > >> Sent: Wednesday, June 3, 2020 1:28 AM > >> To: Linhaifeng <haifeng.lin@huawei.com> > >> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > >> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > >> <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > >> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > >> > >> On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > >>> We should update rcu pointer first then use ovsrcu_postpone to free > >>> otherwise maybe cause use-after-free. > >>> e.g.,reader indicates momentary quiescent and access old pointer > >>> after writer postpone free old pointer and before setting new pointer. > >>> > >>> Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > >> > >> I don't see how that's possible, since the writer hasn't quiesced. > >> > >> I think the logic is as follow, Could you help me find out where is incorrect? > >> > >> 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4 -> > >> 3.3 -> 2.2(use after free) > >> > >> wirter: > >> 1.1 use postone to free old pointer > >> 1.2 flush cbsets to flushed_cbsets > >> 1.3 update new pointer > >> 1.4 quiesced > >> > >> Read: > >> 2.1. read pointer > >> 2.2. use pointer > >> 2.3. quiesced > >> > >> Rcu: > >> 3.1 pop flushed_cbsets > >> 3.2 ovsrcu_synchronize > >> 3.3 call all cb to free > > > > So you're saying this: > > > > 1.1 use postone to free old pointer (A) > > 1.2 flush cbsets to flushed_cbsets > > > > 3.1 pop flushed_cbsets > > 3.2 ovsrcu_synchronize > > > > 2.1. read pointer (A) > > 2.2. use pointer (A) > > 2.3. quiesced > > > > 2.1. read pointer (A) > > IIUC, 'Read' thread should enter quiescent state here at least one more time in > order to trigger the issue (or, maybe, some other thread could advance > global_seqno before step 2.3 and after reading target_seqno by RCU thread on > step 3.2). Otherwise, target_seqno will be equal to thread->seqno of the > 'Read' thread and RCU thread will wait for the next quiescent state before > calling callbacks. > Yes, you are right. The reader's thread->seqno should bigger than global_seqno so it maybe have run one more times before someone's thread->seqno bigger than global_seqno. > > > > 1.3 update new pointer (B) > > 1.4 quiesced > > > > 3.3 call all cb to free (A) > > > > 2.2. use pointer (A) > > > > Wow, you are absolutely right. This had never occurred to me. Thank > > you! I'll review your patch. > > Sharing my more detailed diagram of the execution flow that should trigger the > issue here: > > ------------------ ------------------ ------------------- > Thread 1 Thread 2 RCU Thread > ------------------ ------------------ ------------------- > pointer = A > > ovsrcu_quiesce(): > thread->seqno = 30 > global_seqno = 31 > quiesced > > read pointer A > postpone(free(A)): > flush cbset > pop flushed_cbsets > ovsrcu_synchronize: > target_seqno = 31 > ovsrcu_quiesce(): > thread->seqno = 31 > global_seqno = 32 > quiesced > > read pointer A > use pointer A > > ovsrcu_quiesce(): > thread->seqno = 32 > global_seqno = 33 > quiesced > > read pointer A > pointer = B > > ovsrcu_quiesce(): > thread->seqno = 33 > global_seqno = 34 > quiesced > > target_seqno exceeded > by all threads > call cbs to free A > use pointer A > (use after free) > ----------------------------------------------------------- > > Best regards, Ilya Maximets.
Hi, Ben Will this patch apply ? > -----Original Message----- > From: Ilya Maximets [mailto:i.maximets@ovn.org] > Sent: Wednesday, June 3, 2020 10:33 PM > To: Linhaifeng <haifeng.lin@huawei.com>; Ilya Maximets > <i.maximets@ovn.org>; Ben Pfaff <blp@ovn.org> > Cc: dev@openvswitch.org; Lilijun (Jerry) <jerry.lilijun@huawei.com>; Lichunhe > <lichunhe@huawei.com>; nd <nd@arm.com>; chenchanghu > <chenchanghu@huawei.com> > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > The change could look like this: > > diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c index ebc8120f0..cde1e925b 100644 > --- a/lib/ovs-rcu.c > +++ b/lib/ovs-rcu.c > @@ -30,6 +30,8 @@ > > VLOG_DEFINE_THIS_MODULE(ovs_rcu); > > +#define MIN_CBS 16 > + > struct ovsrcu_cb { > void (*function)(void *aux); > void *aux; > @@ -37,7 +39,8 @@ struct ovsrcu_cb { > > struct ovsrcu_cbset { > struct ovs_list list_node; > - struct ovsrcu_cb cbs[16]; > + struct ovsrcu_cb *cbs; > + size_t n_allocated; > int n_cbs; > }; > > @@ -310,16 +313,19 @@ ovsrcu_postpone__(void (*function)(void *aux), void > *aux) > cbset = perthread->cbset; > if (!cbset) { > cbset = perthread->cbset = xmalloc(sizeof *perthread->cbset); > + cbset->cbs = xmalloc(MIN_CBS * sizeof *cbset->cbs); > + cbset->n_allocated = MIN_CBS; > cbset->n_cbs = 0; > } > > + if (cbset->n_cbs == cbset->n_allocated) { > + cbset->cbs = x2nrealloc(cbset->cbs, &cbset->n_allocated, > + sizeof *cbset->cbs); > + } > + > cb = &cbset->cbs[cbset->n_cbs++]; > cb->function = function; > cb->aux = aux; > - > - if (cbset->n_cbs >= ARRAY_SIZE(cbset->cbs)) { > - ovsrcu_flush_cbset(perthread); > - } > } > > static bool > @@ -341,6 +347,7 @@ ovsrcu_call_postponed(void) > for (cb = cbset->cbs; cb < &cbset->cbs[cbset->n_cbs]; cb++) { > cb->function(cb->aux); > } > + free(cbset->cbs); > free(cbset); > } > > --- > > > I could submit a formal patch, if agreed. > > Ben, what do you think?
On Wed, Jun 03, 2020 at 04:33:28PM +0200, Ilya Maximets wrote: > On 6/3/20 1:08 PM, Linhaifeng wrote: > > > > > >> -----Original Message----- > >> From: Ilya Maximets [mailto:i.maximets@ovn.org] > >> Sent: Wednesday, June 3, 2020 6:50 PM > >> To: Linhaifeng <haifeng.lin@huawei.com>; Ben Pfaff <blp@ovn.org> > >> Cc: i.maximets@ovn.org; dev@openvswitch.org; Lilijun (Jerry) > >> <jerry.lilijun@huawei.com>; Lichunhe <lichunhe@huawei.com>; nd > >> <nd@arm.com>; chenchanghu <chenchanghu@huawei.com> > >> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > >> > >> On 6/3/20 9:04 AM, Linhaifeng wrote: > >>> > >>> > >>>> -----Original Message----- > >>>> From: Ben Pfaff [mailto:blp@ovn.org] > >>>> Sent: Wednesday, June 3, 2020 1:26 PM > >>>> To: Linhaifeng <haifeng.lin@huawei.com> > >>>> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > >>>> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > >>>> <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > >>>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > >>>> > >>>> On Wed, Jun 03, 2020 at 01:22:52AM +0000, Linhaifeng wrote: > >>>>> > >>>>> > >>>>> -----Original Message----- > >>>>> From: Ben Pfaff [mailto:blp@ovn.org] > >>>>> Sent: Wednesday, June 3, 2020 1:28 AM > >>>>> To: Linhaifeng <haifeng.lin@huawei.com> > >>>>> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > >>>>> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; > >>>>> chenchanghu <chenchanghu@huawei.com>; Lichunhe > >> <lichunhe@huawei.com> > >>>>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > >>>>> > >>>>> On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > >>>>>> We should update rcu pointer first then use ovsrcu_postpone to free > >>>>>> otherwise maybe cause use-after-free. > >>>>>> e.g.,reader indicates momentary quiescent and access old pointer > >>>>>> after writer postpone free old pointer and before setting new pointer. > >>>>>> > >>>>>> Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > >>>>> > >>>>> I don't see how that's possible, since the writer hasn't quiesced. > >>>>> > >>>>> I think the logic is as follow, Could you help me find out where is incorrect? > >>>>> > >>>>> 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4 > >>>>> -> > >>>>> 3.3 -> 2.2(use after free) > >>>>> > >>>>> wirter: > >>>>> 1.1 use postone to free old pointer > >>>>> 1.2 flush cbsets to flushed_cbsets > >>>>> 1.3 update new pointer > >>>>> 1.4 quiesced > >>>>> > >>>>> Read: > >>>>> 2.1. read pointer > >>>>> 2.2. use pointer > >>>>> 2.3. quiesced > >>>>> > >>>>> Rcu: > >>>>> 3.1 pop flushed_cbsets > >>>>> 3.2 ovsrcu_synchronize > >>>>> 3.3 call all cb to free > >>>> > >>>> So you're saying this: > >>>> > >>>> 1.1 use postone to free old pointer (A) > >>>> 1.2 flush cbsets to flushed_cbsets > >>>> > >>>> 3.1 pop flushed_cbsets > >>>> 3.2 ovsrcu_synchronize > >>>> > >>>> 2.1. read pointer (A) > >>>> 2.2. use pointer (A) > >>>> 2.3. quiesced > >>>> > >>>> 2.1. read pointer (A) > >>>> > >>>> 1.3 update new pointer (B) > >>>> 1.4 quiesced > >>>> > >>>> 3.3 call all cb to free (A) > >>>> > >>>> 2.2. use pointer (A) > >>>> > >>>> Wow, you are absolutely right. This had never occurred to me. Thank > >> you! > >>>> I'll review your patch. > >>> > >>> Yes, it's really hard to happen. If it happened it's also hard to find the reason > >> so I suggest it can be a rule for using rcu. > >> > >> I agree that there is an issue here, but I think that we should not force users to > >> call ovsrcu_set() before ovsrcu_postpone(). Current users doesn't do > >> anything illegal since pointer must not be freed before the next grace period > >> from their point of view. > >> > >> For me it looks like the main issue is existence of point 1.2, i.e. flushing cbsets > >> while writer is not quiesced yet. And we need to fix this inside rcu library itself. > >> For example, we could avoid flushing inside > >> ovsrcu_postpone() by making cbs[16] a dynamically allocated array and using > >> x2nrealloc instead of flushing. > >> > >> Thoughts? > >> > > Hi, Ilya Maximets > > > > May be this is a good idea therefor the users not need to think about call ovsrcu_set() first or ovsrcu_postpone(). > > How about you think, ben? May be you can send a patch to modify the ovsrcu_postpone() not to flush cbsets to > > replace of my patches. > > The change could look like this: > > diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c > index ebc8120f0..cde1e925b 100644 > --- a/lib/ovs-rcu.c > +++ b/lib/ovs-rcu.c > @@ -30,6 +30,8 @@ > > VLOG_DEFINE_THIS_MODULE(ovs_rcu); > > +#define MIN_CBS 16 > + > struct ovsrcu_cb { > void (*function)(void *aux); > void *aux; > @@ -37,7 +39,8 @@ struct ovsrcu_cb { > > struct ovsrcu_cbset { > struct ovs_list list_node; > - struct ovsrcu_cb cbs[16]; > + struct ovsrcu_cb *cbs; > + size_t n_allocated; > int n_cbs; > }; > > @@ -310,16 +313,19 @@ ovsrcu_postpone__(void (*function)(void *aux), void *aux) > cbset = perthread->cbset; > if (!cbset) { > cbset = perthread->cbset = xmalloc(sizeof *perthread->cbset); > + cbset->cbs = xmalloc(MIN_CBS * sizeof *cbset->cbs); > + cbset->n_allocated = MIN_CBS; > cbset->n_cbs = 0; > } > > + if (cbset->n_cbs == cbset->n_allocated) { > + cbset->cbs = x2nrealloc(cbset->cbs, &cbset->n_allocated, > + sizeof *cbset->cbs); > + } > + > cb = &cbset->cbs[cbset->n_cbs++]; > cb->function = function; > cb->aux = aux; > - > - if (cbset->n_cbs >= ARRAY_SIZE(cbset->cbs)) { > - ovsrcu_flush_cbset(perthread); > - } > } > > static bool > @@ -341,6 +347,7 @@ ovsrcu_call_postponed(void) > for (cb = cbset->cbs; cb < &cbset->cbs[cbset->n_cbs]; cb++) { > cb->function(cb->aux); > } > + free(cbset->cbs); > free(cbset); > } > > --- > > > I could submit a formal patch, if agreed. > > Ben, what do you think? I was pretty resigned to the idea that we'd have to change the callers and the expectations. This seems better and the code looks fine to me. Thank you so much for thinking of this approach! This approach also has the significant advantage that it should be easy to backport. I was worried that we would either have to laboriously backport and check lots of old versions, or just declare that old versions wouldn't get fixed. As long as you tested this: Acked-by: Ben Pfaff <blp@ovn.org>
On Wed, Jun 10, 2020 at 11:40:14AM -0700, Ben Pfaff wrote: > On Wed, Jun 03, 2020 at 04:33:28PM +0200, Ilya Maximets wrote: > > On 6/3/20 1:08 PM, Linhaifeng wrote: > > > > > > > > >> -----Original Message----- > > >> From: Ilya Maximets [mailto:i.maximets@ovn.org] > > >> Sent: Wednesday, June 3, 2020 6:50 PM > > >> To: Linhaifeng <haifeng.lin@huawei.com>; Ben Pfaff <blp@ovn.org> > > >> Cc: i.maximets@ovn.org; dev@openvswitch.org; Lilijun (Jerry) > > >> <jerry.lilijun@huawei.com>; Lichunhe <lichunhe@huawei.com>; nd > > >> <nd@arm.com>; chenchanghu <chenchanghu@huawei.com> > > >> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > >> > > >> On 6/3/20 9:04 AM, Linhaifeng wrote: > > >>> > > >>> > > >>>> -----Original Message----- > > >>>> From: Ben Pfaff [mailto:blp@ovn.org] > > >>>> Sent: Wednesday, June 3, 2020 1:26 PM > > >>>> To: Linhaifeng <haifeng.lin@huawei.com> > > >>>> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > > >>>> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > > >>>> <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> > > >>>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > >>>> > > >>>> On Wed, Jun 03, 2020 at 01:22:52AM +0000, Linhaifeng wrote: > > >>>>> > > >>>>> > > >>>>> -----Original Message----- > > >>>>> From: Ben Pfaff [mailto:blp@ovn.org] > > >>>>> Sent: Wednesday, June 3, 2020 1:28 AM > > >>>>> To: Linhaifeng <haifeng.lin@huawei.com> > > >>>>> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd > > >>>>> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; > > >>>>> chenchanghu <chenchanghu@huawei.com>; Lichunhe > > >> <lichunhe@huawei.com> > > >>>>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first > > >>>>> > > >>>>> On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: > > >>>>>> We should update rcu pointer first then use ovsrcu_postpone to free > > >>>>>> otherwise maybe cause use-after-free. > > >>>>>> e.g.,reader indicates momentary quiescent and access old pointer > > >>>>>> after writer postpone free old pointer and before setting new pointer. > > >>>>>> > > >>>>>> Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> > > >>>>> > > >>>>> I don't see how that's possible, since the writer hasn't quiesced. > > >>>>> > > >>>>> I think the logic is as follow, Could you help me find out where is incorrect? > > >>>>> > > >>>>> 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4 > > >>>>> -> > > >>>>> 3.3 -> 2.2(use after free) > > >>>>> > > >>>>> wirter: > > >>>>> 1.1 use postone to free old pointer > > >>>>> 1.2 flush cbsets to flushed_cbsets > > >>>>> 1.3 update new pointer > > >>>>> 1.4 quiesced > > >>>>> > > >>>>> Read: > > >>>>> 2.1. read pointer > > >>>>> 2.2. use pointer > > >>>>> 2.3. quiesced > > >>>>> > > >>>>> Rcu: > > >>>>> 3.1 pop flushed_cbsets > > >>>>> 3.2 ovsrcu_synchronize > > >>>>> 3.3 call all cb to free > > >>>> > > >>>> So you're saying this: > > >>>> > > >>>> 1.1 use postone to free old pointer (A) > > >>>> 1.2 flush cbsets to flushed_cbsets > > >>>> > > >>>> 3.1 pop flushed_cbsets > > >>>> 3.2 ovsrcu_synchronize > > >>>> > > >>>> 2.1. read pointer (A) > > >>>> 2.2. use pointer (A) > > >>>> 2.3. quiesced > > >>>> > > >>>> 2.1. read pointer (A) > > >>>> > > >>>> 1.3 update new pointer (B) > > >>>> 1.4 quiesced > > >>>> > > >>>> 3.3 call all cb to free (A) > > >>>> > > >>>> 2.2. use pointer (A) > > >>>> > > >>>> Wow, you are absolutely right. This had never occurred to me. Thank > > >> you! > > >>>> I'll review your patch. > > >>> > > >>> Yes, it's really hard to happen. If it happened it's also hard to find the reason > > >> so I suggest it can be a rule for using rcu. > > >> > > >> I agree that there is an issue here, but I think that we should not force users to > > >> call ovsrcu_set() before ovsrcu_postpone(). Current users doesn't do > > >> anything illegal since pointer must not be freed before the next grace period > > >> from their point of view. > > >> > > >> For me it looks like the main issue is existence of point 1.2, i.e. flushing cbsets > > >> while writer is not quiesced yet. And we need to fix this inside rcu library itself. > > >> For example, we could avoid flushing inside > > >> ovsrcu_postpone() by making cbs[16] a dynamically allocated array and using > > >> x2nrealloc instead of flushing. > > >> > > >> Thoughts? > > >> > > > Hi, Ilya Maximets > > > > > > May be this is a good idea therefor the users not need to think about call ovsrcu_set() first or ovsrcu_postpone(). > > > How about you think, ben? May be you can send a patch to modify the ovsrcu_postpone() not to flush cbsets to > > > replace of my patches. > > > > The change could look like this: > > > > diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c > > index ebc8120f0..cde1e925b 100644 > > --- a/lib/ovs-rcu.c > > +++ b/lib/ovs-rcu.c > > @@ -30,6 +30,8 @@ > > > > VLOG_DEFINE_THIS_MODULE(ovs_rcu); > > > > +#define MIN_CBS 16 > > + > > struct ovsrcu_cb { > > void (*function)(void *aux); > > void *aux; > > @@ -37,7 +39,8 @@ struct ovsrcu_cb { > > > > struct ovsrcu_cbset { > > struct ovs_list list_node; > > - struct ovsrcu_cb cbs[16]; > > + struct ovsrcu_cb *cbs; > > + size_t n_allocated; > > int n_cbs; > > }; > > > > @@ -310,16 +313,19 @@ ovsrcu_postpone__(void (*function)(void *aux), void *aux) > > cbset = perthread->cbset; > > if (!cbset) { > > cbset = perthread->cbset = xmalloc(sizeof *perthread->cbset); > > + cbset->cbs = xmalloc(MIN_CBS * sizeof *cbset->cbs); > > + cbset->n_allocated = MIN_CBS; > > cbset->n_cbs = 0; > > } > > > > + if (cbset->n_cbs == cbset->n_allocated) { > > + cbset->cbs = x2nrealloc(cbset->cbs, &cbset->n_allocated, > > + sizeof *cbset->cbs); > > + } > > + > > cb = &cbset->cbs[cbset->n_cbs++]; > > cb->function = function; > > cb->aux = aux; > > - > > - if (cbset->n_cbs >= ARRAY_SIZE(cbset->cbs)) { > > - ovsrcu_flush_cbset(perthread); > > - } > > } > > > > static bool > > @@ -341,6 +347,7 @@ ovsrcu_call_postponed(void) > > for (cb = cbset->cbs; cb < &cbset->cbs[cbset->n_cbs]; cb++) { > > cb->function(cb->aux); > > } > > + free(cbset->cbs); > > free(cbset); > > } > > > > --- > > > > > > I could submit a formal patch, if agreed. > > > > Ben, what do you think? > > I was pretty resigned to the idea that we'd have to change the callers > and the expectations. This seems better and the code looks fine to me. > Thank you so much for thinking of this approach! > > This approach also has the significant advantage that it should be easy > to backport. I was worried that we would either have to laboriously > backport and check lots of old versions, or just declare that old > versions wouldn't get fixed. > > As long as you tested this: > > Acked-by: Ben Pfaff <blp@ovn.org> Oh, and please credit Linhaifeng for reporting the problem. This is subtle and we've had years to notice and Linhaifeng is the first one to actually do that. Thanks, Ben.
On 6/10/20 8:41 PM, Ben Pfaff wrote: > On Wed, Jun 10, 2020 at 11:40:14AM -0700, Ben Pfaff wrote: >> On Wed, Jun 03, 2020 at 04:33:28PM +0200, Ilya Maximets wrote: >>> On 6/3/20 1:08 PM, Linhaifeng wrote: >>>> >>>> >>>>> -----Original Message----- >>>>> From: Ilya Maximets [mailto:i.maximets@ovn.org] >>>>> Sent: Wednesday, June 3, 2020 6:50 PM >>>>> To: Linhaifeng <haifeng.lin@huawei.com>; Ben Pfaff <blp@ovn.org> >>>>> Cc: i.maximets@ovn.org; dev@openvswitch.org; Lilijun (Jerry) >>>>> <jerry.lilijun@huawei.com>; Lichunhe <lichunhe@huawei.com>; nd >>>>> <nd@arm.com>; chenchanghu <chenchanghu@huawei.com> >>>>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first >>>>> >>>>> On 6/3/20 9:04 AM, Linhaifeng wrote: >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Ben Pfaff [mailto:blp@ovn.org] >>>>>>> Sent: Wednesday, June 3, 2020 1:26 PM >>>>>>> To: Linhaifeng <haifeng.lin@huawei.com> >>>>>>> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd >>>>>>> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu >>>>>>> <chenchanghu@huawei.com>; Lichunhe <lichunhe@huawei.com> >>>>>>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first >>>>>>> >>>>>>> On Wed, Jun 03, 2020 at 01:22:52AM +0000, Linhaifeng wrote: >>>>>>>> >>>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Ben Pfaff [mailto:blp@ovn.org] >>>>>>>> Sent: Wednesday, June 3, 2020 1:28 AM >>>>>>>> To: Linhaifeng <haifeng.lin@huawei.com> >>>>>>>> Cc: Yanqin Wei <Yanqin.Wei@arm.com>; dev@openvswitch.org; nd >>>>>>>> <nd@arm.com>; Lilijun (Jerry) <jerry.lilijun@huawei.com>; >>>>>>>> chenchanghu <chenchanghu@huawei.com>; Lichunhe >>>>> <lichunhe@huawei.com> >>>>>>>> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first >>>>>>>> >>>>>>>> On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote: >>>>>>>>> We should update rcu pointer first then use ovsrcu_postpone to free >>>>>>>>> otherwise maybe cause use-after-free. >>>>>>>>> e.g.,reader indicates momentary quiescent and access old pointer >>>>>>>>> after writer postpone free old pointer and before setting new pointer. >>>>>>>>> >>>>>>>>> Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> >>>>>>>> >>>>>>>> I don't see how that's possible, since the writer hasn't quiesced. >>>>>>>> >>>>>>>> I think the logic is as follow, Could you help me find out where is incorrect? >>>>>>>> >>>>>>>> 1.1 -> 1.2 -> 3.1 -> 3.2 -> 2.1 -> 2.2 -> 2.3 -> 2.1 -> 1.3 -> 1.4 >>>>>>>> -> >>>>>>>> 3.3 -> 2.2(use after free) >>>>>>>> >>>>>>>> wirter: >>>>>>>> 1.1 use postone to free old pointer >>>>>>>> 1.2 flush cbsets to flushed_cbsets >>>>>>>> 1.3 update new pointer >>>>>>>> 1.4 quiesced >>>>>>>> >>>>>>>> Read: >>>>>>>> 2.1. read pointer >>>>>>>> 2.2. use pointer >>>>>>>> 2.3. quiesced >>>>>>>> >>>>>>>> Rcu: >>>>>>>> 3.1 pop flushed_cbsets >>>>>>>> 3.2 ovsrcu_synchronize >>>>>>>> 3.3 call all cb to free >>>>>>> >>>>>>> So you're saying this: >>>>>>> >>>>>>> 1.1 use postone to free old pointer (A) >>>>>>> 1.2 flush cbsets to flushed_cbsets >>>>>>> >>>>>>> 3.1 pop flushed_cbsets >>>>>>> 3.2 ovsrcu_synchronize >>>>>>> >>>>>>> 2.1. read pointer (A) >>>>>>> 2.2. use pointer (A) >>>>>>> 2.3. quiesced >>>>>>> >>>>>>> 2.1. read pointer (A) >>>>>>> >>>>>>> 1.3 update new pointer (B) >>>>>>> 1.4 quiesced >>>>>>> >>>>>>> 3.3 call all cb to free (A) >>>>>>> >>>>>>> 2.2. use pointer (A) >>>>>>> >>>>>>> Wow, you are absolutely right. This had never occurred to me. Thank >>>>> you! >>>>>>> I'll review your patch. >>>>>> >>>>>> Yes, it's really hard to happen. If it happened it's also hard to find the reason >>>>> so I suggest it can be a rule for using rcu. >>>>> >>>>> I agree that there is an issue here, but I think that we should not force users to >>>>> call ovsrcu_set() before ovsrcu_postpone(). Current users doesn't do >>>>> anything illegal since pointer must not be freed before the next grace period >>>>> from their point of view. >>>>> >>>>> For me it looks like the main issue is existence of point 1.2, i.e. flushing cbsets >>>>> while writer is not quiesced yet. And we need to fix this inside rcu library itself. >>>>> For example, we could avoid flushing inside >>>>> ovsrcu_postpone() by making cbs[16] a dynamically allocated array and using >>>>> x2nrealloc instead of flushing. >>>>> >>>>> Thoughts? >>>>> >>>> Hi, Ilya Maximets >>>> >>>> May be this is a good idea therefor the users not need to think about call ovsrcu_set() first or ovsrcu_postpone(). >>>> How about you think, ben? May be you can send a patch to modify the ovsrcu_postpone() not to flush cbsets to >>>> replace of my patches. >>> >>> The change could look like this: >>> >>> diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c >>> index ebc8120f0..cde1e925b 100644 >>> --- a/lib/ovs-rcu.c >>> +++ b/lib/ovs-rcu.c >>> @@ -30,6 +30,8 @@ >>> >>> VLOG_DEFINE_THIS_MODULE(ovs_rcu); >>> >>> +#define MIN_CBS 16 >>> + >>> struct ovsrcu_cb { >>> void (*function)(void *aux); >>> void *aux; >>> @@ -37,7 +39,8 @@ struct ovsrcu_cb { >>> >>> struct ovsrcu_cbset { >>> struct ovs_list list_node; >>> - struct ovsrcu_cb cbs[16]; >>> + struct ovsrcu_cb *cbs; >>> + size_t n_allocated; >>> int n_cbs; >>> }; >>> >>> @@ -310,16 +313,19 @@ ovsrcu_postpone__(void (*function)(void *aux), void *aux) >>> cbset = perthread->cbset; >>> if (!cbset) { >>> cbset = perthread->cbset = xmalloc(sizeof *perthread->cbset); >>> + cbset->cbs = xmalloc(MIN_CBS * sizeof *cbset->cbs); >>> + cbset->n_allocated = MIN_CBS; >>> cbset->n_cbs = 0; >>> } >>> >>> + if (cbset->n_cbs == cbset->n_allocated) { >>> + cbset->cbs = x2nrealloc(cbset->cbs, &cbset->n_allocated, >>> + sizeof *cbset->cbs); >>> + } >>> + >>> cb = &cbset->cbs[cbset->n_cbs++]; >>> cb->function = function; >>> cb->aux = aux; >>> - >>> - if (cbset->n_cbs >= ARRAY_SIZE(cbset->cbs)) { >>> - ovsrcu_flush_cbset(perthread); >>> - } >>> } >>> >>> static bool >>> @@ -341,6 +347,7 @@ ovsrcu_call_postponed(void) >>> for (cb = cbset->cbs; cb < &cbset->cbs[cbset->n_cbs]; cb++) { >>> cb->function(cb->aux); >>> } >>> + free(cbset->cbs); >>> free(cbset); >>> } >>> >>> --- >>> >>> >>> I could submit a formal patch, if agreed. >>> >>> Ben, what do you think? >> >> I was pretty resigned to the idea that we'd have to change the callers >> and the expectations. This seems better and the code looks fine to me. >> Thank you so much for thinking of this approach! >> >> This approach also has the significant advantage that it should be easy >> to backport. I was worried that we would either have to laboriously >> backport and check lots of old versions, or just declare that old >> versions wouldn't get fixed. >> >> As long as you tested this: >> >> Acked-by: Ben Pfaff <blp@ovn.org> > > Oh, and please credit Linhaifeng for reporting the problem. This is > subtle and we've had years to notice and Linhaifeng is the first one to > actually do that. Sure. Thanks for review! I'll send patch formally, give it one more test run and apply. Best regards, Ilya Maximets.
diff --git a/lib/classifier.c b/lib/classifier.c index f2c3497c2..6bff76e07 100644 --- a/lib/classifier.c +++ b/lib/classifier.c @@ -249,11 +249,11 @@ cls_rule_set_conjunctions(struct cls_rule *cr, unsigned int old_n = old ? old->n : 0; if (old_n != n || (n && memcmp(old_conj, conj, n * sizeof *conj))) { + ovsrcu_set(&match->conj_set, + cls_conjunction_set_alloc(match, conj, n)); if (old) { ovsrcu_postpone(free, old); } - ovsrcu_set(&match->conj_set, - cls_conjunction_set_alloc(match, conj, n)); } } diff --git a/lib/ovs-rcu.h b/lib/ovs-rcu.h index ecc4c9201..98c238aea 100644 --- a/lib/ovs-rcu.h +++ b/lib/ovs-rcu.h @@ -118,10 +118,10 @@ * void * change_flow(struct flow *new_flow) * { + * struct flow *old_flow = ovsrcu_get_protected(struct flow *, &flowp) * ovs_mutex_lock(&mutex); - * ovsrcu_postpone(free, - * ovsrcu_get_protected(struct flow *, &flowp)); * ovsrcu_set(&flowp, new_flow); + * ovsrcu_postpone(free, old_flow); * ovs_mutex_unlock(&mutex); * } * diff --git a/lib/pvector.c b/lib/pvector.c index cc527fdc4..aa8c6cb24 100644 --- a/lib/pvector.c +++ b/lib/pvector.c @@ -67,10 +67,11 @@ pvector_init(struct pvector *pvec) void pvector_destroy(struct pvector *pvec) { + struct pvector_impl *old = pvector_impl_get(pvec); free(pvec->temp); pvec->temp = NULL; - ovsrcu_postpone(free, pvector_impl_get(pvec)); ovsrcu_set(&pvec->impl, NULL); /* Poison. */ + ovsrcu_postpone(free, old); } /* Iterators for callers that need the 'index' afterward. */ @@ -205,11 +206,11 @@ pvector_change_priority(struct pvector *pvec, void *ptr, int priority) /* Make the modified pvector available for iteration. */ void pvector_publish__(struct pvector *pvec) { - struct pvector_impl *temp = pvec->temp; - + struct pvector_impl *new = pvec->temp; + struct pvector_impl *old = ovsrcu_get_protected(struct pvector_impl *, + &pvec->impl); pvec->temp = NULL; - pvector_impl_sort(temp); /* Also removes gaps. */ - ovsrcu_postpone(free, ovsrcu_get_protected(struct pvector_impl *, - &pvec->impl)); - ovsrcu_set(&pvec->impl, temp); + pvector_impl_sort(new); /* Also removes gaps. */ + ovsrcu_set(&pvec->impl, new); + ovsrcu_postpone(free, old); } diff --git a/ofproto/ofproto-dpif-mirror.c b/ofproto/ofproto-dpif-mirror.c index 343b75f0e..343100c08 100644 --- a/ofproto/ofproto-dpif-mirror.c +++ b/ofproto/ofproto-dpif-mirror.c @@ -276,9 +276,9 @@ mirror_set(struct mbridge *mbridge, void *aux, const char *name, hmapx_destroy(&dsts_map); if (vlans || src_vlans) { + unsigned long *new_vlans = vlan_bitmap_clone(src_vlans); + ovsrcu_set(&mirror->vlans, new_vlans); ovsrcu_postpone(free, vlans); - vlans = vlan_bitmap_clone(src_vlans); - ovsrcu_set(&mirror->vlans, vlans); } mirror->out = out; diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 5e08ef10d..be6dafb78 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -1658,11 +1658,10 @@ ukey_set_actions(struct udpif_key *ukey, const struct ofpbuf *actions) struct ofpbuf *old_actions = ovsrcu_get_protected(struct ofpbuf *, &ukey->actions); + ovsrcu_set(&ukey->actions, ofpbuf_clone(actions)); if (old_actions) { ovsrcu_postpone(ofpbuf_delete, old_actions); } - - ovsrcu_set(&ukey->actions, ofpbuf_clone(actions)); } static struct udpif_key *
We should update rcu pointer first then use ovsrcu_postpone to free otherwise maybe cause use-after-free. e.g.,reader indicates momentary quiescent and access old pointer after writer postpone free old pointer and before setting new pointer. Signed-off-by: Linhaifeng <haifeng.lin@huawei.com> --- lib/classifier.c | 4 ++-- lib/ovs-rcu.h | 4 ++-- lib/pvector.c | 15 ++++++++------- ofproto/ofproto-dpif-mirror.c | 4 ++-- ofproto/ofproto-dpif-upcall.c | 3 +-- 5 files changed, 15 insertions(+), 15 deletions(-)