Message ID | 1551838610-66580-1-git-send-email-hzhou8@ebay.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,1/2] ovsdb-idl: Fix memory leak of ovsdb_idl_delete_row. | expand |
On Tue, Mar 5, 2019 at 6:17 PM Han Zhou <zhouhan@gmail.com> wrote: > > From: Han Zhou <hzhou8@ebay.com> > > When a row is deleted, if it is referenced by another row, the > function ovsdb_idl_row_reparse_backrefs() is called but in that > function it doesn't destroy the row, because parameter destroy_dsts > is false when ovsdb_idl_row_reparse_backrefs() calls > ovsdb_idl_row_clear_arcs(). > > Signed-off-by: Han Zhou <hzhou8@ebay.com> > --- > lib/ovsdb-idl.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c > index 8cfb201..49fd45c 100644 > --- a/lib/ovsdb-idl.c > +++ b/lib/ovsdb-idl.c > @@ -3208,6 +3208,7 @@ ovsdb_idl_delete_row(struct ovsdb_idl_row *row) > ovsdb_idl_row_destroy(row); > } else { > ovsdb_idl_row_reparse_backrefs(row); > + ovsdb_idl_row_destroy(row); > } > } > > -- > 2.1.0 > This patch fails many tests, e.g. set, simple3 idl-compound-index-with-ref, initially populated - C It causes SIGSEGV when processing update notification for deletions when there are references. E.g. column C1 in row R1 in table T1 refers to C2 in R2 in T2. When R1 is deleted, both R1 and R2 will be deleted and sent in the same update notification. The IDL processes R2 deletion first. With this patch, it is destroyed after calling ovsdb_idl_row_reparse_backrefs(row). However, when processing R1 deletion, the function ovsdb_idl_delete_row() calls ovsdb_idl_row_clear_arcs(destroy=true) to free the orphan rows. Somehow R1 still has arc pointing to R2, and so it is calling ovsdb_idl_row_destroy() again for R2, which causes SIGSEGV, with below backtrace. (gdb) bt #0 0x000000000043fdd3 in hmap_remove (node=0x16bce30, hmap=0x16b7d88) at ../include/openvswitch/hmap.h:287 #1 ovsdb_idl_row_destroy (row=0x16bce30) at ../lib/ovsdb-idl.c:3104 #2 0x00000000004400d6 in ovsdb_idl_row_clear_arcs (row=row@entry=0x16bcf10, destroy_dsts=destroy_dsts@entry=true) at ../lib/ovsdb-idl.c:3041 #3 0x000000000044028e in ovsdb_idl_delete_row (row=row@entry=0x16bcf10) at ../lib/ovsdb-idl.c:3205 #4 0x0000000000441392 in ovsdb_idl_process_update2 (json_row=0x16c2370, operation=0x469d7e "delete", uuid=0x7ffff0219980, table=0x16b7c90) at ../lib/ovsdb-idl.c:2442 #5 ovsdb_idl_db_parse_update__ (method=(unknown: 23866864), table_updates=<optimized out>, db=<optimized out>) at ../lib/ovsdb-idl.c:2302 #6 ovsdb_idl_db_parse_update (db=db@entry=0x16b7348, table_updates=0x16bd1a0, method=method@entry=OVSDB_IDL_MM_MONITOR_COND_SINCE) at ../lib/ovsdb-idl.c:2354 #7 0x0000000000441b6e in ovsdb_idl_db_parse_update_rpc (db=db@entry=0x16b7348, msg=msg@entry=0x16bd290) at ../lib/ovsdb-idl.c:2185 #8 0x0000000000441f0c in ovsdb_idl_db_parse_update_rpc (msg=0x16bd290, db=0x16b7348) at ../lib/ovsdb-idl.c:900 #9 ovsdb_idl_process_msg (msg=0x16bd290, idl=0x16b7290) at ../lib/ovsdb-idl.c:820 #10 ovsdb_idl_run (idl=0x16b7290) at ../lib/ovsdb-idl.c:899 #11 0x000000000044682c in ovsdb_idl_txn_commit_block (txn=txn@entry=0x16bb600) at ../lib/ovsdb-idl.c:4285 #12 0x0000000000407f8a in do_idl_compound_index_with_ref (ctx=<optimized out>) at ../tests/test-ovsdb.c:2804 #13 0x0000000000430a84 in ovs_cmdl_run_command__ (ctx=ctx@entry=0x7ffff0219b80, commands=commands@entry=0x69a800 <all_commands>, read_only=read_only@entry=false) at ../lib/command-line.c:223 #14 0x0000000000431047 in ovs_cmdl_run_command (ctx=ctx@entry=0x7ffff0219b80, commands=commands@entry=0x69a800 <all_commands>) at ../lib/command-line.c:254 #15 0x000000000040530c in main (argc=<optimized out>, argv=0x7ffff0219c98) at ../tests/test-ovsdb.c:76 My original understanding was that ovsdb_idl_row_reparse_backrefs(row) is to cut the link between R1 and R2 when R2 is being deleted. However, it turns out in this function it unparse R1, clear arcs from R1 to R2 without destroying R2 (i.e. ovsdb_idl_row_clear_arcs(destroy=false)), and then it parse R1 again, which adds the arcs back. So it seems doesn't cut any arcs, and I am totally confused about the purpose of this function ovsdb_idl_row_reparse_backrefs() if it doesn't change anything at all. And still, it seems the memory leak would happen if R2 (the one being referenced) is deleted but R1 is never deleted, because the only chance that R2 gets destroyed is when R1 is deleted, though ovsdb_idl_row_clear_arcs(destroy=true). Did I misunderstand anything? Could someone who is familiar with this code help explain a little bit? Thanks, Han
On Thu, Mar 7, 2019 at 7:45 PM Han Zhou <zhouhan@gmail.com> wrote: > > On Tue, Mar 5, 2019 at 6:17 PM Han Zhou <zhouhan@gmail.com> wrote: > > > > From: Han Zhou <hzhou8@ebay.com> > > > > When a row is deleted, if it is referenced by another row, the > > function ovsdb_idl_row_reparse_backrefs() is called but in that > > function it doesn't destroy the row, because parameter destroy_dsts > > is false when ovsdb_idl_row_reparse_backrefs() calls > > ovsdb_idl_row_clear_arcs(). > > > > Signed-off-by: Han Zhou <hzhou8@ebay.com> > > --- > > lib/ovsdb-idl.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c > > index 8cfb201..49fd45c 100644 > > --- a/lib/ovsdb-idl.c > > +++ b/lib/ovsdb-idl.c > > @@ -3208,6 +3208,7 @@ ovsdb_idl_delete_row(struct ovsdb_idl_row *row) > > ovsdb_idl_row_destroy(row); > > } else { > > ovsdb_idl_row_reparse_backrefs(row); > > + ovsdb_idl_row_destroy(row); > > } > > } > > > > -- > > 2.1.0 > > > > This patch fails many tests, e.g. set, simple3 > idl-compound-index-with-ref, initially populated - C > > It causes SIGSEGV when processing update notification for deletions > when there are references. E.g. column C1 in row R1 in table T1 refers > to C2 in R2 in T2. When R1 is deleted, both R1 and R2 will be deleted > and sent in the same update notification. The IDL processes R2 > deletion first. With this patch, it is destroyed after calling > ovsdb_idl_row_reparse_backrefs(row). However, when processing R1 > deletion, the function ovsdb_idl_delete_row() calls > ovsdb_idl_row_clear_arcs(destroy=true) to free the orphan rows. > Somehow R1 still has arc pointing to R2, and so it is calling > ovsdb_idl_row_destroy() again for R2, which causes SIGSEGV, with below > backtrace. > > (gdb) bt > #0 0x000000000043fdd3 in hmap_remove (node=0x16bce30, hmap=0x16b7d88) > at ../include/openvswitch/hmap.h:287 > #1 ovsdb_idl_row_destroy (row=0x16bce30) at ../lib/ovsdb-idl.c:3104 > #2 0x00000000004400d6 in ovsdb_idl_row_clear_arcs > (row=row@entry=0x16bcf10, destroy_dsts=destroy_dsts@entry=true) at > ../lib/ovsdb-idl.c:3041 > #3 0x000000000044028e in ovsdb_idl_delete_row > (row=row@entry=0x16bcf10) at ../lib/ovsdb-idl.c:3205 > #4 0x0000000000441392 in ovsdb_idl_process_update2 > (json_row=0x16c2370, operation=0x469d7e "delete", uuid=0x7ffff0219980, > table=0x16b7c90) at ../lib/ovsdb-idl.c:2442 > #5 ovsdb_idl_db_parse_update__ (method=(unknown: 23866864), > table_updates=<optimized out>, db=<optimized out>) at > ../lib/ovsdb-idl.c:2302 > #6 ovsdb_idl_db_parse_update (db=db@entry=0x16b7348, > table_updates=0x16bd1a0, > method=method@entry=OVSDB_IDL_MM_MONITOR_COND_SINCE) at > ../lib/ovsdb-idl.c:2354 > #7 0x0000000000441b6e in ovsdb_idl_db_parse_update_rpc > (db=db@entry=0x16b7348, msg=msg@entry=0x16bd290) at > ../lib/ovsdb-idl.c:2185 > #8 0x0000000000441f0c in ovsdb_idl_db_parse_update_rpc > (msg=0x16bd290, db=0x16b7348) at ../lib/ovsdb-idl.c:900 > #9 ovsdb_idl_process_msg (msg=0x16bd290, idl=0x16b7290) at > ../lib/ovsdb-idl.c:820 > #10 ovsdb_idl_run (idl=0x16b7290) at ../lib/ovsdb-idl.c:899 > #11 0x000000000044682c in ovsdb_idl_txn_commit_block > (txn=txn@entry=0x16bb600) at ../lib/ovsdb-idl.c:4285 > #12 0x0000000000407f8a in do_idl_compound_index_with_ref > (ctx=<optimized out>) at ../tests/test-ovsdb.c:2804 > #13 0x0000000000430a84 in ovs_cmdl_run_command__ > (ctx=ctx@entry=0x7ffff0219b80, commands=commands@entry=0x69a800 > <all_commands>, read_only=read_only@entry=false) > at ../lib/command-line.c:223 > #14 0x0000000000431047 in ovs_cmdl_run_command > (ctx=ctx@entry=0x7ffff0219b80, commands=commands@entry=0x69a800 > <all_commands>) at ../lib/command-line.c:254 > #15 0x000000000040530c in main (argc=<optimized out>, > argv=0x7ffff0219c98) at ../tests/test-ovsdb.c:76 > > My original understanding was that ovsdb_idl_row_reparse_backrefs(row) > is to cut the link between R1 and R2 when R2 is being deleted. > However, it turns out in this function it unparse R1, clear arcs from > R1 to R2 without destroying R2 (i.e. > ovsdb_idl_row_clear_arcs(destroy=false)), and then it parse R1 again, > which adds the arcs back. So it seems doesn't cut any arcs, and I am > totally confused about the purpose of this function > ovsdb_idl_row_reparse_backrefs() if it doesn't change anything at all. > > And still, it seems the memory leak would happen if R2 (the one being > referenced) is deleted but R1 is never deleted, because the only > chance that R2 gets destroyed is when R1 is deleted, though > ovsdb_idl_row_clear_arcs(destroy=true). > > Did I misunderstand anything? Could someone who is familiar with this > code help explain a little bit? > I think I understand it now. There is no memory leak here, but the logic is a tricky. The purpose of ovsdb_idl_row_reparse_backrefs() is not to cut the arc between R1 and R2, but to make sure R2 disappears from R1's C structure. The actual destroy of R2 is deferred to when R1 is deleted. If R1 -> R2 is a weak reference and R2 is deleted without deleting R1, then there will be a "modify" operation for R1, involved in the same update notification, which will trigger destroying R2 in ovsdb_idl_modify_row_by_diff()->ovsdb_idl_row_clear_arcs(destroy=true). So, please ignore this patch.
On Thu, Mar 07, 2019 at 11:37:57PM -0800, Han Zhou wrote: > On Thu, Mar 7, 2019 at 7:45 PM Han Zhou <zhouhan@gmail.com> wrote: > > > > On Tue, Mar 5, 2019 at 6:17 PM Han Zhou <zhouhan@gmail.com> wrote: > > > > > > From: Han Zhou <hzhou8@ebay.com> > > > > > > When a row is deleted, if it is referenced by another row, the > > > function ovsdb_idl_row_reparse_backrefs() is called but in that > > > function it doesn't destroy the row, because parameter destroy_dsts > > > is false when ovsdb_idl_row_reparse_backrefs() calls > > > ovsdb_idl_row_clear_arcs(). > > > > > > Signed-off-by: Han Zhou <hzhou8@ebay.com> > > > --- > > > lib/ovsdb-idl.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c > > > index 8cfb201..49fd45c 100644 > > > --- a/lib/ovsdb-idl.c > > > +++ b/lib/ovsdb-idl.c > > > @@ -3208,6 +3208,7 @@ ovsdb_idl_delete_row(struct ovsdb_idl_row *row) > > > ovsdb_idl_row_destroy(row); > > > } else { > > > ovsdb_idl_row_reparse_backrefs(row); > > > + ovsdb_idl_row_destroy(row); > > > } > > > } > > > > > > -- > > > 2.1.0 > > > > > > > This patch fails many tests, e.g. set, simple3 > > idl-compound-index-with-ref, initially populated - C > > > > It causes SIGSEGV when processing update notification for deletions > > when there are references. E.g. column C1 in row R1 in table T1 refers > > to C2 in R2 in T2. When R1 is deleted, both R1 and R2 will be deleted > > and sent in the same update notification. The IDL processes R2 > > deletion first. With this patch, it is destroyed after calling > > ovsdb_idl_row_reparse_backrefs(row). However, when processing R1 > > deletion, the function ovsdb_idl_delete_row() calls > > ovsdb_idl_row_clear_arcs(destroy=true) to free the orphan rows. > > Somehow R1 still has arc pointing to R2, and so it is calling > > ovsdb_idl_row_destroy() again for R2, which causes SIGSEGV, with below > > backtrace. > > > > (gdb) bt > > #0 0x000000000043fdd3 in hmap_remove (node=0x16bce30, hmap=0x16b7d88) > > at ../include/openvswitch/hmap.h:287 > > #1 ovsdb_idl_row_destroy (row=0x16bce30) at ../lib/ovsdb-idl.c:3104 > > #2 0x00000000004400d6 in ovsdb_idl_row_clear_arcs > > (row=row@entry=0x16bcf10, destroy_dsts=destroy_dsts@entry=true) at > > ../lib/ovsdb-idl.c:3041 > > #3 0x000000000044028e in ovsdb_idl_delete_row > > (row=row@entry=0x16bcf10) at ../lib/ovsdb-idl.c:3205 > > #4 0x0000000000441392 in ovsdb_idl_process_update2 > > (json_row=0x16c2370, operation=0x469d7e "delete", uuid=0x7ffff0219980, > > table=0x16b7c90) at ../lib/ovsdb-idl.c:2442 > > #5 ovsdb_idl_db_parse_update__ (method=(unknown: 23866864), > > table_updates=<optimized out>, db=<optimized out>) at > > ../lib/ovsdb-idl.c:2302 > > #6 ovsdb_idl_db_parse_update (db=db@entry=0x16b7348, > > table_updates=0x16bd1a0, > > method=method@entry=OVSDB_IDL_MM_MONITOR_COND_SINCE) at > > ../lib/ovsdb-idl.c:2354 > > #7 0x0000000000441b6e in ovsdb_idl_db_parse_update_rpc > > (db=db@entry=0x16b7348, msg=msg@entry=0x16bd290) at > > ../lib/ovsdb-idl.c:2185 > > #8 0x0000000000441f0c in ovsdb_idl_db_parse_update_rpc > > (msg=0x16bd290, db=0x16b7348) at ../lib/ovsdb-idl.c:900 > > #9 ovsdb_idl_process_msg (msg=0x16bd290, idl=0x16b7290) at > > ../lib/ovsdb-idl.c:820 > > #10 ovsdb_idl_run (idl=0x16b7290) at ../lib/ovsdb-idl.c:899 > > #11 0x000000000044682c in ovsdb_idl_txn_commit_block > > (txn=txn@entry=0x16bb600) at ../lib/ovsdb-idl.c:4285 > > #12 0x0000000000407f8a in do_idl_compound_index_with_ref > > (ctx=<optimized out>) at ../tests/test-ovsdb.c:2804 > > #13 0x0000000000430a84 in ovs_cmdl_run_command__ > > (ctx=ctx@entry=0x7ffff0219b80, commands=commands@entry=0x69a800 > > <all_commands>, read_only=read_only@entry=false) > > at ../lib/command-line.c:223 > > #14 0x0000000000431047 in ovs_cmdl_run_command > > (ctx=ctx@entry=0x7ffff0219b80, commands=commands@entry=0x69a800 > > <all_commands>) at ../lib/command-line.c:254 > > #15 0x000000000040530c in main (argc=<optimized out>, > > argv=0x7ffff0219c98) at ../tests/test-ovsdb.c:76 > > > > My original understanding was that ovsdb_idl_row_reparse_backrefs(row) > > is to cut the link between R1 and R2 when R2 is being deleted. > > However, it turns out in this function it unparse R1, clear arcs from > > R1 to R2 without destroying R2 (i.e. > > ovsdb_idl_row_clear_arcs(destroy=false)), and then it parse R1 again, > > which adds the arcs back. So it seems doesn't cut any arcs, and I am > > totally confused about the purpose of this function > > ovsdb_idl_row_reparse_backrefs() if it doesn't change anything at all. > > > > And still, it seems the memory leak would happen if R2 (the one being > > referenced) is deleted but R1 is never deleted, because the only > > chance that R2 gets destroyed is when R1 is deleted, though > > ovsdb_idl_row_clear_arcs(destroy=true). > > > > Did I misunderstand anything? Could someone who is familiar with this > > code help explain a little bit? > > > I think I understand it now. There is no memory leak here, but the > logic is a tricky. The purpose of ovsdb_idl_row_reparse_backrefs() is > not to cut the arc between R1 and R2, but to make sure R2 disappears > from R1's C structure. The actual destroy of R2 is deferred to when R1 > is deleted. If R1 -> R2 is a weak reference and R2 is deleted without > deleting R1, then there will be a "modify" operation for R1, involved > in the same update notification, which will trigger destroying R2 in > ovsdb_idl_modify_row_by_diff()->ovsdb_idl_row_clear_arcs(destroy=true). > So, please ignore this patch. This *is* tricky. It is trying to implement a general-purpose graph in a user-friendly way in C. If you spot a better way to document it, or to make it simpler, then please do propose it. There might be an easier way to do some of this. I wrote it very quickly back in late 2009 and early 2010. It's proved useful, but the implementation is subtle, which makes it hard to extend and to be confident that it is correct.
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c index 8cfb201..49fd45c 100644 --- a/lib/ovsdb-idl.c +++ b/lib/ovsdb-idl.c @@ -3208,6 +3208,7 @@ ovsdb_idl_delete_row(struct ovsdb_idl_row *row) ovsdb_idl_row_destroy(row); } else { ovsdb_idl_row_reparse_backrefs(row); + ovsdb_idl_row_destroy(row); } }