Message ID | 1563390304-15546-1-git-send-email-dceara@redhat.com |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev] ovsdb-idl: Mark row "parsed" in ovsdb_idl_txn_write__ | expand |
On Wed, Jul 17, 2019 at 09:05:04PM +0200, Dumitru Ceara wrote: > Once a column is set in a row using ovsdb_idl_txn_write__ we also mark > the row "parsed". Otherwise the memory allocated by > sbrec_<table>_parse_<col>() will never be freed. After marking the row > "parsed", the ovsdb_idl_txn_disassemble function will properly free the > newly allocated memory for the column (ovsdb_idl_row_unparse). Wow, good catch. I bet that's been there forever. The OVSDB IDL code is too complicated. It's too hard to spot the issues. I wish I saw a way to make it simpler.
Hmm, the problem of "parsed" flag is that it identifies "all" columns of certain row have been parsed, however there are CLI tools which modify only individual colums by calling ovsdb_idl_txn_write_() function. In this case and in case parsed flag would be set in ovsdb_idl_txn_write() function then unparsing procedure would be called also for columns which were not parsed. The problem could be overcome by having individual flag for each column. This has been addresed in pending pull request. Apparent mail has been sent to dev list, but obviosly has been somehow overlooked. br, damijan On Thu, 18 Jul 2019, 17:52 Ben Pfaff, <blp@ovn.org> wrote: > On Wed, Jul 17, 2019 at 09:05:04PM +0200, Dumitru Ceara wrote: > > Once a column is set in a row using ovsdb_idl_txn_write__ we also mark > > the row "parsed". Otherwise the memory allocated by > > sbrec_<table>_parse_<col>() will never be freed. After marking the row > > "parsed", the ovsdb_idl_txn_disassemble function will properly free the > > newly allocated memory for the column (ovsdb_idl_row_unparse). > > Wow, good catch. I bet that's been there forever. > > The OVSDB IDL code is too complicated. It's too hard to spot the > issues. I wish I saw a way to make it simpler. > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
I guess I missed it. Will you please point it out to me, for example in the list archive? On Thu, Jul 18, 2019 at 06:58:34PM +0200, Damijan Skvarc wrote: > Hmm, the problem of "parsed" flag is that it identifies "all" columns of > certain row have been parsed, however there are CLI tools which modify only > individual colums by calling ovsdb_idl_txn_write_() function. In this case > and in case parsed flag would be set in ovsdb_idl_txn_write() function then > unparsing procedure would be called also for columns which were not parsed. > The problem could be overcome by having individual flag for each column. > This has been addresed in pending pull request. Apparent mail has been sent > to dev list, but obviosly has been somehow overlooked. > br, damijan > > On Thu, 18 Jul 2019, 17:52 Ben Pfaff, <blp@ovn.org> wrote: > > > On Wed, Jul 17, 2019 at 09:05:04PM +0200, Dumitru Ceara wrote: > > > Once a column is set in a row using ovsdb_idl_txn_write__ we also mark > > > the row "parsed". Otherwise the memory allocated by > > > sbrec_<table>_parse_<col>() will never be freed. After marking the row > > > "parsed", the ovsdb_idl_txn_disassemble function will properly free the > > > newly allocated memory for the column (ovsdb_idl_row_unparse). > > > > Wow, good catch. I bet that's been there forever. > > > > The OVSDB IDL code is too complicated. It's too hard to spot the > > issues. I wish I saw a way to make it simpler. > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >
On Thu, Jul 18, 2019 at 08:52:13AM -0700, Ben Pfaff wrote: > On Wed, Jul 17, 2019 at 09:05:04PM +0200, Dumitru Ceara wrote: > > Once a column is set in a row using ovsdb_idl_txn_write__ we also mark > > the row "parsed". Otherwise the memory allocated by > > sbrec_<table>_parse_<col>() will never be freed. After marking the row > > "parsed", the ovsdb_idl_txn_disassemble function will properly free the > > newly allocated memory for the column (ovsdb_idl_row_unparse). > > Wow, good catch. I bet that's been there forever. > > The OVSDB IDL code is too complicated. It's too hard to spot the > issues. I wish I saw a way to make it simpler. Oh, I forgot to say: I applied this to master.
https://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360285.htmlhttps://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360285.html On Thu, 18 Jul 2019, 20:12 Ben Pfaff, <blp@ovn.org> wrote: > I guess I missed it. Will you please point it out to me, for example in > the list archive? > > On Thu, Jul 18, 2019 at 06:58:34PM +0200, Damijan Skvarc wrote: > > Hmm, the problem of "parsed" flag is that it identifies "all" columns of > > certain row have been parsed, however there are CLI tools which modify > only > > individual colums by calling ovsdb_idl_txn_write_() function. In this > case > > and in case parsed flag would be set in ovsdb_idl_txn_write() function > then > > unparsing procedure would be called also for columns which were not > parsed. > > The problem could be overcome by having individual flag for each column. > > This has been addresed in pending pull request. Apparent mail has been > sent > > to dev list, but obviosly has been somehow overlooked. > > br, damijan > > > > On Thu, 18 Jul 2019, 17:52 Ben Pfaff, <blp@ovn.org> wrote: > > > > > On Wed, Jul 17, 2019 at 09:05:04PM +0200, Dumitru Ceara wrote: > > > > Once a column is set in a row using ovsdb_idl_txn_write__ we also > mark > > > > the row "parsed". Otherwise the memory allocated by > > > > sbrec_<table>_parse_<col>() will never be freed. After marking the > row > > > > "parsed", the ovsdb_idl_txn_disassemble function will properly free > the > > > > newly allocated memory for the column (ovsdb_idl_row_unparse). > > > > > > Wow, good catch. I bet that's been there forever. > > > > > > The OVSDB IDL code is too complicated. It's too hard to spot the > > > issues. I wish I saw a way to make it simpler. > > > _______________________________________________ > > > dev mailing list > > > dev@openvswitch.org > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > >
Hi Damijan, Ben, Damijan, sorry, I didn't realize you had already reported and fixed the problem in your pending pull request. I wouldn't have sent my patch otherwise. Regarding a single parsed field instead of parsed bits per column I had the impression that it's ok if unparsing is done for all columns. Right now, whenever we set a column we call ovsdb_idl_txn_write__() which unconditionally executes "(column->unparse)(row)" even if there was no old value set before parsing the new datum. As we zero out rows when we allocate them I assumed that this unparsing is safe. What do you guys think? Thanks, Dumitru On Fri, Jul 19, 2019 at 4:50 AM Damijan Skvarc <damjan.skvarc@gmail.com> wrote: > > https://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360285.htmlhttps://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360285.html > > On Thu, 18 Jul 2019, 20:12 Ben Pfaff, <blp@ovn.org> wrote: >> >> I guess I missed it. Will you please point it out to me, for example in >> the list archive? >> >> On Thu, Jul 18, 2019 at 06:58:34PM +0200, Damijan Skvarc wrote: >> > Hmm, the problem of "parsed" flag is that it identifies "all" columns of >> > certain row have been parsed, however there are CLI tools which modify only >> > individual colums by calling ovsdb_idl_txn_write_() function. In this case >> > and in case parsed flag would be set in ovsdb_idl_txn_write() function then >> > unparsing procedure would be called also for columns which were not parsed. >> > The problem could be overcome by having individual flag for each column. >> > This has been addresed in pending pull request. Apparent mail has been sent >> > to dev list, but obviosly has been somehow overlooked. >> > br, damijan >> > >> > On Thu, 18 Jul 2019, 17:52 Ben Pfaff, <blp@ovn.org> wrote: >> > >> > > On Wed, Jul 17, 2019 at 09:05:04PM +0200, Dumitru Ceara wrote: >> > > > Once a column is set in a row using ovsdb_idl_txn_write__ we also mark >> > > > the row "parsed". Otherwise the memory allocated by >> > > > sbrec_<table>_parse_<col>() will never be freed. After marking the row >> > > > "parsed", the ovsdb_idl_txn_disassemble function will properly free the >> > > > newly allocated memory for the column (ovsdb_idl_row_unparse). >> > > >> > > Wow, good catch. I bet that's been there forever. >> > > >> > > The OVSDB IDL code is too complicated. It's too hard to spot the >> > > issues. I wish I saw a way to make it simpler. >> > > _______________________________________________ >> > > dev mailing list >> > > dev@openvswitch.org >> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > >
Hi Dumitru, The problem is that ovsdb_idl entity is part of library and can be used by different application, where each application instantiates its own parse/unparse callback functions. Library itself does not know how these parse/unparse functions are implemented thus it is not reliable to call unparse() function without calling apparent parse() function first. This case happens in case the application modifies one column, while common parse flag insinuates unparse() function of some another column is called. The most problematic case are parse/unparse pairs where parse() function allocates memory and its unparse() counterpart deallocates it. Common parse flag would in this case cause some memory would be freed despite it has not been allocated or even worse, some memory could be deallocated multiple times. I strongly believe this would lead soon or later to the application crash. However, since I am just a self-invited visitor on this project (just to gain/discover some practices on github platform) you should not rely on my opinion. Ben looks to be a moderator/architect of this project and he should advice how to precede with this issue. regards, Damijan On Fri, Jul 19, 2019 at 10:12 AM Dumitru Ceara <dceara@redhat.com> wrote: > Hi Damijan, Ben, > > Damijan, sorry, I didn't realize you had already reported and fixed > the problem in your pending pull request. I wouldn't have sent my > patch otherwise. > > Regarding a single parsed field instead of parsed bits per column I > had the impression that it's ok if unparsing is done for all columns. > Right now, whenever we set a column we call ovsdb_idl_txn_write__() > which unconditionally executes "(column->unparse)(row)" even if there > was no old value set before parsing the new datum. As we zero out rows > when we allocate them I assumed that this unparsing is safe. > > What do you guys think? > > Thanks, > Dumitru > > On Fri, Jul 19, 2019 at 4:50 AM Damijan Skvarc <damjan.skvarc@gmail.com> > wrote: > > > > > https://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360285.htmlhttps://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360285.html > > > > On Thu, 18 Jul 2019, 20:12 Ben Pfaff, <blp@ovn.org> wrote: > >> > >> I guess I missed it. Will you please point it out to me, for example in > >> the list archive? > >> > >> On Thu, Jul 18, 2019 at 06:58:34PM +0200, Damijan Skvarc wrote: > >> > Hmm, the problem of "parsed" flag is that it identifies "all" columns > of > >> > certain row have been parsed, however there are CLI tools which > modify only > >> > individual colums by calling ovsdb_idl_txn_write_() function. In this > case > >> > and in case parsed flag would be set in ovsdb_idl_txn_write() > function then > >> > unparsing procedure would be called also for columns which were not > parsed. > >> > The problem could be overcome by having individual flag for each > column. > >> > This has been addresed in pending pull request. Apparent mail has > been sent > >> > to dev list, but obviosly has been somehow overlooked. > >> > br, damijan > >> > > >> > On Thu, 18 Jul 2019, 17:52 Ben Pfaff, <blp@ovn.org> wrote: > >> > > >> > > On Wed, Jul 17, 2019 at 09:05:04PM +0200, Dumitru Ceara wrote: > >> > > > Once a column is set in a row using ovsdb_idl_txn_write__ we also > mark > >> > > > the row "parsed". Otherwise the memory allocated by > >> > > > sbrec_<table>_parse_<col>() will never be freed. After marking > the row > >> > > > "parsed", the ovsdb_idl_txn_disassemble function will properly > free the > >> > > > newly allocated memory for the column (ovsdb_idl_row_unparse). > >> > > > >> > > Wow, good catch. I bet that's been there forever. > >> > > > >> > > The OVSDB IDL code is too complicated. It's too hard to spot the > >> > > issues. I wish I saw a way to make it simpler. > >> > > _______________________________________________ > >> > > dev mailing list > >> > > dev@openvswitch.org > >> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > > >
Hi Ben, If we keep this fix as is can we please have it backported all the way to and including branch-2.10? Right now it's there in master and branch 2.12. ovn-nbctl daemon mode is available since branch-2.10 and might be affected by this leak as it creates new entries in the database. Thanks, Dumitru On Mon, Jul 22, 2019 at 9:58 AM Damijan Skvarc <damjan.skvarc@gmail.com> wrote: > > Hi Dumitru, > > The problem is that ovsdb_idl entity is part of library and can be used by different application, where each application > instantiates its own parse/unparse callback functions. Library itself does not know how these parse/unparse functions are > implemented thus it is not reliable to call unparse() function without calling apparent parse() function first. This case > happens in case the application modifies one column, while common parse flag insinuates unparse() > function of some another column is called. The most problematic case are parse/unparse pairs where parse() > function allocates memory and its unparse() counterpart deallocates it. Common parse flag would in this case cause some > memory would be freed despite it has not been allocated or even worse, some memory could be deallocated multiple times. > I strongly believe this would lead soon or later to the application crash. > > However, since I am just a self-invited visitor on this project (just to gain/discover some practices on github platform) you should > not rely on my opinion. Ben looks to be a moderator/architect of this project and he should advice how to precede with this issue. > > regards, > Damijan > > > On Fri, Jul 19, 2019 at 10:12 AM Dumitru Ceara <dceara@redhat.com> wrote: >> >> Hi Damijan, Ben, >> >> Damijan, sorry, I didn't realize you had already reported and fixed >> the problem in your pending pull request. I wouldn't have sent my >> patch otherwise. >> >> Regarding a single parsed field instead of parsed bits per column I >> had the impression that it's ok if unparsing is done for all columns. >> Right now, whenever we set a column we call ovsdb_idl_txn_write__() >> which unconditionally executes "(column->unparse)(row)" even if there >> was no old value set before parsing the new datum. As we zero out rows >> when we allocate them I assumed that this unparsing is safe. >> >> What do you guys think? >> >> Thanks, >> Dumitru >> >> On Fri, Jul 19, 2019 at 4:50 AM Damijan Skvarc <damjan.skvarc@gmail.com> wrote: >> > >> > https://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360285.htmlhttps://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360285.html >> > >> > On Thu, 18 Jul 2019, 20:12 Ben Pfaff, <blp@ovn.org> wrote: >> >> >> >> I guess I missed it. Will you please point it out to me, for example in >> >> the list archive? >> >> >> >> On Thu, Jul 18, 2019 at 06:58:34PM +0200, Damijan Skvarc wrote: >> >> > Hmm, the problem of "parsed" flag is that it identifies "all" columns of >> >> > certain row have been parsed, however there are CLI tools which modify only >> >> > individual colums by calling ovsdb_idl_txn_write_() function. In this case >> >> > and in case parsed flag would be set in ovsdb_idl_txn_write() function then >> >> > unparsing procedure would be called also for columns which were not parsed. >> >> > The problem could be overcome by having individual flag for each column. >> >> > This has been addresed in pending pull request. Apparent mail has been sent >> >> > to dev list, but obviosly has been somehow overlooked. >> >> > br, damijan >> >> > >> >> > On Thu, 18 Jul 2019, 17:52 Ben Pfaff, <blp@ovn.org> wrote: >> >> > >> >> > > On Wed, Jul 17, 2019 at 09:05:04PM +0200, Dumitru Ceara wrote: >> >> > > > Once a column is set in a row using ovsdb_idl_txn_write__ we also mark >> >> > > > the row "parsed". Otherwise the memory allocated by >> >> > > > sbrec_<table>_parse_<col>() will never be freed. After marking the row >> >> > > > "parsed", the ovsdb_idl_txn_disassemble function will properly free the >> >> > > > newly allocated memory for the column (ovsdb_idl_row_unparse). >> >> > > >> >> > > Wow, good catch. I bet that's been there forever. >> >> > > >> >> > > The OVSDB IDL code is too complicated. It's too hard to spot the >> >> > > issues. I wish I saw a way to make it simpler. >> >> > > _______________________________________________ >> >> > > dev mailing list >> >> > > dev@openvswitch.org >> >> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> >> > >
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c index 5c61096..1a6a4af 100644 --- a/lib/ovsdb-idl.c +++ b/lib/ovsdb-idl.c @@ -4487,6 +4487,7 @@ ovsdb_idl_txn_write__(const struct ovsdb_idl_row *row_, } (column->unparse)(row); (column->parse)(row, &row->new_datum[column_idx]); + row->parsed = true; if (!index_row) { ovsdb_idl_add_to_indexes(row); }
Once a column is set in a row using ovsdb_idl_txn_write__ we also mark the row "parsed". Otherwise the memory allocated by sbrec_<table>_parse_<col>() will never be freed. After marking the row "parsed", the ovsdb_idl_txn_disassemble function will properly free the newly allocated memory for the column (ovsdb_idl_row_unparse). The problem is present only for rows that are inserted by the running application because rows that are loaded from the database will always have row->parsed == true. One way to detect the leak is to start northd with valgrind: valgrind --tool=memcheck --leak-check=yes ./ovn-northd Then create a logical switch and bind a logical port to it: ovn-nbctl ls-add ls1 ovn-nbctl lsp-add ls1 ls1-vm1 The valgrind report: ==9270== Memcheck, a memory error detector ==9270== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==9270== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info ==9270== Command: ./ovn-northd ==9270== <snip> ==9270== ==9270== 8 bytes in 1 blocks are definitely lost in loss record 30 of 292 ==9270== at 0x4C29BC3: malloc (vg_replace_malloc.c:299) ==9270== by 0x4D31EF: xmalloc (util.c:138) ==9270== by 0x45CB8E: sbrec_multicast_group_parse_ports (ovn-sb-idl.c:18141) ==9270== by 0x4BB12D: ovsdb_idl_txn_write__ (ovsdb-idl.c:4489) ==9270== by 0x4BB1B5: ovsdb_idl_txn_write (ovsdb-idl.c:4527) ==9270== by 0x45D167: sbrec_multicast_group_set_ports (ovn-sb-idl.c:18561) ==9270== by 0x40F416: ovn_multicast_update_sbrec (ovn-northd.c:2947) ==9270== by 0x41FC55: build_lflows (ovn-northd.c:7981) ==9270== by 0x421830: ovnnb_db_run (ovn-northd.c:8522) ==9270== by 0x422B2D: ovn_db_run (ovn-northd.c:9089) ==9270== by 0x423909: main (ovn-northd.c:9409) ==9270== ==9270== 157 (32 direct, 125 indirect) bytes in 1 blocks are definitely lost in loss record 199 of 292 ==9270== at 0x4C29BC3: malloc (vg_replace_malloc.c:299) ==9270== by 0x4D31EF: xmalloc (util.c:138) ==9270== by 0x471E3D: resize (hmap.c:100) ==9270== by 0x4720C8: hmap_expand_at (hmap.c:175) ==9270== by 0x4C74F1: hmap_insert_at (hmap.h:277) ==9270== by 0x4C825A: smap_add__ (smap.c:392) ==9270== by 0x4C7783: smap_add (smap.c:55) ==9270== by 0x451054: sbrec_datapath_binding_parse_external_ids (ovn-sb-idl.c:7181) ==9270== by 0x4BB12D: ovsdb_idl_txn_write__ (ovsdb-idl.c:4489) ==9270== by 0x4BB1B5: ovsdb_idl_txn_write (ovsdb-idl.c:4527) ==9270== by 0x451436: sbrec_datapath_binding_set_external_ids (ovn-sb-idl.c:7444) ==9270== by 0x4090F1: ovn_datapath_update_external_ids (ovn-northd.c:817) ==9270== <snip> ==9270== ==9270== LEAK SUMMARY: ==9270== definitely lost: 1,322 bytes in 47 blocks ==9270== indirectly lost: 4,653 bytes in 240 blocks ==9270== possibly lost: 0 bytes in 0 blocks ==9270== still reachable: 254,004 bytes in 7,780 blocks ==9270== suppressed: 0 bytes in 0 blocks ==9270== Reachable blocks (those to which a pointer was found) are not shown. ==9270== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==9270== ==9270== For counts of detected and suppressed errors, rerun with: -v ==9270== ERROR SUMMARY: 9 errors from 9 contexts (suppressed: 0 from 0) Signed-off-by: Dumitru Ceara <dceara@redhat.com> --- lib/ovsdb-idl.c | 1 + 1 file changed, 1 insertion(+)