Message ID | 20170624210152.8491-4-lrichard@redhat.com |
---|---|
State | Superseded |
Headers | show |
On Sat, Jun 24, 2017 at 05:01:51PM -0400, Lance Richardson wrote: > This patch adds support for the creation of multicolumn indexes > in the C IDL to enable for efficient search and retrieval of database > rows by key. > > Signed-off-by: Esteban Rodriguez Betancourt <estebarb@hpe.com> > Co-authored-by: Lance Richardson <lrichard@redhat.com> > Signed-off-by: Lance Richardson <lrichard@redhat.com> > --- > v5: - Coding style fixes (checkpatch.py) > - Fixed memory leak (missing ovsdb_datum_destroy() in > ovsdb_idl_index_destroy_row__()). > - Some polishing of comment and log message text. Thanks for reviving this series. I don't understand ovsdb_idl_index_read(). It is almost the same as ovsdb_idl_read(). It looks like ovsdb_idl_read() could be implemented as a wrapper around it, but I'm also not sure why ovsdb_idl_read() can't be used directly. Also, I don't understand its comment about "index_set functions", since there are no functions with index_set in their names. ovsdb_idl_index_write_() is mostly copy-paste of a part of ovsdb_idl_txn_write__(), so to avoid code duplication it would be best to factor that code out of ovsdb_idl_txn_write__() and call it from both places. I don't understand the behavior of ovsdb_idl_index_find() and ovsdb_idl_index_forward_to() for a null 'value' argument. Is there some idiomatic usage where the null and nonnull behaviors work out nicely? The row_sync behavior is confusing. I remember it slightly from the previous iteration but I don't remember being convinced it was the best way to do things. Thanks, Ben.
> From: "Ben Pfaff" <blp@ovn.org> > To: "Lance Richardson" <lrichard@redhat.com> > Cc: dev@openvswitch.org, estebarb@hpe.com, "javier albornoz" <javier.albornoz@hpe.com>, "jorge sauma" > <jorge.sauma@hpe.com>, "arnoldo lutz guevara" <arnoldo.lutz.guevara@hpe.com> > Sent: Tuesday, 11 July, 2017 5:05:32 PM > Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes implementation > > On Sat, Jun 24, 2017 at 05:01:51PM -0400, Lance Richardson wrote: > > This patch adds support for the creation of multicolumn indexes > > in the C IDL to enable for efficient search and retrieval of database > > rows by key. > > > > Signed-off-by: Esteban Rodriguez Betancourt <estebarb@hpe.com> > > Co-authored-by: Lance Richardson <lrichard@redhat.com> > > Signed-off-by: Lance Richardson <lrichard@redhat.com> > > --- > > v5: - Coding style fixes (checkpatch.py) > > - Fixed memory leak (missing ovsdb_datum_destroy() in > > ovsdb_idl_index_destroy_row__()). > > - Some polishing of comment and log message text. > > Thanks for reviving this series. > > I don't understand ovsdb_idl_index_read(). It is almost the same as > ovsdb_idl_read(). It looks like ovsdb_idl_read() could be implemented > as a wrapper around it, but I'm also not sure why ovsdb_idl_read() can't > be used directly. Also, I don't understand its comment about "index_set > functions", since there are no functions with index_set in their names. > I would guess that the original reason had to do with the fact that ovsdb_idl_index_read() needs to deal with "real" rows in the db replica as well as external row instances created by ovsdb_idl_index_init_row() to contain the key values being searched for. Looking at the code a little more closely, I see that ovsdb_idl_index_init_row() sets row->table, so the key row structure would not be considered "synthetic" according to ovsdb_idl_row_is_synthetic(). So: I think you are correct, we should be able to use ovsdb_idl_read() and eliminate ovsdb_idl_index_read(). > ovsdb_idl_index_write_() is mostly copy-paste of a part of > ovsdb_idl_txn_write__(), so to avoid code duplication it would be best > to factor that code out of ovsdb_idl_txn_write__() and call it from both > places. > Hmm, I just noticed that this is v5, v6 seems to be missing in patchwork. The v6 version of this function is a bit more streamlined, and avoids calling ovsdb_datum_destroy() so that a row structure containing a key with a string type can avoid having to xstrdup() key strings: void ovsdb_idl_index_write_(struct ovsdb_idl_row *const_row, const struct ovsdb_idl_column *column, struct ovsdb_datum *datum, const struct ovsdb_idl_table_class *class) { struct ovsdb_idl_row *row = CONST_CAST(struct ovsdb_idl_row *, const_row); size_t column_idx = column - class->columns; if (bitmap_is_set(row->written, column_idx)) { free(row->new[column_idx].values); free(row->new[column_idx].keys); } else { bitmap_set1(row->written, column_idx); } row->new[column_idx] = *datum; (column->unparse)(row); (column->parse)(row, &row->new[column_idx]); } So there's a little less commonality between the two. > I don't understand the behavior of ovsdb_idl_index_find() and > ovsdb_idl_index_forward_to() for a null 'value' argument. Is there some > idiomatic usage where the null and nonnull behaviors work out nicely? > I don't know what the intended use was, but there doesn't currently seem to be a use of the null 'value' case (generated code always passes &data->header_ for this parameter, which could be null if 'data' is null, but that seems pretty fragile if intentional). I'll see if the special-case handling of null values can be eliminated. > The row_sync behavior is confusing. I remember it slightly from the > previous iteration but I don't remember being convinced it was the best > way to do things. > Hmm, that does seem strange/confusing. I will have to spend more time studying that bit of conde. > Thanks, > > Ben. >
----- Original Message ----- > From: "Ben Pfaff" <blp@ovn.org> > To: "Lance Richardson" <lrichard@redhat.com> > Cc: dev@openvswitch.org, estebarb@hpe.com, "javier albornoz" <javier.albornoz@hpe.com>, "jorge sauma" > <jorge.sauma@hpe.com>, "arnoldo lutz guevara" <arnoldo.lutz.guevara@hpe.com> > Sent: Tuesday, 11 July, 2017 5:05:32 PM > Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes implementation > > On Sat, Jun 24, 2017 at 05:01:51PM -0400, Lance Richardson wrote: > > This patch adds support for the creation of multicolumn indexes > > in the C IDL to enable for efficient search and retrieval of database > > rows by key. > > > > Signed-off-by: Esteban Rodriguez Betancourt <estebarb@hpe.com> > > Co-authored-by: Lance Richardson <lrichard@redhat.com> > > Signed-off-by: Lance Richardson <lrichard@redhat.com> > > --- > > v5: - Coding style fixes (checkpatch.py) > > - Fixed memory leak (missing ovsdb_datum_destroy() in > > ovsdb_idl_index_destroy_row__()). > > - Some polishing of comment and log message text. > > Thanks for reviving this series. > > I don't understand ovsdb_idl_index_read(). It is almost the same as > ovsdb_idl_read(). It looks like ovsdb_idl_read() could be implemented > as a wrapper around it, but I'm also not sure why ovsdb_idl_read() can't > be used directly. Also, I don't understand its comment about "index_set > functions", since there are no functions with index_set in their names. Hi Ben, I neglected to respond to the comment about "index_set functions" in my previous response, these are automatically generated for the IDL by the next patch. I had deleted that comment in v6 of this series (which is in the ml archive but not patchwork for some reason), it didn't seem useful to me. > > ovsdb_idl_index_write_() is mostly copy-paste of a part of > ovsdb_idl_txn_write__(), so to avoid code duplication it would be best > to factor that code out of ovsdb_idl_txn_write__() and call it from both > places. > > I don't understand the behavior of ovsdb_idl_index_find() and > ovsdb_idl_index_forward_to() for a null 'value' argument. Is there some > idiomatic usage where the null and nonnull behaviors work out nicely? > > The row_sync behavior is confusing. I remember it slightly from the > previous iteration but I don't remember being convinced it was the best > way to do things. > The "row_sync" stuff seems to be a matter of using additional keys (row uuid values first, then memory addresses) when inserting or deleting a row in order to handle duplicate keys. This still seems a bit strange, but after thinking about it a bit I believe it is a reasonable approach; these extra keys aren't used when searching, so we will always get the first entry in the list when there are multiple rows with the same key, and when inserting/deleting we still get O(log N) performance when there are many entries with identical keys by searching on the UUID value. But "row_sync" isn't at all useful in figuring out the above, nor are any of the comments in this area. I also wonder if the comparison on memory address is needed... is there any real possibility that two rows in a table will have the same UUID? Do you think reworking the "row_sync" name and adding better comments would be an acceptable way forward here? Thanks, Lance > Thanks, > > Ben. >
Hello, The UUID comparison was added to guarantee that always rows with the same keys are sorted in the same way (between executions of a command, for example). Before that we used the pointer comparison (which gives us some randomness in sorting) but we kept it to be really really sure that we are deleting/inserting the correct row. The row sync effectively was intended for finding the correct row to delete when there are duplicated "index keys" and there are deletions/updates/inserts (note that the updates are really handled as a delete-and-reinsert). We copied ovsdb_idl_index_read from ovsdb_idl_read because we were concerned of what would happen if somebody access the indexes when a transaction is being made. If the index read the new values instead of the old one then the rows could be lost/wrongly sorted, etc. It is the same story with ovsdb_idl_index_write_() and index_set: we wanted the indexes to behave correctly during a transaction with new changes. ovsdb_idl_index_{find,forward_to} are used inside the autogenerated functions for iterating over the indexes. We thought that it would be nice to be able to say things like: OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, start, end) /* [start, end] */ OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, NULL, end) /* [0, end] */ OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, start, NULL) /* [start, "+infty"] */ OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, NULL, NULL) /* everything */ Regards, Esteban -----Original Message----- From: Lance Richardson [mailto:lrichard@redhat.com] Sent: jueves, 13 de julio de 2017 14:29 To: Ben Pfaff <blp@ovn.org> Cc: dev@openvswitch.org; Rodriguez Betancourt, Esteban <estebarb@hpe.com>; Albornoz, Javier <javier.albornoz@hpe.com>; jorge sauma <jorge.sauma@hpe.com>; Lutz, Arnoldo <arnoldo.lutz.guevara@hpe.com> Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes implementation ----- Original Message ----- > From: "Ben Pfaff" <blp@ovn.org> > To: "Lance Richardson" <lrichard@redhat.com> > Cc: dev@openvswitch.org, estebarb@hpe.com, "javier albornoz" <javier.albornoz@hpe.com>, "jorge sauma" > <jorge.sauma@hpe.com>, "arnoldo lutz guevara" > <arnoldo.lutz.guevara@hpe.com> > Sent: Tuesday, 11 July, 2017 5:05:32 PM > Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes > implementation > > On Sat, Jun 24, 2017 at 05:01:51PM -0400, Lance Richardson wrote: > > This patch adds support for the creation of multicolumn indexes in > > the C IDL to enable for efficient search and retrieval of database > > rows by key. > > > > Signed-off-by: Esteban Rodriguez Betancourt <estebarb@hpe.com> > > Co-authored-by: Lance Richardson <lrichard@redhat.com> > > Signed-off-by: Lance Richardson <lrichard@redhat.com> > > --- > > v5: - Coding style fixes (checkpatch.py) > > - Fixed memory leak (missing ovsdb_datum_destroy() in > > ovsdb_idl_index_destroy_row__()). > > - Some polishing of comment and log message text. > > Thanks for reviving this series. > > I don't understand ovsdb_idl_index_read(). It is almost the same as > ovsdb_idl_read(). It looks like ovsdb_idl_read() could be implemented > as a wrapper around it, but I'm also not sure why ovsdb_idl_read() > can't be used directly. Also, I don't understand its comment about > "index_set functions", since there are no functions with index_set in their names. Hi Ben, I neglected to respond to the comment about "index_set functions" in my previous response, these are automatically generated for the IDL by the next patch. I had deleted that comment in v6 of this series (which is in the ml archive but not patchwork for some reason), it didn't seem useful to me. > > ovsdb_idl_index_write_() is mostly copy-paste of a part of > ovsdb_idl_txn_write__(), so to avoid code duplication it would be best > to factor that code out of ovsdb_idl_txn_write__() and call it from > both places. > > I don't understand the behavior of ovsdb_idl_index_find() and > ovsdb_idl_index_forward_to() for a null 'value' argument. Is there > some idiomatic usage where the null and nonnull behaviors work out nicely? > > The row_sync behavior is confusing. I remember it slightly from the > previous iteration but I don't remember being convinced it was the > best way to do things. > The "row_sync" stuff seems to be a matter of using additional keys (row uuid values first, then memory addresses) when inserting or deleting a row in order to handle duplicate keys. This still seems a bit strange, but after thinking about it a bit I believe it is a reasonable approach; these extra keys aren't used when searching, so we will always get the first entry in the list when there are multiple rows with the same key, and when inserting/deleting we still get O(log N) performance when there are many entries with identical keys by searching on the UUID value. But "row_sync" isn't at all useful in figuring out the above, nor are any of the comments in this area. I also wonder if the comparison on memory address is needed... is there any real possibility that two rows in a table will have the same UUID? Do you think reworking the "row_sync" name and adding better comments would be an acceptable way forward here? Thanks, Lance > Thanks, > > Ben. >
> From: "Rodriguez Betancourt, Esteban" <estebarb@hpe.com> > To: "Lance Richardson" <lrichard@redhat.com>, "Ben Pfaff" <blp@ovn.org> > Cc: dev@openvswitch.org, "Javier Albornoz" <javier.albornoz@hpe.com>, "Arnoldo Lutz" <arnoldo.lutz.guevara@hpe.com> > Sent: Thursday, 13 July, 2017 5:22:42 PM > Subject: RE: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes implementation > > Hello, > The UUID comparison was added to guarantee that always rows with the same > keys are > sorted in the same way (between executions of a command, for example). Before > that we > used the pointer comparison (which gives us some randomness in sorting) but > we kept it > to be really really sure that we are deleting/inserting the correct row. > > The row sync effectively was intended for finding the correct row to delete > when there are duplicated "index keys" and there are > deletions/updates/inserts (note that the updates are really handled as a > delete-and-reinsert). Hi Esteban, Thanks, I think all of that makes sense. I do wonder, though, since UUIDs should be unique whether it would make just as much to assert that the addresses are equal if the UUIDs are equal. > > We copied ovsdb_idl_index_read from ovsdb_idl_read because we were concerned > of what would happen if somebody > access the indexes when a transaction is being made. If the index read the > new values instead of the old one then > the rows could be lost/wrongly sorted, etc. It is the same story with > ovsdb_idl_index_write_() and index_set: we > wanted the indexes to behave correctly during a transaction with new changes. > Comparing the v4 version of ovsdb_idl_index_read() against the current ovsdb_idl_read(), the only difference between the two that I can find is that ovsdb_idl_index_read() takes the table class as a function parameter while ovsdb_idl_read() expects row->table to be non-NULL and takes the table class from row->table->class. Since ovsdb_idl_index_init_row() initializes row->table appropriately, ovsdb_idl_read() should produce exactly the same result as ovsdb_idl_index_read(), whether a transaction on that row is in progress or not. So it seems to me we should be able to get rid of ovsdb_idl_index_read(). > ovsdb_idl_index_{find,forward_to} are used inside the autogenerated functions > for iterating over the indexes. > We thought that it would be nice to be able to say things like: > > OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, start, end) /* [start, end] */ > OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, NULL, end) /* [0, end] */ > OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, start, NULL) /* [start, "+infty"] > */ > OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, NULL, NULL) /* everything */ > > Regards, > Esteban Thanks, that makes sense. I think it would be good to add tests for these cases. Thanks again! Lance > > -----Original Message----- > From: Lance Richardson [mailto:lrichard@redhat.com] > Sent: jueves, 13 de julio de 2017 14:29 > To: Ben Pfaff <blp@ovn.org> > Cc: dev@openvswitch.org; Rodriguez Betancourt, Esteban <estebarb@hpe.com>; > Albornoz, Javier <javier.albornoz@hpe.com>; jorge sauma > <jorge.sauma@hpe.com>; Lutz, Arnoldo <arnoldo.lutz.guevara@hpe.com> > Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes > implementation > > > > ----- Original Message ----- > > From: "Ben Pfaff" <blp@ovn.org> > > To: "Lance Richardson" <lrichard@redhat.com> > > Cc: dev@openvswitch.org, estebarb@hpe.com, "javier albornoz" > > <javier.albornoz@hpe.com>, "jorge sauma" > > <jorge.sauma@hpe.com>, "arnoldo lutz guevara" > > <arnoldo.lutz.guevara@hpe.com> > > Sent: Tuesday, 11 July, 2017 5:05:32 PM > > Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes > > implementation > > > > On Sat, Jun 24, 2017 at 05:01:51PM -0400, Lance Richardson wrote: > > > This patch adds support for the creation of multicolumn indexes in > > > the C IDL to enable for efficient search and retrieval of database > > > rows by key. > > > > > > Signed-off-by: Esteban Rodriguez Betancourt <estebarb@hpe.com> > > > Co-authored-by: Lance Richardson <lrichard@redhat.com> > > > Signed-off-by: Lance Richardson <lrichard@redhat.com> > > > --- > > > v5: - Coding style fixes (checkpatch.py) > > > - Fixed memory leak (missing ovsdb_datum_destroy() in > > > ovsdb_idl_index_destroy_row__()). > > > - Some polishing of comment and log message text. > > > > Thanks for reviving this series. > > > > I don't understand ovsdb_idl_index_read(). It is almost the same as > > ovsdb_idl_read(). It looks like ovsdb_idl_read() could be implemented > > as a wrapper around it, but I'm also not sure why ovsdb_idl_read() > > can't be used directly. Also, I don't understand its comment about > > "index_set functions", since there are no functions with index_set in their > > names. > > Hi Ben, > > I neglected to respond to the comment about "index_set functions" in my > previous response, these are automatically generated for the IDL by the next > patch. I had deleted that comment in v6 of this series (which is in the ml > archive but not patchwork for some reason), it didn't seem useful to me. > > > > > ovsdb_idl_index_write_() is mostly copy-paste of a part of > > ovsdb_idl_txn_write__(), so to avoid code duplication it would be best > > to factor that code out of ovsdb_idl_txn_write__() and call it from > > both places. > > > > I don't understand the behavior of ovsdb_idl_index_find() and > > ovsdb_idl_index_forward_to() for a null 'value' argument. Is there > > some idiomatic usage where the null and nonnull behaviors work out nicely? > > > > The row_sync behavior is confusing. I remember it slightly from the > > previous iteration but I don't remember being convinced it was the > > best way to do things. > > > > The "row_sync" stuff seems to be a matter of using additional keys (row uuid > values first, then memory addresses) when inserting or deleting a row in > order to handle duplicate keys. > > This still seems a bit strange, but after thinking about it a bit I believe > it is a reasonable approach; these extra keys aren't used when searching, so > we will always get the first entry in the list when there are multiple rows > with the same key, and when inserting/deleting we still get O(log N) > performance when there are many entries with identical keys by searching on > the UUID value. > > But "row_sync" isn't at all useful in figuring out the above, nor are any of > the comments in this area. > > I also wonder if the comparison on memory address is needed... is there any > real possibility that two rows in a table will have the same UUID? > > Do you think reworking the "row_sync" name and adding better comments would > be an acceptable way forward here? > > Thanks, > > Lance > > > > Thanks, > > > > Ben. > > >
I think that is ok to remove that last pointer check in the generic comparison function. UUID is more than enough. Regards, Esteban > -----Original Message----- > From: Lance Richardson [mailto:lrichard@redhat.com] > Sent: jueves, 13 de julio de 2017 16:48 > To: Rodriguez Betancourt, Esteban <estebarb@hpe.com> > Cc: Ben Pfaff <blp@ovn.org>; dev@openvswitch.org; Albornoz, Javier > <javier.albornoz@hpe.com>; Lutz, Arnoldo > <arnoldo.lutz.guevara@hpe.com> > Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes > implementation > > > From: "Rodriguez Betancourt, Esteban" <estebarb@hpe.com> > > To: "Lance Richardson" <lrichard@redhat.com>, "Ben Pfaff" > > <blp@ovn.org> > > Cc: dev@openvswitch.org, "Javier Albornoz" <javier.albornoz@hpe.com>, > > "Arnoldo Lutz" <arnoldo.lutz.guevara@hpe.com> > > Sent: Thursday, 13 July, 2017 5:22:42 PM > > Subject: RE: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes > > implementation > > > > Hello, > > The UUID comparison was added to guarantee that always rows with the > > same keys are sorted in the same way (between executions of a > command, > > for example). Before that we used the pointer comparison (which gives > > us some randomness in sorting) but we kept it to be really really > > sure that we are deleting/inserting the correct row. > > > > The row sync effectively was intended for finding the correct row to > > delete when there are duplicated "index keys" and there are > > deletions/updates/inserts (note that the updates are really handled as > > a delete-and-reinsert). > > Hi Esteban, > > Thanks, I think all of that makes sense. I do wonder, though, since UUIDs > should be unique whether it would make just as much to assert that the > addresses are equal if the UUIDs are equal. > > > > > We copied ovsdb_idl_index_read from ovsdb_idl_read because we were > > concerned of what would happen if somebody access the indexes when a > > transaction is being made. If the index read the new values instead of > > the old one then the rows could be lost/wrongly sorted, etc. It is the > > same story with > > ovsdb_idl_index_write_() and index_set: we wanted the indexes to > > behave correctly during a transaction with new changes. > > > > Comparing the v4 version of ovsdb_idl_index_read() against the current > ovsdb_idl_read(), the only difference between the two that I can find is that > ovsdb_idl_index_read() takes the table class as a function parameter while > ovsdb_idl_read() expects row->table to be non-NULL and takes the table > class from row->table->class. Since ovsdb_idl_index_init_row() initializes > row->table appropriately, ovsdb_idl_read() should produce exactly the same > result as ovsdb_idl_index_read(), whether a transaction on that row is in > progress or not. So it seems to me we should be able to get rid of > ovsdb_idl_index_read(). > > > ovsdb_idl_index_{find,forward_to} are used inside the autogenerated > > functions for iterating over the indexes. > > We thought that it would be nice to be able to say things like: > > > > OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, start, end) /* [start, > end] > > */ OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, NULL, end) /* [0, > end] > > */ OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, start, NULL) /* > [start, > > "+infty"] */ OVSREC_EXAMPLE_FOR_EACH_RANGE(row, cursor, NULL, > NULL) /* > > everything */ > > > > Regards, > > Esteban > > Thanks, that makes sense. I think it would be good to add tests for these > cases. > > Thanks again! > > Lance > > > > > -----Original Message----- > > From: Lance Richardson [mailto:lrichard@redhat.com] > > Sent: jueves, 13 de julio de 2017 14:29 > > To: Ben Pfaff <blp@ovn.org> > > Cc: dev@openvswitch.org; Rodriguez Betancourt, Esteban > > <estebarb@hpe.com>; Albornoz, Javier <javier.albornoz@hpe.com>; jorge > > sauma <jorge.sauma@hpe.com>; Lutz, Arnoldo > > <arnoldo.lutz.guevara@hpe.com> > > Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes > > implementation > > > > > > > > ----- Original Message ----- > > > From: "Ben Pfaff" <blp@ovn.org> > > > To: "Lance Richardson" <lrichard@redhat.com> > > > Cc: dev@openvswitch.org, estebarb@hpe.com, "javier albornoz" > > > <javier.albornoz@hpe.com>, "jorge sauma" > > > <jorge.sauma@hpe.com>, "arnoldo lutz guevara" > > > <arnoldo.lutz.guevara@hpe.com> > > > Sent: Tuesday, 11 July, 2017 5:05:32 PM > > > Subject: Re: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound > > > indexes implementation > > > > > > On Sat, Jun 24, 2017 at 05:01:51PM -0400, Lance Richardson wrote: > > > > This patch adds support for the creation of multicolumn indexes in > > > > the C IDL to enable for efficient search and retrieval of database > > > > rows by key. > > > > > > > > Signed-off-by: Esteban Rodriguez Betancourt <estebarb@hpe.com> > > > > Co-authored-by: Lance Richardson <lrichard@redhat.com> > > > > Signed-off-by: Lance Richardson <lrichard@redhat.com> > > > > --- > > > > v5: - Coding style fixes (checkpatch.py) > > > > - Fixed memory leak (missing ovsdb_datum_destroy() in > > > > ovsdb_idl_index_destroy_row__()). > > > > - Some polishing of comment and log message text. > > > > > > Thanks for reviving this series. > > > > > > I don't understand ovsdb_idl_index_read(). It is almost the same as > > > ovsdb_idl_read(). It looks like ovsdb_idl_read() could be > > > implemented as a wrapper around it, but I'm also not sure why > > > ovsdb_idl_read() can't be used directly. Also, I don't understand > > > its comment about "index_set functions", since there are no > > > functions with index_set in their names. > > > > Hi Ben, > > > > I neglected to respond to the comment about "index_set functions" in > > my previous response, these are automatically generated for the IDL by > > the next patch. I had deleted that comment in v6 of this series (which > > is in the ml archive but not patchwork for some reason), it didn't seem > useful to me. > > > > > > > > ovsdb_idl_index_write_() is mostly copy-paste of a part of > > > ovsdb_idl_txn_write__(), so to avoid code duplication it would be > > > best to factor that code out of ovsdb_idl_txn_write__() and call it > > > from both places. > > > > > > I don't understand the behavior of ovsdb_idl_index_find() and > > > ovsdb_idl_index_forward_to() for a null 'value' argument. Is there > > > some idiomatic usage where the null and nonnull behaviors work out > nicely? > > > > > > The row_sync behavior is confusing. I remember it slightly from the > > > previous iteration but I don't remember being convinced it was the > > > best way to do things. > > > > > > > The "row_sync" stuff seems to be a matter of using additional keys > > (row uuid values first, then memory addresses) when inserting or > > deleting a row in order to handle duplicate keys. > > > > This still seems a bit strange, but after thinking about it a bit I > > believe it is a reasonable approach; these extra keys aren't used when > > searching, so we will always get the first entry in the list when > > there are multiple rows with the same key, and when inserting/deleting > > we still get O(log N) performance when there are many entries with > > identical keys by searching on the UUID value. > > > > But "row_sync" isn't at all useful in figuring out the above, nor are > > any of the comments in this area. > > > > I also wonder if the comparison on memory address is needed... is > > there any real possibility that two rows in a table will have the same UUID? > > > > Do you think reworking the "row_sync" name and adding better comments > > would be an acceptable way forward here? > > > > Thanks, > > > > Lance > > > > > > > Thanks, > > > > > > Ben. > > > > >
> From: "Rodriguez Betancourt, Esteban" <estebarb@hpe.com> > To: "Lance Richardson" <lrichard@redhat.com> > Cc: "Ben Pfaff" <blp@ovn.org>, dev@openvswitch.org, "Javier Albornoz" <javier.albornoz@hpe.com>, "Arnoldo Lutz" > <arnoldo.lutz.guevara@hpe.com> > Sent: Friday, 14 July, 2017 5:27:44 PM > Subject: RE: [ovs-dev] [PATCH v5 3/4] ovsdb-idl: idl compound indexes implementation > > I think that is ok to remove that last pointer check in the generic > comparison function. UUID is more than enough. > Regards, > Esteban > Thanks, Esteban. I think this will be a very useful facility, many thanks to you and the others who have worked on it. I've posted a v7 of the patch series a little while ago, in patchwork here (removing the pointer check in the comparison function will be a v8 item): https://patchwork.ozlabs.org/project/openvswitch/list/?submitter=67850 Regards, Lance
diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h index 8cfbb95..48a732d 100644 --- a/lib/ovsdb-idl-provider.h +++ b/lib/ovsdb-idl-provider.h @@ -1,4 +1,5 @@ /* Copyright (c) 2009, 2010, 2011, 2012, 2016 Nicira, Inc. + * Copyright (C) 2016 Hewlett Packard Enterprise Development LP * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -111,6 +112,7 @@ struct ovsdb_idl_table { struct hmap rows; /* Contains "struct ovsdb_idl_row"s. */ struct ovsdb_idl *idl; /* Containing idl. */ unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX]; + struct shash indexes; /* Contains "struct ovsdb_idl_index"s */ struct ovs_list track_list; /* Tracked rows (ovsdb_idl_row.track_node). */ struct ovsdb_idl_condition condition; bool cond_changed; @@ -122,6 +124,33 @@ struct ovsdb_idl_class { size_t n_tables; }; +/* + * Contains the configuration per column of the index + */ +struct ovsdb_idl_index_column { + const struct ovsdb_idl_column *column; + column_comparator *comparer; + int sorting_order; +}; + +/* + * Defines a IDL compound index + */ +struct ovsdb_idl_index { + struct skiplist *skiplist; /* Skiplist with pointer to + * rows*/ + struct ovsdb_idl_index_column *columns; /* Columns configuration */ + size_t n_columns; /* Number of columns in index */ + size_t alloc_columns; /* Size allocated memory for + * columns, comparers and + * sorting order */ + bool row_sync; /* Determines if the replica + * is modifying its content or + * not */ + const struct ovsdb_idl_table *table; /* Table that owns this index */ + const char *index_name; /* The name of this index */ +}; + struct ovsdb_idl_row *ovsdb_idl_get_row_arc( struct ovsdb_idl_row *src, const struct ovsdb_idl_table_class *dst_table, diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c index 893143c..6b51826 100644 --- a/lib/ovsdb-idl.c +++ b/lib/ovsdb-idl.c @@ -1,4 +1,5 @@ /* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 Nicira, Inc. + * Copyright (C) 2016 Hewlett Packard Enterprise Development LP * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -37,8 +38,10 @@ #include "ovsdb-parser.h" #include "poll-loop.h" #include "openvswitch/shash.h" +#include "skiplist.h" #include "sset.h" #include "util.h" +#include "uuid.h" #include "openvswitch/vlog.h" VLOG_DEFINE_THIS_MODULE(ovsdb_idl); @@ -228,6 +231,14 @@ ovsdb_idl_table_from_class(const struct ovsdb_idl *, static bool ovsdb_idl_track_is_set(struct ovsdb_idl_table *table); static void ovsdb_idl_send_cond_change(struct ovsdb_idl *idl); +static struct ovsdb_idl_index *ovsdb_idl_create_index_(const struct + ovsdb_idl_table *table, + size_t allocated_cols); +static void + ovsdb_idl_destroy_indexes(struct ovsdb_idl_table *table); +static void ovsdb_idl_add_to_indexes(const struct ovsdb_idl_row *); +static void ovsdb_idl_remove_from_indexes(const struct ovsdb_idl_row *); + /* Creates and returns a connection to database 'remote', which should be in a * form acceptable to jsonrpc_session_open(). The connection will maintain an * in-memory replica of the remote database whose schema is described by @@ -274,6 +285,7 @@ ovsdb_idl_create(const char *remote, const struct ovsdb_idl_class *class, memset(table->modes, default_mode, tc->n_columns); table->need_table = false; shash_init(&table->columns); + shash_init(&table->indexes); for (j = 0; j < tc->n_columns; j++) { const struct ovsdb_idl_column *column = &tc->columns[j]; @@ -329,6 +341,7 @@ ovsdb_idl_destroy(struct ovsdb_idl *idl) for (i = 0; i < idl->class->n_tables; i++) { struct ovsdb_idl_table *table = &idl->tables[i]; ovsdb_idl_condition_destroy(&table->condition); + ovsdb_idl_destroy_indexes(table); shash_destroy(&table->columns); hmap_destroy(&table->rows); free(table->modes); @@ -364,6 +377,7 @@ ovsdb_idl_clear(struct ovsdb_idl *idl) struct ovsdb_idl_arc *arc, *next_arc; if (!ovsdb_idl_row_is_orphan(row)) { + ovsdb_idl_remove_from_indexes(row); ovsdb_idl_row_unparse(row); } LIST_FOR_EACH_SAFE (arc, next_arc, src_node, &row->src_arcs) { @@ -2027,6 +2041,449 @@ ovsdb_idl_row_unparse(struct ovsdb_idl_row *row) } } +/* + * The OVSDB-IDL Compound Indexes feature allows to create custom + * indexes over several columns in the IDL. + * With those index, the developer can retrieve rows matching a search + * criteria, or show the rows sorted in a specific way. + */ + +/* Reads and returns the value of 'column' within 'row'. If the value + * was set with index_set functions then the modified value is returned. + * + * The caller must not modify or free the returned value. + * + */ +static const struct ovsdb_datum * +ovsdb_idl_index_read(const struct ovsdb_idl_row *row, + const struct ovsdb_idl_column *column, + const struct ovsdb_idl_table_class *class) +{ + size_t column_idx; + + column_idx = column - class->columns; + + ovs_assert(row->new != NULL); + ovs_assert(column_idx < class->n_columns); + + if (row->written && bitmap_is_set(row->written, column_idx)) { + return &row->new[column_idx]; + } else if (row->old) { + return &row->old[column_idx]; + } else { + return ovsdb_datum_default(&column->type); + } +} + +/* + * Creates a new index, that is attached to the given idl and table. + * The index has the given name. + * All the indexes must be created before the first ovsdb_idl_run is + * executed. + */ +struct ovsdb_idl_index * +ovsdb_idl_create_index(struct ovsdb_idl *idl, + const struct ovsdb_idl_table_class *tc, + const char *index_name) +{ + struct ovsdb_idl_index *index; + size_t i; + + for (i = 0; i < idl->class->n_tables; i++) { + struct ovsdb_idl_table *table = &idl->tables[i]; + + if (table->class == tc) { + index = ovsdb_idl_create_index_(table, 1); + if (!shash_add_once(&table->indexes, index_name, index)) { + VLOG_ERR("Duplicate index name '%s' in table %s", + index_name, table->class->name); + return NULL; + } + index->index_name = index_name; + return index; + } + } + OVS_NOT_REACHED(); + return NULL; +} + +/* + * Generic comparator that can compare each index, using the custom + * configuration (an struct ovsdb_idl_index) passed to it. + * Not intended for direct usage. + */ +static int +ovsdb_idl_index_generic_comparer(const void *a, + const void *b, const void *conf) +{ + const struct ovsdb_idl_column *column; + const struct ovsdb_idl_index *index; + size_t i; + + index = CONST_CAST(struct ovsdb_idl_index *, conf); + + if (a == b) { + return 0; + } + + for (i = 0; i < index->n_columns; i++) { + int val; + if (index->columns[i].comparer) { + val = index->columns[i].comparer(a, b); + } else { + column = index->columns[i].column; + const struct ovsdb_idl_row *row_a, *row_b; + row_a = CONST_CAST(struct ovsdb_idl_row *, a); + row_b = CONST_CAST(struct ovsdb_idl_row *, b); + const struct ovsdb_datum *datum_a, *datum_b; + datum_a = ovsdb_idl_index_read(row_a, column, index->table->class); + datum_b = ovsdb_idl_index_read(row_b, column, index->table->class); + val = ovsdb_datum_compare_3way(datum_a, datum_b, &column->type); + } + + if (val) { + return val * index->columns[i].sorting_order; + } + } + + /* + * If row_sync is true then the IDL is synchronizing the replica's + * rows with the ones stored in the index. In this case is necessary + * to compare also by pointer value (eg: so the correct row is removed). + * In any other case (the user is doing a search) the values are + * already equal, so return 0. + * Also, the pointers obviously are random, so in different IDLs of the + * same OVSDB instance the index could have different ordering. + * Comparing first by UUID can guarantee the same order at any IDL. + */ + if (index->row_sync) { + const struct ovsdb_idl_row *row_a, *row_b; + + row_a = (const struct ovsdb_idl_row *) a; + row_b = (const struct ovsdb_idl_row *) b; + int value = uuid_compare_3way(&row_a->uuid, &row_b->uuid); + + return value ? value : (a < b) - (a > b); + } else { + return 0; + } +} + +static struct ovsdb_idl_index * +ovsdb_idl_create_index_(const struct ovsdb_idl_table *table, + size_t allocated_cols) +{ + struct ovsdb_idl_index *index; + + index = xmalloc(sizeof (struct ovsdb_idl_index)); + index->n_columns = 0; + index->alloc_columns = allocated_cols; + index->skiplist = skiplist_create(ovsdb_idl_index_generic_comparer, index); + index->columns = xmalloc(allocated_cols * + sizeof (struct ovsdb_idl_index_column)); + index->row_sync = false; + index->table = table; + return index; +} + +static void +ovsdb_idl_destroy_indexes(struct ovsdb_idl_table *table) +{ + struct ovsdb_idl_index *index; + struct shash_node *node; + + SHASH_FOR_EACH (node, &(table->indexes)) { + index = (struct ovsdb_idl_index *) node->data; + skiplist_destroy(index->skiplist, NULL); + free(index->columns); + } +} + +static void +ovsdb_idl_add_to_indexes(const struct ovsdb_idl_row *row) +{ + struct ovsdb_idl_table *table = row->table; + struct ovsdb_idl_index *index; + struct shash_node *node; + + SHASH_FOR_EACH (node, &(table->indexes)) { + index = (struct ovsdb_idl_index *) node->data; + index->row_sync = true; + skiplist_insert(index->skiplist, row); + index->row_sync = false; + } +} + +static void +ovsdb_idl_remove_from_indexes(const struct ovsdb_idl_row *row) +{ + struct ovsdb_idl_table *table = row->table; + struct ovsdb_idl_index *index; + struct shash_node *node; + + SHASH_FOR_EACH (node, &(table->indexes)) { + index = (struct ovsdb_idl_index *) node->data; + index->row_sync = true; + skiplist_delete(index->skiplist, row); + index->row_sync = false; + } +} + +/* + * Adds a column to an existing index (all columns must be inserted before + * the first ovsdb_idl_run is executed). + * In "order", accepts the values OVSDB_INDEX_ASC or OVSDB_INDEX_DESC + * (OVSDB_INDEX_ASC by default). + * In "custom_comparer" it accepts a custom comparison function. If given NULL + * it will use the default comparator for the column (only available for + * string, numeric or real columns). + */ +void +ovsdb_idl_index_add_column(struct ovsdb_idl_index *index, + const struct ovsdb_idl_column *column, + int order, column_comparator * custom_comparer) +{ + /* Check that the column or table is tracked */ + if (!index->table->need_table && + !((OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT) & + *ovsdb_idl_get_mode(index->table->idl, column))) { + VLOG_ERR("Can't add unmonitored column '%s' at index '%s' in " + "table '%s'.", + column->name, index->index_name, index->table->class->name); + } + if (!ovsdb_type_is_scalar(&column->type) && !custom_comparer) { + VLOG_WARN("Comparing non-scalar values."); + } + + /* Allocate more memory for column configuration */ + if (index->n_columns == index->alloc_columns) { + index->alloc_columns++; + index->columns = xrealloc(index->columns, + index->alloc_columns * + sizeof(struct ovsdb_idl_index_column)); + } + + /* Append column to index */ + int i = index->n_columns; + + index->columns[i].column = column; + index->columns[i].comparer = custom_comparer ? custom_comparer : NULL; + if (order == OVSDB_INDEX_ASC) { + index->columns[i].sorting_order = OVSDB_INDEX_ASC; + } else { + index->columns[i].sorting_order = OVSDB_INDEX_DESC; + } + index->n_columns++; +} + +/* + * Initializes a index cursor + */ +bool +ovsdb_idl_initialize_cursor(struct ovsdb_idl *idl, + const struct ovsdb_idl_table_class *tc, + const char *index_name, + struct ovsdb_idl_index_cursor *cursor) +{ + size_t i; + + for (i = 0; i < idl->class->n_tables; i++) { + struct ovsdb_idl_table *table = &idl->tables[i]; + + if (table->class == tc) { + cursor->index = + (struct ovsdb_idl_index *) shash_find(&table->indexes, + index_name)->data; + if (!cursor->index) { + VLOG_ERR("Cursor initialization failed, " + "index %s at table %s does not exist.", + index_name, tc->name); + cursor->index = NULL; + cursor->position = NULL; + return false; + } + cursor->position = skiplist_first(cursor->index->skiplist); + return true; + } + } + VLOG_ERR("Cursor initialization failed, " + "index %s at table %s does not exist.", index_name, tc->name); + return false; +} + +/* + * ovsdb_idl_index_write_ writes a datum in an ovsdb_idl_row, + * and updates the corresponding field in the table record. + * Not intended for direct usage. + */ +void +ovsdb_idl_index_write_(struct ovsdb_idl_row *const_row, + const struct ovsdb_idl_column *column, + struct ovsdb_datum *datum, + const struct ovsdb_idl_table_class *class) +{ + struct ovsdb_idl_row *row = CONST_CAST(struct ovsdb_idl_row *, const_row); + size_t column_idx = column - class->columns; + + /* Extracted from ovsdb_idl_txn_write__ */ + if (row->old == row->new) { + row->new = xmalloc(class->n_columns * sizeof *row->new); + } + if (!row->written) { + row->written = bitmap_allocate(class->n_columns); + } + if (bitmap_is_set(row->written, column_idx)) { + ovsdb_datum_destroy(&row->new[column_idx], &column->type); + } else { + bitmap_set1(row->written, column_idx); + } + row->new[column_idx] = *datum; + (column->unparse)(row); + (column->parse)(row, &row->new[column_idx]); +} + +/* + * Initializes a row for usage with the compound indexes query. + * Not intended for direct usage. + */ +struct ovsdb_idl_row * +ovsdb_idl_index_init_row(struct ovsdb_idl * idl, + const struct ovsdb_idl_table_class *class) +{ + struct ovsdb_idl_row *row = xzalloc(class->allocation_size); + class->row_init(row); + row->written = bitmap_allocate(class->n_columns); + row->table = ovsdb_idl_table_from_class(idl, class); + return row; +} + +/* Not intended for direct ussage. + * Destroys 'row_' and frees associated memory + * This function is intended to be used indirectly through one of the + * "index_destroy_row" functions generated by ovsdb-idlc. + */ +void +ovsdb_idl_index_destroy_row__(const struct ovsdb_idl_row *row_) +{ + struct ovsdb_idl_row *row = CONST_CAST(struct ovsdb_idl_row *, row_); + const struct ovsdb_idl_table_class *class = row->table->class; + size_t i; + + if (row->new) { + const struct ovsdb_idl_column *c; + if (row->written) { + BITMAP_FOR_EACH_1 (i, class->n_columns, row->written) { + c = &class->columns[i]; + (c->unparse) (row); + ovsdb_datum_destroy(&row->new[i], &c->type); + } + } + if (row->old) { + free(row->old); + } + free(row->new); + free(row->written); + free(row); + } +} + +/* + * Moves the cursor to the first entry in the index. + * Returns a pointer to the corresponding ovsdb_idl_row, or NULL. + */ +struct ovsdb_idl_row * +ovsdb_idl_index_first(struct ovsdb_idl_index_cursor *cursor) +{ + cursor->position = skiplist_first(cursor->index->skiplist); + return ovsdb_idl_index_data(cursor); +} + +/* + * Moves the cursor to the following record in the index. + */ +struct ovsdb_idl_row * +ovsdb_idl_index_next(struct ovsdb_idl_index_cursor *cursor) +{ + if (!cursor->position) { + return NULL; + } + cursor->position = skiplist_next(cursor->position); + return ovsdb_idl_index_data(cursor); + } + +/* + * Returns the ovsdb_idl_row pointer corresponding to the current record + */ +struct ovsdb_idl_row * +ovsdb_idl_index_data(struct ovsdb_idl_index_cursor *cursor) +{ + return skiplist_get_data(cursor->position); +} + +/* + * Moves the cursor to the first entry with a value equal to the given value. + * If the value given is NULL then the function will behave like + * ovsdb_idl_index_first. + * Returns a pointer to the corresponding ovsdb_idl_row (that can be casted + * to a ovsrec) or NULL. + */ +struct ovsdb_idl_row * +ovsdb_idl_index_find(struct ovsdb_idl_index_cursor *cursor, + struct ovsdb_idl_row *value) +{ + if (value) { + cursor->position = skiplist_find(cursor->index->skiplist, value); + } else { + cursor->position = skiplist_first(cursor->index->skiplist); + } + return ovsdb_idl_index_data(cursor); +} + +/* + * Moves the cursor to the first entry with a value greater or equal + * to the given value. + * If the value given is NULL then the function will behave like + * ovsdb_idl_index_first. + * Returns a pointer to the corresponding ovsdb_idl_row (that can be casted + * to a ovsrec) or NULL. + */ +struct ovsdb_idl_row * +ovsdb_idl_index_forward_to(struct ovsdb_idl_index_cursor *cursor, + struct ovsdb_idl_row *value) +{ + if (value) { + cursor->position = skiplist_forward_to(cursor->index->skiplist, value); + } else { + cursor->position = skiplist_first(cursor->index->skiplist); + } + return ovsdb_idl_index_data(cursor); +} + +/* + * Returns the result of comparing two ovsrecs (casted to ovsdb_idl_row), + * using the comparer defined in the index. + * Returns: + * < 0 if a < b + * 0 if a == b + * > 0 if a > b + * When some input is NULL this function considers NULL to be greater than + * any other value, and NULL == NULL. + */ +int +ovsdb_idl_index_compare(struct ovsdb_idl_index_cursor *cursor, + struct ovsdb_idl_row *a, struct ovsdb_idl_row *b) +{ + if (a && b) { + return ovsdb_idl_index_generic_comparer(a, b, cursor->index); + } else if (!a && !b) { + return 0; + } else if (a) { + return -1; + } else { + return 1; + } +} + static void ovsdb_idl_row_clear_old(struct ovsdb_idl_row *row) { @@ -2235,11 +2692,13 @@ ovsdb_idl_insert_row(struct ovsdb_idl_row *row, const struct json *row_json) ovsdb_idl_row_parse(row); ovsdb_idl_row_reparse_backrefs(row); + ovsdb_idl_add_to_indexes(row); } static void ovsdb_idl_delete_row(struct ovsdb_idl_row *row) { + ovsdb_idl_remove_from_indexes(row); ovsdb_idl_row_unparse(row); ovsdb_idl_row_clear_arcs(row, true); ovsdb_idl_row_clear_old(row); @@ -2257,10 +2716,12 @@ ovsdb_idl_modify_row(struct ovsdb_idl_row *row, const struct json *row_json) { bool changed; + ovsdb_idl_remove_from_indexes(row); ovsdb_idl_row_unparse(row); ovsdb_idl_row_clear_arcs(row, true); changed = ovsdb_idl_row_update(row, row_json, OVSDB_IDL_CHANGE_MODIFY); ovsdb_idl_row_parse(row); + ovsdb_idl_add_to_indexes(row); return changed; } diff --git a/lib/ovsdb-idl.h b/lib/ovsdb-idl.h index 281873d..77a01a6 100644 --- a/lib/ovsdb-idl.h +++ b/lib/ovsdb-idl.h @@ -1,4 +1,5 @@ /* Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc. + * Copyright (C) 2016 Hewlett Packard Enterprise Development LP * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -41,6 +42,7 @@ #include "ovsdb-data.h" #include "openvswitch/list.h" #include "ovsdb-condition.h" +#include "skiplist.h" struct json; struct ovsdb_datum; @@ -358,4 +360,61 @@ unsigned int ovsdb_idl_set_condition(struct ovsdb_idl *, const struct ovsdb_idl_condition *); unsigned int ovsdb_idl_get_condition_seqno(const struct ovsdb_idl *); + +/* + * The OVSDB-IDL Compound Indexes feature allows the creation of + * custom indexes over several columns in the IDL. + * With these custom indexes, the user can retrieve rows matching a search + * criteria or iterate of rows sorted in a specific order. + */ + +struct ovsdb_idl_index * +ovsdb_idl_create_index(struct ovsdb_idl *idl, + const struct ovsdb_idl_table_class *tc, + const char *index_name); + +#define OVSDB_INDEX_DESC -1 +#define OVSDB_INDEX_ASC 1 + +/* + * Skiplist comparison function. Allows to store sorted data. + */ +typedef int +(column_comparator)(const void *a, const void *b); + +struct ovsdb_idl_index_cursor { + struct ovsdb_idl_index *index; /* Index used by this cursor */ + struct skiplist_node *position; /* Current position in the index */ +}; + +int ovsdb_idl_index_strcmp(char *, char *); +int ovsdb_idl_index_intcmp(int64_t, int64_t); +int ovsdb_idl_index_doublecmp(double, double); +void ovsdb_idl_index_add_column(struct ovsdb_idl_index *, + const struct ovsdb_idl_column *, + int order, + column_comparator *custom_comparer + ); +bool ovsdb_idl_initialize_cursor(struct ovsdb_idl *, + const struct ovsdb_idl_table_class *tc, + const char *index_name, + struct ovsdb_idl_index_cursor *cursor); +void ovsdb_idl_index_write_(struct ovsdb_idl_row *, + const struct ovsdb_idl_column *, + struct ovsdb_datum *, + const struct ovsdb_idl_table_class *); +struct ovsdb_idl_row *ovsdb_idl_index_init_row(struct ovsdb_idl *, + const struct ovsdb_idl_table_class *); +void ovsdb_idl_index_destroy_row__(const struct ovsdb_idl_row *); +struct ovsdb_idl_row *ovsdb_idl_index_first(struct ovsdb_idl_index_cursor *); +struct ovsdb_idl_row *ovsdb_idl_index_next(struct ovsdb_idl_index_cursor *); +struct ovsdb_idl_row *ovsdb_idl_index_data(struct ovsdb_idl_index_cursor *); +struct ovsdb_idl_row *ovsdb_idl_index_find(struct ovsdb_idl_index_cursor *, + struct ovsdb_idl_row *); +struct ovsdb_idl_row *ovsdb_idl_index_forward_to( + struct ovsdb_idl_index_cursor *, + struct ovsdb_idl_row *); +int ovsdb_idl_index_compare(struct ovsdb_idl_index_cursor *, + struct ovsdb_idl_row *a, + struct ovsdb_idl_row *b); #endif /* ovsdb-idl.h */