diff mbox series

[ovs-dev] ovsdb-idl: Support change tracking of write-only columns.

Message ID 20220401112304.30647-1-dceara@redhat.com
State Changes Requested
Headers show
Series [ovs-dev] ovsdb-idl: Support change tracking of write-only columns. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Dumitru Ceara April 1, 2022, 11:23 a.m. UTC
At a first glance, change tracking should never be allowed for
write-only columns.  However, some clients (e.g., ovn-northd) that are
mostly exclusive writers of a database, use change tracking to avoid
duplicating the IDL row records into a local cache when implementing
incremental processing.

The default behavior of the IDL is to automatically turn a write-only
column into a read-write column whenever the client enables change
tracking for that column.

For the afore mentioned clients, this becomes a performance issue.
Commit 1cc618c32524 ("ovsdb-idl: Fix atomicity of writes that don't
change a column's value.") explains why writes that don't change a
column's value cannot be optimized out early if the column is
read/write.

Furthermore, if there is at least one record in any table that
changed during a transaction, then *all* records that have been
written are added to the transaction, even if their values didn't
change.  If there are many such rows (e.g., like in ovn-northd's
case) this incurs a significant overhead because:
a. the client has to build this large transaction
b. the transaction has to be sent over the network
c. the server needs to parse this (mostly) no-op update

We now introduce a new IDL API allowing users to explicitly request
change tracking of a column *without* forcing the column mode to
read-write.

We benchmarked ovn-northd performance when using this new mode
against NB and SB databases taken from ovn-kubernetes scale tests.
We noticed that when a minor change is performed to the Northbound
database (e.g., NB_Global.nb_cfg is incremented) the time it takes to
build the Southbound transaction becomes negligible (vs ~1.5 seconds
before this change).

End-to-end ovn-kubernetes scale tests on 120-node clusters also show
significant reduction of latency to bring up pods; both average and P99
latency decreased by ~30%.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
---
Note: The OVN counter part change is:
https://github.com/dceara/ovn/commit/c051e774120967e7046031098410dd521ee430b7
---
 NEWS               |  2 ++
 lib/ovsdb-idl.c    | 62 ++++++++++++++++++++++++++++++++++++----------
 lib/ovsdb-idl.h    |  1 +
 tests/ovsdb-idl.at | 20 ++++++++++++---
 tests/test-ovsdb.c | 21 ++++++++++++----
 5 files changed, 85 insertions(+), 21 deletions(-)

Comments

Dumitru Ceara April 14, 2022, 9:33 a.m. UTC | #1
Hi Han,

I was wondering if you had some time by any chance to think about the
proposal below.

Thank you!

Regards,
Dumitru

On 4/1/22 13:23, Dumitru Ceara wrote:
> At a first glance, change tracking should never be allowed for
> write-only columns.  However, some clients (e.g., ovn-northd) that are
> mostly exclusive writers of a database, use change tracking to avoid
> duplicating the IDL row records into a local cache when implementing
> incremental processing.
> 
> The default behavior of the IDL is to automatically turn a write-only
> column into a read-write column whenever the client enables change
> tracking for that column.
> 
> For the afore mentioned clients, this becomes a performance issue.
> Commit 1cc618c32524 ("ovsdb-idl: Fix atomicity of writes that don't
> change a column's value.") explains why writes that don't change a
> column's value cannot be optimized out early if the column is
> read/write.
> 
> Furthermore, if there is at least one record in any table that
> changed during a transaction, then *all* records that have been
> written are added to the transaction, even if their values didn't
> change.  If there are many such rows (e.g., like in ovn-northd's
> case) this incurs a significant overhead because:
> a. the client has to build this large transaction
> b. the transaction has to be sent over the network
> c. the server needs to parse this (mostly) no-op update
> 
> We now introduce a new IDL API allowing users to explicitly request
> change tracking of a column *without* forcing the column mode to
> read-write.
> 
> We benchmarked ovn-northd performance when using this new mode
> against NB and SB databases taken from ovn-kubernetes scale tests.
> We noticed that when a minor change is performed to the Northbound
> database (e.g., NB_Global.nb_cfg is incremented) the time it takes to
> build the Southbound transaction becomes negligible (vs ~1.5 seconds
> before this change).
> 
> End-to-end ovn-kubernetes scale tests on 120-node clusters also show
> significant reduction of latency to bring up pods; both average and P99
> latency decreased by ~30%.
> 
> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
> ---
> Note: The OVN counter part change is:
> https://github.com/dceara/ovn/commit/c051e774120967e7046031098410dd521ee430b7
> ---
>  NEWS               |  2 ++
>  lib/ovsdb-idl.c    | 62 ++++++++++++++++++++++++++++++++++++----------
>  lib/ovsdb-idl.h    |  1 +
>  tests/ovsdb-idl.at | 20 ++++++++++++---
>  tests/test-ovsdb.c | 21 ++++++++++++----
>  5 files changed, 85 insertions(+), 21 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 8fa57836a928..2e87d1735aa2 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -3,6 +3,8 @@ Post-v2.17.0
>     - OVSDB:
>       * 'relay' service model now supports transaction history, i.e. honors the
>         'last-txn-id' field in 'monitor_cond_since' requests from clients.
> +   - OVSDB-IDL:
> +     * New interface for enabling change tracking of write-only columns.
>  
>  
>  v2.17.0 - 17 Feb 2022
> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
> index 882ede75598d..6898d0911602 100644
> --- a/lib/ovsdb-idl.c
> +++ b/lib/ovsdb-idl.c
> @@ -93,6 +93,8 @@ struct ovsdb_idl {
>      struct ovsdb_idl_txn *txn;
>      struct hmap outstanding_txns;
>      bool verify_write_only;
> +    bool track_changes_noalert; /* If true then the IDL allows change tracking
> +                                 * of write-only columns. */
>      struct ovs_list deleted_untracked_rows; /* Stores rows deleted in the
>                                               * current run, that are not yet
>                                               * added to the track_list. */
> @@ -264,6 +266,7 @@ ovsdb_idl_create_unconnected(const struct ovsdb_idl_class *class,
>          .txn = NULL,
>          .outstanding_txns = HMAP_INITIALIZER(&idl->outstanding_txns),
>          .verify_write_only = false,
> +        .track_changes_noalert = false,
>          .deleted_untracked_rows
>              = OVS_LIST_INITIALIZER(&idl->deleted_untracked_rows),
>          .rows_to_reparse
> @@ -586,6 +589,19 @@ ovsdb_idl_verify_write_only(struct ovsdb_idl *idl)
>      idl->verify_write_only = true;
>  }
>  
> +/* In regular operation mode, IDL change tracking of a column implies
> + * that the column is read-write (OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT set).
> + * This comes at a cost as no-op updates to read-write columns cannot be
> + * easily optimized out (see ovsdb_idl_txn_write()).  This function allows
> + * users to enable a mode in which change tracking is allowed for write-only
> + * columns.
> + */
> +void
> +ovsdb_idl_track_changes_noalert(struct ovsdb_idl *idl)
> +{
> +    idl->track_changes_noalert = true;
> +}
> +
>  /* Returns true if 'idl' is currently connected or trying to connect
>   * and a negative response to a schema request has not been received */
>  bool
> @@ -889,6 +905,16 @@ add_ref_table(struct ovsdb_idl *idl, const struct ovsdb_base_type *base)
>      }
>  }
>  
> +static void
> +ovsdb_idl_add_column__(struct ovsdb_idl *idl,
> +                       const struct ovsdb_idl_column *column,
> +                       unsigned char mode)
> +{
> +    ovsdb_idl_set_mode(idl, column, mode);
> +    add_ref_table(idl, &column->type.key);
> +    add_ref_table(idl, &column->type.value);
> +}
> +
>  /* Turns on OVSDB_IDL_MONITOR and OVSDB_IDL_ALERT for 'column' in 'idl'.  Also
>   * ensures that any tables referenced by 'column' will be replicated, even if
>   * no columns in that table are selected for replication (see
> @@ -902,9 +928,7 @@ void
>  ovsdb_idl_add_column(struct ovsdb_idl *idl,
>                       const struct ovsdb_idl_column *column)
>  {
> -    ovsdb_idl_set_mode(idl, column, OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
> -    add_ref_table(idl, &column->type.key);
> -    add_ref_table(idl, &column->type.value);
> +    ovsdb_idl_add_column__(idl, column, OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
>  }
>  
>  /* Ensures that the table with class 'tc' will be replicated on 'idl' even if
> @@ -1236,15 +1260,25 @@ ovsdb_idl_row_get_seqno(const struct ovsdb_idl_row *row,
>   * functions.
>   *
>   * This function should be called between ovsdb_idl_create() and
> - * the first call to ovsdb_idl_run(). The column to be tracked
> - * should have OVSDB_IDL_ALERT turned on.
> + * the first call to ovsdb_idl_run(). The column to be traked should
> + * have:
> + * a. OVSDB_IDL_ALERT turned on if 'track_changes_noalert' is 'false'
> + * OR
> + * b. OVSDB_IDL_MONITOR turned on if 'track_changes_noalert' is 'true'
>   */
>  void
>  ovsdb_idl_track_add_column(struct ovsdb_idl *idl,
>                             const struct ovsdb_idl_column *column)
>  {
> -    if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_ALERT)) {
> -        ovsdb_idl_add_column(idl, column);
> +    if (idl->track_changes_noalert) {
> +        if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_MONITOR)) {
> +            ovsdb_idl_add_column__(idl, column, OVSDB_IDL_MONITOR);
> +        }
> +    } else {
> +        if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_ALERT)) {
> +            ovsdb_idl_add_column__(idl, column,
> +                                   OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
> +        }
>      }
>      *ovsdb_idl_get_mode(idl, column) |= OVSDB_IDL_TRACK;
>  }
> @@ -1694,11 +1728,13 @@ ovsdb_idl_row_change(struct ovsdb_idl_row *row, const struct shash *values,
>              continue;
>          }
>  
> -        if (datum_changed && table->modes[column_idx] & OVSDB_IDL_ALERT) {
> -            changed = true;
> -            row->change_seqno[change]
> -                = row->table->change_seqno[change]
> -                = row->table->idl->change_seqno + 1;
> +        if (datum_changed) {
> +            if (table->modes[column_idx] & OVSDB_IDL_ALERT) {
> +                changed = true;
> +                row->change_seqno[change]
> +                    = row->table->change_seqno[change]
> +                    = row->table->idl->change_seqno + 1;
> +            }
>  
>              if (table->modes[column_idx] & OVSDB_IDL_TRACK) {
>                  if (ovs_list_is_empty(&row->track_node) &&
> @@ -3501,7 +3537,7 @@ ovsdb_idl_txn_write__(const struct ovsdb_idl_row *row_,
>  
>      class = row->table->class_;
>      column_idx = column - class->columns;
> -    write_only = row->table->modes[column_idx] == OVSDB_IDL_MONITOR;
> +    write_only = !(row->table->modes[column_idx] & OVSDB_IDL_ALERT);
>  
>      ovs_assert(row->new_datum != NULL);
>      ovs_assert(column_idx < class->n_columns);
> diff --git a/lib/ovsdb-idl.h b/lib/ovsdb-idl.h
> index d00599616ef9..bcb576359b91 100644
> --- a/lib/ovsdb-idl.h
> +++ b/lib/ovsdb-idl.h
> @@ -85,6 +85,7 @@ bool ovsdb_idl_has_ever_connected(const struct ovsdb_idl *);
>  void ovsdb_idl_enable_reconnect(struct ovsdb_idl *);
>  void ovsdb_idl_force_reconnect(struct ovsdb_idl *);
>  void ovsdb_idl_verify_write_only(struct ovsdb_idl *);
> +void ovsdb_idl_track_changes_noalert(struct ovsdb_idl *);
>  
>  bool ovsdb_idl_is_alive(const struct ovsdb_idl *);
>  bool ovsdb_idl_is_connected(const struct ovsdb_idl *idl);
> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> index 62e2b638320c..0436e0f77b0f 100644
> --- a/tests/ovsdb-idl.at
> +++ b/tests/ovsdb-idl.at
> @@ -1154,12 +1154,25 @@ OVSDB_CHECK_IDL_WO_MONITOR_COND([simple idl disable monitor-cond],
>  ]])
>  
>  m4_define([OVSDB_CHECK_IDL_TRACK_C],
> -  [AT_SETUP([$1 - C])
> +  [AT_SETUP([$1 - alert - C])
> +   AT_KEYWORDS([ovsdb server idl tracking positive $5])
> +   AT_CHECK([ovsdb_start_idltest])
> +   m4_if([$2], [], [],
> +     [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], [ignore])])
> +   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 --change-track=alert idl unix:socket $3],
> +            [0], [stdout], [ignore])
> +   AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]),
> +            [0], [$4])
> +   OVSDB_SERVER_SHUTDOWN
> +   AT_CLEANUP])
> +
> +m4_define([OVSDB_CHECK_IDL_TRACK_NOALERT_C],
> +  [AT_SETUP([$1 - noalert - C])
>     AT_KEYWORDS([ovsdb server idl tracking positive $5])
>     AT_CHECK([ovsdb_start_idltest])
>     m4_if([$2], [], [],
>       [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], [ignore])])
> -   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 -c idl unix:socket $3],
> +   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 --change-track=noalert idl unix:socket $3],
>              [0], [stdout], [ignore])
>     AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]),
>              [0], [$4])
> @@ -1167,7 +1180,8 @@ m4_define([OVSDB_CHECK_IDL_TRACK_C],
>     AT_CLEANUP])
>  
>  m4_define([OVSDB_CHECK_IDL_TRACK],
> -  [OVSDB_CHECK_IDL_TRACK_C($@)])
> +  [OVSDB_CHECK_IDL_TRACK_C($@)
> +   OVSDB_CHECK_IDL_TRACK_NOALERT_C($@)])
>  
>  OVSDB_CHECK_IDL_TRACK([track, simple idl, initially populated],
>    [['["idltest",
> diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c
> index ca4e87b8115c..a5de7919d4c6 100644
> --- a/tests/test-ovsdb.c
> +++ b/tests/test-ovsdb.c
> @@ -56,6 +56,7 @@
>  VLOG_DEFINE_THIS_MODULE(test_ovsdb);
>  
>  struct test_ovsdb_pvt_context {
> +    bool track_noalert;
>      bool track;
>  };
>  
> @@ -122,6 +123,15 @@ parse_options(int argc, char *argv[], struct test_ovsdb_pvt_context *pvt)
>              break;
>  
>          case 'c':
> +            if (optarg) {
> +                if (!strcmp(optarg, "noalert")) {
> +                    pvt->track_noalert = true;
> +                } else if (!strcmp(optarg, "alert")) {
> +                    pvt->track_noalert = false;
> +                } else {
> +                    ovs_fatal(0, "value %s is invalid", optarg);
> +                }
> +            }
>              pvt->track = true;
>              break;
>  
> @@ -2610,6 +2620,7 @@ update_conditions(struct ovsdb_idl *idl, char *commands)
>  static void
>  do_idl(struct ovs_cmdl_context *ctx)
>  {
> +    struct test_ovsdb_pvt_context *pvt = ctx->pvt;
>      struct jsonrpc *rpc;
>      struct ovsdb_idl *idl;
>      unsigned int seqno = 0;
> @@ -2618,9 +2629,6 @@ do_idl(struct ovs_cmdl_context *ctx)
>      int step = 0;
>      int error;
>      int i;
> -    bool track;
> -
> -    track = ((struct test_ovsdb_pvt_context *)(ctx->pvt))->track;
>  
>      idl = ovsdb_idl_create(ctx->argv[1], &idltest_idl_class, true, true);
>      ovsdb_idl_set_leader_only(idl, false);
> @@ -2637,7 +2645,10 @@ do_idl(struct ovs_cmdl_context *ctx)
>          rpc = NULL;
>      }
>  
> -    if (track) {
> +    if (pvt->track_noalert) {
> +        ovsdb_idl_track_changes_noalert(idl);
> +    }
> +    if (pvt->track) {
>          ovsdb_idl_track_add_all(idl);
>      }
>  
> @@ -2683,7 +2694,7 @@ do_idl(struct ovs_cmdl_context *ctx)
>              }
>  
>              /* Print update. */
> -            if (track) {
> +            if (pvt->track) {
>                  print_idl_track(idl, step++, terse);
>                  ovsdb_idl_track_clear(idl);
>              } else {
Numan Siddique April 14, 2022, 4:45 p.m. UTC | #2
On Fri, Apr 1, 2022 at 7:23 AM Dumitru Ceara <dceara@redhat.com> wrote:
>
> At a first glance, change tracking should never be allowed for
> write-only columns.  However, some clients (e.g., ovn-northd) that are
> mostly exclusive writers of a database, use change tracking to avoid
> duplicating the IDL row records into a local cache when implementing
> incremental processing.
>
> The default behavior of the IDL is to automatically turn a write-only
> column into a read-write column whenever the client enables change
> tracking for that column.
>
> For the afore mentioned clients, this becomes a performance issue.
> Commit 1cc618c32524 ("ovsdb-idl: Fix atomicity of writes that don't
> change a column's value.") explains why writes that don't change a
> column's value cannot be optimized out early if the column is
> read/write.
>
> Furthermore, if there is at least one record in any table that
> changed during a transaction, then *all* records that have been
> written are added to the transaction, even if their values didn't
> change.  If there are many such rows (e.g., like in ovn-northd's
> case) this incurs a significant overhead because:
> a. the client has to build this large transaction
> b. the transaction has to be sent over the network
> c. the server needs to parse this (mostly) no-op update
>
> We now introduce a new IDL API allowing users to explicitly request
> change tracking of a column *without* forcing the column mode to
> read-write.
>
> We benchmarked ovn-northd performance when using this new mode
> against NB and SB databases taken from ovn-kubernetes scale tests.
> We noticed that when a minor change is performed to the Northbound
> database (e.g., NB_Global.nb_cfg is incremented) the time it takes to
> build the Southbound transaction becomes negligible (vs ~1.5 seconds
> before this change).
>
> End-to-end ovn-kubernetes scale tests on 120-node clusters also show
> significant reduction of latency to bring up pods; both average and P99
> latency decreased by ~30%.
>
> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
> ---
> Note: The OVN counter part change is:
> https://github.com/dceara/ovn/commit/c051e774120967e7046031098410dd521ee430b7
> ---
>  NEWS               |  2 ++
>  lib/ovsdb-idl.c    | 62 ++++++++++++++++++++++++++++++++++++----------
>  lib/ovsdb-idl.h    |  1 +
>  tests/ovsdb-idl.at | 20 ++++++++++++---
>  tests/test-ovsdb.c | 21 ++++++++++++----
>  5 files changed, 85 insertions(+), 21 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 8fa57836a928..2e87d1735aa2 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -3,6 +3,8 @@ Post-v2.17.0
>     - OVSDB:
>       * 'relay' service model now supports transaction history, i.e. honors the
>         'last-txn-id' field in 'monitor_cond_since' requests from clients.
> +   - OVSDB-IDL:
> +     * New interface for enabling change tracking of write-only columns.
>
>
>  v2.17.0 - 17 Feb 2022
> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
> index 882ede75598d..6898d0911602 100644
> --- a/lib/ovsdb-idl.c
> +++ b/lib/ovsdb-idl.c
> @@ -93,6 +93,8 @@ struct ovsdb_idl {
>      struct ovsdb_idl_txn *txn;
>      struct hmap outstanding_txns;
>      bool verify_write_only;
> +    bool track_changes_noalert; /* If true then the IDL allows change tracking
> +                                 * of write-only columns. */
>      struct ovs_list deleted_untracked_rows; /* Stores rows deleted in the
>                                               * current run, that are not yet
>                                               * added to the track_list. */
> @@ -264,6 +266,7 @@ ovsdb_idl_create_unconnected(const struct ovsdb_idl_class *class,
>          .txn = NULL,
>          .outstanding_txns = HMAP_INITIALIZER(&idl->outstanding_txns),
>          .verify_write_only = false,
> +        .track_changes_noalert = false,
>          .deleted_untracked_rows
>              = OVS_LIST_INITIALIZER(&idl->deleted_untracked_rows),
>          .rows_to_reparse
> @@ -586,6 +589,19 @@ ovsdb_idl_verify_write_only(struct ovsdb_idl *idl)
>      idl->verify_write_only = true;
>  }
>
> +/* In regular operation mode, IDL change tracking of a column implies
> + * that the column is read-write (OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT set).
> + * This comes at a cost as no-op updates to read-write columns cannot be
> + * easily optimized out (see ovsdb_idl_txn_write()).  This function allows
> + * users to enable a mode in which change tracking is allowed for write-only
> + * columns.
> + */
> +void
> +ovsdb_idl_track_changes_noalert(struct ovsdb_idl *idl)
> +{
> +    idl->track_changes_noalert = true;
> +}
> +
>  /* Returns true if 'idl' is currently connected or trying to connect
>   * and a negative response to a schema request has not been received */
>  bool
> @@ -889,6 +905,16 @@ add_ref_table(struct ovsdb_idl *idl, const struct ovsdb_base_type *base)
>      }
>  }
>
> +static void
> +ovsdb_idl_add_column__(struct ovsdb_idl *idl,
> +                       const struct ovsdb_idl_column *column,
> +                       unsigned char mode)
> +{
> +    ovsdb_idl_set_mode(idl, column, mode);
> +    add_ref_table(idl, &column->type.key);
> +    add_ref_table(idl, &column->type.value);
> +}
> +
>  /* Turns on OVSDB_IDL_MONITOR and OVSDB_IDL_ALERT for 'column' in 'idl'.  Also
>   * ensures that any tables referenced by 'column' will be replicated, even if
>   * no columns in that table are selected for replication (see
> @@ -902,9 +928,7 @@ void
>  ovsdb_idl_add_column(struct ovsdb_idl *idl,
>                       const struct ovsdb_idl_column *column)
>  {
> -    ovsdb_idl_set_mode(idl, column, OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
> -    add_ref_table(idl, &column->type.key);
> -    add_ref_table(idl, &column->type.value);
> +    ovsdb_idl_add_column__(idl, column, OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
>  }
>
>  /* Ensures that the table with class 'tc' will be replicated on 'idl' even if
> @@ -1236,15 +1260,25 @@ ovsdb_idl_row_get_seqno(const struct ovsdb_idl_row *row,
>   * functions.
>   *
>   * This function should be called between ovsdb_idl_create() and
> - * the first call to ovsdb_idl_run(). The column to be tracked
> - * should have OVSDB_IDL_ALERT turned on.
> + * the first call to ovsdb_idl_run(). The column to be traked should
> + * have:
> + * a. OVSDB_IDL_ALERT turned on if 'track_changes_noalert' is 'false'
> + * OR
> + * b. OVSDB_IDL_MONITOR turned on if 'track_changes_noalert' is 'true'
>   */
>  void
>  ovsdb_idl_track_add_column(struct ovsdb_idl *idl,
>                             const struct ovsdb_idl_column *column)
>  {
> -    if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_ALERT)) {
> -        ovsdb_idl_add_column(idl, column);
> +    if (idl->track_changes_noalert) {
> +        if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_MONITOR)) {
> +            ovsdb_idl_add_column__(idl, column, OVSDB_IDL_MONITOR);
> +        }
> +    } else {
> +        if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_ALERT)) {
> +            ovsdb_idl_add_column__(idl, column,
> +                                   OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
> +        }
>      }
>      *ovsdb_idl_get_mode(idl, column) |= OVSDB_IDL_TRACK;
>  }
> @@ -1694,11 +1728,13 @@ ovsdb_idl_row_change(struct ovsdb_idl_row *row, const struct shash *values,
>              continue;
>          }
>
> -        if (datum_changed && table->modes[column_idx] & OVSDB_IDL_ALERT) {
> -            changed = true;
> -            row->change_seqno[change]
> -                = row->table->change_seqno[change]
> -                = row->table->idl->change_seqno + 1;
> +        if (datum_changed) {
> +            if (table->modes[column_idx] & OVSDB_IDL_ALERT) {
> +                changed = true;
> +                row->change_seqno[change]
> +                    = row->table->change_seqno[change]
> +                    = row->table->idl->change_seqno + 1;
> +            }
>
>              if (table->modes[column_idx] & OVSDB_IDL_TRACK) {
>                  if (ovs_list_is_empty(&row->track_node) &&
> @@ -3501,7 +3537,7 @@ ovsdb_idl_txn_write__(const struct ovsdb_idl_row *row_,
>
>      class = row->table->class_;
>      column_idx = column - class->columns;
> -    write_only = row->table->modes[column_idx] == OVSDB_IDL_MONITOR;
> +    write_only = !(row->table->modes[column_idx] & OVSDB_IDL_ALERT);
>
>      ovs_assert(row->new_datum != NULL);
>      ovs_assert(column_idx < class->n_columns);
> diff --git a/lib/ovsdb-idl.h b/lib/ovsdb-idl.h
> index d00599616ef9..bcb576359b91 100644
> --- a/lib/ovsdb-idl.h
> +++ b/lib/ovsdb-idl.h
> @@ -85,6 +85,7 @@ bool ovsdb_idl_has_ever_connected(const struct ovsdb_idl *);
>  void ovsdb_idl_enable_reconnect(struct ovsdb_idl *);
>  void ovsdb_idl_force_reconnect(struct ovsdb_idl *);
>  void ovsdb_idl_verify_write_only(struct ovsdb_idl *);
> +void ovsdb_idl_track_changes_noalert(struct ovsdb_idl *);
>

Hi Dumitru,

I've a few comments.  Since the proposed API applies to all IDL
columns,  does it make sense to provide the APIs to just enable
MONITOR + TRACK
for the columns which the client desires ?

Maybe we can add 2 new APIs

1.  ovsdb_idl_track_no_alert_add_all() ->  This would set MONITOR |
TRACK for all columns.
2.  ovsdb_idl_track_no_alert_add_column(idl, column).  which sets
MONITOR | TRACK for that column

What do you think ?

Thanks
Numan

>  bool ovsdb_idl_is_alive(const struct ovsdb_idl *);
>  bool ovsdb_idl_is_connected(const struct ovsdb_idl *idl);
> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> index 62e2b638320c..0436e0f77b0f 100644
> --- a/tests/ovsdb-idl.at
> +++ b/tests/ovsdb-idl.at
> @@ -1154,12 +1154,25 @@ OVSDB_CHECK_IDL_WO_MONITOR_COND([simple idl disable monitor-cond],
>  ]])
>
>  m4_define([OVSDB_CHECK_IDL_TRACK_C],
> -  [AT_SETUP([$1 - C])
> +  [AT_SETUP([$1 - alert - C])
> +   AT_KEYWORDS([ovsdb server idl tracking positive $5])
> +   AT_CHECK([ovsdb_start_idltest])
> +   m4_if([$2], [], [],
> +     [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], [ignore])])
> +   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 --change-track=alert idl unix:socket $3],
> +            [0], [stdout], [ignore])
> +   AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]),
> +            [0], [$4])
> +   OVSDB_SERVER_SHUTDOWN
> +   AT_CLEANUP])
> +
> +m4_define([OVSDB_CHECK_IDL_TRACK_NOALERT_C],
> +  [AT_SETUP([$1 - noalert - C])
>     AT_KEYWORDS([ovsdb server idl tracking positive $5])
>     AT_CHECK([ovsdb_start_idltest])
>     m4_if([$2], [], [],
>       [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], [ignore])])
> -   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 -c idl unix:socket $3],
> +   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 --change-track=noalert idl unix:socket $3],
>              [0], [stdout], [ignore])
>     AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]),
>              [0], [$4])
> @@ -1167,7 +1180,8 @@ m4_define([OVSDB_CHECK_IDL_TRACK_C],
>     AT_CLEANUP])
>
>  m4_define([OVSDB_CHECK_IDL_TRACK],
> -  [OVSDB_CHECK_IDL_TRACK_C($@)])
> +  [OVSDB_CHECK_IDL_TRACK_C($@)
> +   OVSDB_CHECK_IDL_TRACK_NOALERT_C($@)])
>
>  OVSDB_CHECK_IDL_TRACK([track, simple idl, initially populated],
>    [['["idltest",
> diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c
> index ca4e87b8115c..a5de7919d4c6 100644
> --- a/tests/test-ovsdb.c
> +++ b/tests/test-ovsdb.c
> @@ -56,6 +56,7 @@
>  VLOG_DEFINE_THIS_MODULE(test_ovsdb);
>
>  struct test_ovsdb_pvt_context {
> +    bool track_noalert;
>      bool track;
>  };
>
> @@ -122,6 +123,15 @@ parse_options(int argc, char *argv[], struct test_ovsdb_pvt_context *pvt)
>              break;
>
>          case 'c':
> +            if (optarg) {
> +                if (!strcmp(optarg, "noalert")) {
> +                    pvt->track_noalert = true;
> +                } else if (!strcmp(optarg, "alert")) {
> +                    pvt->track_noalert = false;
> +                } else {
> +                    ovs_fatal(0, "value %s is invalid", optarg);
> +                }
> +            }
>              pvt->track = true;
>              break;
>
> @@ -2610,6 +2620,7 @@ update_conditions(struct ovsdb_idl *idl, char *commands)
>  static void
>  do_idl(struct ovs_cmdl_context *ctx)
>  {
> +    struct test_ovsdb_pvt_context *pvt = ctx->pvt;
>      struct jsonrpc *rpc;
>      struct ovsdb_idl *idl;
>      unsigned int seqno = 0;
> @@ -2618,9 +2629,6 @@ do_idl(struct ovs_cmdl_context *ctx)
>      int step = 0;
>      int error;
>      int i;
> -    bool track;
> -
> -    track = ((struct test_ovsdb_pvt_context *)(ctx->pvt))->track;
>
>      idl = ovsdb_idl_create(ctx->argv[1], &idltest_idl_class, true, true);
>      ovsdb_idl_set_leader_only(idl, false);
> @@ -2637,7 +2645,10 @@ do_idl(struct ovs_cmdl_context *ctx)
>          rpc = NULL;
>      }
>
> -    if (track) {
> +    if (pvt->track_noalert) {
> +        ovsdb_idl_track_changes_noalert(idl);
> +    }
> +    if (pvt->track) {
>          ovsdb_idl_track_add_all(idl);
>      }
>
> @@ -2683,7 +2694,7 @@ do_idl(struct ovs_cmdl_context *ctx)
>              }
>
>              /* Print update. */
> -            if (track) {
> +            if (pvt->track) {
>                  print_idl_track(idl, step++, terse);
>                  ovsdb_idl_track_clear(idl);
>              } else {
> --
> 2.27.0
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Dumitru Ceara April 15, 2022, 11:27 a.m. UTC | #3
On 4/14/22 18:45, Numan Siddique wrote:
> On Fri, Apr 1, 2022 at 7:23 AM Dumitru Ceara <dceara@redhat.com> wrote:
>>
>> At a first glance, change tracking should never be allowed for
>> write-only columns.  However, some clients (e.g., ovn-northd) that are
>> mostly exclusive writers of a database, use change tracking to avoid
>> duplicating the IDL row records into a local cache when implementing
>> incremental processing.
>>
>> The default behavior of the IDL is to automatically turn a write-only
>> column into a read-write column whenever the client enables change
>> tracking for that column.
>>
>> For the afore mentioned clients, this becomes a performance issue.
>> Commit 1cc618c32524 ("ovsdb-idl: Fix atomicity of writes that don't
>> change a column's value.") explains why writes that don't change a
>> column's value cannot be optimized out early if the column is
>> read/write.
>>
>> Furthermore, if there is at least one record in any table that
>> changed during a transaction, then *all* records that have been
>> written are added to the transaction, even if their values didn't
>> change.  If there are many such rows (e.g., like in ovn-northd's
>> case) this incurs a significant overhead because:
>> a. the client has to build this large transaction
>> b. the transaction has to be sent over the network
>> c. the server needs to parse this (mostly) no-op update
>>
>> We now introduce a new IDL API allowing users to explicitly request
>> change tracking of a column *without* forcing the column mode to
>> read-write.
>>
>> We benchmarked ovn-northd performance when using this new mode
>> against NB and SB databases taken from ovn-kubernetes scale tests.
>> We noticed that when a minor change is performed to the Northbound
>> database (e.g., NB_Global.nb_cfg is incremented) the time it takes to
>> build the Southbound transaction becomes negligible (vs ~1.5 seconds
>> before this change).
>>
>> End-to-end ovn-kubernetes scale tests on 120-node clusters also show
>> significant reduction of latency to bring up pods; both average and P99
>> latency decreased by ~30%.
>>
>> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
>> ---
>> Note: The OVN counter part change is:
>> https://github.com/dceara/ovn/commit/c051e774120967e7046031098410dd521ee430b7
>> ---
>>  NEWS               |  2 ++
>>  lib/ovsdb-idl.c    | 62 ++++++++++++++++++++++++++++++++++++----------
>>  lib/ovsdb-idl.h    |  1 +
>>  tests/ovsdb-idl.at | 20 ++++++++++++---
>>  tests/test-ovsdb.c | 21 ++++++++++++----
>>  5 files changed, 85 insertions(+), 21 deletions(-)
>>
>> diff --git a/NEWS b/NEWS
>> index 8fa57836a928..2e87d1735aa2 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -3,6 +3,8 @@ Post-v2.17.0
>>     - OVSDB:
>>       * 'relay' service model now supports transaction history, i.e. honors the
>>         'last-txn-id' field in 'monitor_cond_since' requests from clients.
>> +   - OVSDB-IDL:
>> +     * New interface for enabling change tracking of write-only columns.
>>
>>
>>  v2.17.0 - 17 Feb 2022
>> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
>> index 882ede75598d..6898d0911602 100644
>> --- a/lib/ovsdb-idl.c
>> +++ b/lib/ovsdb-idl.c
>> @@ -93,6 +93,8 @@ struct ovsdb_idl {
>>      struct ovsdb_idl_txn *txn;
>>      struct hmap outstanding_txns;
>>      bool verify_write_only;
>> +    bool track_changes_noalert; /* If true then the IDL allows change tracking
>> +                                 * of write-only columns. */
>>      struct ovs_list deleted_untracked_rows; /* Stores rows deleted in the
>>                                               * current run, that are not yet
>>                                               * added to the track_list. */
>> @@ -264,6 +266,7 @@ ovsdb_idl_create_unconnected(const struct ovsdb_idl_class *class,
>>          .txn = NULL,
>>          .outstanding_txns = HMAP_INITIALIZER(&idl->outstanding_txns),
>>          .verify_write_only = false,
>> +        .track_changes_noalert = false,
>>          .deleted_untracked_rows
>>              = OVS_LIST_INITIALIZER(&idl->deleted_untracked_rows),
>>          .rows_to_reparse
>> @@ -586,6 +589,19 @@ ovsdb_idl_verify_write_only(struct ovsdb_idl *idl)
>>      idl->verify_write_only = true;
>>  }
>>
>> +/* In regular operation mode, IDL change tracking of a column implies
>> + * that the column is read-write (OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT set).
>> + * This comes at a cost as no-op updates to read-write columns cannot be
>> + * easily optimized out (see ovsdb_idl_txn_write()).  This function allows
>> + * users to enable a mode in which change tracking is allowed for write-only
>> + * columns.
>> + */
>> +void
>> +ovsdb_idl_track_changes_noalert(struct ovsdb_idl *idl)
>> +{
>> +    idl->track_changes_noalert = true;
>> +}
>> +
>>  /* Returns true if 'idl' is currently connected or trying to connect
>>   * and a negative response to a schema request has not been received */
>>  bool
>> @@ -889,6 +905,16 @@ add_ref_table(struct ovsdb_idl *idl, const struct ovsdb_base_type *base)
>>      }
>>  }
>>
>> +static void
>> +ovsdb_idl_add_column__(struct ovsdb_idl *idl,
>> +                       const struct ovsdb_idl_column *column,
>> +                       unsigned char mode)
>> +{
>> +    ovsdb_idl_set_mode(idl, column, mode);
>> +    add_ref_table(idl, &column->type.key);
>> +    add_ref_table(idl, &column->type.value);
>> +}
>> +
>>  /* Turns on OVSDB_IDL_MONITOR and OVSDB_IDL_ALERT for 'column' in 'idl'.  Also
>>   * ensures that any tables referenced by 'column' will be replicated, even if
>>   * no columns in that table are selected for replication (see
>> @@ -902,9 +928,7 @@ void
>>  ovsdb_idl_add_column(struct ovsdb_idl *idl,
>>                       const struct ovsdb_idl_column *column)
>>  {
>> -    ovsdb_idl_set_mode(idl, column, OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
>> -    add_ref_table(idl, &column->type.key);
>> -    add_ref_table(idl, &column->type.value);
>> +    ovsdb_idl_add_column__(idl, column, OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
>>  }
>>
>>  /* Ensures that the table with class 'tc' will be replicated on 'idl' even if
>> @@ -1236,15 +1260,25 @@ ovsdb_idl_row_get_seqno(const struct ovsdb_idl_row *row,
>>   * functions.
>>   *
>>   * This function should be called between ovsdb_idl_create() and
>> - * the first call to ovsdb_idl_run(). The column to be tracked
>> - * should have OVSDB_IDL_ALERT turned on.
>> + * the first call to ovsdb_idl_run(). The column to be traked should
>> + * have:
>> + * a. OVSDB_IDL_ALERT turned on if 'track_changes_noalert' is 'false'
>> + * OR
>> + * b. OVSDB_IDL_MONITOR turned on if 'track_changes_noalert' is 'true'
>>   */
>>  void
>>  ovsdb_idl_track_add_column(struct ovsdb_idl *idl,
>>                             const struct ovsdb_idl_column *column)
>>  {
>> -    if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_ALERT)) {
>> -        ovsdb_idl_add_column(idl, column);
>> +    if (idl->track_changes_noalert) {
>> +        if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_MONITOR)) {
>> +            ovsdb_idl_add_column__(idl, column, OVSDB_IDL_MONITOR);
>> +        }
>> +    } else {
>> +        if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_ALERT)) {
>> +            ovsdb_idl_add_column__(idl, column,
>> +                                   OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
>> +        }
>>      }
>>      *ovsdb_idl_get_mode(idl, column) |= OVSDB_IDL_TRACK;
>>  }
>> @@ -1694,11 +1728,13 @@ ovsdb_idl_row_change(struct ovsdb_idl_row *row, const struct shash *values,
>>              continue;
>>          }
>>
>> -        if (datum_changed && table->modes[column_idx] & OVSDB_IDL_ALERT) {
>> -            changed = true;
>> -            row->change_seqno[change]
>> -                = row->table->change_seqno[change]
>> -                = row->table->idl->change_seqno + 1;
>> +        if (datum_changed) {
>> +            if (table->modes[column_idx] & OVSDB_IDL_ALERT) {
>> +                changed = true;
>> +                row->change_seqno[change]
>> +                    = row->table->change_seqno[change]
>> +                    = row->table->idl->change_seqno + 1;
>> +            }

Hi Numan,

Thanks for having a look!

I just realized this actually breaks change tracking, the test I had
added is wrong and failed to detect this.

>>
>>              if (table->modes[column_idx] & OVSDB_IDL_TRACK) {
>>                  if (ovs_list_is_empty(&row->track_node) &&
>> @@ -3501,7 +3537,7 @@ ovsdb_idl_txn_write__(const struct ovsdb_idl_row *row_,
>>
>>      class = row->table->class_;
>>      column_idx = column - class->columns;
>> -    write_only = row->table->modes[column_idx] == OVSDB_IDL_MONITOR;
>> +    write_only = !(row->table->modes[column_idx] & OVSDB_IDL_ALERT);
>>
>>      ovs_assert(row->new_datum != NULL);
>>      ovs_assert(column_idx < class->n_columns);
>> diff --git a/lib/ovsdb-idl.h b/lib/ovsdb-idl.h
>> index d00599616ef9..bcb576359b91 100644
>> --- a/lib/ovsdb-idl.h
>> +++ b/lib/ovsdb-idl.h
>> @@ -85,6 +85,7 @@ bool ovsdb_idl_has_ever_connected(const struct ovsdb_idl *);
>>  void ovsdb_idl_enable_reconnect(struct ovsdb_idl *);
>>  void ovsdb_idl_force_reconnect(struct ovsdb_idl *);
>>  void ovsdb_idl_verify_write_only(struct ovsdb_idl *);
>> +void ovsdb_idl_track_changes_noalert(struct ovsdb_idl *);
>>
> 
> Hi Dumitru,
> 
> I've a few comments.  Since the proposed API applies to all IDL
> columns,  does it make sense to provide the APIs to just enable
> MONITOR + TRACK
> for the columns which the client desires ?
> 
> Maybe we can add 2 new APIs
> 
> 1.  ovsdb_idl_track_no_alert_add_all() ->  This would set MONITOR |
> TRACK for all columns.
> 2.  ovsdb_idl_track_no_alert_add_column(idl, column).  which sets
> MONITOR | TRACK for that column
> 
> What do you think ?

I think that's a good idea but I what if we make them more generic?
Something like:

void ovsdb_idl_set_column_mode(struct ovsdb_idl *,
                               const struct ovsdb_idl_column *,
                               unsigned char mode);
void ovsdb_idl_set_column_mode_all(struct ovsdb_idl *, unsigned char mode);

Where 'mode' may be a combination of:

#define OVSDB_IDL_MONITOR (1 << 0) /* Replicate this column? */
#define OVSDB_IDL_ALERT   (1 << 1) /* Alert client when column changes? */
#define OVSDB_IDL_TRACK   (1 << 2)

As the current version of the patch has functional issues, I'll go
ahead and make this API change and post v2 to make review easier.

> 
> Thanks
> Numan
> 
>>  bool ovsdb_idl_is_alive(const struct ovsdb_idl *);
>>  bool ovsdb_idl_is_connected(const struct ovsdb_idl *idl);
>> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
>> index 62e2b638320c..0436e0f77b0f 100644
>> --- a/tests/ovsdb-idl.at
>> +++ b/tests/ovsdb-idl.at
>> @@ -1154,12 +1154,25 @@ OVSDB_CHECK_IDL_WO_MONITOR_COND([simple idl disable monitor-cond],
>>  ]])
>>
>>  m4_define([OVSDB_CHECK_IDL_TRACK_C],
>> -  [AT_SETUP([$1 - C])
>> +  [AT_SETUP([$1 - alert - C])
>> +   AT_KEYWORDS([ovsdb server idl tracking positive $5])
>> +   AT_CHECK([ovsdb_start_idltest])
>> +   m4_if([$2], [], [],
>> +     [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], [ignore])])
>> +   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 --change-track=alert idl unix:socket $3],
>> +            [0], [stdout], [ignore])
>> +   AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]),
>> +            [0], [$4])
>> +   OVSDB_SERVER_SHUTDOWN
>> +   AT_CLEANUP])
>> +
>> +m4_define([OVSDB_CHECK_IDL_TRACK_NOALERT_C],
>> +  [AT_SETUP([$1 - noalert - C])
>>     AT_KEYWORDS([ovsdb server idl tracking positive $5])
>>     AT_CHECK([ovsdb_start_idltest])
>>     m4_if([$2], [], [],
>>       [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], [ignore])])
>> -   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 -c idl unix:socket $3],
>> +   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 --change-track=noalert idl unix:socket $3],
>>              [0], [stdout], [ignore])
>>     AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]),
>>              [0], [$4])
>> @@ -1167,7 +1180,8 @@ m4_define([OVSDB_CHECK_IDL_TRACK_C],
>>     AT_CLEANUP])
>>
>>  m4_define([OVSDB_CHECK_IDL_TRACK],
>> -  [OVSDB_CHECK_IDL_TRACK_C($@)])
>> +  [OVSDB_CHECK_IDL_TRACK_C($@)
>> +   OVSDB_CHECK_IDL_TRACK_NOALERT_C($@)])
>>
>>  OVSDB_CHECK_IDL_TRACK([track, simple idl, initially populated],
>>    [['["idltest",
>> diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c
>> index ca4e87b8115c..a5de7919d4c6 100644
>> --- a/tests/test-ovsdb.c
>> +++ b/tests/test-ovsdb.c
>> @@ -56,6 +56,7 @@
>>  VLOG_DEFINE_THIS_MODULE(test_ovsdb);
>>
>>  struct test_ovsdb_pvt_context {
>> +    bool track_noalert;
>>      bool track;
>>  };
>>
>> @@ -122,6 +123,15 @@ parse_options(int argc, char *argv[], struct test_ovsdb_pvt_context *pvt)
>>              break;
>>
>>          case 'c':
>> +            if (optarg) {
>> +                if (!strcmp(optarg, "noalert")) {
>> +                    pvt->track_noalert = true;
>> +                } else if (!strcmp(optarg, "alert")) {
>> +                    pvt->track_noalert = false;
>> +                } else {
>> +                    ovs_fatal(0, "value %s is invalid", optarg);
>> +                }
>> +            }
>>              pvt->track = true;
>>              break;
>>
>> @@ -2610,6 +2620,7 @@ update_conditions(struct ovsdb_idl *idl, char *commands)
>>  static void
>>  do_idl(struct ovs_cmdl_context *ctx)
>>  {
>> +    struct test_ovsdb_pvt_context *pvt = ctx->pvt;
>>      struct jsonrpc *rpc;
>>      struct ovsdb_idl *idl;
>>      unsigned int seqno = 0;
>> @@ -2618,9 +2629,6 @@ do_idl(struct ovs_cmdl_context *ctx)
>>      int step = 0;
>>      int error;
>>      int i;
>> -    bool track;
>> -
>> -    track = ((struct test_ovsdb_pvt_context *)(ctx->pvt))->track;
>>
>>      idl = ovsdb_idl_create(ctx->argv[1], &idltest_idl_class, true, true);

This already sets the mode for all columns to:

OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT

And OVSDB_IDL_ALERT never gets removed.  I'll fix this up in v2, sorry
for the noise.

Thanks,
Dumitru

>>      ovsdb_idl_set_leader_only(idl, false);
>> @@ -2637,7 +2645,10 @@ do_idl(struct ovs_cmdl_context *ctx)
>>          rpc = NULL;
>>      }
>>
>> -    if (track) {
>> +    if (pvt->track_noalert) {
>> +        ovsdb_idl_track_changes_noalert(idl);
>> +    }
>> +    if (pvt->track) {
>>          ovsdb_idl_track_add_all(idl);
>>      }
>>
>> @@ -2683,7 +2694,7 @@ do_idl(struct ovs_cmdl_context *ctx)
>>              }
>>
>>              /* Print update. */
>> -            if (track) {
>> +            if (pvt->track) {
>>                  print_idl_track(idl, step++, terse);
>>                  ovsdb_idl_track_clear(idl);
>>              } else {
>> --
>> 2.27.0
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>
Dumitru Ceara April 15, 2022, 3:20 p.m. UTC | #4
On 4/15/22 13:27, Dumitru Ceara wrote:
> As the current version of the patch has functional issues, I'll go
> ahead and make this API change and post v2 to make review easier.

V2 posted:
https://patchwork.ozlabs.org/project/openvswitch/patch/20220415142146.13169-1-dceara@redhat.com/

Thanks!
diff mbox series

Patch

diff --git a/NEWS b/NEWS
index 8fa57836a928..2e87d1735aa2 100644
--- a/NEWS
+++ b/NEWS
@@ -3,6 +3,8 @@  Post-v2.17.0
    - OVSDB:
      * 'relay' service model now supports transaction history, i.e. honors the
        'last-txn-id' field in 'monitor_cond_since' requests from clients.
+   - OVSDB-IDL:
+     * New interface for enabling change tracking of write-only columns.
 
 
 v2.17.0 - 17 Feb 2022
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index 882ede75598d..6898d0911602 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -93,6 +93,8 @@  struct ovsdb_idl {
     struct ovsdb_idl_txn *txn;
     struct hmap outstanding_txns;
     bool verify_write_only;
+    bool track_changes_noalert; /* If true then the IDL allows change tracking
+                                 * of write-only columns. */
     struct ovs_list deleted_untracked_rows; /* Stores rows deleted in the
                                              * current run, that are not yet
                                              * added to the track_list. */
@@ -264,6 +266,7 @@  ovsdb_idl_create_unconnected(const struct ovsdb_idl_class *class,
         .txn = NULL,
         .outstanding_txns = HMAP_INITIALIZER(&idl->outstanding_txns),
         .verify_write_only = false,
+        .track_changes_noalert = false,
         .deleted_untracked_rows
             = OVS_LIST_INITIALIZER(&idl->deleted_untracked_rows),
         .rows_to_reparse
@@ -586,6 +589,19 @@  ovsdb_idl_verify_write_only(struct ovsdb_idl *idl)
     idl->verify_write_only = true;
 }
 
+/* In regular operation mode, IDL change tracking of a column implies
+ * that the column is read-write (OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT set).
+ * This comes at a cost as no-op updates to read-write columns cannot be
+ * easily optimized out (see ovsdb_idl_txn_write()).  This function allows
+ * users to enable a mode in which change tracking is allowed for write-only
+ * columns.
+ */
+void
+ovsdb_idl_track_changes_noalert(struct ovsdb_idl *idl)
+{
+    idl->track_changes_noalert = true;
+}
+
 /* Returns true if 'idl' is currently connected or trying to connect
  * and a negative response to a schema request has not been received */
 bool
@@ -889,6 +905,16 @@  add_ref_table(struct ovsdb_idl *idl, const struct ovsdb_base_type *base)
     }
 }
 
+static void
+ovsdb_idl_add_column__(struct ovsdb_idl *idl,
+                       const struct ovsdb_idl_column *column,
+                       unsigned char mode)
+{
+    ovsdb_idl_set_mode(idl, column, mode);
+    add_ref_table(idl, &column->type.key);
+    add_ref_table(idl, &column->type.value);
+}
+
 /* Turns on OVSDB_IDL_MONITOR and OVSDB_IDL_ALERT for 'column' in 'idl'.  Also
  * ensures that any tables referenced by 'column' will be replicated, even if
  * no columns in that table are selected for replication (see
@@ -902,9 +928,7 @@  void
 ovsdb_idl_add_column(struct ovsdb_idl *idl,
                      const struct ovsdb_idl_column *column)
 {
-    ovsdb_idl_set_mode(idl, column, OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
-    add_ref_table(idl, &column->type.key);
-    add_ref_table(idl, &column->type.value);
+    ovsdb_idl_add_column__(idl, column, OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
 }
 
 /* Ensures that the table with class 'tc' will be replicated on 'idl' even if
@@ -1236,15 +1260,25 @@  ovsdb_idl_row_get_seqno(const struct ovsdb_idl_row *row,
  * functions.
  *
  * This function should be called between ovsdb_idl_create() and
- * the first call to ovsdb_idl_run(). The column to be tracked
- * should have OVSDB_IDL_ALERT turned on.
+ * the first call to ovsdb_idl_run(). The column to be traked should
+ * have:
+ * a. OVSDB_IDL_ALERT turned on if 'track_changes_noalert' is 'false'
+ * OR
+ * b. OVSDB_IDL_MONITOR turned on if 'track_changes_noalert' is 'true'
  */
 void
 ovsdb_idl_track_add_column(struct ovsdb_idl *idl,
                            const struct ovsdb_idl_column *column)
 {
-    if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_ALERT)) {
-        ovsdb_idl_add_column(idl, column);
+    if (idl->track_changes_noalert) {
+        if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_MONITOR)) {
+            ovsdb_idl_add_column__(idl, column, OVSDB_IDL_MONITOR);
+        }
+    } else {
+        if (!(*ovsdb_idl_get_mode(idl, column) & OVSDB_IDL_ALERT)) {
+            ovsdb_idl_add_column__(idl, column,
+                                   OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT);
+        }
     }
     *ovsdb_idl_get_mode(idl, column) |= OVSDB_IDL_TRACK;
 }
@@ -1694,11 +1728,13 @@  ovsdb_idl_row_change(struct ovsdb_idl_row *row, const struct shash *values,
             continue;
         }
 
-        if (datum_changed && table->modes[column_idx] & OVSDB_IDL_ALERT) {
-            changed = true;
-            row->change_seqno[change]
-                = row->table->change_seqno[change]
-                = row->table->idl->change_seqno + 1;
+        if (datum_changed) {
+            if (table->modes[column_idx] & OVSDB_IDL_ALERT) {
+                changed = true;
+                row->change_seqno[change]
+                    = row->table->change_seqno[change]
+                    = row->table->idl->change_seqno + 1;
+            }
 
             if (table->modes[column_idx] & OVSDB_IDL_TRACK) {
                 if (ovs_list_is_empty(&row->track_node) &&
@@ -3501,7 +3537,7 @@  ovsdb_idl_txn_write__(const struct ovsdb_idl_row *row_,
 
     class = row->table->class_;
     column_idx = column - class->columns;
-    write_only = row->table->modes[column_idx] == OVSDB_IDL_MONITOR;
+    write_only = !(row->table->modes[column_idx] & OVSDB_IDL_ALERT);
 
     ovs_assert(row->new_datum != NULL);
     ovs_assert(column_idx < class->n_columns);
diff --git a/lib/ovsdb-idl.h b/lib/ovsdb-idl.h
index d00599616ef9..bcb576359b91 100644
--- a/lib/ovsdb-idl.h
+++ b/lib/ovsdb-idl.h
@@ -85,6 +85,7 @@  bool ovsdb_idl_has_ever_connected(const struct ovsdb_idl *);
 void ovsdb_idl_enable_reconnect(struct ovsdb_idl *);
 void ovsdb_idl_force_reconnect(struct ovsdb_idl *);
 void ovsdb_idl_verify_write_only(struct ovsdb_idl *);
+void ovsdb_idl_track_changes_noalert(struct ovsdb_idl *);
 
 bool ovsdb_idl_is_alive(const struct ovsdb_idl *);
 bool ovsdb_idl_is_connected(const struct ovsdb_idl *idl);
diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
index 62e2b638320c..0436e0f77b0f 100644
--- a/tests/ovsdb-idl.at
+++ b/tests/ovsdb-idl.at
@@ -1154,12 +1154,25 @@  OVSDB_CHECK_IDL_WO_MONITOR_COND([simple idl disable monitor-cond],
 ]])
 
 m4_define([OVSDB_CHECK_IDL_TRACK_C],
-  [AT_SETUP([$1 - C])
+  [AT_SETUP([$1 - alert - C])
+   AT_KEYWORDS([ovsdb server idl tracking positive $5])
+   AT_CHECK([ovsdb_start_idltest])
+   m4_if([$2], [], [],
+     [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], [ignore])])
+   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 --change-track=alert idl unix:socket $3],
+            [0], [stdout], [ignore])
+   AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]),
+            [0], [$4])
+   OVSDB_SERVER_SHUTDOWN
+   AT_CLEANUP])
+
+m4_define([OVSDB_CHECK_IDL_TRACK_NOALERT_C],
+  [AT_SETUP([$1 - noalert - C])
    AT_KEYWORDS([ovsdb server idl tracking positive $5])
    AT_CHECK([ovsdb_start_idltest])
    m4_if([$2], [], [],
      [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], [ignore])])
-   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 -c idl unix:socket $3],
+   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 --change-track=noalert idl unix:socket $3],
             [0], [stdout], [ignore])
    AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]),
             [0], [$4])
@@ -1167,7 +1180,8 @@  m4_define([OVSDB_CHECK_IDL_TRACK_C],
    AT_CLEANUP])
 
 m4_define([OVSDB_CHECK_IDL_TRACK],
-  [OVSDB_CHECK_IDL_TRACK_C($@)])
+  [OVSDB_CHECK_IDL_TRACK_C($@)
+   OVSDB_CHECK_IDL_TRACK_NOALERT_C($@)])
 
 OVSDB_CHECK_IDL_TRACK([track, simple idl, initially populated],
   [['["idltest",
diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c
index ca4e87b8115c..a5de7919d4c6 100644
--- a/tests/test-ovsdb.c
+++ b/tests/test-ovsdb.c
@@ -56,6 +56,7 @@ 
 VLOG_DEFINE_THIS_MODULE(test_ovsdb);
 
 struct test_ovsdb_pvt_context {
+    bool track_noalert;
     bool track;
 };
 
@@ -122,6 +123,15 @@  parse_options(int argc, char *argv[], struct test_ovsdb_pvt_context *pvt)
             break;
 
         case 'c':
+            if (optarg) {
+                if (!strcmp(optarg, "noalert")) {
+                    pvt->track_noalert = true;
+                } else if (!strcmp(optarg, "alert")) {
+                    pvt->track_noalert = false;
+                } else {
+                    ovs_fatal(0, "value %s is invalid", optarg);
+                }
+            }
             pvt->track = true;
             break;
 
@@ -2610,6 +2620,7 @@  update_conditions(struct ovsdb_idl *idl, char *commands)
 static void
 do_idl(struct ovs_cmdl_context *ctx)
 {
+    struct test_ovsdb_pvt_context *pvt = ctx->pvt;
     struct jsonrpc *rpc;
     struct ovsdb_idl *idl;
     unsigned int seqno = 0;
@@ -2618,9 +2629,6 @@  do_idl(struct ovs_cmdl_context *ctx)
     int step = 0;
     int error;
     int i;
-    bool track;
-
-    track = ((struct test_ovsdb_pvt_context *)(ctx->pvt))->track;
 
     idl = ovsdb_idl_create(ctx->argv[1], &idltest_idl_class, true, true);
     ovsdb_idl_set_leader_only(idl, false);
@@ -2637,7 +2645,10 @@  do_idl(struct ovs_cmdl_context *ctx)
         rpc = NULL;
     }
 
-    if (track) {
+    if (pvt->track_noalert) {
+        ovsdb_idl_track_changes_noalert(idl);
+    }
+    if (pvt->track) {
         ovsdb_idl_track_add_all(idl);
     }
 
@@ -2683,7 +2694,7 @@  do_idl(struct ovs_cmdl_context *ctx)
             }
 
             /* Print update. */
-            if (track) {
+            if (pvt->track) {
                 print_idl_track(idl, step++, terse);
                 ovsdb_idl_track_clear(idl);
             } else {