From patchwork Wed Apr 20 19:31:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dumitru Ceara X-Patchwork-Id: 1619769 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=CEx7Type; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::133; helo=smtp2.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4Kk9lj6h0Nz9sG2 for ; Thu, 21 Apr 2022 05:31:45 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id C07DC410F5; Wed, 20 Apr 2022 19:31:42 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NUhLte5caiZI; Wed, 20 Apr 2022 19:31:41 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp2.osuosl.org (Postfix) with ESMTPS id 4FA024055A; Wed, 20 Apr 2022 19:31:40 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 070ADC002F; Wed, 20 Apr 2022 19:31:40 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) by lists.linuxfoundation.org (Postfix) with ESMTP id EB2D0C002C for ; Wed, 20 Apr 2022 19:31:38 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id D2CE060BD6 for ; Wed, 20 Apr 2022 19:31:38 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Authentication-Results: smtp3.osuosl.org (amavisd-new); dkim=fail (1024-bit key) reason="fail (body has been altered)" header.d=redhat.com Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id U7k3_bjMN4UN for ; Wed, 20 Apr 2022 19:31:37 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by smtp3.osuosl.org (Postfix) with ESMTPS id 4992F60B72 for ; Wed, 20 Apr 2022 19:31:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1650483095; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=pksdGhdsJPHwmRsItBofPwdBBQSC9fNwmIlu0qw+A7A=; b=CEx7TypeaHfl81wCGtcrlplaUuWGgAQtx1NZ6z8YmDN24uec0Z8SCyxx0wB/rVtz/AkDxf EMKrMCrYsigHGrILyVlXQkPq50jcvwht9xhmqAxFeL/ZWJDea/m3GvhtunfOf6QbmxerFT 9pvZVmJ5hl1nrQ8Qc8X8yPh8dbXtNrw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-665-zdpNvFxAOdqXCrdDpYxfXQ-1; Wed, 20 Apr 2022 15:31:32 -0400 X-MC-Unique: zdpNvFxAOdqXCrdDpYxfXQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4A1BF86B8A7; Wed, 20 Apr 2022 19:31:32 +0000 (UTC) Received: from dceara.remote.csb (unknown [10.39.192.121]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3DF7C145BA53; Wed, 20 Apr 2022 19:31:31 +0000 (UTC) From: Dumitru Ceara To: dev@openvswitch.org Date: Wed, 20 Apr 2022 21:31:22 +0200 Message-Id: <20220420193122.22651-1-dceara@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.85 on 10.11.54.7 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dceara@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Cc: i.maximets@ovn.org Subject: [ovs-dev] [PATCH v3] ovsdb-idl: Support write-only-changed IDL monitor mode. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" At a first glance, change tracking should never be allowed for write-only columns. However, some clients (e.g., ovn-northd) that are mostly exclusive writers of a database, use change tracking to avoid duplicating the IDL row records into a local cache when implementing incremental processing. The default behavior of the IDL is to automatically turn a write-only column into a read-write column whenever the client enables change tracking for that column. For the afore mentioned clients, this becomes a performance issue. Commit 1cc618c32524 ("ovsdb-idl: Fix atomicity of writes that don't change a column's value.") explains why writes that don't change a column's value cannot be optimized out early if the column is read/write. Furthermore, if there is at least one record in any table that changed during a transaction, then *all* records that have been written are added to the transaction, even if their values didn't change. If there are many such rows (e.g., like in ovn-northd's case) this incurs a significant overhead because: a. the client has to build this large transaction b. the transaction has to be sent over the network c. the server needs to parse this (mostly) no-op update We now introduce new IDL APIs allowing users to set a new monitoring mode flag, OVSDB_IDL_WRITE_CHANGED_ONLY, to indicate to the IDL that the atomicity constraints may be relaxed and written columns that don't change value can be skipped from the current transaction. We benchmarked ovn-northd performance when using this new mode against NB and SB databases taken from ovn-kubernetes scale tests. We noticed that when a minor change is performed to the Northbound database (e.g., NB_Global.nb_cfg is incremented) the time it takes to build the Southbound transaction becomes negligible (vs ~1.5 seconds before this change). End-to-end ovn-kubernetes scale tests on 120-node clusters also show significant reduction of latency to bring up pods; both average and P99 latency decreased by ~30%. Signed-off-by: Dumitru Ceara Acked-by: Han Zhou --- V3: - Addressed Han's comments: - Add a new IDL monitor mode flag, OVSDB_IDL_WRITE_CHANGED_ONLY. - Rephrased commit log. - Changed commit title to reflect the new approach. - Old patch (v2) was: https://patchwork.ozlabs.org/project/openvswitch/patch/20220415142146.13169-1-dceara@redhat.com/ V2: - Addressed Numan's comments: - Added APIs to allow per column configuration of modes. - Fixed unit tests to actually enable change tracking for write-only columns. - Fixed ovsdb_idl_row_change() to correctly update row/table/idl change seqnos if change tracking is enabled (even for write-only rows). Note: The OVN counter part change is: https://github.com/dceara/ovn/commit/5afba6f3aa51f508df41d8f0b7b9d739b00b1ee2 --- NEWS | 4 ++++ lib/ovsdb-idl.c | 57 +++++++++++++++++++++++++++++++++++++++++----- lib/ovsdb-idl.h | 14 +++++++++++- tests/ovsdb-idl.at | 31 ++++++++++++++++++++++++- tests/test-ovsdb.c | 18 +++++++++++---- 5 files changed, 111 insertions(+), 13 deletions(-) diff --git a/NEWS b/NEWS index 5074b97aa51c..60932791bfcf 100644 --- a/NEWS +++ b/NEWS @@ -14,6 +14,10 @@ Post-v2.17.0 - OVSDB: * 'relay' service model now supports transaction history, i.e. honors the 'last-txn-id' field in 'monitor_cond_since' requests from clients. + - OVSDB-IDL: + * New monitor mode flag, OVSDB_IDL_WRITE_CHANGED_ONLY, allowing + applications to relax atomicity requirements when dealing with + columns whose value has been rewritten (but not changed). v2.17.0 - 17 Feb 2022 diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c index 882ede75598d..e5bb6cca5274 100644 --- a/lib/ovsdb-idl.c +++ b/lib/ovsdb-idl.c @@ -1396,6 +1396,42 @@ ovsdb_idl_track_clear(struct ovsdb_idl *idl) { ovsdb_idl_track_clear__(idl, false); } + +/* Sets or clears (depending on 'enable') OVSDB_IDL_WRITE_CHANGED_ONLY + * for 'column' in 'idl', ensuring that the column will be included in a + * transaction only if its value has actually changed locally. Normally + * read/write columns that are written to are always included in the + * transaction but, in specific cases, when the application doesn't + * require atomicity of writes across different columns the ones that + * don't chave value may be skipped. + * + * This function should be called between ovsdb_idl_create() and + * the first call to ovsdb_idl_run(). + */ +void +ovsdb_idl_set_write_change_only(struct ovsdb_idl *idl, + const struct ovsdb_idl_column *column, + bool enable) +{ + if (enable) { + *ovsdb_idl_get_mode(idl, column) |= OVSDB_IDL_WRITE_CHANGED_ONLY; + } else { + *ovsdb_idl_get_mode(idl, column) &= ~OVSDB_IDL_WRITE_CHANGED_ONLY; + } +} + +void +ovsdb_idl_set_write_change_only_all(struct ovsdb_idl *idl, bool enable) +{ + for (size_t i = 0; i < idl->class_->n_tables; i++) { + const struct ovsdb_idl_table_class *tc = &idl->class_->tables[i]; + + for (size_t j = 0; j < tc->n_columns; j++) { + const struct ovsdb_idl_column *column = &tc->columns[j]; + ovsdb_idl_set_write_change_only(idl, column, enable); + } + } +} static void log_parse_update_error(struct ovsdb_error *error) @@ -3491,6 +3527,8 @@ ovsdb_idl_txn_write__(const struct ovsdb_idl_row *row_, { struct ovsdb_idl_row *row = CONST_CAST(struct ovsdb_idl_row *, row_); const struct ovsdb_idl_table_class *class; + unsigned char column_mode; + bool optimize_rewritten; size_t column_idx; bool write_only; @@ -3501,12 +3539,15 @@ ovsdb_idl_txn_write__(const struct ovsdb_idl_row *row_, class = row->table->class_; column_idx = column - class->columns; - write_only = row->table->modes[column_idx] == OVSDB_IDL_MONITOR; + column_mode = row->table->modes[column_idx]; + write_only = column_mode == OVSDB_IDL_MONITOR; + optimize_rewritten = + write_only || (column_mode & OVSDB_IDL_WRITE_CHANGED_ONLY); + ovs_assert(row->new_datum != NULL); ovs_assert(column_idx < class->n_columns); - ovs_assert(row->old_datum == NULL || - row->table->modes[column_idx] & OVSDB_IDL_MONITOR); + ovs_assert(row->old_datum == NULL || column_mode & OVSDB_IDL_MONITOR); if (row->table->idl->verify_write_only && !write_only) { VLOG_ERR("Bug: Attempt to write to a read/write column (%s:%s) when" @@ -3524,9 +3565,13 @@ ovsdb_idl_txn_write__(const struct ovsdb_idl_row *row_, * different value in that column since we read it. (But if a whole * transaction only does writes of existing values, without making any real * changes, we will drop the whole transaction later in - * ovsdb_idl_txn_commit().) */ - if (write_only && ovsdb_datum_equals(ovsdb_idl_read(row, column), - datum, &column->type)) { + * ovsdb_idl_txn_commit().) + * + * The application may choose to bypass this restriction and always + * optimize by setting OVSDB_IDL_WRITE_CHANGED_ONLY. + */ + if (optimize_rewritten && ovsdb_datum_equals(ovsdb_idl_read(row, column), + datum, &column->type)) { goto discard_datum; } diff --git a/lib/ovsdb-idl.h b/lib/ovsdb-idl.h index d00599616ef9..1716955ab61c 100644 --- a/lib/ovsdb-idl.h +++ b/lib/ovsdb-idl.h @@ -181,10 +181,17 @@ bool ovsdb_idl_server_has_column(const struct ovsdb_idl *, * - (OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT | OVSDB_IDL_TRACK), for a column * that a client wants to track using the change tracking * ovsdb_idl_track_get_*() functions. + * + * - (OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT | OVSDB_IDL_WRITE_CHANGED_ONLY) + * is similar to (OVSDB_IDL_MONITOR | OVSDB_IDL_ALERT) except that it + * only adds a written column to a transaction if the column's value + * has actually changed. */ #define OVSDB_IDL_MONITOR (1 << 0) /* Replicate this column? */ #define OVSDB_IDL_ALERT (1 << 1) /* Alert client when column changes? */ -#define OVSDB_IDL_TRACK (1 << 2) +#define OVSDB_IDL_TRACK (1 << 2) /* Track column changes? */ +#define OVSDB_IDL_WRITE_CHANGED_ONLY \ + (1 << 3) /* Write back only changed columns? */ void ovsdb_idl_add_column(struct ovsdb_idl *, const struct ovsdb_idl_column *); void ovsdb_idl_add_table(struct ovsdb_idl *, @@ -233,6 +240,11 @@ bool ovsdb_idl_track_is_updated(const struct ovsdb_idl_row *row, const struct ovsdb_idl_column *column); void ovsdb_idl_track_clear(struct ovsdb_idl *); +void ovsdb_idl_set_write_change_only(struct ovsdb_idl *idl, + const struct ovsdb_idl_column *column, + bool enable); +void ovsdb_idl_set_write_change_only_all(struct ovsdb_idl *idl, bool enable); + /* Reading the database replica. */ diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at index 62e2b638320c..d1cfa59c5758 100644 --- a/tests/ovsdb-idl.at +++ b/tests/ovsdb-idl.at @@ -87,6 +87,20 @@ m4_define([OVSDB_CHECK_IDL_C], OVSDB_SERVER_SHUTDOWN AT_CLEANUP]) +# same as OVSDB_CHECK_IDL but uses OVSDB_IDL_WRITE_CHANGED_ONLY. +m4_define([OVSDB_CHECK_IDL_WRITE_CHANGED_ONLY_C], + [AT_SETUP([$1 - write-changed-only - C]) + AT_KEYWORDS([ovsdb server idl positive $5]) + AT_CHECK([ovsdb_start_idltest]) + m4_if([$2], [], [], + [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], [ignore])]) + AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 idl unix:socket $3], + [0], [stdout], [ignore]) + AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]), + [0], [$4]) + OVSDB_SERVER_SHUTDOWN + AT_CLEANUP]) + # same as OVSDB_CHECK_IDL but uses tcp. m4_define([OVSDB_CHECK_IDL_TCP_C], [AT_SETUP([$1 - C - tcp]) @@ -257,6 +271,7 @@ m4_define([OVSDB_CHECK_IDL_SSL_PY], m4_define([OVSDB_CHECK_IDL], [OVSDB_CHECK_IDL_C($@) + OVSDB_CHECK_IDL_WRITE_CHANGED_ONLY_C($@) OVSDB_CHECK_IDL_TCP_C($@) OVSDB_CHECK_IDL_TCP6_C($@) OVSDB_CHECK_IDL_PY($@) @@ -1166,8 +1181,22 @@ m4_define([OVSDB_CHECK_IDL_TRACK_C], OVSDB_SERVER_SHUTDOWN AT_CLEANUP]) +m4_define([OVSDB_CHECK_IDL_TRACK_WRITE_CHANGED_ONLY_C], + [AT_SETUP([$1 - write-changed-only - C]) + AT_KEYWORDS([ovsdb server idl tracking positive $5]) + AT_CHECK([ovsdb_start_idltest]) + m4_if([$2], [], [], + [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], [ignore])]) + AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc -t10 -c -w idl unix:socket $3], + [0], [stdout], [ignore]) + AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]), + [0], [$4]) + OVSDB_SERVER_SHUTDOWN + AT_CLEANUP]) + m4_define([OVSDB_CHECK_IDL_TRACK], - [OVSDB_CHECK_IDL_TRACK_C($@)]) + [OVSDB_CHECK_IDL_TRACK_C($@) + OVSDB_CHECK_IDL_TRACK_WRITE_CHANGED_ONLY_C($@)]) OVSDB_CHECK_IDL_TRACK([track, simple idl, initially populated], [['["idltest", diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c index ca4e87b8115c..808b15355743 100644 --- a/tests/test-ovsdb.c +++ b/tests/test-ovsdb.c @@ -56,6 +56,7 @@ VLOG_DEFINE_THIS_MODULE(test_ovsdb); struct test_ovsdb_pvt_context { + bool write_changed_only; bool track; }; @@ -91,6 +92,7 @@ parse_options(int argc, char *argv[], struct test_ovsdb_pvt_context *pvt) {"timeout", required_argument, NULL, 't'}, {"verbose", optional_argument, NULL, 'v'}, {"change-track", optional_argument, NULL, 'c'}, + {"write-changed-only", optional_argument, NULL, 'w'}, {"magic", required_argument, NULL, OPT_MAGIC}, {"no-rename-open-files", no_argument, NULL, OPT_NO_RENAME_OPEN_FILES}, {"help", no_argument, NULL, 'h'}, @@ -125,6 +127,10 @@ parse_options(int argc, char *argv[], struct test_ovsdb_pvt_context *pvt) pvt->track = true; break; + case 'w': + pvt->write_changed_only = true; + break; + case OPT_MAGIC: magic = optarg; break; @@ -2610,6 +2616,7 @@ update_conditions(struct ovsdb_idl *idl, char *commands) static void do_idl(struct ovs_cmdl_context *ctx) { + struct test_ovsdb_pvt_context *pvt = ctx->pvt; struct jsonrpc *rpc; struct ovsdb_idl *idl; unsigned int seqno = 0; @@ -2618,9 +2625,6 @@ do_idl(struct ovs_cmdl_context *ctx) int step = 0; int error; int i; - bool track; - - track = ((struct test_ovsdb_pvt_context *)(ctx->pvt))->track; idl = ovsdb_idl_create(ctx->argv[1], &idltest_idl_class, true, true); ovsdb_idl_set_leader_only(idl, false); @@ -2637,10 +2641,14 @@ do_idl(struct ovs_cmdl_context *ctx) rpc = NULL; } - if (track) { + if (pvt->track) { ovsdb_idl_track_add_all(idl); } + if (pvt->write_changed_only) { + ovsdb_idl_set_write_change_only_all(idl, true); + } + setvbuf(stdout, NULL, _IONBF, 0); symtab = ovsdb_symbol_table_create(); @@ -2683,7 +2691,7 @@ do_idl(struct ovs_cmdl_context *ctx) } /* Print update. */ - if (track) { + if (pvt->track) { print_idl_track(idl, step++, terse); ovsdb_idl_track_clear(idl); } else {