Message ID | 20240112230018.34044-6-frode.nordahl@canonical.com |
---|---|
State | Changes Requested |
Delegated to: | Ilya Maximets |
Headers | show |
Series | Introduce cooperative multitasking to improve OVSDB RAFT cluster operation. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | fail | test: fail |
On 1/13/24 00:00, Frode Nordahl wrote: > Initialize the cooperative multitasking module for the > ovsdb-server. > > The server side schema conversion process used for storage > engines such as RAFT is time consuming, yield during > processing. > > After the schema conversion is done, the processing of JSON-RPC > sessions and OVSDB monitors for reconnecting clients can > overrun the configured election timer. > > The destruction of JSON objects representing the database > contents has been identified as one of the primary offenders. > Make use of yielding version of the JSON object destroy function > to mitigate. > > This series has been tested by checking success of schema > conversion, ensuring no involuntary leader change occurs with > election timer configurations as low as 750 msec, on a 75MB > database with ~ 100 connected clients as produced by the > ovn-heater ocp-120-density-light test scenario. > > Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> > --- > NEWS | 2 ++ > ovsdb/file.c | 3 +++ > ovsdb/jsonrpc-server.c | 3 +++ > ovsdb/monitor.c | 15 +++++++++++---- > ovsdb/ovsdb-server.c | 2 ++ > ovsdb/trigger.c | 3 +++ > 6 files changed, 24 insertions(+), 4 deletions(-) > > diff --git a/NEWS b/NEWS > index 270ed6673..512b290c6 100644 > --- a/NEWS > +++ b/NEWS > @@ -6,6 +6,8 @@ Post-v3.2.0 > from older version is supported but it may trigger more leader elections > during the process, and error logs complaining unrecognized fields may > be observed on old nodes. > + * Make use of cooperative multitasking to improve maintenance of RAFT > + cluster during long running processing such as online schema conversion. > - OpenFlow: > * NXT_CT_FLUSH extension is updated to support flushing connections > based on mark and labels. 'ct-flush' command of ovs-ofctl updated > diff --git a/ovsdb/file.c b/ovsdb/file.c > index 8bd1d4af3..38a5b29a9 100644 > --- a/ovsdb/file.c > +++ b/ovsdb/file.c > @@ -23,6 +23,7 @@ > > #include "bitmap.h" > #include "column.h" > +#include "cooperative-multitasking.h" > #include "log.h" > #include "openvswitch/json.h" > #include "lockfile.h" > @@ -311,6 +312,8 @@ ovsdb_convert_table(struct ovsdb_txn *txn, > struct ovsdb_row *dst_row = ovsdb_row_create(dst_table); > *ovsdb_row_get_uuid_rw(dst_row) = *ovsdb_row_get_uuid(src_row); > > + cooperative_multitasking_yield(); > + > SHASH_FOR_EACH (node, &src_table->schema->columns) { > const struct ovsdb_column *src_column = node->data; > const struct ovsdb_column *dst_column; > diff --git a/ovsdb/jsonrpc-server.c b/ovsdb/jsonrpc-server.c > index 45f7c8038..b39ca618f 100644 > --- a/ovsdb/jsonrpc-server.c > +++ b/ovsdb/jsonrpc-server.c > @@ -21,6 +21,7 @@ > > #include "bitmap.h" > #include "column.h" > +#include "cooperative-multitasking.h" > #include "openvswitch/dynamic-string.h" > #include "monitor.h" > #include "openvswitch/json.h" > @@ -600,6 +601,8 @@ ovsdb_jsonrpc_session_run_all(struct ovsdb_jsonrpc_remote *remote) > struct ovsdb_jsonrpc_session *s; > > LIST_FOR_EACH_SAFE (s, node, &remote->sessions) { > + cooperative_multitasking_yield(); > + > int error = ovsdb_jsonrpc_session_run(s); > if (error) { > ovsdb_jsonrpc_session_close(s); > diff --git a/ovsdb/monitor.c b/ovsdb/monitor.c > index d1e466faa..718f16db0 100644 > --- a/ovsdb/monitor.c > +++ b/ovsdb/monitor.c > @@ -20,8 +20,10 @@ > > #include "bitmap.h" > #include "column.h" > +#include "cooperative-multitasking.h" > #include "openvswitch/dynamic-string.h" > #include "openvswitch/json.h" > +#include "json.h" > #include "jsonrpc.h" > #include "ovsdb-error.h" > #include "ovsdb-parser.h" > @@ -262,7 +264,7 @@ ovsdb_monitor_json_cache_flush(struct ovsdb_monitor *dbmon) > struct ovsdb_monitor_json_cache_node *node; > > HMAP_FOR_EACH_POP(node, hmap_node, &dbmon->json_cache) { > - json_destroy(node->json); > + json_destroy_with_yield(node->json); > free(node); > } > } > @@ -278,7 +280,7 @@ ovsdb_monitor_json_cache_destroy(struct ovsdb_monitor *dbmon, > = ovsdb_monitor_json_cache_search(dbmon, v, change_set); > if (node) { > hmap_remove(&dbmon->json_cache, &node->hmap_node); > - json_destroy(node->json); > + json_destroy_with_yield(node->json); > free(node); > } > } > @@ -1172,6 +1174,8 @@ ovsdb_monitor_compose_update( > struct ovsdb_monitor_table *mt = mcst->mt; > > HMAP_FOR_EACH_SAFE (row, hmap_node, &mcst->rows) { > + cooperative_multitasking_yield(); > + > struct json *row_json; > row_json = (*row_update)(mt, condition, OVSDB_MONITOR_ROW, row, > initial, changed, mcst->n_columns); > @@ -1217,6 +1221,8 @@ ovsdb_monitor_compose_cond_change_update( > HMAP_FOR_EACH (row, hmap_node, &mt->table->rows) { > struct json *row_json; > > + cooperative_multitasking_yield(); > + > row_json = ovsdb_monitor_compose_row_update2(mt, condition, > OVSDB_ROW, row, > false, changed, > @@ -1286,8 +1292,9 @@ ovsdb_monitor_get_update( > > /* Pre-serializing the object to avoid doing this > * for every client. */ > - json_serialized = json_serialized_object_create(json); > - json_destroy(json); > + json_serialized = > + json_serialized_object_create(json); Suggesting to yield inside this call, see my reply for a previous patch. Otherwise, the changes here looks fine. Best regards, Ilya Maximets.
diff --git a/NEWS b/NEWS index 270ed6673..512b290c6 100644 --- a/NEWS +++ b/NEWS @@ -6,6 +6,8 @@ Post-v3.2.0 from older version is supported but it may trigger more leader elections during the process, and error logs complaining unrecognized fields may be observed on old nodes. + * Make use of cooperative multitasking to improve maintenance of RAFT + cluster during long running processing such as online schema conversion. - OpenFlow: * NXT_CT_FLUSH extension is updated to support flushing connections based on mark and labels. 'ct-flush' command of ovs-ofctl updated diff --git a/ovsdb/file.c b/ovsdb/file.c index 8bd1d4af3..38a5b29a9 100644 --- a/ovsdb/file.c +++ b/ovsdb/file.c @@ -23,6 +23,7 @@ #include "bitmap.h" #include "column.h" +#include "cooperative-multitasking.h" #include "log.h" #include "openvswitch/json.h" #include "lockfile.h" @@ -311,6 +312,8 @@ ovsdb_convert_table(struct ovsdb_txn *txn, struct ovsdb_row *dst_row = ovsdb_row_create(dst_table); *ovsdb_row_get_uuid_rw(dst_row) = *ovsdb_row_get_uuid(src_row); + cooperative_multitasking_yield(); + SHASH_FOR_EACH (node, &src_table->schema->columns) { const struct ovsdb_column *src_column = node->data; const struct ovsdb_column *dst_column; diff --git a/ovsdb/jsonrpc-server.c b/ovsdb/jsonrpc-server.c index 45f7c8038..b39ca618f 100644 --- a/ovsdb/jsonrpc-server.c +++ b/ovsdb/jsonrpc-server.c @@ -21,6 +21,7 @@ #include "bitmap.h" #include "column.h" +#include "cooperative-multitasking.h" #include "openvswitch/dynamic-string.h" #include "monitor.h" #include "openvswitch/json.h" @@ -600,6 +601,8 @@ ovsdb_jsonrpc_session_run_all(struct ovsdb_jsonrpc_remote *remote) struct ovsdb_jsonrpc_session *s; LIST_FOR_EACH_SAFE (s, node, &remote->sessions) { + cooperative_multitasking_yield(); + int error = ovsdb_jsonrpc_session_run(s); if (error) { ovsdb_jsonrpc_session_close(s); diff --git a/ovsdb/monitor.c b/ovsdb/monitor.c index d1e466faa..718f16db0 100644 --- a/ovsdb/monitor.c +++ b/ovsdb/monitor.c @@ -20,8 +20,10 @@ #include "bitmap.h" #include "column.h" +#include "cooperative-multitasking.h" #include "openvswitch/dynamic-string.h" #include "openvswitch/json.h" +#include "json.h" #include "jsonrpc.h" #include "ovsdb-error.h" #include "ovsdb-parser.h" @@ -262,7 +264,7 @@ ovsdb_monitor_json_cache_flush(struct ovsdb_monitor *dbmon) struct ovsdb_monitor_json_cache_node *node; HMAP_FOR_EACH_POP(node, hmap_node, &dbmon->json_cache) { - json_destroy(node->json); + json_destroy_with_yield(node->json); free(node); } } @@ -278,7 +280,7 @@ ovsdb_monitor_json_cache_destroy(struct ovsdb_monitor *dbmon, = ovsdb_monitor_json_cache_search(dbmon, v, change_set); if (node) { hmap_remove(&dbmon->json_cache, &node->hmap_node); - json_destroy(node->json); + json_destroy_with_yield(node->json); free(node); } } @@ -1172,6 +1174,8 @@ ovsdb_monitor_compose_update( struct ovsdb_monitor_table *mt = mcst->mt; HMAP_FOR_EACH_SAFE (row, hmap_node, &mcst->rows) { + cooperative_multitasking_yield(); + struct json *row_json; row_json = (*row_update)(mt, condition, OVSDB_MONITOR_ROW, row, initial, changed, mcst->n_columns); @@ -1217,6 +1221,8 @@ ovsdb_monitor_compose_cond_change_update( HMAP_FOR_EACH (row, hmap_node, &mt->table->rows) { struct json *row_json; + cooperative_multitasking_yield(); + row_json = ovsdb_monitor_compose_row_update2(mt, condition, OVSDB_ROW, row, false, changed, @@ -1286,8 +1292,9 @@ ovsdb_monitor_get_update( /* Pre-serializing the object to avoid doing this * for every client. */ - json_serialized = json_serialized_object_create(json); - json_destroy(json); + json_serialized = + json_serialized_object_create(json); + json_destroy_with_yield(json); json = json_serialized; } ovsdb_monitor_json_cache_insert(dbmon, version, mcs, diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c index 4d29043f4..546370b4a 100644 --- a/ovsdb/ovsdb-server.c +++ b/ovsdb/ovsdb-server.c @@ -24,6 +24,7 @@ #include "column.h" #include "command-line.h" +#include "cooperative-multitasking.h" #include "daemon.h" #include "dirs.h" #include "dns-resolve.h" @@ -530,6 +531,7 @@ main(int argc, char *argv[]) } dns_resolve_destroy(); perf_counters_destroy(); + cooperative_multitasking_destroy(); service_stop(); return 0; } diff --git a/ovsdb/trigger.c b/ovsdb/trigger.c index 2a48ccc64..8c00fec18 100644 --- a/ovsdb/trigger.c +++ b/ovsdb/trigger.c @@ -20,6 +20,7 @@ #include <limits.h> #include <string.h> +#include "cooperative-multitasking.h" #include "file.h" #include "openvswitch/json.h" #include "jsonrpc.h" @@ -181,6 +182,8 @@ ovsdb_trigger_run(struct ovsdb *db, long long int now) bool disconnect_all = false; LIST_FOR_EACH_SAFE (t, node, &db->triggers) { + cooperative_multitasking_yield(); + if (run_triggers || now - t->created >= t->timeout_msec || t->progress || t->txn_forward) {
Initialize the cooperative multitasking module for the ovsdb-server. The server side schema conversion process used for storage engines such as RAFT is time consuming, yield during processing. After the schema conversion is done, the processing of JSON-RPC sessions and OVSDB monitors for reconnecting clients can overrun the configured election timer. The destruction of JSON objects representing the database contents has been identified as one of the primary offenders. Make use of yielding version of the JSON object destroy function to mitigate. This series has been tested by checking success of schema conversion, ensuring no involuntary leader change occurs with election timer configurations as low as 750 msec, on a 75MB database with ~ 100 connected clients as produced by the ovn-heater ocp-120-density-light test scenario. Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> --- NEWS | 2 ++ ovsdb/file.c | 3 +++ ovsdb/jsonrpc-server.c | 3 +++ ovsdb/monitor.c | 15 +++++++++++---- ovsdb/ovsdb-server.c | 2 ++ ovsdb/trigger.c | 3 +++ 6 files changed, 24 insertions(+), 4 deletions(-)