Message ID | 20221208152151.350120-1-dceara@redhat.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev] ovsdb-cs: Always send first explicitly set monitor condition. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | fail | test: fail |
On 12/8/22 16:21, Dumitru Ceara wrote: > Applications request specific monitor conditions via the > ovsdb_cs_set_condition() API. This returns the condition sequence > number at which the client can assume that the server acknowledged > and applied the new monitor condition. > > A corner case is when the client's first request is to set the condition > to a value that's equivalent to the default one ("<true>"). In such > cases, because it's requested explicitly by the client, we now > explicitly send the monitor_cond_change request to the server. This > avoids cases in which the expected condition sequence number and the > acked condition sequence number get out of sync. Sorry, I think I'm lost again. What exactly we're trying to fix here? [True] is a default condition if no other conditions are supplied for the table. Why do we need to send it explicitly? At first I thought that newly created condition from ovsdb_cs_db_get_table() has to be sent out, but... if it is the first time we're setting conditions for that table, default condition should be [True], hence equal to the condition we're trying to set. The test case is a bit synthetic as it doesn't seem to test the issue that exists in OVN. The fact that python IDL passes the test without changes also doesn't make a lot of sense to me. Could you provide some more data on what exactly is going on and what we're trying to fix? With jsonrpc logs, if possible. Thanks. Best regards, Ilya Maximets. > > Reported-by: Numan Siddique <numans@ovn.org> > Signed-off-by: Dumitru Ceara <dceara@redhat.com> > --- > lib/ovsdb-cs.c | 1 + > tests/ovsdb-idl.at | 27 +++++++++++++++++++++++++++ > tests/test-ovsdb.c | 28 ++++++++++++++++++++++++---- > tests/test-ovsdb.py | 20 +++++++++++++++++--- > 4 files changed, 69 insertions(+), 7 deletions(-) > > diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c > index a6fbd290c87d..16c8f6a2cf55 100644 > --- a/lib/ovsdb-cs.c > +++ b/lib/ovsdb-cs.c > @@ -894,6 +894,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const char *table) > t->name = xstrdup(table); > t->new_cond = json_array_create_1(json_boolean_create(true)); > hmap_insert(&db->tables, &t->hmap_node, hash); > + db->cond_changed = true; > return t; > } > > diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at > index c2970984bae3..57ed0ac9d12c 100644 > --- a/tests/ovsdb-idl.at > +++ b/tests/ovsdb-idl.at > @@ -599,6 +599,33 @@ OVSDB_CHECK_IDL([simple idl, conditional, true condition], > 004: done > ]]) > > +dnl This test ensures that the first explicitly set monitor condition > +dnl is sent to the server. > +OVSDB_CHECK_IDL([simple idl, conditional, wait for condition], > + [], > + [['["idltest", > + {"op": "insert", > + "table": "simple", > + "row": {"i": 1, > + "r": 2.0, > + "b": true}}]' \ > + 'condition simple [true]' \ > + '^["idltest", > + {"op": "insert", > + "table": "simple", > + "row": {"i": 2, > + "r": 4.0, > + "b": true}}]']], > + [[000: empty > +001: {"error":null,"result":[{"uuid":["uuid","<0>"]}]} > +002: table simple: i=1 r=2 b=true s= u=<1> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<0> > +003: change conditions > +004: {"error":null,"result":[{"uuid":["uuid","<2>"]}]} > +005: table simple: i=1 r=2 b=true s= u=<1> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<0> > +005: table simple: i=2 r=4 b=true s= u=<1> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<2> > +006: done > +]]) > + > OVSDB_CHECK_IDL([simple idl, conditional, multiple clauses in condition], > [['["idltest", > {"op": "insert", > diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c > index 84fe232765aa..23cfd79a433e 100644 > --- a/tests/test-ovsdb.c > +++ b/tests/test-ovsdb.c > @@ -2627,11 +2627,12 @@ parse_link2_json_clause(struct ovsdb_idl_condition *cond, > } > } > > -static void > +static unsigned int > update_conditions(struct ovsdb_idl *idl, char *commands) > { > - char *cmd, *save_ptr1 = NULL; > const struct ovsdb_idl_table_class *tc; > + unsigned int next_cond_seqno = 0; > + char *cmd, *save_ptr1 = NULL; > > for (cmd = strtok_r(commands, ";", &save_ptr1); cmd; > cmd = strtok_r(NULL, ";", &save_ptr1)) { > @@ -2688,9 +2689,11 @@ update_conditions(struct ovsdb_idl *idl, char *commands) > if (next_seqno != new_next_seqno) { > ovs_fatal(0, "condition expected seqno changed"); > } > + next_cond_seqno = MAX(next_cond_seqno, next_seqno); > ovsdb_idl_condition_destroy(&cond); > json_destroy(json); > } > + return next_cond_seqno; > } > > static void > @@ -2699,6 +2702,7 @@ do_idl(struct ovs_cmdl_context *ctx) > struct test_ovsdb_pvt_context *pvt = ctx->pvt; > struct jsonrpc *rpc; > struct ovsdb_idl *idl; > + unsigned int next_cond_seqno = 0; > unsigned int seqno = 0; > struct ovsdb_symbol_table *symtab; > size_t n_uuids = 0; > @@ -2735,7 +2739,8 @@ do_idl(struct ovs_cmdl_context *ctx) > const char remote_s[] = "set-remote "; > const char cond_s[] = "condition "; > if (ctx->argc > 2 && strstr(ctx->argv[2], cond_s)) { > - update_conditions(idl, ctx->argv[2] + strlen(cond_s)); > + next_cond_seqno = > + update_conditions(idl, ctx->argv[2] + strlen(cond_s)); > print_and_log("%03d: change conditions", step++); > i = 3; > } else { > @@ -2755,6 +2760,21 @@ do_idl(struct ovs_cmdl_context *ctx) > if (*arg == '+') { > /* The previous transaction didn't change anything. */ > arg++; > + } else if (*arg == '^') { > + /* Wait for condition change to be acked by the server. */ > + arg++; > + for (;;) { > + ovsdb_idl_run(idl); > + ovsdb_idl_check_consistency(idl); > + if (ovsdb_idl_get_condition_seqno(idl) == next_cond_seqno) { > + break; > + } > + jsonrpc_run(rpc); > + > + ovsdb_idl_wait(idl); > + jsonrpc_wait(rpc); > + poll_block(); > + } > } else { > /* Wait for update. */ > for (;;) { > @@ -2789,7 +2809,7 @@ do_idl(struct ovs_cmdl_context *ctx) > arg + strlen(remote_s), > ovsdb_idl_is_connected(idl) ? "true" : "false"); > } else if (!strncmp(arg, cond_s, strlen(cond_s))) { > - update_conditions(idl, arg + strlen(cond_s)); > + next_cond_seqno = update_conditions(idl, arg + strlen(cond_s)); > print_and_log("%03d: change conditions", step++); > } else if (arg[0] != '[') { > if (!idl_set(idl, arg, step++)) { > diff --git a/tests/test-ovsdb.py b/tests/test-ovsdb.py > index cca1818ea3aa..86184e6d4e4d 100644 > --- a/tests/test-ovsdb.py > +++ b/tests/test-ovsdb.py > @@ -627,6 +627,7 @@ def idl_set(idl, commands, step): > > > def update_condition(idl, commands): > + expected_seqno = 0 > commands = commands[len("condition "):].split(";") > for command in commands: > command = command.split(" ") > @@ -637,7 +638,8 @@ def update_condition(idl, commands): > table = command[0] > cond = ovs.json.from_string(command[1]) > > - idl.cond_change(table, cond) > + expected_seqno = max(expected_seqno, idl.cond_change(table, cond)) > + return expected_seqno > > > def do_idl(schema_file, remote, *commands): > @@ -694,6 +696,7 @@ def do_idl(schema_file, remote, *commands): > else: > rpc = None > > + next_cond_seqno = 0 > symtab = {} > seqno = 0 > step = 0 > @@ -717,7 +720,7 @@ def do_idl(schema_file, remote, *commands): > > commands = list(commands) > if len(commands) >= 1 and "condition" in commands[0]: > - update_condition(idl, commands.pop(0)) > + next_cond_seqno = update_condition(idl, commands.pop(0)) > sys.stdout.write("%03d: change conditions\n" % step) > sys.stdout.flush() > step += 1 > @@ -732,6 +735,17 @@ def do_idl(schema_file, remote, *commands): > if command.startswith("+"): > # The previous transaction didn't change anything. > command = command[1:] > + elif command.startswith("^"): > + # Wait for condition change to be acked by the server. > + command = command[1:] > + while idl.cond_seqno != next_cond_seqno and \ > + not idl.run(): > + rpc.run() > + > + poller = ovs.poller.Poller() > + idl.wait(poller) > + rpc.wait(poller) > + poller.block() > else: > # Wait for update. > while idl.change_seqno == seqno and not idl.run(): > @@ -753,7 +767,7 @@ def do_idl(schema_file, remote, *commands): > step += 1 > idl.force_reconnect() > elif "condition" in command: > - update_condition(idl, command) > + next_cond_seqno = update_condition(idl, command) > sys.stdout.write("%03d: change conditions\n" % step) > sys.stdout.flush() > step += 1
On 12/12/22 20:42, Ilya Maximets wrote: > On 12/8/22 16:21, Dumitru Ceara wrote: >> Applications request specific monitor conditions via the >> ovsdb_cs_set_condition() API. This returns the condition sequence >> number at which the client can assume that the server acknowledged >> and applied the new monitor condition. >> >> A corner case is when the client's first request is to set the condition >> to a value that's equivalent to the default one ("<true>"). In such >> cases, because it's requested explicitly by the client, we now >> explicitly send the monitor_cond_change request to the server. This >> avoids cases in which the expected condition sequence number and the >> acked condition sequence number get out of sync. > > Sorry, I think I'm lost again. What exactly we're trying to fix here? > [True] is a default condition if no other conditions are supplied for > the table. Why do we need to send it explicitly? > > At first I thought that newly created condition from ovsdb_cs_db_get_table() > has to be sent out, but... if it is the first time we're setting > conditions for that table, default condition should be [True], hence > equal to the condition we're trying to set. > > > The test case is a bit synthetic as it doesn't seem to test the issue > that exists in OVN. The fact that python IDL passes the test without > changes also doesn't make a lot of sense to me. > > Could you provide some more data on what exactly is going on and what > we're trying to fix? With jsonrpc logs, if possible. Should we do this instead: diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c index a6fbd290c..0fca03d72 100644 --- a/lib/ovsdb-cs.c +++ b/lib/ovsdb-cs.c @@ -892,7 +892,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const char *table) t = xzalloc(sizeof *t); t->name = xstrdup(table); - t->new_cond = json_array_create_1(json_boolean_create(true)); + t->ack_cond = json_array_create_1(json_boolean_create(true)); hmap_insert(&db->tables, &t->hmap_node, hash); return t; } --- ? This breaks a few tests, but it seems to be a correct change because it is indeed a default condition that server already has for the table. What do you think? Will that fix the OVN oproblem? > > Thanks. > Best regards, Ilya Maximets. > >> >> Reported-by: Numan Siddique <numans@ovn.org> >> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >> --- >> lib/ovsdb-cs.c | 1 + >> tests/ovsdb-idl.at | 27 +++++++++++++++++++++++++++ >> tests/test-ovsdb.c | 28 ++++++++++++++++++++++++---- >> tests/test-ovsdb.py | 20 +++++++++++++++++--- >> 4 files changed, 69 insertions(+), 7 deletions(-)
On 12/13/22 11:57, Ilya Maximets wrote: > On 12/12/22 20:42, Ilya Maximets wrote: >> On 12/8/22 16:21, Dumitru Ceara wrote: >>> Applications request specific monitor conditions via the >>> ovsdb_cs_set_condition() API. This returns the condition sequence >>> number at which the client can assume that the server acknowledged >>> and applied the new monitor condition. >>> >>> A corner case is when the client's first request is to set the condition >>> to a value that's equivalent to the default one ("<true>"). In such >>> cases, because it's requested explicitly by the client, we now >>> explicitly send the monitor_cond_change request to the server. This >>> avoids cases in which the expected condition sequence number and the >>> acked condition sequence number get out of sync. >> >> Sorry, I think I'm lost again. What exactly we're trying to fix here? >> [True] is a default condition if no other conditions are supplied for >> the table. Why do we need to send it explicitly? >> The problem is that there's a discrepancy between the returned expected condition seqno and what happens if a condition is explicitly set to [True] before vs after the initial connection happened. >> At first I thought that newly created condition from ovsdb_cs_db_get_table() >> has to be sent out, but... if it is the first time we're setting >> conditions for that table, default condition should be [True], hence >> equal to the condition we're trying to set. >> >> >> The test case is a bit synthetic as it doesn't seem to test the issue >> that exists in OVN. The fact that python IDL passes the test without >> changes also doesn't make a lot of sense to me. >> Actually it should test exactly what's happenning in OVN. That is: - connect to database, no explicit monitor condition set - get initial contents - explicitly set monitor condition to [True] and get seqno S at which the condition should be "acked". - expect that ovsdb_idl_get_condition_seqno() eventually reaches S The Python IDL behaves differently unfortunately, I'll give more details below. >> Could you provide some more data on what exactly is going on and what >> we're trying to fix? With jsonrpc logs, if possible. > For the C client, in the ovn-controller case, let's focus on 2 tables Chassis_Private and Chassis_Template_Var. The difference between these 2 is that Chassis_Template_Var was recently added to the schema so ovn-controller only sets its monitor condition if sbrec_server_has_chassis_template_var_table() is true. So that's only after the remote schema was received. Sorry for the wide table below but this is the best way I could find that should explain what's happening: operation/event reconnect-state ovsdb-cs-state Chassis_Private-new_cond Chassis_Template_Var-new_cond cond-seqno expected-cond-seqno ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 1. set_cond(Chassis_Private, [True]) BACKOFF SERVER_SCHEMA_REQUESTED [True] non-existent 0 1 2. reconnect->CONNECTING - ovsdb_cs_db_sync_condition(OVN_Southbound) is called and ack_all == true - this will move all ovsdb_cs table->req_cond to table->new_cond; ovsdb_cs only knows of the existence of the Chassis_Private table. CONNECTING SERVER_SCHEMA_REQUESTED nil non-existent 0 1 # new event loop iteration 3. set_cond(Chassis_Private, [True]) CONNECTING SERVER_SCHEMA_REQUESTED nil non-existent 0 0 4. reconnect->CONNECTED - ovsdb_cs requests server schema jsonrpc|DBG|ssl:127.0.0.1:6642: send request, method="get_schema", params=["_Server"], id=7 CONNECTED SERVER_SCHEMA_REQUESTED nil non-existent 0 0 # new event loop iteration 5. set_cond(Chassis_Private, [True]) CONNECTED SERVER_SCHEMA_REQUESTED nil non-existent 0 0 6. ovsdb_cs receives server schema: jsonrpc|DBG|ssl:127.0.0.1:6642: received reply, result={...,"name":"_Server",...} - ovsdb_cs sends monitor_cond for the server db: jsonrpc|DBG|ssl:127.0.0.1:6642: send request, method="monitor_cond", params=["_Server",...] CONNECTED SERVER_MONITOR_REQUESTED nil non-existent 0 0 # new event loop iteration 7. set_cond(Chassis_Private, [True]) CONNECTED SERVER_MONITOR_REQUESTED nil non-existent 0 0 8. ovsdb_cs receives database schema: jsonrpc|DBG|ssl:127.0.0.1:6642: received reply, result={"Database":{"1eaa91c6-771d-4456-93d1-23fe3e559a05":{"initial":{"schema":"{...\"name\":\"OVN_Southbound\", - at this point the Chassis_Template_Var table becomes "available". - ovsdb_cs issues a monitor_cond_since with the current condition (everything default, no "where" clause): jsonrpc|DBG|ssl:127.0.0.1:6642: send request, method="monitor_cond_since", params=["OVN_Southbound"... - ovsdb_cs issues a set_db_change_aware request: jsonrpc|DBG|ssl:127.0.0.1:6642: send request, method="set_db_change_aware", params=[true] CONNECTED DATA_MONITOR_COND_SINCE_REQUESTED nil non-existent 0 0 # new event loop iteration 9. set_cond(Chassis_Private, [True]) CONNECTED SERVER_MONITOR_REQUESTED nil non-existent 0 0 10. set_cond(Chassis_Template_Var, [True]) - the table didn't exist in ovsdb-cs so it's created, with new_cond set to [True] - because new_cond is the only condition set and the requested condition is also [True] then db->cond_changed is not touched so no monitor_cond_change request will be sent - but the expected cond seqno (returned value) is computed as 1 CONNECTED SERVER_MONITOR_REQUESTED nil [True] 0 1 # new event loop iteration 9. set_cond(Chassis_Private, [True]) CONNECTED SERVER_MONITOR_REQUESTED nil nil 0 1 10. set_cond(Chassis_Template_Var, [True]) - nothing changed so expected cond seqno stays 1 and db->cond_changed is still false CONNECTED SERVER_MONITOR_REQUESTED nil [True] 0 1 At this point we have a discrepancy between the expected cond seqno and the actual cond seqno. This causes an issue because ovn-controller relies on these being equal for determining if all monitor condition changes are up to date. Moreover, when ovn-monitor-all is set to True, no other condition changes will be triggered by ovn-controller. So we stay in this broken state for ever. Now, for the Python IDL part, the main difference there is in step "2": 2. reconnect->CONNECTING - Idl.sync_conditions() is called and ack_all == true - this will move all IDL table->req_cond to table->new_cond; This applies to ALL tables the client registered for, including the Chassis_Template_Var table. CONNECTING SERVER_SCHEMA_REQUESTED nil nil 0 1 So steps 9 and 10 would look like: 9. set_cond(Chassis_Private, [True]) CONNECTED SERVER_MONITOR_REQUESTED nil nil 0 0 10. set_cond(Chassis_Template_Var, [True]) - nothing really changed, we shouldn't send anything, seqnos stay 0 CONNECTED SERVER_MONITOR_REQUESTED nil nil 0 0 > > Should we do this instead: > > diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c > index a6fbd290c..0fca03d72 100644 > --- a/lib/ovsdb-cs.c > +++ b/lib/ovsdb-cs.c > @@ -892,7 +892,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const char *table) > > t = xzalloc(sizeof *t); > t->name = xstrdup(table); > - t->new_cond = json_array_create_1(json_boolean_create(true)); > + t->ack_cond = json_array_create_1(json_boolean_create(true)); > hmap_insert(&db->tables, &t->hmap_node, hash); > return t; > } > --- > > ? > > This breaks a few tests, but it seems to be a correct change because it > is indeed a default condition that server already has for the table. > > What do you think? Will that fix the OVN oproblem? > That would fix the OVN problem I think. I'm not sure how to rewrite the "[track, simple idl, initially populated, strong references, conditional]" test however. On the other had, I had something else in mind: what if we just do the same as the python IDL? That is, create all ovsdb-cs tables immediately after the cs was created. Like that we we don't change any other IDL behavior. I'm also a bit worried about fundamental changes given that we need this change in OVN before the release that should happen this week. My suggested change (on top of master) is a bit larger.. Let me know what you think. Thanks, Dumitru --- diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c index a6fbd290c87d..7dd07c75eebe 100644 --- a/lib/ovsdb-cs.c +++ b/lib/ovsdb-cs.c @@ -815,6 +815,8 @@ struct ovsdb_cs_db_table { struct json *ack_cond; /* Last condition acked by the server. */ struct json *req_cond; /* Last condition requested to the server. */ struct json *new_cond; /* Latest condition set by the IDL client. */ + bool in_server_schema; /* Indicates if this table is in the server schema + * or not. */ }; /* A kind of condition, so that we can treat equivalent JSON as equivalent. */ @@ -876,6 +878,17 @@ condition_clone(const struct json *condition) OVS_NOT_REACHED(); } +static struct ovsdb_cs_db_table * +ovsdb_cs_db_add_table(struct ovsdb_cs_db *db, const char *table) +{ + struct ovsdb_cs_db_table *t = xzalloc(sizeof *t); + + t->name = xstrdup(table); + t->new_cond = json_array_create_1(json_boolean_create(true)); + hmap_insert(&db->tables, &t->hmap_node, hash_string(table, 0)); + return t; +} + /* Returns the ovsdb_cs_db_table associated with 'table' in 'db', creating an * empty one if necessary. */ static struct ovsdb_cs_db_table * @@ -890,11 +903,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const char *table) } } - t = xzalloc(sizeof *t); - t->name = xstrdup(table); - t->new_cond = json_array_create_1(json_boolean_create(true)); - hmap_insert(&db->tables, &t->hmap_node, hash); - return t; + return ovsdb_cs_db_add_table(db, table); } static void @@ -953,6 +962,21 @@ ovsdb_cs_db_set_condition(struct ovsdb_cs_db *db, const char *table, } } +void +ovsdb_cs_set_table_in_schema(struct ovsdb_cs *cs, const char *table, + bool in_schema) +{ + struct ovsdb_cs_db_table *t = ovsdb_cs_db_get_table(&cs->data, table); + + t->in_server_schema = in_schema; +} + +bool +ovsdb_cs_get_table_in_schema(struct ovsdb_cs *cs, const char *table) +{ + return ovsdb_cs_db_get_table(&cs->data, table)->in_server_schema; +} + /* Sets the replication condition for 'tc' in 'cs' to 'condition' and arranges * to send the new condition to the database server. * @@ -1004,6 +1028,10 @@ ovsdb_cs_db_compose_cond_change(struct ovsdb_cs_db *db) struct json *monitor_cond_change_requests = NULL; struct ovsdb_cs_db_table *table; HMAP_FOR_EACH (table, hmap_node, &db->tables) { + if (!table->in_server_schema) { + continue; + } + /* Always use the most recent conditions set by the CS client when * requesting monitor_cond_change, i.e., table->new_cond. */ @@ -1483,6 +1511,9 @@ ovsdb_cs_send_monitor_request(struct ovsdb_cs *cs, struct ovsdb_cs_db *db, if (version > 1) { struct ovsdb_cs_db_table *table; HMAP_FOR_EACH (table, hmap_node, &db->tables) { + if (!table->in_server_schema) { + continue; + } if (table->ack_cond) { struct json *mr = shash_find_data(json_object(mrs), table->name); diff --git a/lib/ovsdb-cs.h b/lib/ovsdb-cs.h index 5d5b58f0a0a6..c3d9e82df5fd 100644 --- a/lib/ovsdb-cs.h +++ b/lib/ovsdb-cs.h @@ -133,6 +133,10 @@ int ovsdb_cs_get_last_error(const struct ovsdb_cs *); void ovsdb_cs_set_probe_interval(const struct ovsdb_cs *, int probe_interval); +void ovsdb_cs_set_table_in_schema(struct ovsdb_cs *, const char *table, + bool in_schema); +bool ovsdb_cs_get_table_in_schema(struct ovsdb_cs *, const char *table); + /* Conditional monitoring (specifying that only rows matching particular * criteria should be monitored). * diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h index 8d2b7d6b9140..c0e1d1f4345d 100644 --- a/lib/ovsdb-idl-provider.h +++ b/lib/ovsdb-idl-provider.h @@ -125,8 +125,6 @@ struct ovsdb_idl_table { struct hmap rows; /* Contains "struct ovsdb_idl_row"s. */ struct ovsdb_idl *idl; /* Containing IDL instance. */ unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX]; - bool in_server_schema; /* Indicates if this table is in the server schema - * or not. */ struct ovs_list indexes; /* Contains "struct ovsdb_idl_index"s */ struct ovs_list track_list; /* Tracked rows (ovsdb_idl_row.track_node). */ }; diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c index dbdfe45d87ea..ead4a5857d73 100644 --- a/lib/ovsdb-idl.c +++ b/lib/ovsdb-idl.c @@ -295,8 +295,8 @@ ovsdb_idl_create_unconnected(const struct ovsdb_idl_class *class, = table->change_seqno[OVSDB_IDL_CHANGE_MODIFY] = table->change_seqno[OVSDB_IDL_CHANGE_DELETE] = 0; table->idl = idl; - table->in_server_schema = false; sset_init(&table->schema_columns); + ovsdb_cs_set_table_in_schema(idl->cs, tc->name, false); } return idl; @@ -807,10 +807,10 @@ ovsdb_idl_compose_monitor_request(const struct json *schema_json, void *idl_) "(database needs upgrade?)", idl->class_->database, table->class_->name); json_destroy(columns); - table->in_server_schema = false; + ovsdb_cs_set_table_in_schema(idl->cs, tc->name, false); continue; } else if (schema && table_schema) { - table->in_server_schema = true; + ovsdb_cs_set_table_in_schema(idl->cs, tc->name, true); } monitor_request = json_object_create(); @@ -964,7 +964,7 @@ ovsdb_idl_server_has_table(const struct ovsdb_idl *idl, const struct ovsdb_idl_table *table = ovsdb_idl_table_from_class(idl, table_class); - return (table && table->in_server_schema); + return (table && ovsdb_cs_get_table_in_schema(idl->cs, table_class->name)); } /* Returns 'true' if the 'idl' has seen the 'column' in the schema @@ -984,8 +984,8 @@ ovsdb_idl_server_has_column(const struct ovsdb_idl *idl, const struct ovsdb_idl_table *table = ovsdb_idl_table_from_column(idl, column); - return (table->in_server_schema && sset_find(&table->schema_columns, - column->name)); + return (ovsdb_cs_get_table_in_schema(idl->cs, table->class_->name) + && sset_find(&table->schema_columns, column->name)); } ^L /* A single clause within an ovsdb_idl_condition. */ ---
On 12/13/22 13:21, Dumitru Ceara wrote: > On 12/13/22 11:57, Ilya Maximets wrote: >> On 12/12/22 20:42, Ilya Maximets wrote: >>> On 12/8/22 16:21, Dumitru Ceara wrote: >>>> Applications request specific monitor conditions via the >>>> ovsdb_cs_set_condition() API. This returns the condition sequence >>>> number at which the client can assume that the server acknowledged >>>> and applied the new monitor condition. >>>> >>>> A corner case is when the client's first request is to set the condition >>>> to a value that's equivalent to the default one ("<true>"). In such >>>> cases, because it's requested explicitly by the client, we now >>>> explicitly send the monitor_cond_change request to the server. This >>>> avoids cases in which the expected condition sequence number and the >>>> acked condition sequence number get out of sync. >>> >>> Sorry, I think I'm lost again. What exactly we're trying to fix here? >>> [True] is a default condition if no other conditions are supplied for >>> the table. Why do we need to send it explicitly? >>> > > The problem is that there's a discrepancy between the returned expected > condition seqno and what happens if a condition is explicitly set to > [True] before vs after the initial connection happened. > >>> At first I thought that newly created condition from ovsdb_cs_db_get_table() >>> has to be sent out, but... if it is the first time we're setting >>> conditions for that table, default condition should be [True], hence >>> equal to the condition we're trying to set. >>> >>> >>> The test case is a bit synthetic as it doesn't seem to test the issue >>> that exists in OVN. The fact that python IDL passes the test without >>> changes also doesn't make a lot of sense to me. >>> > > Actually it should test exactly what's happenning in OVN. That is: > > - connect to database, no explicit monitor condition set > - get initial contents > - explicitly set monitor condition to [True] and get seqno S at > which the condition should be "acked". > - expect that ovsdb_idl_get_condition_seqno() eventually reaches S > > The Python IDL behaves differently unfortunately, I'll give more details > below. > >>> Could you provide some more data on what exactly is going on and what >>> we're trying to fix? With jsonrpc logs, if possible. >> > > For the C client, in the ovn-controller case, let's focus on 2 tables > Chassis_Private and Chassis_Template_Var. The difference between these > 2 is that Chassis_Template_Var was recently added to the schema so > ovn-controller only sets its monitor condition if > sbrec_server_has_chassis_template_var_table() is true. So that's only > after the remote schema was received. > > Sorry for the wide table below but this is the best way I could find > that should explain what's happening: > > operation/event reconnect-state ovsdb-cs-state Chassis_Private-new_cond Chassis_Template_Var-new_cond cond-seqno expected-cond-seqno > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > 1. set_cond(Chassis_Private, [True]) > BACKOFF SERVER_SCHEMA_REQUESTED [True] non-existent 0 1 I think, this very first operation/event is wrong. The condition change is never going to be sent, but the expected sequence number is reported as 1. The only reason this doesn't cause any problems right away is that we're going to reconnect here and drop the state, so the application will not rely on that number in the end. But it is still an incorrect return value. The change that I suggested should fix that fundamental issue. <snip> >> >> Should we do this instead: >> >> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c >> index a6fbd290c..0fca03d72 100644 >> --- a/lib/ovsdb-cs.c >> +++ b/lib/ovsdb-cs.c >> @@ -892,7 +892,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const char *table) >> >> t = xzalloc(sizeof *t); >> t->name = xstrdup(table); >> - t->new_cond = json_array_create_1(json_boolean_create(true)); >> + t->ack_cond = json_array_create_1(json_boolean_create(true)); >> hmap_insert(&db->tables, &t->hmap_node, hash); >> return t; >> } >> --- >> >> ? >> >> This breaks a few tests, but it seems to be a correct change because it >> is indeed a default condition that server already has for the table. >> >> What do you think? Will that fix the OVN oproblem? >> > > That would fix the OVN problem I think. I'm not sure how to rewrite the > "[track, simple idl, initially populated, strong references, conditional]" > test however. Something like this should fix it: diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at index 57ed0ac9d..1846cb521 100644 --- a/tests/ovsdb-idl.at +++ b/tests/ovsdb-idl.at @@ -1585,10 +1585,12 @@ OVSDB_CHECK_IDL([simple idl, initially populated, strong references, conditional {"op": "insert", "table": "simple", "row": {"s": "row0_s"}}]']], - [[000: change conditions + [[000: simple3: conditions unchanged +000: simple4: conditions unchanged +000: simple: conditions unchanged 001: table simple3: name=row0_s3 uset=[] uref=[<0>] uuid=<1> 001: table simple4: name=row0_s4 uuid=<0> -002: change conditions +002: simple4: change conditions 003: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1> 004: {"error":null,"result":[{"uuid":["uuid","<2>"]}]} 005: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1> diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c index 23cfd79a4..1bc5ac17a 100644 --- a/tests/test-ovsdb.c +++ b/tests/test-ovsdb.c @@ -2628,7 +2628,7 @@ parse_link2_json_clause(struct ovsdb_idl_condition *cond, } static unsigned int -update_conditions(struct ovsdb_idl *idl, char *commands) +update_conditions(struct ovsdb_idl *idl, char *commands, int step) { const struct ovsdb_idl_table_class *tc; unsigned int next_cond_seqno = 0; @@ -2683,7 +2683,10 @@ update_conditions(struct ovsdb_idl *idl, char *commands) unsigned int seqno = ovsdb_idl_get_condition_seqno(idl); unsigned int next_seqno = ovsdb_idl_set_condition(idl, tc, &cond); if (seqno == next_seqno ) { - ovs_fatal(0, "condition unchanged"); + print_and_log("%03d: %s: conditions unchanged", + step, table_name); + } else { + print_and_log("%03d: %s: change conditions", step, table_name); } unsigned int new_next_seqno = ovsdb_idl_set_condition(idl, tc, &cond); if (next_seqno != new_next_seqno) { @@ -2740,8 +2743,7 @@ do_idl(struct ovs_cmdl_context *ctx) const char cond_s[] = "condition "; if (ctx->argc > 2 && strstr(ctx->argv[2], cond_s)) { next_cond_seqno = - update_conditions(idl, ctx->argv[2] + strlen(cond_s)); - print_and_log("%03d: change conditions", step++); + update_conditions(idl, ctx->argv[2] + strlen(cond_s), step++); i = 3; } else { i = 2; @@ -2809,8 +2811,8 @@ do_idl(struct ovs_cmdl_context *ctx) arg + strlen(remote_s), ovsdb_idl_is_connected(idl) ? "true" : "false"); } else if (!strncmp(arg, cond_s, strlen(cond_s))) { - next_cond_seqno = update_conditions(idl, arg + strlen(cond_s)); - print_and_log("%03d: change conditions", step++); + next_cond_seqno = update_conditions(idl, arg + strlen(cond_s), + step++); } else if (arg[0] != '[') { if (!idl_set(idl, arg, step++)) { /* If idl_set() returns false, then no transaction --- I.e. just don't crash if canditions didn't change, because that is expected. > > On the other had, I had something else in mind: what if we just do the > same as the python IDL? That is, create all ovsdb-cs tables immediately > after the cs was created. Like that we we don't change any other IDL > behavior. I'm also a bit worried about fundamental changes given > that we need this change in OVN before the release that should happen > this week. > > My suggested change (on top of master) is a bit larger.. Let me know > what you think. I'm not sure about that. This doesn't change the fundamental issue, IMO, but masks it out. Best regards, Ilya Maximets. > > Thanks, > Dumitru > > --- > diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c > index a6fbd290c87d..7dd07c75eebe 100644 > --- a/lib/ovsdb-cs.c > +++ b/lib/ovsdb-cs.c > @@ -815,6 +815,8 @@ struct ovsdb_cs_db_table { > struct json *ack_cond; /* Last condition acked by the server. */ > struct json *req_cond; /* Last condition requested to the server. */ > struct json *new_cond; /* Latest condition set by the IDL client. */ > + bool in_server_schema; /* Indicates if this table is in the server schema > + * or not. */ > }; > > /* A kind of condition, so that we can treat equivalent JSON as equivalent. */ > @@ -876,6 +878,17 @@ condition_clone(const struct json *condition) > OVS_NOT_REACHED(); > } > > +static struct ovsdb_cs_db_table * > +ovsdb_cs_db_add_table(struct ovsdb_cs_db *db, const char *table) > +{ > + struct ovsdb_cs_db_table *t = xzalloc(sizeof *t); > + > + t->name = xstrdup(table); > + t->new_cond = json_array_create_1(json_boolean_create(true)); > + hmap_insert(&db->tables, &t->hmap_node, hash_string(table, 0)); > + return t; > +} > + > /* Returns the ovsdb_cs_db_table associated with 'table' in 'db', creating an > * empty one if necessary. */ > static struct ovsdb_cs_db_table * > @@ -890,11 +903,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const char *table) > } > } > > - t = xzalloc(sizeof *t); > - t->name = xstrdup(table); > - t->new_cond = json_array_create_1(json_boolean_create(true)); > - hmap_insert(&db->tables, &t->hmap_node, hash); > - return t; > + return ovsdb_cs_db_add_table(db, table); > } > > static void > @@ -953,6 +962,21 @@ ovsdb_cs_db_set_condition(struct ovsdb_cs_db *db, const char *table, > } > } > > +void > +ovsdb_cs_set_table_in_schema(struct ovsdb_cs *cs, const char *table, > + bool in_schema) > +{ > + struct ovsdb_cs_db_table *t = ovsdb_cs_db_get_table(&cs->data, table); > + > + t->in_server_schema = in_schema; > +} > + > +bool > +ovsdb_cs_get_table_in_schema(struct ovsdb_cs *cs, const char *table) > +{ > + return ovsdb_cs_db_get_table(&cs->data, table)->in_server_schema; > +} > + > /* Sets the replication condition for 'tc' in 'cs' to 'condition' and arranges > * to send the new condition to the database server. > * > @@ -1004,6 +1028,10 @@ ovsdb_cs_db_compose_cond_change(struct ovsdb_cs_db *db) > struct json *monitor_cond_change_requests = NULL; > struct ovsdb_cs_db_table *table; > HMAP_FOR_EACH (table, hmap_node, &db->tables) { > + if (!table->in_server_schema) { > + continue; > + } > + > /* Always use the most recent conditions set by the CS client when > * requesting monitor_cond_change, i.e., table->new_cond. > */ > @@ -1483,6 +1511,9 @@ ovsdb_cs_send_monitor_request(struct ovsdb_cs *cs, struct ovsdb_cs_db *db, > if (version > 1) { > struct ovsdb_cs_db_table *table; > HMAP_FOR_EACH (table, hmap_node, &db->tables) { > + if (!table->in_server_schema) { > + continue; > + } > if (table->ack_cond) { > struct json *mr = shash_find_data(json_object(mrs), > table->name); > diff --git a/lib/ovsdb-cs.h b/lib/ovsdb-cs.h > index 5d5b58f0a0a6..c3d9e82df5fd 100644 > --- a/lib/ovsdb-cs.h > +++ b/lib/ovsdb-cs.h > @@ -133,6 +133,10 @@ int ovsdb_cs_get_last_error(const struct ovsdb_cs *); > > void ovsdb_cs_set_probe_interval(const struct ovsdb_cs *, int probe_interval); > > +void ovsdb_cs_set_table_in_schema(struct ovsdb_cs *, const char *table, > + bool in_schema); > +bool ovsdb_cs_get_table_in_schema(struct ovsdb_cs *, const char *table); > + > /* Conditional monitoring (specifying that only rows matching particular > * criteria should be monitored). > * > diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h > index 8d2b7d6b9140..c0e1d1f4345d 100644 > --- a/lib/ovsdb-idl-provider.h > +++ b/lib/ovsdb-idl-provider.h > @@ -125,8 +125,6 @@ struct ovsdb_idl_table { > struct hmap rows; /* Contains "struct ovsdb_idl_row"s. */ > struct ovsdb_idl *idl; /* Containing IDL instance. */ > unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX]; > - bool in_server_schema; /* Indicates if this table is in the server schema > - * or not. */ > struct ovs_list indexes; /* Contains "struct ovsdb_idl_index"s */ > struct ovs_list track_list; /* Tracked rows (ovsdb_idl_row.track_node). */ > }; > diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c > index dbdfe45d87ea..ead4a5857d73 100644 > --- a/lib/ovsdb-idl.c > +++ b/lib/ovsdb-idl.c > @@ -295,8 +295,8 @@ ovsdb_idl_create_unconnected(const struct ovsdb_idl_class *class, > = table->change_seqno[OVSDB_IDL_CHANGE_MODIFY] > = table->change_seqno[OVSDB_IDL_CHANGE_DELETE] = 0; > table->idl = idl; > - table->in_server_schema = false; > sset_init(&table->schema_columns); > + ovsdb_cs_set_table_in_schema(idl->cs, tc->name, false); > } > > return idl; > @@ -807,10 +807,10 @@ ovsdb_idl_compose_monitor_request(const struct json *schema_json, void *idl_) > "(database needs upgrade?)", > idl->class_->database, table->class_->name); > json_destroy(columns); > - table->in_server_schema = false; > + ovsdb_cs_set_table_in_schema(idl->cs, tc->name, false); > continue; > } else if (schema && table_schema) { > - table->in_server_schema = true; > + ovsdb_cs_set_table_in_schema(idl->cs, tc->name, true); > } > > monitor_request = json_object_create(); > @@ -964,7 +964,7 @@ ovsdb_idl_server_has_table(const struct ovsdb_idl *idl, > const struct ovsdb_idl_table *table = > ovsdb_idl_table_from_class(idl, table_class); > > - return (table && table->in_server_schema); > + return (table && ovsdb_cs_get_table_in_schema(idl->cs, table_class->name)); > } > > /* Returns 'true' if the 'idl' has seen the 'column' in the schema > @@ -984,8 +984,8 @@ ovsdb_idl_server_has_column(const struct ovsdb_idl *idl, > const struct ovsdb_idl_table *table = > ovsdb_idl_table_from_column(idl, column); > > - return (table->in_server_schema && sset_find(&table->schema_columns, > - column->name)); > + return (ovsdb_cs_get_table_in_schema(idl->cs, table->class_->name) > + && sset_find(&table->schema_columns, column->name)); > } > ^L > /* A single clause within an ovsdb_idl_condition. */ > --- >
On 12/13/22 14:56, Ilya Maximets wrote: > On 12/13/22 13:21, Dumitru Ceara wrote: >> On 12/13/22 11:57, Ilya Maximets wrote: >>> On 12/12/22 20:42, Ilya Maximets wrote: >>>> On 12/8/22 16:21, Dumitru Ceara wrote: >>>>> Applications request specific monitor conditions via the >>>>> ovsdb_cs_set_condition() API. This returns the condition sequence >>>>> number at which the client can assume that the server acknowledged >>>>> and applied the new monitor condition. >>>>> >>>>> A corner case is when the client's first request is to set the condition >>>>> to a value that's equivalent to the default one ("<true>"). In such >>>>> cases, because it's requested explicitly by the client, we now >>>>> explicitly send the monitor_cond_change request to the server. This >>>>> avoids cases in which the expected condition sequence number and the >>>>> acked condition sequence number get out of sync. >>>> >>>> Sorry, I think I'm lost again. What exactly we're trying to fix here? >>>> [True] is a default condition if no other conditions are supplied for >>>> the table. Why do we need to send it explicitly? >>>> >> >> The problem is that there's a discrepancy between the returned expected >> condition seqno and what happens if a condition is explicitly set to >> [True] before vs after the initial connection happened. >> >>>> At first I thought that newly created condition from ovsdb_cs_db_get_table() >>>> has to be sent out, but... if it is the first time we're setting >>>> conditions for that table, default condition should be [True], hence >>>> equal to the condition we're trying to set. >>>> >>>> >>>> The test case is a bit synthetic as it doesn't seem to test the issue >>>> that exists in OVN. The fact that python IDL passes the test without >>>> changes also doesn't make a lot of sense to me. >>>> >> >> Actually it should test exactly what's happenning in OVN. That is: >> >> - connect to database, no explicit monitor condition set >> - get initial contents >> - explicitly set monitor condition to [True] and get seqno S at >> which the condition should be "acked". >> - expect that ovsdb_idl_get_condition_seqno() eventually reaches S >> >> The Python IDL behaves differently unfortunately, I'll give more details >> below. >> >>>> Could you provide some more data on what exactly is going on and what >>>> we're trying to fix? With jsonrpc logs, if possible. >>> >> >> For the C client, in the ovn-controller case, let's focus on 2 tables >> Chassis_Private and Chassis_Template_Var. The difference between these >> 2 is that Chassis_Template_Var was recently added to the schema so >> ovn-controller only sets its monitor condition if >> sbrec_server_has_chassis_template_var_table() is true. So that's only >> after the remote schema was received. >> >> Sorry for the wide table below but this is the best way I could find >> that should explain what's happening: >> >> operation/event reconnect-state ovsdb-cs-state Chassis_Private-new_cond Chassis_Template_Var-new_cond cond-seqno expected-cond-seqno >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >> 1. set_cond(Chassis_Private, [True]) >> BACKOFF SERVER_SCHEMA_REQUESTED [True] non-existent 0 1 > > > I think, this very first operation/event is wrong. > The condition change is never going to be sent, but the expected > sequence number is reported as 1. The only reason this doesn't > cause any problems right away is that we're going to reconnect > here and drop the state, so the application will not rely on that > number in the end. But it is still an incorrect return value. > > The change that I suggested should fix that fundamental issue. > It should, yes. > <snip> > >>> >>> Should we do this instead: >>> >>> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c >>> index a6fbd290c..0fca03d72 100644 >>> --- a/lib/ovsdb-cs.c >>> +++ b/lib/ovsdb-cs.c >>> @@ -892,7 +892,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const char *table) >>> >>> t = xzalloc(sizeof *t); >>> t->name = xstrdup(table); >>> - t->new_cond = json_array_create_1(json_boolean_create(true)); >>> + t->ack_cond = json_array_create_1(json_boolean_create(true)); >>> hmap_insert(&db->tables, &t->hmap_node, hash); >>> return t; >>> } >>> --- >>> >>> ? >>> >>> This breaks a few tests, but it seems to be a correct change because it >>> is indeed a default condition that server already has for the table. >>> >>> What do you think? Will that fix the OVN oproblem? >>> >> >> That would fix the OVN problem I think. I'm not sure how to rewrite the >> "[track, simple idl, initially populated, strong references, conditional]" >> test however. > > Something like this should fix it: > > diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at > index 57ed0ac9d..1846cb521 100644 > --- a/tests/ovsdb-idl.at > +++ b/tests/ovsdb-idl.at > @@ -1585,10 +1585,12 @@ OVSDB_CHECK_IDL([simple idl, initially populated, strong references, conditional > {"op": "insert", > "table": "simple", > "row": {"s": "row0_s"}}]']], > - [[000: change conditions > + [[000: simple3: conditions unchanged > +000: simple4: conditions unchanged > +000: simple: conditions unchanged > 001: table simple3: name=row0_s3 uset=[] uref=[<0>] uuid=<1> > 001: table simple4: name=row0_s4 uuid=<0> > -002: change conditions > +002: simple4: change conditions > 003: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1> > 004: {"error":null,"result":[{"uuid":["uuid","<2>"]}]} > 005: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1> > diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c > index 23cfd79a4..1bc5ac17a 100644 > --- a/tests/test-ovsdb.c > +++ b/tests/test-ovsdb.c > @@ -2628,7 +2628,7 @@ parse_link2_json_clause(struct ovsdb_idl_condition *cond, > } > > static unsigned int > -update_conditions(struct ovsdb_idl *idl, char *commands) > +update_conditions(struct ovsdb_idl *idl, char *commands, int step) > { > const struct ovsdb_idl_table_class *tc; > unsigned int next_cond_seqno = 0; > @@ -2683,7 +2683,10 @@ update_conditions(struct ovsdb_idl *idl, char *commands) > unsigned int seqno = ovsdb_idl_get_condition_seqno(idl); > unsigned int next_seqno = ovsdb_idl_set_condition(idl, tc, &cond); > if (seqno == next_seqno ) { > - ovs_fatal(0, "condition unchanged"); > + print_and_log("%03d: %s: conditions unchanged", > + step, table_name); > + } else { > + print_and_log("%03d: %s: change conditions", step, table_name); > } > unsigned int new_next_seqno = ovsdb_idl_set_condition(idl, tc, &cond); > if (next_seqno != new_next_seqno) { > @@ -2740,8 +2743,7 @@ do_idl(struct ovs_cmdl_context *ctx) > const char cond_s[] = "condition "; > if (ctx->argc > 2 && strstr(ctx->argv[2], cond_s)) { > next_cond_seqno = > - update_conditions(idl, ctx->argv[2] + strlen(cond_s)); > - print_and_log("%03d: change conditions", step++); > + update_conditions(idl, ctx->argv[2] + strlen(cond_s), step++); > i = 3; > } else { > i = 2; > @@ -2809,8 +2811,8 @@ do_idl(struct ovs_cmdl_context *ctx) > arg + strlen(remote_s), > ovsdb_idl_is_connected(idl) ? "true" : "false"); > } else if (!strncmp(arg, cond_s, strlen(cond_s))) { > - next_cond_seqno = update_conditions(idl, arg + strlen(cond_s)); > - print_and_log("%03d: change conditions", step++); > + next_cond_seqno = update_conditions(idl, arg + strlen(cond_s), > + step++); > } else if (arg[0] != '[') { > if (!idl_set(idl, arg, step++)) { > /* If idl_set() returns false, then no transaction > --- > > I.e. just don't crash if canditions didn't change, because that is expected. > > Ok. >> >> On the other had, I had something else in mind: what if we just do the >> same as the python IDL? That is, create all ovsdb-cs tables immediately >> after the cs was created. Like that we we don't change any other IDL >> behavior. I'm also a bit worried about fundamental changes given >> that we need this change in OVN before the release that should happen >> this week. >> >> My suggested change (on top of master) is a bit larger.. Let me know >> what you think. > > I'm not sure about that. This doesn't change the fundamental issue, IMO, > but masks it out. > I'm ok either way in the end. Do you plan to post a formal patch with your version or should I? Thanks, Dumitru
On 12/13/22 17:02, Dumitru Ceara wrote: > On 12/13/22 14:56, Ilya Maximets wrote: >> On 12/13/22 13:21, Dumitru Ceara wrote: >>> On 12/13/22 11:57, Ilya Maximets wrote: >>>> On 12/12/22 20:42, Ilya Maximets wrote: >>>>> On 12/8/22 16:21, Dumitru Ceara wrote: >>>>>> Applications request specific monitor conditions via the >>>>>> ovsdb_cs_set_condition() API. This returns the condition sequence >>>>>> number at which the client can assume that the server acknowledged >>>>>> and applied the new monitor condition. >>>>>> >>>>>> A corner case is when the client's first request is to set the condition >>>>>> to a value that's equivalent to the default one ("<true>"). In such >>>>>> cases, because it's requested explicitly by the client, we now >>>>>> explicitly send the monitor_cond_change request to the server. This >>>>>> avoids cases in which the expected condition sequence number and the >>>>>> acked condition sequence number get out of sync. >>>>> >>>>> Sorry, I think I'm lost again. What exactly we're trying to fix here? >>>>> [True] is a default condition if no other conditions are supplied for >>>>> the table. Why do we need to send it explicitly? >>>>> >>> >>> The problem is that there's a discrepancy between the returned expected >>> condition seqno and what happens if a condition is explicitly set to >>> [True] before vs after the initial connection happened. >>> >>>>> At first I thought that newly created condition from ovsdb_cs_db_get_table() >>>>> has to be sent out, but... if it is the first time we're setting >>>>> conditions for that table, default condition should be [True], hence >>>>> equal to the condition we're trying to set. >>>>> >>>>> >>>>> The test case is a bit synthetic as it doesn't seem to test the issue >>>>> that exists in OVN. The fact that python IDL passes the test without >>>>> changes also doesn't make a lot of sense to me. >>>>> >>> >>> Actually it should test exactly what's happenning in OVN. That is: >>> >>> - connect to database, no explicit monitor condition set >>> - get initial contents >>> - explicitly set monitor condition to [True] and get seqno S at >>> which the condition should be "acked". >>> - expect that ovsdb_idl_get_condition_seqno() eventually reaches S >>> >>> The Python IDL behaves differently unfortunately, I'll give more details >>> below. >>> >>>>> Could you provide some more data on what exactly is going on and what >>>>> we're trying to fix? With jsonrpc logs, if possible. >>>> >>> >>> For the C client, in the ovn-controller case, let's focus on 2 tables >>> Chassis_Private and Chassis_Template_Var. The difference between these >>> 2 is that Chassis_Template_Var was recently added to the schema so >>> ovn-controller only sets its monitor condition if >>> sbrec_server_has_chassis_template_var_table() is true. So that's only >>> after the remote schema was received. >>> >>> Sorry for the wide table below but this is the best way I could find >>> that should explain what's happening: >>> >>> operation/event reconnect-state ovsdb-cs-state Chassis_Private-new_cond Chassis_Template_Var-new_cond cond-seqno expected-cond-seqno >>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>> 1. set_cond(Chassis_Private, [True]) >>> BACKOFF SERVER_SCHEMA_REQUESTED [True] non-existent 0 1 >> >> >> I think, this very first operation/event is wrong. >> The condition change is never going to be sent, but the expected >> sequence number is reported as 1. The only reason this doesn't >> cause any problems right away is that we're going to reconnect >> here and drop the state, so the application will not rely on that >> number in the end. But it is still an incorrect return value. >> >> The change that I suggested should fix that fundamental issue. >> > > It should, yes. > >> <snip> >> >>>> >>>> Should we do this instead: >>>> >>>> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c >>>> index a6fbd290c..0fca03d72 100644 >>>> --- a/lib/ovsdb-cs.c >>>> +++ b/lib/ovsdb-cs.c >>>> @@ -892,7 +892,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const char *table) >>>> >>>> t = xzalloc(sizeof *t); >>>> t->name = xstrdup(table); >>>> - t->new_cond = json_array_create_1(json_boolean_create(true)); >>>> + t->ack_cond = json_array_create_1(json_boolean_create(true)); >>>> hmap_insert(&db->tables, &t->hmap_node, hash); >>>> return t; >>>> } >>>> --- >>>> >>>> ? >>>> >>>> This breaks a few tests, but it seems to be a correct change because it >>>> is indeed a default condition that server already has for the table. >>>> >>>> What do you think? Will that fix the OVN oproblem? >>>> >>> >>> That would fix the OVN problem I think. I'm not sure how to rewrite the >>> "[track, simple idl, initially populated, strong references, conditional]" >>> test however. >> >> Something like this should fix it: >> >> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at >> index 57ed0ac9d..1846cb521 100644 >> --- a/tests/ovsdb-idl.at >> +++ b/tests/ovsdb-idl.at >> @@ -1585,10 +1585,12 @@ OVSDB_CHECK_IDL([simple idl, initially populated, strong references, conditional >> {"op": "insert", >> "table": "simple", >> "row": {"s": "row0_s"}}]']], >> - [[000: change conditions >> + [[000: simple3: conditions unchanged >> +000: simple4: conditions unchanged >> +000: simple: conditions unchanged >> 001: table simple3: name=row0_s3 uset=[] uref=[<0>] uuid=<1> >> 001: table simple4: name=row0_s4 uuid=<0> >> -002: change conditions >> +002: simple4: change conditions >> 003: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1> >> 004: {"error":null,"result":[{"uuid":["uuid","<2>"]}]} >> 005: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1> >> diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c >> index 23cfd79a4..1bc5ac17a 100644 >> --- a/tests/test-ovsdb.c >> +++ b/tests/test-ovsdb.c >> @@ -2628,7 +2628,7 @@ parse_link2_json_clause(struct ovsdb_idl_condition *cond, >> } >> >> static unsigned int >> -update_conditions(struct ovsdb_idl *idl, char *commands) >> +update_conditions(struct ovsdb_idl *idl, char *commands, int step) >> { >> const struct ovsdb_idl_table_class *tc; >> unsigned int next_cond_seqno = 0; >> @@ -2683,7 +2683,10 @@ update_conditions(struct ovsdb_idl *idl, char *commands) >> unsigned int seqno = ovsdb_idl_get_condition_seqno(idl); >> unsigned int next_seqno = ovsdb_idl_set_condition(idl, tc, &cond); >> if (seqno == next_seqno ) { >> - ovs_fatal(0, "condition unchanged"); >> + print_and_log("%03d: %s: conditions unchanged", >> + step, table_name); >> + } else { >> + print_and_log("%03d: %s: change conditions", step, table_name); >> } >> unsigned int new_next_seqno = ovsdb_idl_set_condition(idl, tc, &cond); >> if (next_seqno != new_next_seqno) { >> @@ -2740,8 +2743,7 @@ do_idl(struct ovs_cmdl_context *ctx) >> const char cond_s[] = "condition "; >> if (ctx->argc > 2 && strstr(ctx->argv[2], cond_s)) { >> next_cond_seqno = >> - update_conditions(idl, ctx->argv[2] + strlen(cond_s)); >> - print_and_log("%03d: change conditions", step++); >> + update_conditions(idl, ctx->argv[2] + strlen(cond_s), step++); >> i = 3; >> } else { >> i = 2; >> @@ -2809,8 +2811,8 @@ do_idl(struct ovs_cmdl_context *ctx) >> arg + strlen(remote_s), >> ovsdb_idl_is_connected(idl) ? "true" : "false"); >> } else if (!strncmp(arg, cond_s, strlen(cond_s))) { >> - next_cond_seqno = update_conditions(idl, arg + strlen(cond_s)); >> - print_and_log("%03d: change conditions", step++); >> + next_cond_seqno = update_conditions(idl, arg + strlen(cond_s), >> + step++); >> } else if (arg[0] != '[') { >> if (!idl_set(idl, arg, step++)) { >> /* If idl_set() returns false, then no transaction >> --- >> >> I.e. just don't crash if canditions didn't change, because that is expected. >> >> > > Ok. > >>> >>> On the other had, I had something else in mind: what if we just do the >>> same as the python IDL? That is, create all ovsdb-cs tables immediately >>> after the cs was created. Like that we we don't change any other IDL >>> behavior. I'm also a bit worried about fundamental changes given >>> that we need this change in OVN before the release that should happen >>> this week. >>> >>> My suggested change (on top of master) is a bit larger.. Let me know >>> what you think. >> >> I'm not sure about that. This doesn't change the fundamental issue, IMO, >> but masks it out. >> > > I'm ok either way in the end. Do you plan to post a formal patch with > your version or should I? Please, go ahead. Thanks! > > Thanks, > Dumitru > >
diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c index a6fbd290c87d..16c8f6a2cf55 100644 --- a/lib/ovsdb-cs.c +++ b/lib/ovsdb-cs.c @@ -894,6 +894,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const char *table) t->name = xstrdup(table); t->new_cond = json_array_create_1(json_boolean_create(true)); hmap_insert(&db->tables, &t->hmap_node, hash); + db->cond_changed = true; return t; } diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at index c2970984bae3..57ed0ac9d12c 100644 --- a/tests/ovsdb-idl.at +++ b/tests/ovsdb-idl.at @@ -599,6 +599,33 @@ OVSDB_CHECK_IDL([simple idl, conditional, true condition], 004: done ]]) +dnl This test ensures that the first explicitly set monitor condition +dnl is sent to the server. +OVSDB_CHECK_IDL([simple idl, conditional, wait for condition], + [], + [['["idltest", + {"op": "insert", + "table": "simple", + "row": {"i": 1, + "r": 2.0, + "b": true}}]' \ + 'condition simple [true]' \ + '^["idltest", + {"op": "insert", + "table": "simple", + "row": {"i": 2, + "r": 4.0, + "b": true}}]']], + [[000: empty +001: {"error":null,"result":[{"uuid":["uuid","<0>"]}]} +002: table simple: i=1 r=2 b=true s= u=<1> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<0> +003: change conditions +004: {"error":null,"result":[{"uuid":["uuid","<2>"]}]} +005: table simple: i=1 r=2 b=true s= u=<1> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<0> +005: table simple: i=2 r=4 b=true s= u=<1> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<2> +006: done +]]) + OVSDB_CHECK_IDL([simple idl, conditional, multiple clauses in condition], [['["idltest", {"op": "insert", diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c index 84fe232765aa..23cfd79a433e 100644 --- a/tests/test-ovsdb.c +++ b/tests/test-ovsdb.c @@ -2627,11 +2627,12 @@ parse_link2_json_clause(struct ovsdb_idl_condition *cond, } } -static void +static unsigned int update_conditions(struct ovsdb_idl *idl, char *commands) { - char *cmd, *save_ptr1 = NULL; const struct ovsdb_idl_table_class *tc; + unsigned int next_cond_seqno = 0; + char *cmd, *save_ptr1 = NULL; for (cmd = strtok_r(commands, ";", &save_ptr1); cmd; cmd = strtok_r(NULL, ";", &save_ptr1)) { @@ -2688,9 +2689,11 @@ update_conditions(struct ovsdb_idl *idl, char *commands) if (next_seqno != new_next_seqno) { ovs_fatal(0, "condition expected seqno changed"); } + next_cond_seqno = MAX(next_cond_seqno, next_seqno); ovsdb_idl_condition_destroy(&cond); json_destroy(json); } + return next_cond_seqno; } static void @@ -2699,6 +2702,7 @@ do_idl(struct ovs_cmdl_context *ctx) struct test_ovsdb_pvt_context *pvt = ctx->pvt; struct jsonrpc *rpc; struct ovsdb_idl *idl; + unsigned int next_cond_seqno = 0; unsigned int seqno = 0; struct ovsdb_symbol_table *symtab; size_t n_uuids = 0; @@ -2735,7 +2739,8 @@ do_idl(struct ovs_cmdl_context *ctx) const char remote_s[] = "set-remote "; const char cond_s[] = "condition "; if (ctx->argc > 2 && strstr(ctx->argv[2], cond_s)) { - update_conditions(idl, ctx->argv[2] + strlen(cond_s)); + next_cond_seqno = + update_conditions(idl, ctx->argv[2] + strlen(cond_s)); print_and_log("%03d: change conditions", step++); i = 3; } else { @@ -2755,6 +2760,21 @@ do_idl(struct ovs_cmdl_context *ctx) if (*arg == '+') { /* The previous transaction didn't change anything. */ arg++; + } else if (*arg == '^') { + /* Wait for condition change to be acked by the server. */ + arg++; + for (;;) { + ovsdb_idl_run(idl); + ovsdb_idl_check_consistency(idl); + if (ovsdb_idl_get_condition_seqno(idl) == next_cond_seqno) { + break; + } + jsonrpc_run(rpc); + + ovsdb_idl_wait(idl); + jsonrpc_wait(rpc); + poll_block(); + } } else { /* Wait for update. */ for (;;) { @@ -2789,7 +2809,7 @@ do_idl(struct ovs_cmdl_context *ctx) arg + strlen(remote_s), ovsdb_idl_is_connected(idl) ? "true" : "false"); } else if (!strncmp(arg, cond_s, strlen(cond_s))) { - update_conditions(idl, arg + strlen(cond_s)); + next_cond_seqno = update_conditions(idl, arg + strlen(cond_s)); print_and_log("%03d: change conditions", step++); } else if (arg[0] != '[') { if (!idl_set(idl, arg, step++)) { diff --git a/tests/test-ovsdb.py b/tests/test-ovsdb.py index cca1818ea3aa..86184e6d4e4d 100644 --- a/tests/test-ovsdb.py +++ b/tests/test-ovsdb.py @@ -627,6 +627,7 @@ def idl_set(idl, commands, step): def update_condition(idl, commands): + expected_seqno = 0 commands = commands[len("condition "):].split(";") for command in commands: command = command.split(" ") @@ -637,7 +638,8 @@ def update_condition(idl, commands): table = command[0] cond = ovs.json.from_string(command[1]) - idl.cond_change(table, cond) + expected_seqno = max(expected_seqno, idl.cond_change(table, cond)) + return expected_seqno def do_idl(schema_file, remote, *commands): @@ -694,6 +696,7 @@ def do_idl(schema_file, remote, *commands): else: rpc = None + next_cond_seqno = 0 symtab = {} seqno = 0 step = 0 @@ -717,7 +720,7 @@ def do_idl(schema_file, remote, *commands): commands = list(commands) if len(commands) >= 1 and "condition" in commands[0]: - update_condition(idl, commands.pop(0)) + next_cond_seqno = update_condition(idl, commands.pop(0)) sys.stdout.write("%03d: change conditions\n" % step) sys.stdout.flush() step += 1 @@ -732,6 +735,17 @@ def do_idl(schema_file, remote, *commands): if command.startswith("+"): # The previous transaction didn't change anything. command = command[1:] + elif command.startswith("^"): + # Wait for condition change to be acked by the server. + command = command[1:] + while idl.cond_seqno != next_cond_seqno and \ + not idl.run(): + rpc.run() + + poller = ovs.poller.Poller() + idl.wait(poller) + rpc.wait(poller) + poller.block() else: # Wait for update. while idl.change_seqno == seqno and not idl.run(): @@ -753,7 +767,7 @@ def do_idl(schema_file, remote, *commands): step += 1 idl.force_reconnect() elif "condition" in command: - update_condition(idl, command) + next_cond_seqno = update_condition(idl, command) sys.stdout.write("%03d: change conditions\n" % step) sys.stdout.flush() step += 1
Applications request specific monitor conditions via the ovsdb_cs_set_condition() API. This returns the condition sequence number at which the client can assume that the server acknowledged and applied the new monitor condition. A corner case is when the client's first request is to set the condition to a value that's equivalent to the default one ("<true>"). In such cases, because it's requested explicitly by the client, we now explicitly send the monitor_cond_change request to the server. This avoids cases in which the expected condition sequence number and the acked condition sequence number get out of sync. Reported-by: Numan Siddique <numans@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> --- lib/ovsdb-cs.c | 1 + tests/ovsdb-idl.at | 27 +++++++++++++++++++++++++++ tests/test-ovsdb.c | 28 ++++++++++++++++++++++++---- tests/test-ovsdb.py | 20 +++++++++++++++++--- 4 files changed, 69 insertions(+), 7 deletions(-)