Message ID | 1596641273-5748-1-git-send-email-dceara@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev] ovsdb-server: Replace in-memory DB contents at raft install_snapshot. | expand |
On Wed, Aug 5, 2020 at 8:28 AM Dumitru Ceara <dceara@redhat.com> wrote: > > Every time a follower has to install a snapshot received from the > leader, it should also replace the data in memory. Right now this only > happens when snapshots are installed that also change the schema. > > This can lead to inconsistent DB data on follower nodes and the snapshot > may fail to get applied. > > CC: Han Zhou <hzhou@ovn.org> > Fixes: bda1f6b60588 ("ovsdb-server: Don't disconnect clients after raft install_snapshot.") > Signed-off-by: Dumitru Ceara <dceara@redhat.com> Thanks Dumitru! This is a great finding, and sorry for my mistake. This patch looks good to me. Just one minor comment below on the test case. Otherwise: Acked-by: Han Zhou <hzhou@ovn.org> > --- > ovsdb/ovsdb-server.c | 21 +++++++++++++-------- > tests/idltest.ovsschema | 9 +++++++++ > tests/ovsdb-cluster.at | 28 +++++++++++++++++++++++++--- > tests/ovsdb-idl.at | 1 + > 4 files changed, 48 insertions(+), 11 deletions(-) > > diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c > index ef4e996..fd7891a 100644 > --- a/ovsdb/ovsdb-server.c > +++ b/ovsdb/ovsdb-server.c > @@ -543,13 +543,14 @@ parse_txn(struct server_config *config, struct db *db, > const struct ovsdb_schema *schema, const struct json *txn_json, > const struct uuid *txnid) > { > - if (schema && (!db->db->schema || strcmp(schema->version, > - db->db->schema->version))) { > + if (schema) { > /* We're replacing the schema (and the data). Destroy the database > * (first grabbing its storage), then replace it with the new schema. > * The transaction must also include the replacement data. > * > - * Only clustered database schema changes go through this path. */ > + * Only clustered database schema changes and snapshot installs > + * go through this path. > + */ > ovs_assert(txn_json); > ovs_assert(ovsdb_storage_is_clustered(db->db->storage)); > > @@ -559,11 +560,15 @@ parse_txn(struct server_config *config, struct db *db, > return error; > } > > - ovsdb_jsonrpc_server_reconnect( > - config->jsonrpc, false, > - (db->db->schema > - ? xasprintf("database %s schema changed", db->db->name) > - : xasprintf("database %s connected to storage", db->db->name))); > + if (!db->db->schema || > + strcmp(schema->version, db->db->schema->version)) { > + ovsdb_jsonrpc_server_reconnect( > + config->jsonrpc, false, > + (db->db->schema > + ? xasprintf("database %s schema changed", db->db->name) > + : xasprintf("database %s connected to storage", > + db->db->name))); > + } > > ovsdb_replace(db->db, ovsdb_create(ovsdb_schema_clone(schema), NULL)); > > diff --git a/tests/idltest.ovsschema b/tests/idltest.ovsschema > index e02b975..e04755e 100644 > --- a/tests/idltest.ovsschema > +++ b/tests/idltest.ovsschema > @@ -54,6 +54,15 @@ > }, > "isRoot" : true > }, > + "indexed": { > + "columns": { > + "i": { > + "type": "integer" > + } > + }, > + "indexes": [["i"]], > + "isRoot" : true > + }, > "simple": { > "columns": { > "b": { > diff --git a/tests/ovsdb-cluster.at b/tests/ovsdb-cluster.at > index 9714545..06d27c9 100644 > --- a/tests/ovsdb-cluster.at > +++ b/tests/ovsdb-cluster.at > @@ -332,15 +332,31 @@ for i in `seq $n`; do > AT_CHECK([ovsdb_client_wait unix:s$i.ovsdb $schema_name connected]) > done > > +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", > + {"op": "insert", > + "table": "indexed", > + "row": {"i": 1}}]]'], [0], [ignore], [ignore]) > + > # Kill one follower (s2) and write some data to cluster, so that the follower is falling behind > printf "\ns2: stopping\n" > OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s2], [s2.pid]) > > +# Delete "i":1 and readd it to get a different UUID for it. > +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", > + {"op": "delete", > + "table": "indexed", > + "where": [["i", "==", 1]]}]]'], [0], [ignore], [ignore]) > + > AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", > {"op": "insert", > - "table": "simple", > + "table": "indexed", > "row": {"i": 1}}]]'], [0], [ignore], [ignore]) > > +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", > + {"op": "insert", > + "table": "indexed", > + "row": {"i": 2}}]]'], [0], [ignore], [ignore]) > + > # Compact leader online to generate snapshot > AT_CHECK([ovs-appctl -t "`pwd`"/s1 ovsdb-server/compact]) > > @@ -355,8 +371,14 @@ AT_CHECK([ovsdb_client_wait unix:s2.ovsdb $schema_name connected]) > # succeed. > AT_CHECK([ovsdb-client transact unix:s2.ovsdb '[["idltest", > {"op": "insert", > - "table": "simple", > - "row": {"i": 1}}]]'], [0], [ignore], [ignore]) > + "table": "indexed", > + "row": {"i": 3}}]]'], [0], [ignore], [ignore]) > + > +# The snapshot should overwrite the in-memory contents of the DB on S2 > +# without generating any constraint violations. > +AT_CHECK([grep 'Transaction causes multiple rows in "indexed" table to have identical values (1) for index on column "i"' -c s2.log], [1], [dnl > +0 > +]) Beside checking the error log, it may be better to simply check if the i=2 row is found in S2. Without this fix i=2 shouldn't get inserted. This can better ensure the correctness, in case the log text changes and the grep doesn't detect the problem. > > for i in `seq $n`; do > OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s$i], [s$i.pid]) > diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at > index 4efed88..789ae23 100644 > --- a/tests/ovsdb-idl.at > +++ b/tests/ovsdb-idl.at > @@ -954,6 +954,7 @@ AT_CHECK([sort stdout | uuidfilt], [0], > > # Check that ovsdb-idl figured out that table link2 and column l2 are missing. > AT_CHECK([grep ovsdb_idl stderr | sort], [0], [dnl > +test-ovsdb|ovsdb_idl|idltest database lacks indexed table (database needs upgrade?) > test-ovsdb|ovsdb_idl|idltest database lacks link2 table (database needs upgrade?) > test-ovsdb|ovsdb_idl|idltest database lacks simple5 table (database needs upgrade?) > test-ovsdb|ovsdb_idl|idltest database lacks singleton table (database needs upgrade?) > -- > 1.8.3.1 >
On 8/5/20 7:48 PM, Han Zhou wrote: > > > On Wed, Aug 5, 2020 at 8:28 AM Dumitru Ceara <dceara@redhat.com > <mailto:dceara@redhat.com>> wrote: >> >> Every time a follower has to install a snapshot received from the >> leader, it should also replace the data in memory. Right now this only >> happens when snapshots are installed that also change the schema. >> >> This can lead to inconsistent DB data on follower nodes and the snapshot >> may fail to get applied. >> >> CC: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org>> >> Fixes: bda1f6b60588 ("ovsdb-server: Don't disconnect clients after > raft install_snapshot.") >> Signed-off-by: Dumitru Ceara <dceara@redhat.com > <mailto:dceara@redhat.com>> > > Thanks Dumitru! This is a great finding, and sorry for my mistake. > This patch looks good to me. Just one minor comment below on the test > case. Otherwise: > > Acked-by: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org>> > Thanks Han for the review! I fixed the test case as you suggested and sent v2. I was wondering if this is also the root cause for the issue you reported a while back during the OVN meeting. In my scenario, if a follower ends up in this situation, and if the DB gets compacted online afterwards, the DB file also becomes inconsistent and in some cases (after the DB server is restarted) all write transactions from clients are rejected with "ovsdb-error: inconsistent data". Related to that I also sent the following patch to make the ovsdb-server storage state available via appctl commands: https://patchwork.ozlabs.org/project/openvswitch/patch/1596467128-13004-1-git-send-email-dceara@redhat.com/ Regards, Dumitru >> --- >> ovsdb/ovsdb-server.c | 21 +++++++++++++-------- >> tests/idltest.ovsschema | 9 +++++++++ >> tests/ovsdb-cluster.at <http://ovsdb-cluster.at> | 28 > +++++++++++++++++++++++++--- >> tests/ovsdb-idl.at <http://ovsdb-idl.at> | 1 + >> 4 files changed, 48 insertions(+), 11 deletions(-) >> >> diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c >> index ef4e996..fd7891a 100644 >> --- a/ovsdb/ovsdb-server.c >> +++ b/ovsdb/ovsdb-server.c >> @@ -543,13 +543,14 @@ parse_txn(struct server_config *config, struct > db *db, >> const struct ovsdb_schema *schema, const struct json *txn_json, >> const struct uuid *txnid) >> { >> - if (schema && (!db->db->schema || strcmp(schema->version, >> - db->db->schema->version))) { >> + if (schema) { >> /* We're replacing the schema (and the data). Destroy the > database >> * (first grabbing its storage), then replace it with the new > schema. >> * The transaction must also include the replacement data. >> * >> - * Only clustered database schema changes go through this > path. */ >> + * Only clustered database schema changes and snapshot installs >> + * go through this path. >> + */ >> ovs_assert(txn_json); >> ovs_assert(ovsdb_storage_is_clustered(db->db->storage)); >> >> @@ -559,11 +560,15 @@ parse_txn(struct server_config *config, struct > db *db, >> return error; >> } >> >> - ovsdb_jsonrpc_server_reconnect( >> - config->jsonrpc, false, >> - (db->db->schema >> - ? xasprintf("database %s schema changed", db->db->name) >> - : xasprintf("database %s connected to storage", > db->db->name))); >> + if (!db->db->schema || >> + strcmp(schema->version, db->db->schema->version)) { >> + ovsdb_jsonrpc_server_reconnect( >> + config->jsonrpc, false, >> + (db->db->schema >> + ? xasprintf("database %s schema changed", db->db->name) >> + : xasprintf("database %s connected to storage", >> + db->db->name))); >> + } >> >> ovsdb_replace(db->db, > ovsdb_create(ovsdb_schema_clone(schema), NULL)); >> >> diff --git a/tests/idltest.ovsschema b/tests/idltest.ovsschema >> index e02b975..e04755e 100644 >> --- a/tests/idltest.ovsschema >> +++ b/tests/idltest.ovsschema >> @@ -54,6 +54,15 @@ >> }, >> "isRoot" : true >> }, >> + "indexed": { >> + "columns": { >> + "i": { >> + "type": "integer" >> + } >> + }, >> + "indexes": [["i"]], >> + "isRoot" : true >> + }, >> "simple": { >> "columns": { >> "b": { >> diff --git a/tests/ovsdb-cluster.at <http://ovsdb-cluster.at> > b/tests/ovsdb-cluster.at <http://ovsdb-cluster.at> >> index 9714545..06d27c9 100644 >> --- a/tests/ovsdb-cluster.at <http://ovsdb-cluster.at> >> +++ b/tests/ovsdb-cluster.at <http://ovsdb-cluster.at> >> @@ -332,15 +332,31 @@ for i in `seq $n`; do >> AT_CHECK([ovsdb_client_wait unix:s$i.ovsdb $schema_name connected]) >> done >> >> +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", >> + {"op": "insert", >> + "table": "indexed", >> + "row": {"i": 1}}]]'], [0], [ignore], [ignore]) >> + >> # Kill one follower (s2) and write some data to cluster, so that the > follower is falling behind >> printf "\ns2: stopping\n" >> OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s2], [s2.pid]) >> >> +# Delete "i":1 and readd it to get a different UUID for it. >> +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", >> + {"op": "delete", >> + "table": "indexed", >> + "where": [["i", "==", 1]]}]]'], [0], [ignore], [ignore]) >> + >> AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", >> {"op": "insert", >> - "table": "simple", >> + "table": "indexed", >> "row": {"i": 1}}]]'], [0], [ignore], [ignore]) >> >> +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", >> + {"op": "insert", >> + "table": "indexed", >> + "row": {"i": 2}}]]'], [0], [ignore], [ignore]) >> + >> # Compact leader online to generate snapshot >> AT_CHECK([ovs-appctl -t "`pwd`"/s1 ovsdb-server/compact]) >> >> @@ -355,8 +371,14 @@ AT_CHECK([ovsdb_client_wait unix:s2.ovsdb > $schema_name connected]) >> # succeed. >> AT_CHECK([ovsdb-client transact unix:s2.ovsdb '[["idltest", >> {"op": "insert", >> - "table": "simple", >> - "row": {"i": 1}}]]'], [0], [ignore], [ignore]) >> + "table": "indexed", >> + "row": {"i": 3}}]]'], [0], [ignore], [ignore]) >> + >> +# The snapshot should overwrite the in-memory contents of the DB on S2 >> +# without generating any constraint violations. >> +AT_CHECK([grep 'Transaction causes multiple rows in "indexed" table > to have identical values (1) for index on column "i"' -c s2.log], [1], [dnl >> +0 >> +]) > > Beside checking the error log, it may be better to simply check if the > i=2 row is found in S2. Without this fix i=2 shouldn't get inserted. > This can better ensure the correctness, in case the log text changes and > the grep doesn't detect the problem. > >> >> for i in `seq $n`; do >> OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s$i], [s$i.pid]) >> diff --git a/tests/ovsdb-idl.at <http://ovsdb-idl.at> > b/tests/ovsdb-idl.at <http://ovsdb-idl.at> >> index 4efed88..789ae23 100644 >> --- a/tests/ovsdb-idl.at <http://ovsdb-idl.at> >> +++ b/tests/ovsdb-idl.at <http://ovsdb-idl.at> >> @@ -954,6 +954,7 @@ AT_CHECK([sort stdout | uuidfilt], [0], >> >> # Check that ovsdb-idl figured out that table link2 and column l2 are > missing. >> AT_CHECK([grep ovsdb_idl stderr | sort], [0], [dnl >> +test-ovsdb|ovsdb_idl|idltest database lacks indexed table (database > needs upgrade?) >> test-ovsdb|ovsdb_idl|idltest database lacks link2 table (database > needs upgrade?) >> test-ovsdb|ovsdb_idl|idltest database lacks simple5 table (database > needs upgrade?) >> test-ovsdb|ovsdb_idl|idltest database lacks singleton table (database > needs upgrade?) >> -- >> 1.8.3.1 >>
On Wed, Aug 5, 2020 at 12:48 PM Dumitru Ceara <dceara@redhat.com> wrote: > > On 8/5/20 7:48 PM, Han Zhou wrote: > > > > > > On Wed, Aug 5, 2020 at 8:28 AM Dumitru Ceara <dceara@redhat.com > > <mailto:dceara@redhat.com>> wrote: > >> > >> Every time a follower has to install a snapshot received from the > >> leader, it should also replace the data in memory. Right now this only > >> happens when snapshots are installed that also change the schema. > >> > >> This can lead to inconsistent DB data on follower nodes and the snapshot > >> may fail to get applied. > >> > >> CC: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org>> > >> Fixes: bda1f6b60588 ("ovsdb-server: Don't disconnect clients after > > raft install_snapshot.") > >> Signed-off-by: Dumitru Ceara <dceara@redhat.com > > <mailto:dceara@redhat.com>> > > > > Thanks Dumitru! This is a great finding, and sorry for my mistake. > > This patch looks good to me. Just one minor comment below on the test > > case. Otherwise: > > > > Acked-by: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org>> > > > > Thanks Han for the review! I fixed the test case as you suggested and > sent v2. > > I was wondering if this is also the root cause for the issue you > reported a while back during the OVN meeting. In my scenario, if a > follower ends up in this situation, and if the DB gets compacted online > afterwards, the DB file also becomes inconsistent and in some cases > (after the DB server is restarted) all write transactions from clients > are rejected with "ovsdb-error: inconsistent data". > Yes, I believe it is the root cause. I thought this patch was exactly for that issue. Is it also for something else? > Related to that I also sent the following patch to make the ovsdb-server > storage state available via appctl commands: > > https://patchwork.ozlabs.org/project/openvswitch/patch/1596467128-13004-1-git-send-email-dceara@redhat.com/ > I will take a look. Thanks, Han > Regards, > Dumitru > > >> --- > >> ovsdb/ovsdb-server.c | 21 +++++++++++++-------- > >> tests/idltest.ovsschema | 9 +++++++++ > >> tests/ovsdb-cluster.at <http://ovsdb-cluster.at> | 28 > > +++++++++++++++++++++++++--- > >> tests/ovsdb-idl.at <http://ovsdb-idl.at> | 1 + > >> 4 files changed, 48 insertions(+), 11 deletions(-) > >> > >> diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c > >> index ef4e996..fd7891a 100644 > >> --- a/ovsdb/ovsdb-server.c > >> +++ b/ovsdb/ovsdb-server.c > >> @@ -543,13 +543,14 @@ parse_txn(struct server_config *config, struct > > db *db, > >> const struct ovsdb_schema *schema, const struct json *txn_json, > >> const struct uuid *txnid) > >> { > >> - if (schema && (!db->db->schema || strcmp(schema->version, > >> - db->db->schema->version))) { > >> + if (schema) { > >> /* We're replacing the schema (and the data). Destroy the > > database > >> * (first grabbing its storage), then replace it with the new > > schema. > >> * The transaction must also include the replacement data. > >> * > >> - * Only clustered database schema changes go through this > > path. */ > >> + * Only clustered database schema changes and snapshot installs > >> + * go through this path. > >> + */ > >> ovs_assert(txn_json); > >> ovs_assert(ovsdb_storage_is_clustered(db->db->storage)); > >> > >> @@ -559,11 +560,15 @@ parse_txn(struct server_config *config, struct > > db *db, > >> return error; > >> } > >> > >> - ovsdb_jsonrpc_server_reconnect( > >> - config->jsonrpc, false, > >> - (db->db->schema > >> - ? xasprintf("database %s schema changed", db->db->name) > >> - : xasprintf("database %s connected to storage", > > db->db->name))); > >> + if (!db->db->schema || > >> + strcmp(schema->version, db->db->schema->version)) { > >> + ovsdb_jsonrpc_server_reconnect( > >> + config->jsonrpc, false, > >> + (db->db->schema > >> + ? xasprintf("database %s schema changed", db->db->name) > >> + : xasprintf("database %s connected to storage", > >> + db->db->name))); > >> + } > >> > >> ovsdb_replace(db->db, > > ovsdb_create(ovsdb_schema_clone(schema), NULL)); > >> > >> diff --git a/tests/idltest.ovsschema b/tests/idltest.ovsschema > >> index e02b975..e04755e 100644 > >> --- a/tests/idltest.ovsschema > >> +++ b/tests/idltest.ovsschema > >> @@ -54,6 +54,15 @@ > >> }, > >> "isRoot" : true > >> }, > >> + "indexed": { > >> + "columns": { > >> + "i": { > >> + "type": "integer" > >> + } > >> + }, > >> + "indexes": [["i"]], > >> + "isRoot" : true > >> + }, > >> "simple": { > >> "columns": { > >> "b": { > >> diff --git a/tests/ovsdb-cluster.at <http://ovsdb-cluster.at> > > b/tests/ovsdb-cluster.at <http://ovsdb-cluster.at> > >> index 9714545..06d27c9 100644 > >> --- a/tests/ovsdb-cluster.at <http://ovsdb-cluster.at> > >> +++ b/tests/ovsdb-cluster.at <http://ovsdb-cluster.at> > >> @@ -332,15 +332,31 @@ for i in `seq $n`; do > >> AT_CHECK([ovsdb_client_wait unix:s$i.ovsdb $schema_name connected]) > >> done > >> > >> +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", > >> + {"op": "insert", > >> + "table": "indexed", > >> + "row": {"i": 1}}]]'], [0], [ignore], [ignore]) > >> + > >> # Kill one follower (s2) and write some data to cluster, so that the > > follower is falling behind > >> printf "\ns2: stopping\n" > >> OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s2], [s2.pid]) > >> > >> +# Delete "i":1 and readd it to get a different UUID for it. > >> +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", > >> + {"op": "delete", > >> + "table": "indexed", > >> + "where": [["i", "==", 1]]}]]'], [0], [ignore], [ignore]) > >> + > >> AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", > >> {"op": "insert", > >> - "table": "simple", > >> + "table": "indexed", > >> "row": {"i": 1}}]]'], [0], [ignore], [ignore]) > >> > >> +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", > >> + {"op": "insert", > >> + "table": "indexed", > >> + "row": {"i": 2}}]]'], [0], [ignore], [ignore]) > >> + > >> # Compact leader online to generate snapshot > >> AT_CHECK([ovs-appctl -t "`pwd`"/s1 ovsdb-server/compact]) > >> > >> @@ -355,8 +371,14 @@ AT_CHECK([ovsdb_client_wait unix:s2.ovsdb > > $schema_name connected]) > >> # succeed. > >> AT_CHECK([ovsdb-client transact unix:s2.ovsdb '[["idltest", > >> {"op": "insert", > >> - "table": "simple", > >> - "row": {"i": 1}}]]'], [0], [ignore], [ignore]) > >> + "table": "indexed", > >> + "row": {"i": 3}}]]'], [0], [ignore], [ignore]) > >> + > >> +# The snapshot should overwrite the in-memory contents of the DB on S2 > >> +# without generating any constraint violations. > >> +AT_CHECK([grep 'Transaction causes multiple rows in "indexed" table > > to have identical values (1) for index on column "i"' -c s2.log], [1], [dnl > >> +0 > >> +]) > > > > Beside checking the error log, it may be better to simply check if the > > i=2 row is found in S2. Without this fix i=2 shouldn't get inserted. > > This can better ensure the correctness, in case the log text changes and > > the grep doesn't detect the problem. > > > >> > >> for i in `seq $n`; do > >> OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s$i], [s$i.pid]) > >> diff --git a/tests/ovsdb-idl.at <http://ovsdb-idl.at> > > b/tests/ovsdb-idl.at <http://ovsdb-idl.at> > >> index 4efed88..789ae23 100644 > >> --- a/tests/ovsdb-idl.at <http://ovsdb-idl.at> > >> +++ b/tests/ovsdb-idl.at <http://ovsdb-idl.at> > >> @@ -954,6 +954,7 @@ AT_CHECK([sort stdout | uuidfilt], [0], > >> > >> # Check that ovsdb-idl figured out that table link2 and column l2 are > > missing. > >> AT_CHECK([grep ovsdb_idl stderr | sort], [0], [dnl > >> +test-ovsdb|ovsdb_idl|idltest database lacks indexed table (database > > needs upgrade?) > >> test-ovsdb|ovsdb_idl|idltest database lacks link2 table (database > > needs upgrade?) > >> test-ovsdb|ovsdb_idl|idltest database lacks simple5 table (database > > needs upgrade?) > >> test-ovsdb|ovsdb_idl|idltest database lacks singleton table (database > > needs upgrade?) > >> -- > >> 1.8.3.1 > >> >
On 8/5/20 11:58 PM, Han Zhou wrote: > > > On Wed, Aug 5, 2020 at 12:48 PM Dumitru Ceara <dceara@redhat.com > <mailto:dceara@redhat.com>> wrote: >> >> On 8/5/20 7:48 PM, Han Zhou wrote: >> > >> > >> > On Wed, Aug 5, 2020 at 8:28 AM Dumitru Ceara <dceara@redhat.com > <mailto:dceara@redhat.com> >> > <mailto:dceara@redhat.com <mailto:dceara@redhat.com>>> wrote: >> >> >> >> Every time a follower has to install a snapshot received from the >> >> leader, it should also replace the data in memory. Right now this only >> >> happens when snapshots are installed that also change the schema. >> >> >> >> This can lead to inconsistent DB data on follower nodes and the > snapshot >> >> may fail to get applied. >> >> >> >> CC: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org> > <mailto:hzhou@ovn.org <mailto:hzhou@ovn.org>>> >> >> Fixes: bda1f6b60588 ("ovsdb-server: Don't disconnect clients after >> > raft install_snapshot.") >> >> Signed-off-by: Dumitru Ceara <dceara@redhat.com > <mailto:dceara@redhat.com> >> > <mailto:dceara@redhat.com <mailto:dceara@redhat.com>>> >> > >> > Thanks Dumitru! This is a great finding, and sorry for my mistake. >> > This patch looks good to me. Just one minor comment below on the test >> > case. Otherwise: >> > >> > Acked-by: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org> > <mailto:hzhou@ovn.org <mailto:hzhou@ovn.org>>> >> > >> >> Thanks Han for the review! I fixed the test case as you suggested and >> sent v2. >> >> I was wondering if this is also the root cause for the issue you >> reported a while back during the OVN meeting. In my scenario, if a >> follower ends up in this situation, and if the DB gets compacted online >> afterwards, the DB file also becomes inconsistent and in some cases >> (after the DB server is restarted) all write transactions from clients >> are rejected with "ovsdb-error: inconsistent data". >> > Yes, I believe it is the root cause. I thought this patch was exactly > for that issue. Is it also for something else? > This patch is for the issue I described above: inconsistent DB on follower followed by online compacting of the DB which corrupts the DB file too. I wasn't sure if this was also what you were hitting in your deployment, I just wanted to check if there are any other known potential issues we need to investigate. >> Related to that I also sent the following patch to make the ovsdb-server >> storage state available via appctl commands: >> >> > https://patchwork.ozlabs.org/project/openvswitch/patch/1596467128-13004-1-git-send-email-dceara@redhat.com/ >> > > I will take a look. > > Thanks, > Han > Thanks! Dumitru
On Wed, Aug 5, 2020 at 3:04 PM Dumitru Ceara <dceara@redhat.com> wrote: > > On 8/5/20 11:58 PM, Han Zhou wrote: > > > > > > On Wed, Aug 5, 2020 at 12:48 PM Dumitru Ceara <dceara@redhat.com > > <mailto:dceara@redhat.com>> wrote: > >> > >> On 8/5/20 7:48 PM, Han Zhou wrote: > >> > > >> > > >> > On Wed, Aug 5, 2020 at 8:28 AM Dumitru Ceara <dceara@redhat.com > > <mailto:dceara@redhat.com> > >> > <mailto:dceara@redhat.com <mailto:dceara@redhat.com>>> wrote: > >> >> > >> >> Every time a follower has to install a snapshot received from the > >> >> leader, it should also replace the data in memory. Right now this only > >> >> happens when snapshots are installed that also change the schema. > >> >> > >> >> This can lead to inconsistent DB data on follower nodes and the > > snapshot > >> >> may fail to get applied. > >> >> > >> >> CC: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org> > > <mailto:hzhou@ovn.org <mailto:hzhou@ovn.org>>> > >> >> Fixes: bda1f6b60588 ("ovsdb-server: Don't disconnect clients after > >> > raft install_snapshot.") > >> >> Signed-off-by: Dumitru Ceara <dceara@redhat.com > > <mailto:dceara@redhat.com> > >> > <mailto:dceara@redhat.com <mailto:dceara@redhat.com>>> > >> > > >> > Thanks Dumitru! This is a great finding, and sorry for my mistake. > >> > This patch looks good to me. Just one minor comment below on the test > >> > case. Otherwise: > >> > > >> > Acked-by: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org> > > <mailto:hzhou@ovn.org <mailto:hzhou@ovn.org>>> > >> > > >> > >> Thanks Han for the review! I fixed the test case as you suggested and > >> sent v2. > >> > >> I was wondering if this is also the root cause for the issue you > >> reported a while back during the OVN meeting. In my scenario, if a > >> follower ends up in this situation, and if the DB gets compacted online > >> afterwards, the DB file also becomes inconsistent and in some cases > >> (after the DB server is restarted) all write transactions from clients > >> are rejected with "ovsdb-error: inconsistent data". > >> > > Yes, I believe it is the root cause. I thought this patch was exactly > > for that issue. Is it also for something else? > > > > This patch is for the issue I described above: inconsistent DB on > follower followed by online compacting of the DB which corrupts the DB > file too. I wasn't sure if this was also what you were hitting in your > deployment, I just wanted to check if there are any other known > potential issues we need to investigate. > OK, I think it should be the same issue. There are no other potential issues related to "inconsistent data" discovered so far. > >> Related to that I also sent the following patch to make the ovsdb-server > >> storage state available via appctl commands: > >> > >> > > https://patchwork.ozlabs.org/project/openvswitch/patch/1596467128-13004-1-git-send-email-dceara@redhat.com/ > >> > > > > I will take a look. > > > > Thanks, > > Han > > > > Thanks! > Dumitru >
diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c index ef4e996..fd7891a 100644 --- a/ovsdb/ovsdb-server.c +++ b/ovsdb/ovsdb-server.c @@ -543,13 +543,14 @@ parse_txn(struct server_config *config, struct db *db, const struct ovsdb_schema *schema, const struct json *txn_json, const struct uuid *txnid) { - if (schema && (!db->db->schema || strcmp(schema->version, - db->db->schema->version))) { + if (schema) { /* We're replacing the schema (and the data). Destroy the database * (first grabbing its storage), then replace it with the new schema. * The transaction must also include the replacement data. * - * Only clustered database schema changes go through this path. */ + * Only clustered database schema changes and snapshot installs + * go through this path. + */ ovs_assert(txn_json); ovs_assert(ovsdb_storage_is_clustered(db->db->storage)); @@ -559,11 +560,15 @@ parse_txn(struct server_config *config, struct db *db, return error; } - ovsdb_jsonrpc_server_reconnect( - config->jsonrpc, false, - (db->db->schema - ? xasprintf("database %s schema changed", db->db->name) - : xasprintf("database %s connected to storage", db->db->name))); + if (!db->db->schema || + strcmp(schema->version, db->db->schema->version)) { + ovsdb_jsonrpc_server_reconnect( + config->jsonrpc, false, + (db->db->schema + ? xasprintf("database %s schema changed", db->db->name) + : xasprintf("database %s connected to storage", + db->db->name))); + } ovsdb_replace(db->db, ovsdb_create(ovsdb_schema_clone(schema), NULL)); diff --git a/tests/idltest.ovsschema b/tests/idltest.ovsschema index e02b975..e04755e 100644 --- a/tests/idltest.ovsschema +++ b/tests/idltest.ovsschema @@ -54,6 +54,15 @@ }, "isRoot" : true }, + "indexed": { + "columns": { + "i": { + "type": "integer" + } + }, + "indexes": [["i"]], + "isRoot" : true + }, "simple": { "columns": { "b": { diff --git a/tests/ovsdb-cluster.at b/tests/ovsdb-cluster.at index 9714545..06d27c9 100644 --- a/tests/ovsdb-cluster.at +++ b/tests/ovsdb-cluster.at @@ -332,15 +332,31 @@ for i in `seq $n`; do AT_CHECK([ovsdb_client_wait unix:s$i.ovsdb $schema_name connected]) done +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", + {"op": "insert", + "table": "indexed", + "row": {"i": 1}}]]'], [0], [ignore], [ignore]) + # Kill one follower (s2) and write some data to cluster, so that the follower is falling behind printf "\ns2: stopping\n" OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s2], [s2.pid]) +# Delete "i":1 and readd it to get a different UUID for it. +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", + {"op": "delete", + "table": "indexed", + "where": [["i", "==", 1]]}]]'], [0], [ignore], [ignore]) + AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", {"op": "insert", - "table": "simple", + "table": "indexed", "row": {"i": 1}}]]'], [0], [ignore], [ignore]) +AT_CHECK([ovsdb-client transact unix:s1.ovsdb '[["idltest", + {"op": "insert", + "table": "indexed", + "row": {"i": 2}}]]'], [0], [ignore], [ignore]) + # Compact leader online to generate snapshot AT_CHECK([ovs-appctl -t "`pwd`"/s1 ovsdb-server/compact]) @@ -355,8 +371,14 @@ AT_CHECK([ovsdb_client_wait unix:s2.ovsdb $schema_name connected]) # succeed. AT_CHECK([ovsdb-client transact unix:s2.ovsdb '[["idltest", {"op": "insert", - "table": "simple", - "row": {"i": 1}}]]'], [0], [ignore], [ignore]) + "table": "indexed", + "row": {"i": 3}}]]'], [0], [ignore], [ignore]) + +# The snapshot should overwrite the in-memory contents of the DB on S2 +# without generating any constraint violations. +AT_CHECK([grep 'Transaction causes multiple rows in "indexed" table to have identical values (1) for index on column "i"' -c s2.log], [1], [dnl +0 +]) for i in `seq $n`; do OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s$i], [s$i.pid]) diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at index 4efed88..789ae23 100644 --- a/tests/ovsdb-idl.at +++ b/tests/ovsdb-idl.at @@ -954,6 +954,7 @@ AT_CHECK([sort stdout | uuidfilt], [0], # Check that ovsdb-idl figured out that table link2 and column l2 are missing. AT_CHECK([grep ovsdb_idl stderr | sort], [0], [dnl +test-ovsdb|ovsdb_idl|idltest database lacks indexed table (database needs upgrade?) test-ovsdb|ovsdb_idl|idltest database lacks link2 table (database needs upgrade?) test-ovsdb|ovsdb_idl|idltest database lacks simple5 table (database needs upgrade?) test-ovsdb|ovsdb_idl|idltest database lacks singleton table (database needs upgrade?)
Every time a follower has to install a snapshot received from the leader, it should also replace the data in memory. Right now this only happens when snapshots are installed that also change the schema. This can lead to inconsistent DB data on follower nodes and the snapshot may fail to get applied. CC: Han Zhou <hzhou@ovn.org> Fixes: bda1f6b60588 ("ovsdb-server: Don't disconnect clients after raft install_snapshot.") Signed-off-by: Dumitru Ceara <dceara@redhat.com> --- ovsdb/ovsdb-server.c | 21 +++++++++++++-------- tests/idltest.ovsschema | 9 +++++++++ tests/ovsdb-cluster.at | 28 +++++++++++++++++++++++++--- tests/ovsdb-idl.at | 1 + 4 files changed, 48 insertions(+), 11 deletions(-)