diff mbox series

[ovs-dev,v2] ovsdb-tool: Add a db consistency check to the ovsdb-tool check-cluster command

Message ID CAAFK5zx8COFRUwAYbR5CAO3aOqX8CFx7cbJVkz7KagizeWUE6Q@mail.gmail.com
State Superseded
Headers show
Series [ovs-dev,v2] ovsdb-tool: Add a db consistency check to the ovsdb-tool check-cluster command | expand

Commit Message

Federico Paolinelli July 9, 2020, 4:04 p.m. UTC
There are some occurrences where the database ends up in an inconsistent
state. This happened in ovn-k8s and is described in
https://bugzilla.redhat.com/show_bug.cgi?id=1837953#c23.
Here we are adding a supported way to check that a given db is consistent,
which is less error prone than checking the logs.

This was only tested against a valid database, as did not manage to get a
corrupted one.

Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
Suggested-by: Dumitru Ceara <dceara@redhat.com>
---
 ovsdb/ovsdb-tool.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Dumitru Ceara July 9, 2020, 7:13 p.m. UTC | #1
On 7/9/20 6:04 PM, Federico Paolinelli wrote:
> There are some occurrences where the database ends up in an inconsistent
> state. This happened in ovn-k8s and is described in
> https://bugzilla.redhat.com/show_bug.cgi?id=1837953#c23.
> Here we are adding a supported way to check that a given db is consistent,
> which is less error prone than checking the logs.
> 
> This was only tested against a valid database, as did not manage to get a
> corrupted one.

Hi Federico,

The NB DB [0] on master-3 in the BZ [1] is an example of corrupted DB.

I tested your patch with it and I get:

ovsdb-tool check-cluster /tmp/kni1-vmaster-3-ovnnb_db.db

ovsdb-tool: /tmp/kni1-vmaster-3-ovnnb_db.db: server d5db not found in
server list

Tested-by: Dumitru Ceara <dceara@redhat.com>

I do have a few more minor comments on the patch itself.

Thanks,
Dumitru

[0] https://bugzilla.redhat.com/attachment.cgi?id=1697595
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1837953#c23

> 
> Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
> Suggested-by: Dumitru Ceara <dceara@redhat.com>
> ---
>  ovsdb/ovsdb-tool.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/ovsdb/ovsdb-tool.c b/ovsdb/ovsdb-tool.c
> index 91662cab8..d5ada0c2d 100644
> --- a/ovsdb/ovsdb-tool.c
> +++ b/ovsdb/ovsdb-tool.c
> @@ -1497,6 +1497,27 @@ do_check_cluster(struct ovs_cmdl_context *ctx)
>          }
>      }
> 
> +    /* Check for db consistency:
> +     * The serverid must be in the servers list

Please add a '.' at the end of the sentence in the comment.

> +     */
> +
> +    for (struct server *s = c.servers; s < &c.servers[c.n_servers]; s++) {
> +        struct shash *servers_obj = json_object(s->snap->servers);
> +        char *server_id = xasprintf(SID_FMT, SID_ARGS(&s->header.sid));
> +        bool found = false;
> +        const struct shash_node *node;

Please add a blank line for readability.

> +        SHASH_FOR_EACH (node, servers_obj) {
> +            if (!strncmp(server_id, node->name, SID_LEN)) {
> +                found = true;
> +            }
> +        }
> +        if (!found) {
> +            ovs_fatal(0, "%s: server %s not found in server list",
> +                          s->filename, server_id);

This should be indented such that the arguments on the second line are
aligned right after the '(' above.

> +        }
> +        free(server_id);
> +    }
> +
>      /* Clean up. */
> 
>      for (size_t i = 0; i < c.n_servers; i++) {
>
Federico Paolinelli July 10, 2020, 7:24 a.m. UTC | #2
On Thu, Jul 9, 2020 at 9:13 PM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On 7/9/20 6:04 PM, Federico Paolinelli wrote:
> > There are some occurrences where the database ends up in an inconsistent
> > state. This happened in ovn-k8s and is described in
> > https://bugzilla.redhat.com/show_bug.cgi?id=1837953#c23.
> > Here we are adding a supported way to check that a given db is consistent,
> > which is less error prone than checking the logs.
> >
> > This was only tested against a valid database, as did not manage to get a
> > corrupted one.
>
> Hi Federico,
>
> The NB DB [0] on master-3 in the BZ [1] is an example of corrupted DB.
>

Ah, right. I forgot about it, thanks for taking care!

> I tested your patch with it and I get:
>
> ovsdb-tool check-cluster /tmp/kni1-vmaster-3-ovnnb_db.db
>
> ovsdb-tool: /tmp/kni1-vmaster-3-ovnnb_db.db: server d5db not found in
> server list
>
> Tested-by: Dumitru Ceara <dceara@redhat.com>
>
> I do have a few more minor comments on the patch itself.
>
> Thanks,
> Dumitru
>
> [0] https://bugzilla.redhat.com/attachment.cgi?id=1697595
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1837953#c23
>
> >
> > Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
> > Suggested-by: Dumitru Ceara <dceara@redhat.com>
> > ---
> >  ovsdb/ovsdb-tool.c | 21 +++++++++++++++++++++
> >  1 file changed, 21 insertions(+)
> >
> > diff --git a/ovsdb/ovsdb-tool.c b/ovsdb/ovsdb-tool.c
> > index 91662cab8..d5ada0c2d 100644
> > --- a/ovsdb/ovsdb-tool.c
> > +++ b/ovsdb/ovsdb-tool.c
> > @@ -1497,6 +1497,27 @@ do_check_cluster(struct ovs_cmdl_context *ctx)
> >          }
> >      }
> >
> > +    /* Check for db consistency:
> > +     * The serverid must be in the servers list
>
> Please add a '.' at the end of the sentence in the comment.
>
> > +     */
> > +
> > +    for (struct server *s = c.servers; s < &c.servers[c.n_servers]; s++) {
> > +        struct shash *servers_obj = json_object(s->snap->servers);
> > +        char *server_id = xasprintf(SID_FMT, SID_ARGS(&s->header.sid));
> > +        bool found = false;
> > +        const struct shash_node *node;
>
> Please add a blank line for readability.
>
> > +        SHASH_FOR_EACH (node, servers_obj) {
> > +            if (!strncmp(server_id, node->name, SID_LEN)) {
> > +                found = true;
> > +            }
> > +        }
> > +        if (!found) {
> > +            ovs_fatal(0, "%s: server %s not found in server list",
> > +                          s->filename, server_id);
>
> This should be indented such that the arguments on the second line are
> aligned right after the '(' above.
>
> > +        }
> > +        free(server_id);
> > +    }
> > +
> >      /* Clean up. */
> >
> >      for (size_t i = 0; i < c.n_servers; i++) {
> >
>
diff mbox series

Patch

diff --git a/ovsdb/ovsdb-tool.c b/ovsdb/ovsdb-tool.c
index 91662cab8..d5ada0c2d 100644
--- a/ovsdb/ovsdb-tool.c
+++ b/ovsdb/ovsdb-tool.c
@@ -1497,6 +1497,27 @@  do_check_cluster(struct ovs_cmdl_context *ctx)
         }
     }

+    /* Check for db consistency:
+     * The serverid must be in the servers list
+     */
+
+    for (struct server *s = c.servers; s < &c.servers[c.n_servers]; s++) {
+        struct shash *servers_obj = json_object(s->snap->servers);
+        char *server_id = xasprintf(SID_FMT, SID_ARGS(&s->header.sid));
+        bool found = false;
+        const struct shash_node *node;
+        SHASH_FOR_EACH (node, servers_obj) {
+            if (!strncmp(server_id, node->name, SID_LEN)) {
+                found = true;
+            }
+        }
+        if (!found) {
+            ovs_fatal(0, "%s: server %s not found in server list",
+                          s->filename, server_id);
+        }
+        free(server_id);
+    }
+
     /* Clean up. */

     for (size_t i = 0; i < c.n_servers; i++) {