diff mbox series

[ovs-dev,RFC,39/52] ovn-sbctl: Allow retries by default.

Message ID 20170919220125.32535-40-blp@ovn.org
State RFC
Headers show
Series clustering implementation | expand

Commit Message

Ben Pfaff Sept. 19, 2017, 10:01 p.m. UTC
Most of the OVS database-manipulation utilities (ovn-sbctl, ovn-nbctl,
ovs-vsctl, vtep-ctl) don't retry their connections by default because
they assume that the database is either up or down and likely to stay
that way.  The OVN southbound database, however, is a likely candidate
for high availability clustering, so that even if it appears to be
down for a moment it will be available again soon.  So, prepare for
the clustering implementation by enabling retry by default in
ovn-sbctl.

Signed-off-by: Ben Pfaff <blp@ovn.org>
---
 ovn/utilities/ovn-sbctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Miguel Angel Ajo Sept. 21, 2017, 11:17 a.m. UTC | #1
Makes sense. Is there any way to control the number of retries, and the
retry time?

Acked-by: Miguel Angel Ajo <majopela@redhat.com>

On Wed, Sep 20, 2017 at 12:01 AM, Ben Pfaff <blp@ovn.org> wrote:

> Most of the OVS database-manipulation utilities (ovn-sbctl, ovn-nbctl,
> ovs-vsctl, vtep-ctl) don't retry their connections by default because
> they assume that the database is either up or down and likely to stay
> that way.  The OVN southbound database, however, is a likely candidate
> for high availability clustering, so that even if it appears to be
> down for a moment it will be available again soon.  So, prepare for
> the clustering implementation by enabling retry by default in
> ovn-sbctl.
>
> Signed-off-by: Ben Pfaff <blp@ovn.org>
> ---
>  ovn/utilities/ovn-sbctl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c
> index 7af5863b08fc..f4452fa6ee85 100644
> --- a/ovn/utilities/ovn-sbctl.c
> +++ b/ovn/utilities/ovn-sbctl.c
> @@ -120,7 +120,7 @@ main(int argc, char *argv[])
>      }
>
>      /* Initialize IDL. */
> -    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, false);
> +    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, true);
>      run_prerequisites(commands, n_commands, idl);
>
>      /* Execute the commands.
> --
> 2.10.2
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Ben Pfaff Sept. 22, 2017, 7:48 p.m. UTC | #2
Thank you for the review.

There isn't currently a way to control the number of retries.  If that
is a valuable feature, then it could be added.

You can control the timeout for these utilities with the --timeout
option (or use the "timeout" program from GNU coreutils).

On Thu, Sep 21, 2017 at 01:17:09PM +0200, Miguel Angel Ajo Pelayo wrote:
> Makes sense. Is there any way to control the number of retries, and the
> retry time?
> 
> Acked-by: Miguel Angel Ajo <majopela@redhat.com>
> 
> On Wed, Sep 20, 2017 at 12:01 AM, Ben Pfaff <blp@ovn.org> wrote:
> 
> > Most of the OVS database-manipulation utilities (ovn-sbctl, ovn-nbctl,
> > ovs-vsctl, vtep-ctl) don't retry their connections by default because
> > they assume that the database is either up or down and likely to stay
> > that way.  The OVN southbound database, however, is a likely candidate
> > for high availability clustering, so that even if it appears to be
> > down for a moment it will be available again soon.  So, prepare for
> > the clustering implementation by enabling retry by default in
> > ovn-sbctl.
> >
> > Signed-off-by: Ben Pfaff <blp@ovn.org>
> > ---
> >  ovn/utilities/ovn-sbctl.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c
> > index 7af5863b08fc..f4452fa6ee85 100644
> > --- a/ovn/utilities/ovn-sbctl.c
> > +++ b/ovn/utilities/ovn-sbctl.c
> > @@ -120,7 +120,7 @@ main(int argc, char *argv[])
> >      }
> >
> >      /* Initialize IDL. */
> > -    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, false);
> > +    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, true);
> >      run_prerequisites(commands, n_commands, idl);
> >
> >      /* Execute the commands.
> > --
> > 2.10.2
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
Miguel Angel Ajo Sept. 25, 2017, 8:53 a.m. UTC | #3
It makes sense.

As an idea for future improvements (also applicable to the python client
and the ovsdbapp): something that I've found valuable in distributed
systems is the ability to set an exponential retry time (with a ceiling)
because it's a good failsafe mechanism when one of the clients sends some
sort of big/buggy request that it's going to trigger a failure/long
execution time on the server and finally timeout.

The exponential backoff makes it more likely than the server will recover,
or have time to serve other clients.

On Fri, Sep 22, 2017 at 9:48 PM, Ben Pfaff <blp@ovn.org> wrote:

> Thank you for the review.
>
> There isn't currently a way to control the number of retries.  If that
> is a valuable feature, then it could be added.
>
> You can control the timeout for these utilities with the --timeout
> option (or use the "timeout" program from GNU coreutils).
>
> On Thu, Sep 21, 2017 at 01:17:09PM +0200, Miguel Angel Ajo Pelayo wrote:
> > Makes sense. Is there any way to control the number of retries, and the
> > retry time?
> >
> > Acked-by: Miguel Angel Ajo <majopela@redhat.com>
> >
> > On Wed, Sep 20, 2017 at 12:01 AM, Ben Pfaff <blp@ovn.org> wrote:
> >
> > > Most of the OVS database-manipulation utilities (ovn-sbctl, ovn-nbctl,
> > > ovs-vsctl, vtep-ctl) don't retry their connections by default because
> > > they assume that the database is either up or down and likely to stay
> > > that way.  The OVN southbound database, however, is a likely candidate
> > > for high availability clustering, so that even if it appears to be
> > > down for a moment it will be available again soon.  So, prepare for
> > > the clustering implementation by enabling retry by default in
> > > ovn-sbctl.
> > >
> > > Signed-off-by: Ben Pfaff <blp@ovn.org>
> > > ---
> > >  ovn/utilities/ovn-sbctl.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c
> > > index 7af5863b08fc..f4452fa6ee85 100644
> > > --- a/ovn/utilities/ovn-sbctl.c
> > > +++ b/ovn/utilities/ovn-sbctl.c
> > > @@ -120,7 +120,7 @@ main(int argc, char *argv[])
> > >      }
> > >
> > >      /* Initialize IDL. */
> > > -    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false,
> false);
> > > +    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false,
> true);
> > >      run_prerequisites(commands, n_commands, idl);
> > >
> > >      /* Execute the commands.
> > > --
> > > 2.10.2
> > >
> > > _______________________________________________
> > > dev mailing list
> > > dev@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > >
>
Ben Pfaff Sept. 25, 2017, 12:34 p.m. UTC | #4
OVS has that already too, but the retry time caps out at 8 seconds
currently.  It goes from 1 to 2 to 4 to 8 seconds.  The maximum could
easily be raised.  I don't know a good way to decide what the maximum
should be.

Possibly, it should be a random time within a range, to avoid
synchronizing clients' retries when a server goes down.  That is not yet
implemented (I just thought of it now).

On Mon, Sep 25, 2017 at 10:53:09AM +0200, Miguel Angel Ajo Pelayo wrote:
> It makes sense.
> 
> As an idea for future improvements (also applicable to the python client
> and the ovsdbapp): something that I've found valuable in distributed
> systems is the ability to set an exponential retry time (with a ceiling)
> because it's a good failsafe mechanism when one of the clients sends some
> sort of big/buggy request that it's going to trigger a failure/long
> execution time on the server and finally timeout.
> 
> The exponential backoff makes it more likely than the server will recover,
> or have time to serve other clients.
> 
> On Fri, Sep 22, 2017 at 9:48 PM, Ben Pfaff <blp@ovn.org> wrote:
> 
> > Thank you for the review.
> >
> > There isn't currently a way to control the number of retries.  If that
> > is a valuable feature, then it could be added.
> >
> > You can control the timeout for these utilities with the --timeout
> > option (or use the "timeout" program from GNU coreutils).
> >
> > On Thu, Sep 21, 2017 at 01:17:09PM +0200, Miguel Angel Ajo Pelayo wrote:
> > > Makes sense. Is there any way to control the number of retries, and the
> > > retry time?
> > >
> > > Acked-by: Miguel Angel Ajo <majopela@redhat.com>
> > >
> > > On Wed, Sep 20, 2017 at 12:01 AM, Ben Pfaff <blp@ovn.org> wrote:
> > >
> > > > Most of the OVS database-manipulation utilities (ovn-sbctl, ovn-nbctl,
> > > > ovs-vsctl, vtep-ctl) don't retry their connections by default because
> > > > they assume that the database is either up or down and likely to stay
> > > > that way.  The OVN southbound database, however, is a likely candidate
> > > > for high availability clustering, so that even if it appears to be
> > > > down for a moment it will be available again soon.  So, prepare for
> > > > the clustering implementation by enabling retry by default in
> > > > ovn-sbctl.
> > > >
> > > > Signed-off-by: Ben Pfaff <blp@ovn.org>
> > > > ---
> > > >  ovn/utilities/ovn-sbctl.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c
> > > > index 7af5863b08fc..f4452fa6ee85 100644
> > > > --- a/ovn/utilities/ovn-sbctl.c
> > > > +++ b/ovn/utilities/ovn-sbctl.c
> > > > @@ -120,7 +120,7 @@ main(int argc, char *argv[])
> > > >      }
> > > >
> > > >      /* Initialize IDL. */
> > > > -    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false,
> > false);
> > > > +    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false,
> > true);
> > > >      run_prerequisites(commands, n_commands, idl);
> > > >
> > > >      /* Execute the commands.
> > > > --
> > > > 2.10.2
> > > >
> > > > _______________________________________________
> > > > dev mailing list
> > > > dev@openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > >
> >
Miguel Angel Ajo Sept. 26, 2017, 1:45 p.m. UTC | #5
Wow, great :)

Having random cap times within a range sounds like a nice improvement,
that's a good idea.

On Mon, Sep 25, 2017 at 2:34 PM, Ben Pfaff <blp@ovn.org> wrote:

> OVS has that already too, but the retry time caps out at 8 seconds
> currently.  It goes from 1 to 2 to 4 to 8 seconds.  The maximum could
> easily be raised.  I don't know a good way to decide what the maximum
> should be.
>
> Possibly, it should be a random time within a range, to avoid
> synchronizing clients' retries when a server goes down.  That is not yet
> implemented (I just thought of it now).
>
> On Mon, Sep 25, 2017 at 10:53:09AM +0200, Miguel Angel Ajo Pelayo wrote:
> > It makes sense.
> >
> > As an idea for future improvements (also applicable to the python client
> > and the ovsdbapp): something that I've found valuable in distributed
> > systems is the ability to set an exponential retry time (with a ceiling)
> > because it's a good failsafe mechanism when one of the clients sends some
> > sort of big/buggy request that it's going to trigger a failure/long
> > execution time on the server and finally timeout.
> >
> > The exponential backoff makes it more likely than the server will
> recover,
> > or have time to serve other clients.
> >
> > On Fri, Sep 22, 2017 at 9:48 PM, Ben Pfaff <blp@ovn.org> wrote:
> >
> > > Thank you for the review.
> > >
> > > There isn't currently a way to control the number of retries.  If that
> > > is a valuable feature, then it could be added.
> > >
> > > You can control the timeout for these utilities with the --timeout
> > > option (or use the "timeout" program from GNU coreutils).
> > >
> > > On Thu, Sep 21, 2017 at 01:17:09PM +0200, Miguel Angel Ajo Pelayo
> wrote:
> > > > Makes sense. Is there any way to control the number of retries, and
> the
> > > > retry time?
> > > >
> > > > Acked-by: Miguel Angel Ajo <majopela@redhat.com>
> > > >
> > > > On Wed, Sep 20, 2017 at 12:01 AM, Ben Pfaff <blp@ovn.org> wrote:
> > > >
> > > > > Most of the OVS database-manipulation utilities (ovn-sbctl,
> ovn-nbctl,
> > > > > ovs-vsctl, vtep-ctl) don't retry their connections by default
> because
> > > > > they assume that the database is either up or down and likely to
> stay
> > > > > that way.  The OVN southbound database, however, is a likely
> candidate
> > > > > for high availability clustering, so that even if it appears to be
> > > > > down for a moment it will be available again soon.  So, prepare for
> > > > > the clustering implementation by enabling retry by default in
> > > > > ovn-sbctl.
> > > > >
> > > > > Signed-off-by: Ben Pfaff <blp@ovn.org>
> > > > > ---
> > > > >  ovn/utilities/ovn-sbctl.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c
> > > > > index 7af5863b08fc..f4452fa6ee85 100644
> > > > > --- a/ovn/utilities/ovn-sbctl.c
> > > > > +++ b/ovn/utilities/ovn-sbctl.c
> > > > > @@ -120,7 +120,7 @@ main(int argc, char *argv[])
> > > > >      }
> > > > >
> > > > >      /* Initialize IDL. */
> > > > > -    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false,
> > > false);
> > > > > +    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false,
> > > true);
> > > > >      run_prerequisites(commands, n_commands, idl);
> > > > >
> > > > >      /* Execute the commands.
> > > > > --
> > > > > 2.10.2
> > > > >
> > > > > _______________________________________________
> > > > > dev mailing list
> > > > > dev@openvswitch.org
> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > > >
> > >
>
diff mbox series

Patch

diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c
index 7af5863b08fc..f4452fa6ee85 100644
--- a/ovn/utilities/ovn-sbctl.c
+++ b/ovn/utilities/ovn-sbctl.c
@@ -120,7 +120,7 @@  main(int argc, char *argv[])
     }
 
     /* Initialize IDL. */
-    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, false);
+    idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, true);
     run_prerequisites(commands, n_commands, idl);
 
     /* Execute the commands.