Message ID | 20170919220125.32535-40-blp@ovn.org |
---|---|
State | RFC |
Headers | show |
Series | clustering implementation | expand |
Makes sense. Is there any way to control the number of retries, and the retry time? Acked-by: Miguel Angel Ajo <majopela@redhat.com> On Wed, Sep 20, 2017 at 12:01 AM, Ben Pfaff <blp@ovn.org> wrote: > Most of the OVS database-manipulation utilities (ovn-sbctl, ovn-nbctl, > ovs-vsctl, vtep-ctl) don't retry their connections by default because > they assume that the database is either up or down and likely to stay > that way. The OVN southbound database, however, is a likely candidate > for high availability clustering, so that even if it appears to be > down for a moment it will be available again soon. So, prepare for > the clustering implementation by enabling retry by default in > ovn-sbctl. > > Signed-off-by: Ben Pfaff <blp@ovn.org> > --- > ovn/utilities/ovn-sbctl.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c > index 7af5863b08fc..f4452fa6ee85 100644 > --- a/ovn/utilities/ovn-sbctl.c > +++ b/ovn/utilities/ovn-sbctl.c > @@ -120,7 +120,7 @@ main(int argc, char *argv[]) > } > > /* Initialize IDL. */ > - idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, false); > + idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, true); > run_prerequisites(commands, n_commands, idl); > > /* Execute the commands. > -- > 2.10.2 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
Thank you for the review. There isn't currently a way to control the number of retries. If that is a valuable feature, then it could be added. You can control the timeout for these utilities with the --timeout option (or use the "timeout" program from GNU coreutils). On Thu, Sep 21, 2017 at 01:17:09PM +0200, Miguel Angel Ajo Pelayo wrote: > Makes sense. Is there any way to control the number of retries, and the > retry time? > > Acked-by: Miguel Angel Ajo <majopela@redhat.com> > > On Wed, Sep 20, 2017 at 12:01 AM, Ben Pfaff <blp@ovn.org> wrote: > > > Most of the OVS database-manipulation utilities (ovn-sbctl, ovn-nbctl, > > ovs-vsctl, vtep-ctl) don't retry their connections by default because > > they assume that the database is either up or down and likely to stay > > that way. The OVN southbound database, however, is a likely candidate > > for high availability clustering, so that even if it appears to be > > down for a moment it will be available again soon. So, prepare for > > the clustering implementation by enabling retry by default in > > ovn-sbctl. > > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > --- > > ovn/utilities/ovn-sbctl.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c > > index 7af5863b08fc..f4452fa6ee85 100644 > > --- a/ovn/utilities/ovn-sbctl.c > > +++ b/ovn/utilities/ovn-sbctl.c > > @@ -120,7 +120,7 @@ main(int argc, char *argv[]) > > } > > > > /* Initialize IDL. */ > > - idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, false); > > + idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, true); > > run_prerequisites(commands, n_commands, idl); > > > > /* Execute the commands. > > -- > > 2.10.2 > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >
It makes sense. As an idea for future improvements (also applicable to the python client and the ovsdbapp): something that I've found valuable in distributed systems is the ability to set an exponential retry time (with a ceiling) because it's a good failsafe mechanism when one of the clients sends some sort of big/buggy request that it's going to trigger a failure/long execution time on the server and finally timeout. The exponential backoff makes it more likely than the server will recover, or have time to serve other clients. On Fri, Sep 22, 2017 at 9:48 PM, Ben Pfaff <blp@ovn.org> wrote: > Thank you for the review. > > There isn't currently a way to control the number of retries. If that > is a valuable feature, then it could be added. > > You can control the timeout for these utilities with the --timeout > option (or use the "timeout" program from GNU coreutils). > > On Thu, Sep 21, 2017 at 01:17:09PM +0200, Miguel Angel Ajo Pelayo wrote: > > Makes sense. Is there any way to control the number of retries, and the > > retry time? > > > > Acked-by: Miguel Angel Ajo <majopela@redhat.com> > > > > On Wed, Sep 20, 2017 at 12:01 AM, Ben Pfaff <blp@ovn.org> wrote: > > > > > Most of the OVS database-manipulation utilities (ovn-sbctl, ovn-nbctl, > > > ovs-vsctl, vtep-ctl) don't retry their connections by default because > > > they assume that the database is either up or down and likely to stay > > > that way. The OVN southbound database, however, is a likely candidate > > > for high availability clustering, so that even if it appears to be > > > down for a moment it will be available again soon. So, prepare for > > > the clustering implementation by enabling retry by default in > > > ovn-sbctl. > > > > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > > --- > > > ovn/utilities/ovn-sbctl.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c > > > index 7af5863b08fc..f4452fa6ee85 100644 > > > --- a/ovn/utilities/ovn-sbctl.c > > > +++ b/ovn/utilities/ovn-sbctl.c > > > @@ -120,7 +120,7 @@ main(int argc, char *argv[]) > > > } > > > > > > /* Initialize IDL. */ > > > - idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, > false); > > > + idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, > true); > > > run_prerequisites(commands, n_commands, idl); > > > > > > /* Execute the commands. > > > -- > > > 2.10.2 > > > > > > _______________________________________________ > > > dev mailing list > > > dev@openvswitch.org > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > >
OVS has that already too, but the retry time caps out at 8 seconds currently. It goes from 1 to 2 to 4 to 8 seconds. The maximum could easily be raised. I don't know a good way to decide what the maximum should be. Possibly, it should be a random time within a range, to avoid synchronizing clients' retries when a server goes down. That is not yet implemented (I just thought of it now). On Mon, Sep 25, 2017 at 10:53:09AM +0200, Miguel Angel Ajo Pelayo wrote: > It makes sense. > > As an idea for future improvements (also applicable to the python client > and the ovsdbapp): something that I've found valuable in distributed > systems is the ability to set an exponential retry time (with a ceiling) > because it's a good failsafe mechanism when one of the clients sends some > sort of big/buggy request that it's going to trigger a failure/long > execution time on the server and finally timeout. > > The exponential backoff makes it more likely than the server will recover, > or have time to serve other clients. > > On Fri, Sep 22, 2017 at 9:48 PM, Ben Pfaff <blp@ovn.org> wrote: > > > Thank you for the review. > > > > There isn't currently a way to control the number of retries. If that > > is a valuable feature, then it could be added. > > > > You can control the timeout for these utilities with the --timeout > > option (or use the "timeout" program from GNU coreutils). > > > > On Thu, Sep 21, 2017 at 01:17:09PM +0200, Miguel Angel Ajo Pelayo wrote: > > > Makes sense. Is there any way to control the number of retries, and the > > > retry time? > > > > > > Acked-by: Miguel Angel Ajo <majopela@redhat.com> > > > > > > On Wed, Sep 20, 2017 at 12:01 AM, Ben Pfaff <blp@ovn.org> wrote: > > > > > > > Most of the OVS database-manipulation utilities (ovn-sbctl, ovn-nbctl, > > > > ovs-vsctl, vtep-ctl) don't retry their connections by default because > > > > they assume that the database is either up or down and likely to stay > > > > that way. The OVN southbound database, however, is a likely candidate > > > > for high availability clustering, so that even if it appears to be > > > > down for a moment it will be available again soon. So, prepare for > > > > the clustering implementation by enabling retry by default in > > > > ovn-sbctl. > > > > > > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > > > --- > > > > ovn/utilities/ovn-sbctl.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c > > > > index 7af5863b08fc..f4452fa6ee85 100644 > > > > --- a/ovn/utilities/ovn-sbctl.c > > > > +++ b/ovn/utilities/ovn-sbctl.c > > > > @@ -120,7 +120,7 @@ main(int argc, char *argv[]) > > > > } > > > > > > > > /* Initialize IDL. */ > > > > - idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, > > false); > > > > + idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, > > true); > > > > run_prerequisites(commands, n_commands, idl); > > > > > > > > /* Execute the commands. > > > > -- > > > > 2.10.2 > > > > > > > > _______________________________________________ > > > > dev mailing list > > > > dev@openvswitch.org > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > >
Wow, great :) Having random cap times within a range sounds like a nice improvement, that's a good idea. On Mon, Sep 25, 2017 at 2:34 PM, Ben Pfaff <blp@ovn.org> wrote: > OVS has that already too, but the retry time caps out at 8 seconds > currently. It goes from 1 to 2 to 4 to 8 seconds. The maximum could > easily be raised. I don't know a good way to decide what the maximum > should be. > > Possibly, it should be a random time within a range, to avoid > synchronizing clients' retries when a server goes down. That is not yet > implemented (I just thought of it now). > > On Mon, Sep 25, 2017 at 10:53:09AM +0200, Miguel Angel Ajo Pelayo wrote: > > It makes sense. > > > > As an idea for future improvements (also applicable to the python client > > and the ovsdbapp): something that I've found valuable in distributed > > systems is the ability to set an exponential retry time (with a ceiling) > > because it's a good failsafe mechanism when one of the clients sends some > > sort of big/buggy request that it's going to trigger a failure/long > > execution time on the server and finally timeout. > > > > The exponential backoff makes it more likely than the server will > recover, > > or have time to serve other clients. > > > > On Fri, Sep 22, 2017 at 9:48 PM, Ben Pfaff <blp@ovn.org> wrote: > > > > > Thank you for the review. > > > > > > There isn't currently a way to control the number of retries. If that > > > is a valuable feature, then it could be added. > > > > > > You can control the timeout for these utilities with the --timeout > > > option (or use the "timeout" program from GNU coreutils). > > > > > > On Thu, Sep 21, 2017 at 01:17:09PM +0200, Miguel Angel Ajo Pelayo > wrote: > > > > Makes sense. Is there any way to control the number of retries, and > the > > > > retry time? > > > > > > > > Acked-by: Miguel Angel Ajo <majopela@redhat.com> > > > > > > > > On Wed, Sep 20, 2017 at 12:01 AM, Ben Pfaff <blp@ovn.org> wrote: > > > > > > > > > Most of the OVS database-manipulation utilities (ovn-sbctl, > ovn-nbctl, > > > > > ovs-vsctl, vtep-ctl) don't retry their connections by default > because > > > > > they assume that the database is either up or down and likely to > stay > > > > > that way. The OVN southbound database, however, is a likely > candidate > > > > > for high availability clustering, so that even if it appears to be > > > > > down for a moment it will be available again soon. So, prepare for > > > > > the clustering implementation by enabling retry by default in > > > > > ovn-sbctl. > > > > > > > > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > > > > --- > > > > > ovn/utilities/ovn-sbctl.c | 2 +- > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c > > > > > index 7af5863b08fc..f4452fa6ee85 100644 > > > > > --- a/ovn/utilities/ovn-sbctl.c > > > > > +++ b/ovn/utilities/ovn-sbctl.c > > > > > @@ -120,7 +120,7 @@ main(int argc, char *argv[]) > > > > > } > > > > > > > > > > /* Initialize IDL. */ > > > > > - idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, > > > false); > > > > > + idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, > > > true); > > > > > run_prerequisites(commands, n_commands, idl); > > > > > > > > > > /* Execute the commands. > > > > > -- > > > > > 2.10.2 > > > > > > > > > > _______________________________________________ > > > > > dev mailing list > > > > > dev@openvswitch.org > > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > > > > >
diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c index 7af5863b08fc..f4452fa6ee85 100644 --- a/ovn/utilities/ovn-sbctl.c +++ b/ovn/utilities/ovn-sbctl.c @@ -120,7 +120,7 @@ main(int argc, char *argv[]) } /* Initialize IDL. */ - idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, false); + idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false, true); run_prerequisites(commands, n_commands, idl); /* Execute the commands.
Most of the OVS database-manipulation utilities (ovn-sbctl, ovn-nbctl, ovs-vsctl, vtep-ctl) don't retry their connections by default because they assume that the database is either up or down and likely to stay that way. The OVN southbound database, however, is a likely candidate for high availability clustering, so that even if it appears to be down for a moment it will be available again soon. So, prepare for the clustering implementation by enabling retry by default in ovn-sbctl. Signed-off-by: Ben Pfaff <blp@ovn.org> --- ovn/utilities/ovn-sbctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)