diff mbox series

[ovs-dev] ovs-lib: accept optional timeout value for upgrade_cluster

Message ID 40a63f02d1e24cd98dde4c7dd5ee0346c4415a8a.camel@redhat.com
State Deferred
Headers show
Series [ovs-dev] ovs-lib: accept optional timeout value for upgrade_cluster | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Dan Williams Dec. 15, 2022, 5:57 p.m. UTC
Signed-off-by: Dan Williams <dcbw@redhat.com>
---
 utilities/ovs-lib.in | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

Comments

Frode Nordahl Dec. 15, 2022, 6:04 p.m. UTC | #1
Hello, Dan,

On Thu, Dec 15, 2022 at 6:57 PM Dan Williams <dcbw@redhat.com> wrote:
>
> Signed-off-by: Dan Williams <dcbw@redhat.com>
> ---
>  utilities/ovs-lib.in | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
> index 13477a6a9e991..36e63312b6cba 100644
> --- a/utilities/ovs-lib.in
> +++ b/utilities/ovs-lib.in
> @@ -475,11 +475,16 @@ upgrade_db () {
>  }
>
>  upgrade_cluster () {
> -    local DB_SCHEMA=$1 DB_SERVER=$2
> +    local DB_SCHEMA=$1 DB_SERVER=$2 TIMEOUT_SECONDS=$3
>      local schema_name=$(ovsdb-tool schema-name $1) || return 1
>
> -    action "Waiting for $schema_name to come up" ovsdb-client -t 30 wait "$DB_SERVER" "$schema_name" connected || return $?
> -    local db_version=$(ovsdb-client -t 10 get-schema-version "$DB_SERVER" "$schema_name") || return $?
> +    timeout_arg=30
> +    if [ -n "$TIMEOUT_SECONDS" ]; then
> +      timeout_arg="$TIMEOUT_SECONDS"
> +    fi
> +
> +    action "Waiting for $schema_name to come up" ovsdb-client -t $timeout_arg wait "$DB_SERVER" "$schema_name" connected || return $?
> +    local db_version=$(ovsdb-client -t $timeout_arg get-schema-version "$DB_SERVER" "$schema_name") || return $?
>      local target_version=$(ovsdb-tool schema-version "$DB_SCHEMA") || return $?
>
>      if ovsdb-tool compare-versions "$db_version" == "$target_version"; then
> @@ -487,7 +492,7 @@ upgrade_cluster () {
>      elif ovsdb-tool compare-versions "$db_version" ">" "$target_version"; then
>          log_warning_msg "Database $schema_name has newer schema version ($db_version) than our local schema ($target_version), possibly an upgrade is partially complete?"
>      else
> -        action "Upgrading database $schema_name from schema version $db_version to $target_version" ovsdb-client -t 30 convert "$DB_SERVER" "$DB_SCHEMA"
> +        action "Upgrading database $schema_name from schema version $db_version to $target_version" ovsdb-client -t $timeout_arg convert "$DB_SERVER" "$DB_SCHEMA"
>      fi
>  }
>
> --
> 2.38.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Thank you for proposing this patch!

We've been seeing reports of schema upgrades failing in the too and
have waited for some way to reproduce and see if this would be a fix.

Are you seeing this with clustered databases, and could your problem
be related to the election timer? If it is, raising the client side
timer alone could be problematic.

I recently raised a discussion about this on the list to figure out
possible paths forward [0][1].

0: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html
1: https://bugs.launchpad.net/bugs/1999605
Dan Williams Dec. 15, 2022, 6:16 p.m. UTC | #2
On Thu, 2022-12-15 at 19:04 +0100, Frode Nordahl wrote:
> Hello, Dan,
> 
> On Thu, Dec 15, 2022 at 6:57 PM Dan Williams <dcbw@redhat.com> wrote:
> > 
> > Signed-off-by: Dan Williams <dcbw@redhat.com>
> > ---
> >  utilities/ovs-lib.in | 13 +++++++++----
> >  1 file changed, 9 insertions(+), 4 deletions(-)
> > 
> > diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
> > index 13477a6a9e991..36e63312b6cba 100644
> > --- a/utilities/ovs-lib.in
> > +++ b/utilities/ovs-lib.in
> > @@ -475,11 +475,16 @@ upgrade_db () {
> >  }
> > 
> >  upgrade_cluster () {
> > -    local DB_SCHEMA=$1 DB_SERVER=$2
> > +    local DB_SCHEMA=$1 DB_SERVER=$2 TIMEOUT_SECONDS=$3
> >      local schema_name=$(ovsdb-tool schema-name $1) || return 1
> > 
> > -    action "Waiting for $schema_name to come up" ovsdb-client -t
> > 30 wait "$DB_SERVER" "$schema_name" connected || return $?
> > -    local db_version=$(ovsdb-client -t 10 get-schema-version
> > "$DB_SERVER" "$schema_name") || return $?
> > +    timeout_arg=30
> > +    if [ -n "$TIMEOUT_SECONDS" ]; then
> > +      timeout_arg="$TIMEOUT_SECONDS"
> > +    fi
> > +
> > +    action "Waiting for $schema_name to come up" ovsdb-client -t
> > $timeout_arg wait "$DB_SERVER" "$schema_name" connected || return
> > $?
> > +    local db_version=$(ovsdb-client -t $timeout_arg get-schema-
> > version "$DB_SERVER" "$schema_name") || return $?
> >      local target_version=$(ovsdb-tool schema-version "$DB_SCHEMA")
> > || return $?
> > 
> >      if ovsdb-tool compare-versions "$db_version" ==
> > "$target_version"; then
> > @@ -487,7 +492,7 @@ upgrade_cluster () {
> >      elif ovsdb-tool compare-versions "$db_version" ">"
> > "$target_version"; then
> >          log_warning_msg "Database $schema_name has newer schema
> > version ($db_version) than our local schema ($target_version),
> > possibly an upgrade is partially complete?"
> >      else
> > -        action "Upgrading database $schema_name from schema
> > version $db_version to $target_version" ovsdb-client -t 30 convert
> > "$DB_SERVER" "$DB_SCHEMA"
> > +        action "Upgrading database $schema_name from schema
> > version $db_version to $target_version" ovsdb-client -t
> > $timeout_arg convert "$DB_SERVER" "$DB_SCHEMA"
> >      fi
> >  }
> > 
> > --
> > 2.38.1
> > 
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> Thank you for proposing this patch!
> 
> We've been seeing reports of schema upgrades failing in the too and
> have waited for some way to reproduce and see if this would be a fix.
> 
> Are you seeing this with clustered databases, and could your problem
> be related to the election timer? If it is, raising the client side
> timer alone could be problematic.

Yes clustered.

A 450MB LB-heavy database generated by ovn-kubernetes with OVN 22.06
(which lacks DP groups for SB load balancers) was being upgraded from
OVN 22.06 -> 22.09 (but using same OVS 2.17 version, so no change to
ovsdb-server) and when the ovsdb-servers got restarted as part of the
ovn-kube container upgrade, they took longer to read+parse the database
than the 30s upgrade_cluster() timer, thus the container failed and was
put in CrashLoopBackoff by Kubernetes.

ovn-ctl was never able to update the schema, and thus 22.09 ovn-northd
was never able to reduce the DB size by rewriting it to use DP groups
for SB LBs and recover.

This patch is really a workaround and needs a corresponding ovn-ctl
patch to accept a timeout for the NB/SB DB start functions that our
OpenShift container scripts would pass.

The real fix is, like Ilya suggests, "reduce the size of the DB" as
we've found that the most effective scale strategy for OpenShift and
ovn-kubernetes. And I think that strategy has paid off tremendously
over the last 2 years we've been working on OVN & ovsdb-server scale.

Huge credit to Ilya and the OVN team for making that happen...

Dan


> 
> I recently raised a discussion about this on the list to figure out
> possible paths forward [0][1].
> 
> 0:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html
> 1: https://bugs.launchpad.net/bugs/1999605
>
Frode Nordahl Dec. 15, 2022, 6:46 p.m. UTC | #3
On Thu, Dec 15, 2022 at 7:16 PM Dan Williams <dcbw@redhat.com> wrote:
> On Thu, 2022-12-15 at 19:04 +0100, Frode Nordahl wrote:

[ snip ]

> > Thank you for proposing this patch!
> >
> > We've been seeing reports of schema upgrades failing in the too and
> > have waited for some way to reproduce and see if this would be a fix.
> >
> > Are you seeing this with clustered databases, and could your problem
> > be related to the election timer? If it is, raising the client side
> > timer alone could be problematic.
>
> Yes clustered.
>
> A 450MB LB-heavy database generated by ovn-kubernetes with OVN 22.06
> (which lacks DP groups for SB load balancers) was being upgraded from
> OVN 22.06 -> 22.09 (but using same OVS 2.17 version, so no change to
> ovsdb-server) and when the ovsdb-servers got restarted as part of the
> ovn-kube container upgrade, they took longer to read+parse the database
> than the 30s upgrade_cluster() timer, thus the container failed and was
> put in CrashLoopBackoff by Kubernetes.
>
> ovn-ctl was never able to update the schema, and thus 22.09 ovn-northd
> was never able to reduce the DB size by rewriting it to use DP groups
> for SB LBs and recover.
>
> This patch is really a workaround and needs a corresponding ovn-ctl
> patch to accept a timeout for the NB/SB DB start functions that our
> OpenShift container scripts would pass.

Right, so you plan to couple the value of the ovsdb-client timeout to
the value of the ovsdb-server election timer, that makes sense.

> The real fix is, like Ilya suggests, "reduce the size of the DB" as
> we've found that the most effective scale strategy for OpenShift and
> ovn-kubernetes. And I think that strategy has paid off tremendously
> over the last 2 years we've been working on OVN & ovsdb-server scale.
>
> Huge credit to Ilya and the OVN team for making that happen...

Hear hear, wonders has been worked on in many of the OVS and OVN
components over the past years, my hat off to everyone involved.

The best place to continue the discussion would probably be in the
thread below, but in short, I agree that reducing the size of the DB
is of course the best medicine. However, this specific piece of the
machinery is involved in upgrades, and upgrades are unfortunately
required to move our shared user base forward to the new promised land
of software that produces smaller databases :)

> > I recently raised a discussion about this on the list to figure out
> > possible paths forward [0][1].
> >
> > 0:
> > https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html
> > 1: https://bugs.launchpad.net/bugs/1999605
Dan Williams Dec. 15, 2022, 7:45 p.m. UTC | #4
On Thu, 2022-12-15 at 19:46 +0100, Frode Nordahl wrote:
> On Thu, Dec 15, 2022 at 7:16 PM Dan Williams <dcbw@redhat.com> wrote:
> > On Thu, 2022-12-15 at 19:04 +0100, Frode Nordahl wrote:
> 
> [ snip ]
> 
> > > Thank you for proposing this patch!
> > > 
> > > We've been seeing reports of schema upgrades failing in the too
> > > and
> > > have waited for some way to reproduce and see if this would be a
> > > fix.
> > > 
> > > Are you seeing this with clustered databases, and could your
> > > problem
> > > be related to the election timer? If it is, raising the client
> > > side
> > > timer alone could be problematic.
> > 
> > Yes clustered.
> > 
> > A 450MB LB-heavy database generated by ovn-kubernetes with OVN
> > 22.06
> > (which lacks DP groups for SB load balancers) was being upgraded
> > from
> > OVN 22.06 -> 22.09 (but using same OVS 2.17 version, so no change
> > to
> > ovsdb-server) and when the ovsdb-servers got restarted as part of
> > the
> > ovn-kube container upgrade, they took longer to read+parse the
> > database
> > than the 30s upgrade_cluster() timer, thus the container failed and
> > was
> > put in CrashLoopBackoff by Kubernetes.
> > 
> > ovn-ctl was never able to update the schema, and thus 22.09 ovn-
> > northd
> > was never able to reduce the DB size by rewriting it to use DP
> > groups
> > for SB LBs and recover.
> > 
> > This patch is really a workaround and needs a corresponding ovn-ctl
> > patch to accept a timeout for the NB/SB DB start functions that our
> > OpenShift container scripts would pass.
> 
> Right, so you plan to couple the value of the ovsdb-client timeout to
> the value of the ovsdb-server election timer, that makes sense.

Actually no, I hadn't. AFAIK they aren't related. Election timer is
more relevant while ovsdb-server is running (eg, number of connected
client and what monitors they requested), while the timeout in this
patch is only about how long the schema upgrade process waits for
ovsdb-server to respond at startup (eg, size of the on-disk database it
has to parse in and allocate memory for).

FWIW our OpenShift + ovn-kubernetes election timers are 16 seconds and
that works in clusters the same scale as the problem DB we have. But
reading that DB in takes ~28 or 30 seconds. So I'd probably bump the DB
startup timeout we use to > 45s since there are 2 other databases still
running to take the load while the 3rd one is restarting.

> 
> > The real fix is, like Ilya suggests, "reduce the size of the DB" as
> > we've found that the most effective scale strategy for OpenShift
> > and
> > ovn-kubernetes. And I think that strategy has paid off tremendously
> > over the last 2 years we've been working on OVN & ovsdb-server
> > scale.
> > 
> > Huge credit to Ilya and the OVN team for making that happen...
> 
> Hear hear, wonders has been worked on in many of the OVS and OVN
> components over the past years, my hat off to everyone involved.
> 
> The best place to continue the discussion would probably be in the
> thread below, but in short, I agree that reducing the size of the DB
> is of course the best medicine. However, this specific piece of the
> machinery is involved in upgrades, and upgrades are unfortunately
> required to move our shared user base forward to the new promised
> land
> of software that produces smaller databases :)

Yep. Maybe some of this can be backported to help ease that upgrade
process.

Dan

> 
> > > I recently raised a discussion about this on the list to figure
> > > out
> > > possible paths forward [0][1].
> > > 
> > > 0:
> > > https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html
> > > 1: https://bugs.launchpad.net/bugs/1999605
>
Frode Nordahl Dec. 15, 2022, 8:46 p.m. UTC | #5
tor. 15. des. 2022, 20:45 skrev Dan Williams <dcbw@redhat.com>:

> On Thu, 2022-12-15 at 19:46 +0100, Frode Nordahl wrote:
> > On Thu, Dec 15, 2022 at 7:16 PM Dan Williams <dcbw@redhat.com> wrote:
> > > On Thu, 2022-12-15 at 19:04 +0100, Frode Nordahl wrote:
> >
> > [ snip ]
> >
> > > > Thank you for proposing this patch!
> > > >
> > > > We've been seeing reports of schema upgrades failing in the too
> > > > and
> > > > have waited for some way to reproduce and see if this would be a
> > > > fix.
> > > >
> > > > Are you seeing this with clustered databases, and could your
> > > > problem
> > > > be related to the election timer? If it is, raising the client
> > > > side
> > > > timer alone could be problematic.
> > >
> > > Yes clustered.
> > >
> > > A 450MB LB-heavy database generated by ovn-kubernetes with OVN
> > > 22.06
> > > (which lacks DP groups for SB load balancers) was being upgraded
> > > from
> > > OVN 22.06 -> 22.09 (but using same OVS 2.17 version, so no change
> > > to
> > > ovsdb-server) and when the ovsdb-servers got restarted as part of
> > > the
> > > ovn-kube container upgrade, they took longer to read+parse the
> > > database
> > > than the 30s upgrade_cluster() timer, thus the container failed and
> > > was
> > > put in CrashLoopBackoff by Kubernetes.
> > >
> > > ovn-ctl was never able to update the schema, and thus 22.09 ovn-
> > > northd
> > > was never able to reduce the DB size by rewriting it to use DP
> > > groups
> > > for SB LBs and recover.
> > >
> > > This patch is really a workaround and needs a corresponding ovn-ctl
> > > patch to accept a timeout for the NB/SB DB start functions that our
> > > OpenShift container scripts would pass.
> >
> > Right, so you plan to couple the value of the ovsdb-client timeout to
> > the value of the ovsdb-server election timer, that makes sense.
>
> Actually no, I hadn't. AFAIK they aren't related. Election timer is
> more relevant while ovsdb-server is running (eg, number of connected
> client and what monitors they requested), while the timeout in this
> patch is only about how long the schema upgrade process waits for
> ovsdb-server to respond at startup (eg, size of the on-disk database it
> has to parse in and allocate memory for).
>

This is the crux of the issue, as laid out in the discussion [0] and bug
report [1].

At this point in time, the time spent on a online schema conversion in a
clustered database cannot exceed the server side election timer, increasing
the client side timer in isolation will only prolong the amount of time one
of the cluster nodes will spend retrying the conversion, not having
anywhere to write the result.


> FWIW our OpenShift + ovn-kubernetes election timers are 16 seconds and
> that works in clusters the same scale as the problem DB we have. But
> reading that DB in takes ~28 or 30 seconds. So I'd probably bump the DB
> startup timeout we use to > 45s since there are 2 other databases still
> running to take the load while the 3rd one is restarting.
>

That is reasonable, but the conversion is happening on the server side for
a clustered database, so you would also need to increase the election timer.


> >
> > > The real fix is, like Ilya suggests, "reduce the size of the DB" as
> > > we've found that the most effective scale strategy for OpenShift
> > > and
> > > ovn-kubernetes. And I think that strategy has paid off tremendously
> > > over the last 2 years we've been working on OVN & ovsdb-server
> > > scale.
> > >
> > > Huge credit to Ilya and the OVN team for making that happen...
> >
> > Hear hear, wonders has been worked on in many of the OVS and OVN
> > components over the past years, my hat off to everyone involved.
> >
> > The best place to continue the discussion would probably be in the
> > thread below, but in short, I agree that reducing the size of the DB
> > is of course the best medicine. However, this specific piece of the
> > machinery is involved in upgrades, and upgrades are unfortunately
> > required to move our shared user base forward to the new promised
> > land
> > of software that produces smaller databases :)
>
> Yep. Maybe some of this can be backported to help ease that upgrade
> process.
>

Luckily, the schema conversion process can often be done as the first thing
after an upgrade, so any changes here would definitively be easier to get
out there.

--
Frode Nordahl

Dan
>
> >
> > > > I recently raised a discussion about this on the list to figure
> > > > out
> > > > possible paths forward [0][1].
> > > >
> > > > 0:
> > > >
> https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html
> > > > 1: https://bugs.launchpad.net/bugs/1999605
> >
>
>
Simon Horman Oct. 16, 2023, 8:16 a.m. UTC | #6
On Thu, Dec 15, 2022 at 11:57:27AM -0600, Dan Williams wrote:
> Signed-off-by: Dan Williams <dcbw@redhat.com>

Hi Dan,

This patch appears to have gone stale in patchwork, for one reason or
another. If it is still relevant then I think it needs to be revisited,
by being reposted after appropriate preparation.

As such I'm marking this patch as "Deferred" in patchwork.

No action is required unless there is a desire to revisit this patch.
diff mbox series

Patch

diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
index 13477a6a9e991..36e63312b6cba 100644
--- a/utilities/ovs-lib.in
+++ b/utilities/ovs-lib.in
@@ -475,11 +475,16 @@  upgrade_db () {
 }
 
 upgrade_cluster () {
-    local DB_SCHEMA=$1 DB_SERVER=$2
+    local DB_SCHEMA=$1 DB_SERVER=$2 TIMEOUT_SECONDS=$3
     local schema_name=$(ovsdb-tool schema-name $1) || return 1
 
-    action "Waiting for $schema_name to come up" ovsdb-client -t 30 wait "$DB_SERVER" "$schema_name" connected || return $?
-    local db_version=$(ovsdb-client -t 10 get-schema-version "$DB_SERVER" "$schema_name") || return $?
+    timeout_arg=30
+    if [ -n "$TIMEOUT_SECONDS" ]; then
+      timeout_arg="$TIMEOUT_SECONDS"
+    fi
+
+    action "Waiting for $schema_name to come up" ovsdb-client -t $timeout_arg wait "$DB_SERVER" "$schema_name" connected || return $?
+    local db_version=$(ovsdb-client -t $timeout_arg get-schema-version "$DB_SERVER" "$schema_name") || return $?
     local target_version=$(ovsdb-tool schema-version "$DB_SCHEMA") || return $?
 
     if ovsdb-tool compare-versions "$db_version" == "$target_version"; then
@@ -487,7 +492,7 @@  upgrade_cluster () {
     elif ovsdb-tool compare-versions "$db_version" ">" "$target_version"; then
         log_warning_msg "Database $schema_name has newer schema version ($db_version) than our local schema ($target_version), possibly an upgrade is partially complete?"
     else
-        action "Upgrading database $schema_name from schema version $db_version to $target_version" ovsdb-client -t 30 convert "$DB_SERVER" "$DB_SCHEMA"
+        action "Upgrading database $schema_name from schema version $db_version to $target_version" ovsdb-client -t $timeout_arg convert "$DB_SERVER" "$DB_SCHEMA"
     fi
 }