Message ID | 40a63f02d1e24cd98dde4c7dd5ee0346c4415a8a.camel@redhat.com |
---|---|
State | Deferred |
Headers | show |
Series | [ovs-dev] ovs-lib: accept optional timeout value for upgrade_cluster | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | success | test: success |
Hello, Dan, On Thu, Dec 15, 2022 at 6:57 PM Dan Williams <dcbw@redhat.com> wrote: > > Signed-off-by: Dan Williams <dcbw@redhat.com> > --- > utilities/ovs-lib.in | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in > index 13477a6a9e991..36e63312b6cba 100644 > --- a/utilities/ovs-lib.in > +++ b/utilities/ovs-lib.in > @@ -475,11 +475,16 @@ upgrade_db () { > } > > upgrade_cluster () { > - local DB_SCHEMA=$1 DB_SERVER=$2 > + local DB_SCHEMA=$1 DB_SERVER=$2 TIMEOUT_SECONDS=$3 > local schema_name=$(ovsdb-tool schema-name $1) || return 1 > > - action "Waiting for $schema_name to come up" ovsdb-client -t 30 wait "$DB_SERVER" "$schema_name" connected || return $? > - local db_version=$(ovsdb-client -t 10 get-schema-version "$DB_SERVER" "$schema_name") || return $? > + timeout_arg=30 > + if [ -n "$TIMEOUT_SECONDS" ]; then > + timeout_arg="$TIMEOUT_SECONDS" > + fi > + > + action "Waiting for $schema_name to come up" ovsdb-client -t $timeout_arg wait "$DB_SERVER" "$schema_name" connected || return $? > + local db_version=$(ovsdb-client -t $timeout_arg get-schema-version "$DB_SERVER" "$schema_name") || return $? > local target_version=$(ovsdb-tool schema-version "$DB_SCHEMA") || return $? > > if ovsdb-tool compare-versions "$db_version" == "$target_version"; then > @@ -487,7 +492,7 @@ upgrade_cluster () { > elif ovsdb-tool compare-versions "$db_version" ">" "$target_version"; then > log_warning_msg "Database $schema_name has newer schema version ($db_version) than our local schema ($target_version), possibly an upgrade is partially complete?" > else > - action "Upgrading database $schema_name from schema version $db_version to $target_version" ovsdb-client -t 30 convert "$DB_SERVER" "$DB_SCHEMA" > + action "Upgrading database $schema_name from schema version $db_version to $target_version" ovsdb-client -t $timeout_arg convert "$DB_SERVER" "$DB_SCHEMA" > fi > } > > -- > 2.38.1 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev Thank you for proposing this patch! We've been seeing reports of schema upgrades failing in the too and have waited for some way to reproduce and see if this would be a fix. Are you seeing this with clustered databases, and could your problem be related to the election timer? If it is, raising the client side timer alone could be problematic. I recently raised a discussion about this on the list to figure out possible paths forward [0][1]. 0: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html 1: https://bugs.launchpad.net/bugs/1999605
On Thu, 2022-12-15 at 19:04 +0100, Frode Nordahl wrote: > Hello, Dan, > > On Thu, Dec 15, 2022 at 6:57 PM Dan Williams <dcbw@redhat.com> wrote: > > > > Signed-off-by: Dan Williams <dcbw@redhat.com> > > --- > > utilities/ovs-lib.in | 13 +++++++++---- > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in > > index 13477a6a9e991..36e63312b6cba 100644 > > --- a/utilities/ovs-lib.in > > +++ b/utilities/ovs-lib.in > > @@ -475,11 +475,16 @@ upgrade_db () { > > } > > > > upgrade_cluster () { > > - local DB_SCHEMA=$1 DB_SERVER=$2 > > + local DB_SCHEMA=$1 DB_SERVER=$2 TIMEOUT_SECONDS=$3 > > local schema_name=$(ovsdb-tool schema-name $1) || return 1 > > > > - action "Waiting for $schema_name to come up" ovsdb-client -t > > 30 wait "$DB_SERVER" "$schema_name" connected || return $? > > - local db_version=$(ovsdb-client -t 10 get-schema-version > > "$DB_SERVER" "$schema_name") || return $? > > + timeout_arg=30 > > + if [ -n "$TIMEOUT_SECONDS" ]; then > > + timeout_arg="$TIMEOUT_SECONDS" > > + fi > > + > > + action "Waiting for $schema_name to come up" ovsdb-client -t > > $timeout_arg wait "$DB_SERVER" "$schema_name" connected || return > > $? > > + local db_version=$(ovsdb-client -t $timeout_arg get-schema- > > version "$DB_SERVER" "$schema_name") || return $? > > local target_version=$(ovsdb-tool schema-version "$DB_SCHEMA") > > || return $? > > > > if ovsdb-tool compare-versions "$db_version" == > > "$target_version"; then > > @@ -487,7 +492,7 @@ upgrade_cluster () { > > elif ovsdb-tool compare-versions "$db_version" ">" > > "$target_version"; then > > log_warning_msg "Database $schema_name has newer schema > > version ($db_version) than our local schema ($target_version), > > possibly an upgrade is partially complete?" > > else > > - action "Upgrading database $schema_name from schema > > version $db_version to $target_version" ovsdb-client -t 30 convert > > "$DB_SERVER" "$DB_SCHEMA" > > + action "Upgrading database $schema_name from schema > > version $db_version to $target_version" ovsdb-client -t > > $timeout_arg convert "$DB_SERVER" "$DB_SCHEMA" > > fi > > } > > > > -- > > 2.38.1 > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > Thank you for proposing this patch! > > We've been seeing reports of schema upgrades failing in the too and > have waited for some way to reproduce and see if this would be a fix. > > Are you seeing this with clustered databases, and could your problem > be related to the election timer? If it is, raising the client side > timer alone could be problematic. Yes clustered. A 450MB LB-heavy database generated by ovn-kubernetes with OVN 22.06 (which lacks DP groups for SB load balancers) was being upgraded from OVN 22.06 -> 22.09 (but using same OVS 2.17 version, so no change to ovsdb-server) and when the ovsdb-servers got restarted as part of the ovn-kube container upgrade, they took longer to read+parse the database than the 30s upgrade_cluster() timer, thus the container failed and was put in CrashLoopBackoff by Kubernetes. ovn-ctl was never able to update the schema, and thus 22.09 ovn-northd was never able to reduce the DB size by rewriting it to use DP groups for SB LBs and recover. This patch is really a workaround and needs a corresponding ovn-ctl patch to accept a timeout for the NB/SB DB start functions that our OpenShift container scripts would pass. The real fix is, like Ilya suggests, "reduce the size of the DB" as we've found that the most effective scale strategy for OpenShift and ovn-kubernetes. And I think that strategy has paid off tremendously over the last 2 years we've been working on OVN & ovsdb-server scale. Huge credit to Ilya and the OVN team for making that happen... Dan > > I recently raised a discussion about this on the list to figure out > possible paths forward [0][1]. > > 0: > https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html > 1: https://bugs.launchpad.net/bugs/1999605 >
On Thu, Dec 15, 2022 at 7:16 PM Dan Williams <dcbw@redhat.com> wrote: > On Thu, 2022-12-15 at 19:04 +0100, Frode Nordahl wrote: [ snip ] > > Thank you for proposing this patch! > > > > We've been seeing reports of schema upgrades failing in the too and > > have waited for some way to reproduce and see if this would be a fix. > > > > Are you seeing this with clustered databases, and could your problem > > be related to the election timer? If it is, raising the client side > > timer alone could be problematic. > > Yes clustered. > > A 450MB LB-heavy database generated by ovn-kubernetes with OVN 22.06 > (which lacks DP groups for SB load balancers) was being upgraded from > OVN 22.06 -> 22.09 (but using same OVS 2.17 version, so no change to > ovsdb-server) and when the ovsdb-servers got restarted as part of the > ovn-kube container upgrade, they took longer to read+parse the database > than the 30s upgrade_cluster() timer, thus the container failed and was > put in CrashLoopBackoff by Kubernetes. > > ovn-ctl was never able to update the schema, and thus 22.09 ovn-northd > was never able to reduce the DB size by rewriting it to use DP groups > for SB LBs and recover. > > This patch is really a workaround and needs a corresponding ovn-ctl > patch to accept a timeout for the NB/SB DB start functions that our > OpenShift container scripts would pass. Right, so you plan to couple the value of the ovsdb-client timeout to the value of the ovsdb-server election timer, that makes sense. > The real fix is, like Ilya suggests, "reduce the size of the DB" as > we've found that the most effective scale strategy for OpenShift and > ovn-kubernetes. And I think that strategy has paid off tremendously > over the last 2 years we've been working on OVN & ovsdb-server scale. > > Huge credit to Ilya and the OVN team for making that happen... Hear hear, wonders has been worked on in many of the OVS and OVN components over the past years, my hat off to everyone involved. The best place to continue the discussion would probably be in the thread below, but in short, I agree that reducing the size of the DB is of course the best medicine. However, this specific piece of the machinery is involved in upgrades, and upgrades are unfortunately required to move our shared user base forward to the new promised land of software that produces smaller databases :) > > I recently raised a discussion about this on the list to figure out > > possible paths forward [0][1]. > > > > 0: > > https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html > > 1: https://bugs.launchpad.net/bugs/1999605
On Thu, 2022-12-15 at 19:46 +0100, Frode Nordahl wrote: > On Thu, Dec 15, 2022 at 7:16 PM Dan Williams <dcbw@redhat.com> wrote: > > On Thu, 2022-12-15 at 19:04 +0100, Frode Nordahl wrote: > > [ snip ] > > > > Thank you for proposing this patch! > > > > > > We've been seeing reports of schema upgrades failing in the too > > > and > > > have waited for some way to reproduce and see if this would be a > > > fix. > > > > > > Are you seeing this with clustered databases, and could your > > > problem > > > be related to the election timer? If it is, raising the client > > > side > > > timer alone could be problematic. > > > > Yes clustered. > > > > A 450MB LB-heavy database generated by ovn-kubernetes with OVN > > 22.06 > > (which lacks DP groups for SB load balancers) was being upgraded > > from > > OVN 22.06 -> 22.09 (but using same OVS 2.17 version, so no change > > to > > ovsdb-server) and when the ovsdb-servers got restarted as part of > > the > > ovn-kube container upgrade, they took longer to read+parse the > > database > > than the 30s upgrade_cluster() timer, thus the container failed and > > was > > put in CrashLoopBackoff by Kubernetes. > > > > ovn-ctl was never able to update the schema, and thus 22.09 ovn- > > northd > > was never able to reduce the DB size by rewriting it to use DP > > groups > > for SB LBs and recover. > > > > This patch is really a workaround and needs a corresponding ovn-ctl > > patch to accept a timeout for the NB/SB DB start functions that our > > OpenShift container scripts would pass. > > Right, so you plan to couple the value of the ovsdb-client timeout to > the value of the ovsdb-server election timer, that makes sense. Actually no, I hadn't. AFAIK they aren't related. Election timer is more relevant while ovsdb-server is running (eg, number of connected client and what monitors they requested), while the timeout in this patch is only about how long the schema upgrade process waits for ovsdb-server to respond at startup (eg, size of the on-disk database it has to parse in and allocate memory for). FWIW our OpenShift + ovn-kubernetes election timers are 16 seconds and that works in clusters the same scale as the problem DB we have. But reading that DB in takes ~28 or 30 seconds. So I'd probably bump the DB startup timeout we use to > 45s since there are 2 other databases still running to take the load while the 3rd one is restarting. > > > The real fix is, like Ilya suggests, "reduce the size of the DB" as > > we've found that the most effective scale strategy for OpenShift > > and > > ovn-kubernetes. And I think that strategy has paid off tremendously > > over the last 2 years we've been working on OVN & ovsdb-server > > scale. > > > > Huge credit to Ilya and the OVN team for making that happen... > > Hear hear, wonders has been worked on in many of the OVS and OVN > components over the past years, my hat off to everyone involved. > > The best place to continue the discussion would probably be in the > thread below, but in short, I agree that reducing the size of the DB > is of course the best medicine. However, this specific piece of the > machinery is involved in upgrades, and upgrades are unfortunately > required to move our shared user base forward to the new promised > land > of software that produces smaller databases :) Yep. Maybe some of this can be backported to help ease that upgrade process. Dan > > > > I recently raised a discussion about this on the list to figure > > > out > > > possible paths forward [0][1]. > > > > > > 0: > > > https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html > > > 1: https://bugs.launchpad.net/bugs/1999605 >
tor. 15. des. 2022, 20:45 skrev Dan Williams <dcbw@redhat.com>: > On Thu, 2022-12-15 at 19:46 +0100, Frode Nordahl wrote: > > On Thu, Dec 15, 2022 at 7:16 PM Dan Williams <dcbw@redhat.com> wrote: > > > On Thu, 2022-12-15 at 19:04 +0100, Frode Nordahl wrote: > > > > [ snip ] > > > > > > Thank you for proposing this patch! > > > > > > > > We've been seeing reports of schema upgrades failing in the too > > > > and > > > > have waited for some way to reproduce and see if this would be a > > > > fix. > > > > > > > > Are you seeing this with clustered databases, and could your > > > > problem > > > > be related to the election timer? If it is, raising the client > > > > side > > > > timer alone could be problematic. > > > > > > Yes clustered. > > > > > > A 450MB LB-heavy database generated by ovn-kubernetes with OVN > > > 22.06 > > > (which lacks DP groups for SB load balancers) was being upgraded > > > from > > > OVN 22.06 -> 22.09 (but using same OVS 2.17 version, so no change > > > to > > > ovsdb-server) and when the ovsdb-servers got restarted as part of > > > the > > > ovn-kube container upgrade, they took longer to read+parse the > > > database > > > than the 30s upgrade_cluster() timer, thus the container failed and > > > was > > > put in CrashLoopBackoff by Kubernetes. > > > > > > ovn-ctl was never able to update the schema, and thus 22.09 ovn- > > > northd > > > was never able to reduce the DB size by rewriting it to use DP > > > groups > > > for SB LBs and recover. > > > > > > This patch is really a workaround and needs a corresponding ovn-ctl > > > patch to accept a timeout for the NB/SB DB start functions that our > > > OpenShift container scripts would pass. > > > > Right, so you plan to couple the value of the ovsdb-client timeout to > > the value of the ovsdb-server election timer, that makes sense. > > Actually no, I hadn't. AFAIK they aren't related. Election timer is > more relevant while ovsdb-server is running (eg, number of connected > client and what monitors they requested), while the timeout in this > patch is only about how long the schema upgrade process waits for > ovsdb-server to respond at startup (eg, size of the on-disk database it > has to parse in and allocate memory for). > This is the crux of the issue, as laid out in the discussion [0] and bug report [1]. At this point in time, the time spent on a online schema conversion in a clustered database cannot exceed the server side election timer, increasing the client side timer in isolation will only prolong the amount of time one of the cluster nodes will spend retrying the conversion, not having anywhere to write the result. > FWIW our OpenShift + ovn-kubernetes election timers are 16 seconds and > that works in clusters the same scale as the problem DB we have. But > reading that DB in takes ~28 or 30 seconds. So I'd probably bump the DB > startup timeout we use to > 45s since there are 2 other databases still > running to take the load while the 3rd one is restarting. > That is reasonable, but the conversion is happening on the server side for a clustered database, so you would also need to increase the election timer. > > > > > The real fix is, like Ilya suggests, "reduce the size of the DB" as > > > we've found that the most effective scale strategy for OpenShift > > > and > > > ovn-kubernetes. And I think that strategy has paid off tremendously > > > over the last 2 years we've been working on OVN & ovsdb-server > > > scale. > > > > > > Huge credit to Ilya and the OVN team for making that happen... > > > > Hear hear, wonders has been worked on in many of the OVS and OVN > > components over the past years, my hat off to everyone involved. > > > > The best place to continue the discussion would probably be in the > > thread below, but in short, I agree that reducing the size of the DB > > is of course the best medicine. However, this specific piece of the > > machinery is involved in upgrades, and upgrades are unfortunately > > required to move our shared user base forward to the new promised > > land > > of software that produces smaller databases :) > > Yep. Maybe some of this can be backported to help ease that upgrade > process. > Luckily, the schema conversion process can often be done as the first thing after an upgrade, so any changes here would definitively be easier to get out there. -- Frode Nordahl Dan > > > > > > > I recently raised a discussion about this on the list to figure > > > > out > > > > possible paths forward [0][1]. > > > > > > > > 0: > > > > > https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html > > > > 1: https://bugs.launchpad.net/bugs/1999605 > > > >
On Thu, Dec 15, 2022 at 11:57:27AM -0600, Dan Williams wrote:
> Signed-off-by: Dan Williams <dcbw@redhat.com>
Hi Dan,
This patch appears to have gone stale in patchwork, for one reason or
another. If it is still relevant then I think it needs to be revisited,
by being reposted after appropriate preparation.
As such I'm marking this patch as "Deferred" in patchwork.
No action is required unless there is a desire to revisit this patch.
diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in index 13477a6a9e991..36e63312b6cba 100644 --- a/utilities/ovs-lib.in +++ b/utilities/ovs-lib.in @@ -475,11 +475,16 @@ upgrade_db () { } upgrade_cluster () { - local DB_SCHEMA=$1 DB_SERVER=$2 + local DB_SCHEMA=$1 DB_SERVER=$2 TIMEOUT_SECONDS=$3 local schema_name=$(ovsdb-tool schema-name $1) || return 1 - action "Waiting for $schema_name to come up" ovsdb-client -t 30 wait "$DB_SERVER" "$schema_name" connected || return $? - local db_version=$(ovsdb-client -t 10 get-schema-version "$DB_SERVER" "$schema_name") || return $? + timeout_arg=30 + if [ -n "$TIMEOUT_SECONDS" ]; then + timeout_arg="$TIMEOUT_SECONDS" + fi + + action "Waiting for $schema_name to come up" ovsdb-client -t $timeout_arg wait "$DB_SERVER" "$schema_name" connected || return $? + local db_version=$(ovsdb-client -t $timeout_arg get-schema-version "$DB_SERVER" "$schema_name") || return $? local target_version=$(ovsdb-tool schema-version "$DB_SCHEMA") || return $? if ovsdb-tool compare-versions "$db_version" == "$target_version"; then @@ -487,7 +492,7 @@ upgrade_cluster () { elif ovsdb-tool compare-versions "$db_version" ">" "$target_version"; then log_warning_msg "Database $schema_name has newer schema version ($db_version) than our local schema ($target_version), possibly an upgrade is partially complete?" else - action "Upgrading database $schema_name from schema version $db_version to $target_version" ovsdb-client -t 30 convert "$DB_SERVER" "$DB_SCHEMA" + action "Upgrading database $schema_name from schema version $db_version to $target_version" ovsdb-client -t $timeout_arg convert "$DB_SERVER" "$DB_SCHEMA" fi }
Signed-off-by: Dan Williams <dcbw@redhat.com> --- utilities/ovs-lib.in | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)