diff mbox series

[ovs-dev,v2] ofctrl: Wait at S_WAIT_BEFORE_CLEAR only once.

Message ID 20240328065835.789596-1-hzhou@ovn.org
State Accepted
Headers show
Series [ovs-dev,v2] ofctrl: Wait at S_WAIT_BEFORE_CLEAR only once. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Han Zhou March 28, 2024, 6:58 a.m. UTC
The ovn-ofctrl-wait-before-clear setting is designed to minimize
downtime during the initial start-up of the ovn-controller. For this
purpose, the ovn-controller should wait only once upon entering the
S_WAIT_BEFORE_CLEAR state for the first time. Subsequent reconnections
to the OVS, such as those occurring during an OVS restart/upgrade,
should not trigger this wait. However, the current implemention always
waits for the configured time in the S_WAIT_BEFORE_CLEAR state, which
can inadvertently delay flow installations during OVS restart/upgrade,
potentially causing more harm than good. (The extent of the impact
varies based on the method used to restart OVS, including whether flow
save/restore tools and the flow-restore-wait feature are employed.)

This patch avoids the unnecessary wait after the initial one.

Fixes: 896adfd2d8b3 ("ofctrl: Support ovn-ofctrl-wait-before-clear to reduce down time during upgrade.")
Signed-off-by: Han Zhou <hzhou@ovn.org>
---
v2: Addressed Mark's comments - made test case more reliable.

 controller/ofctrl.c     | 1 -
 tests/ovn-controller.at | 9 +++++++--
 2 files changed, 7 insertions(+), 3 deletions(-)

Comments

Mark Michelson March 28, 2024, 8:29 p.m. UTC | #1
Thanks Han,

Acked-by: Mark Michelson <mmichels@redhat.com>

On 3/28/24 02:58, Han Zhou wrote:
> The ovn-ofctrl-wait-before-clear setting is designed to minimize
> downtime during the initial start-up of the ovn-controller. For this
> purpose, the ovn-controller should wait only once upon entering the
> S_WAIT_BEFORE_CLEAR state for the first time. Subsequent reconnections
> to the OVS, such as those occurring during an OVS restart/upgrade,
> should not trigger this wait. However, the current implemention always
> waits for the configured time in the S_WAIT_BEFORE_CLEAR state, which
> can inadvertently delay flow installations during OVS restart/upgrade,
> potentially causing more harm than good. (The extent of the impact
> varies based on the method used to restart OVS, including whether flow
> save/restore tools and the flow-restore-wait feature are employed.)
> 
> This patch avoids the unnecessary wait after the initial one.
> 
> Fixes: 896adfd2d8b3 ("ofctrl: Support ovn-ofctrl-wait-before-clear to reduce down time during upgrade.")
> Signed-off-by: Han Zhou <hzhou@ovn.org>
> ---
> v2: Addressed Mark's comments - made test case more reliable.
> 
>   controller/ofctrl.c     | 1 -
>   tests/ovn-controller.at | 9 +++++++--
>   2 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/controller/ofctrl.c b/controller/ofctrl.c
> index 6ca2ea4ce63d..6a2564604aa1 100644
> --- a/controller/ofctrl.c
> +++ b/controller/ofctrl.c
> @@ -634,7 +634,6 @@ run_S_WAIT_BEFORE_CLEAR(void)
>       if (!wait_before_clear_time ||
>           (wait_before_clear_expire &&
>            time_msec() >= wait_before_clear_expire)) {
> -        wait_before_clear_expire = 0;
>           state = S_CLEAR_FLOWS;
>           return;
>       }
> diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
> index fdcc5aab2bcf..3202f0beff46 100644
> --- a/tests/ovn-controller.at
> +++ b/tests/ovn-controller.at
> @@ -2324,10 +2324,15 @@ lflow_run_2=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run)
>   AT_CHECK_UNQUOTED([echo $lflow_run_1], [0], [$lflow_run_2
>   ])
>   
> -# Restart OVS this time, and wait until flows are reinstalled
> +# Restart OVS this time. Flows should be reinstalled without waiting.
>   OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
>   start_daemon ovs-vswitchd --enable-dummy=system -vvconn -vofproto_dpif -vunixctl
> -OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep -vF 2.2.2.2])
> +
> +# Sync to make sure ovn-controller is given enough time to install the flows.
> +check ovn-nbctl --wait=hv sync
> +
> +# Flow should be installed without any extra waiting.
> +AT_CHECK([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep -vF 2.2.2.2], [0], [ignore])
>   
>   check ovn-nbctl --wait=hv lb-add lb3 3.3.3.3 10.1.2.5 \
>   -- ls-lb-add ls1 lb3
Han Zhou April 3, 2024, 5:48 a.m. UTC | #2
On Thu, Mar 28, 2024 at 1:29 PM Mark Michelson <mmichels@redhat.com> wrote:
>
> Thanks Han,
>
> Acked-by: Mark Michelson <mmichels@redhat.com>

Thanks Mark. Applied to main.

Han
>
> On 3/28/24 02:58, Han Zhou wrote:
> > The ovn-ofctrl-wait-before-clear setting is designed to minimize
> > downtime during the initial start-up of the ovn-controller. For this
> > purpose, the ovn-controller should wait only once upon entering the
> > S_WAIT_BEFORE_CLEAR state for the first time. Subsequent reconnections
> > to the OVS, such as those occurring during an OVS restart/upgrade,
> > should not trigger this wait. However, the current implemention always
> > waits for the configured time in the S_WAIT_BEFORE_CLEAR state, which
> > can inadvertently delay flow installations during OVS restart/upgrade,
> > potentially causing more harm than good. (The extent of the impact
> > varies based on the method used to restart OVS, including whether flow
> > save/restore tools and the flow-restore-wait feature are employed.)
> >
> > This patch avoids the unnecessary wait after the initial one.
> >
> > Fixes: 896adfd2d8b3 ("ofctrl: Support ovn-ofctrl-wait-before-clear to
reduce down time during upgrade.")
> > Signed-off-by: Han Zhou <hzhou@ovn.org>
> > ---
> > v2: Addressed Mark's comments - made test case more reliable.
> >
> >   controller/ofctrl.c     | 1 -
> >   tests/ovn-controller.at | 9 +++++++--
> >   2 files changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/controller/ofctrl.c b/controller/ofctrl.c
> > index 6ca2ea4ce63d..6a2564604aa1 100644
> > --- a/controller/ofctrl.c
> > +++ b/controller/ofctrl.c
> > @@ -634,7 +634,6 @@ run_S_WAIT_BEFORE_CLEAR(void)
> >       if (!wait_before_clear_time ||
> >           (wait_before_clear_expire &&
> >            time_msec() >= wait_before_clear_expire)) {
> > -        wait_before_clear_expire = 0;
> >           state = S_CLEAR_FLOWS;
> >           return;
> >       }
> > diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
> > index fdcc5aab2bcf..3202f0beff46 100644
> > --- a/tests/ovn-controller.at
> > +++ b/tests/ovn-controller.at
> > @@ -2324,10 +2324,15 @@ lflow_run_2=$(ovn-appctl -t ovn-controller
coverage/read-counter lflow_run)
> >   AT_CHECK_UNQUOTED([echo $lflow_run_1], [0], [$lflow_run_2
> >   ])
> >
> > -# Restart OVS this time, and wait until flows are reinstalled
> > +# Restart OVS this time. Flows should be reinstalled without waiting.
> >   OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
> >   start_daemon ovs-vswitchd --enable-dummy=system -vvconn
-vofproto_dpif -vunixctl
> > -OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep
-vF 2.2.2.2])
> > +
> > +# Sync to make sure ovn-controller is given enough time to install the
flows.
> > +check ovn-nbctl --wait=hv sync
> > +
> > +# Flow should be installed without any extra waiting.
> > +AT_CHECK([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep -vF
2.2.2.2], [0], [ignore])
> >
> >   check ovn-nbctl --wait=hv lb-add lb3 3.3.3.3 10.1.2.5 \
> >   -- ls-lb-add ls1 lb3
>
Han Zhou April 3, 2024, 6:11 a.m. UTC | #3
On Tue, Apr 2, 2024 at 10:48 PM Han Zhou <hzhou@ovn.org> wrote:
>
>
>
> On Thu, Mar 28, 2024 at 1:29 PM Mark Michelson <mmichels@redhat.com>
wrote:
> >
> > Thanks Han,
> >
> > Acked-by: Mark Michelson <mmichels@redhat.com>
>
> Thanks Mark. Applied to main.

Also backported down to branch-23.06

Han
>
> Han
> >
> > On 3/28/24 02:58, Han Zhou wrote:
> > > The ovn-ofctrl-wait-before-clear setting is designed to minimize
> > > downtime during the initial start-up of the ovn-controller. For this
> > > purpose, the ovn-controller should wait only once upon entering the
> > > S_WAIT_BEFORE_CLEAR state for the first time. Subsequent reconnections
> > > to the OVS, such as those occurring during an OVS restart/upgrade,
> > > should not trigger this wait. However, the current implemention always
> > > waits for the configured time in the S_WAIT_BEFORE_CLEAR state, which
> > > can inadvertently delay flow installations during OVS restart/upgrade,
> > > potentially causing more harm than good. (The extent of the impact
> > > varies based on the method used to restart OVS, including whether flow
> > > save/restore tools and the flow-restore-wait feature are employed.)
> > >
> > > This patch avoids the unnecessary wait after the initial one.
> > >
> > > Fixes: 896adfd2d8b3 ("ofctrl: Support ovn-ofctrl-wait-before-clear to
reduce down time during upgrade.")
> > > Signed-off-by: Han Zhou <hzhou@ovn.org>
> > > ---
> > > v2: Addressed Mark's comments - made test case more reliable.
> > >
> > >   controller/ofctrl.c     | 1 -
> > >   tests/ovn-controller.at | 9 +++++++--
> > >   2 files changed, 7 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/controller/ofctrl.c b/controller/ofctrl.c
> > > index 6ca2ea4ce63d..6a2564604aa1 100644
> > > --- a/controller/ofctrl.c
> > > +++ b/controller/ofctrl.c
> > > @@ -634,7 +634,6 @@ run_S_WAIT_BEFORE_CLEAR(void)
> > >       if (!wait_before_clear_time ||
> > >           (wait_before_clear_expire &&
> > >            time_msec() >= wait_before_clear_expire)) {
> > > -        wait_before_clear_expire = 0;
> > >           state = S_CLEAR_FLOWS;
> > >           return;
> > >       }
> > > diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
> > > index fdcc5aab2bcf..3202f0beff46 100644
> > > --- a/tests/ovn-controller.at
> > > +++ b/tests/ovn-controller.at
> > > @@ -2324,10 +2324,15 @@ lflow_run_2=$(ovn-appctl -t ovn-controller
coverage/read-counter lflow_run)
> > >   AT_CHECK_UNQUOTED([echo $lflow_run_1], [0], [$lflow_run_2
> > >   ])
> > >
> > > -# Restart OVS this time, and wait until flows are reinstalled
> > > +# Restart OVS this time. Flows should be reinstalled without waiting.
> > >   OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
> > >   start_daemon ovs-vswitchd --enable-dummy=system -vvconn
-vofproto_dpif -vunixctl
> > > -OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 |
grep -vF 2.2.2.2])
> > > +
> > > +# Sync to make sure ovn-controller is given enough time to install
the flows.
> > > +check ovn-nbctl --wait=hv sync
> > > +
> > > +# Flow should be installed without any extra waiting.
> > > +AT_CHECK([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep -vF
2.2.2.2], [0], [ignore])
> > >
> > >   check ovn-nbctl --wait=hv lb-add lb3 3.3.3.3 10.1.2.5 \
> > >   -- ls-lb-add ls1 lb3
> >
Han Zhou April 3, 2024, 7:11 a.m. UTC | #4
On Tue, Apr 2, 2024 at 11:11 PM Han Zhou <hzhou@ovn.org> wrote:
>
>
>
> On Tue, Apr 2, 2024 at 10:48 PM Han Zhou <hzhou@ovn.org> wrote:
> >
> >
> >
> > On Thu, Mar 28, 2024 at 1:29 PM Mark Michelson <mmichels@redhat.com>
wrote:
> > >
> > > Thanks Han,
> > >
> > > Acked-by: Mark Michelson <mmichels@redhat.com>
> >
> > Thanks Mark. Applied to main.
>
> Also backported down to branch-23.06
>
Sorry that I found the test not stable in CI, although I am not able to
reproduce in my local environment.
With a quick look I realized that "check ovn-nbctl --wait=hv sync" may not
be sufficient to ensure the flows are installed after OVS restart. I will
try to find a way to fix this tomorrow.

Thanks,
Han

> Han
> >
> > Han
> > >
> > > On 3/28/24 02:58, Han Zhou wrote:
> > > > The ovn-ofctrl-wait-before-clear setting is designed to minimize
> > > > downtime during the initial start-up of the ovn-controller. For this
> > > > purpose, the ovn-controller should wait only once upon entering the
> > > > S_WAIT_BEFORE_CLEAR state for the first time. Subsequent
reconnections
> > > > to the OVS, such as those occurring during an OVS restart/upgrade,
> > > > should not trigger this wait. However, the current implemention
always
> > > > waits for the configured time in the S_WAIT_BEFORE_CLEAR state,
which
> > > > can inadvertently delay flow installations during OVS
restart/upgrade,
> > > > potentially causing more harm than good. (The extent of the impact
> > > > varies based on the method used to restart OVS, including whether
flow
> > > > save/restore tools and the flow-restore-wait feature are employed.)
> > > >
> > > > This patch avoids the unnecessary wait after the initial one.
> > > >
> > > > Fixes: 896adfd2d8b3 ("ofctrl: Support ovn-ofctrl-wait-before-clear
to reduce down time during upgrade.")
> > > > Signed-off-by: Han Zhou <hzhou@ovn.org>
> > > > ---
> > > > v2: Addressed Mark's comments - made test case more reliable.
> > > >
> > > >   controller/ofctrl.c     | 1 -
> > > >   tests/ovn-controller.at | 9 +++++++--
> > > >   2 files changed, 7 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/controller/ofctrl.c b/controller/ofctrl.c
> > > > index 6ca2ea4ce63d..6a2564604aa1 100644
> > > > --- a/controller/ofctrl.c
> > > > +++ b/controller/ofctrl.c
> > > > @@ -634,7 +634,6 @@ run_S_WAIT_BEFORE_CLEAR(void)
> > > >       if (!wait_before_clear_time ||
> > > >           (wait_before_clear_expire &&
> > > >            time_msec() >= wait_before_clear_expire)) {
> > > > -        wait_before_clear_expire = 0;
> > > >           state = S_CLEAR_FLOWS;
> > > >           return;
> > > >       }
> > > > diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
> > > > index fdcc5aab2bcf..3202f0beff46 100644
> > > > --- a/tests/ovn-controller.at
> > > > +++ b/tests/ovn-controller.at
> > > > @@ -2324,10 +2324,15 @@ lflow_run_2=$(ovn-appctl -t ovn-controller
coverage/read-counter lflow_run)
> > > >   AT_CHECK_UNQUOTED([echo $lflow_run_1], [0], [$lflow_run_2
> > > >   ])
> > > >
> > > > -# Restart OVS this time, and wait until flows are reinstalled
> > > > +# Restart OVS this time. Flows should be reinstalled without
waiting.
> > > >   OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
> > > >   start_daemon ovs-vswitchd --enable-dummy=system -vvconn
-vofproto_dpif -vunixctl
> > > > -OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 |
grep -vF 2.2.2.2])
> > > > +
> > > > +# Sync to make sure ovn-controller is given enough time to install
the flows.
> > > > +check ovn-nbctl --wait=hv sync
> > > > +
> > > > +# Flow should be installed without any extra waiting.
> > > > +AT_CHECK([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep
-vF 2.2.2.2], [0], [ignore])
> > > >
> > > >   check ovn-nbctl --wait=hv lb-add lb3 3.3.3.3 10.1.2.5 \
> > > >   -- ls-lb-add ls1 lb3
> > >
diff mbox series

Patch

diff --git a/controller/ofctrl.c b/controller/ofctrl.c
index 6ca2ea4ce63d..6a2564604aa1 100644
--- a/controller/ofctrl.c
+++ b/controller/ofctrl.c
@@ -634,7 +634,6 @@  run_S_WAIT_BEFORE_CLEAR(void)
     if (!wait_before_clear_time ||
         (wait_before_clear_expire &&
          time_msec() >= wait_before_clear_expire)) {
-        wait_before_clear_expire = 0;
         state = S_CLEAR_FLOWS;
         return;
     }
diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
index fdcc5aab2bcf..3202f0beff46 100644
--- a/tests/ovn-controller.at
+++ b/tests/ovn-controller.at
@@ -2324,10 +2324,15 @@  lflow_run_2=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run)
 AT_CHECK_UNQUOTED([echo $lflow_run_1], [0], [$lflow_run_2
 ])
 
-# Restart OVS this time, and wait until flows are reinstalled
+# Restart OVS this time. Flows should be reinstalled without waiting.
 OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
 start_daemon ovs-vswitchd --enable-dummy=system -vvconn -vofproto_dpif -vunixctl
-OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep -vF 2.2.2.2])
+
+# Sync to make sure ovn-controller is given enough time to install the flows.
+check ovn-nbctl --wait=hv sync
+
+# Flow should be installed without any extra waiting.
+AT_CHECK([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep -vF 2.2.2.2], [0], [ignore])
 
 check ovn-nbctl --wait=hv lb-add lb3 3.3.3.3 10.1.2.5 \
 -- ls-lb-add ls1 lb3