[ovs-dev,ovn] Fix testsuite 85 - "ensure one gw controller restart in HA doesn't bounce the master ok".
diff mbox series

Message ID 20191111071434.1034644-1-numans@ovn.org
State Accepted
Headers show
Series
  • [ovs-dev,ovn] Fix testsuite 85 - "ensure one gw controller restart in HA doesn't bounce the master ok".
Related show

Commit Message

Numan Siddique Nov. 11, 2019, 7:14 a.m. UTC
From: Numan Siddique <numans@ovn.org>

This testsuite is failing frequently in travis CI and locally when tests are run with "-j5".

The test case deletes the chassis row for chassis 'gw2' and expects that this
doesn't cause the chassisresident port on the master ('gw1') to not bounce as 'gw1'
has higher priority than gw2. But since the ha chassis group has only 2 chassis,
'gw2' can claim the port momemtarily when 'gw2' chassis row is recreated back until
BFD session with 'gw1' is not established.

This patch changes the test assertion approach and makes sure that 'gw1' is the
owner of the chassisresident port eventually.

Signed-off-by: Numan Siddique <numans@ovn.org>
---
 tests/ovn.at | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

Comments

Dumitru Ceara Nov. 12, 2019, 11:27 a.m. UTC | #1
On Mon, Nov 11, 2019 at 8:15 AM <numans@ovn.org> wrote:
>
> From: Numan Siddique <numans@ovn.org>
>
> This testsuite is failing frequently in travis CI and locally when tests are run with "-j5".
>
> The test case deletes the chassis row for chassis 'gw2' and expects that this
> doesn't cause the chassisresident port on the master ('gw1') to not bounce as 'gw1'
> has higher priority than gw2. But since the ha chassis group has only 2 chassis,
> 'gw2' can claim the port momemtarily when 'gw2' chassis row is recreated back until
> BFD session with 'gw1' is not established.
>
> This patch changes the test assertion approach and makes sure that 'gw1' is the
> owner of the chassisresident port eventually.
>
> Signed-off-by: Numan Siddique <numans@ovn.org>

Looks good to me. I ran the autotests and everything passes. I also
executed test 85 in a 20 iterations loop and passed every time.

A small typo in the commit message: s/momemtarily/momentarily. Also,
(I'm not sure about this but) maybe s/recreated back/recreated.
Otherwise:

Acked-by: Dumitru Ceara <dceara@redhat.com>

Thanks,
Dumitru

> ---
>  tests/ovn.at | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/tests/ovn.at b/tests/ovn.at
> index cb7903db8..35ffaf331 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -11033,12 +11033,26 @@ ovn-nbctl --wait=hv --timeout=3 sync
>  # doesn't have the same effect because "name" is conserved, and the
>  # Chassis entry is not replaced.
>
> -> gw1/ovn-controller.log
> -
>  gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
>  ovn-sbctl destroy Chassis $gw2_chassis
>
> -OVS_WAIT_UNTIL([test 0 = `grep -c "Releasing lport" gw1/ovn-controller.log`])
> +# Wait for the gw2_chassis row is recreated.
> +OVS_WAIT_UNTIL([test 1 = `ovn-sbctl --bare --columns=_uuid find Chassis name=gw2 | wc -l`])
> +
> +gw1_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw1)
> +gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
> +
> +# When gw2 chassis row is destroyed, it gets recreated. There
> +# is a small window in which gw2 may claim the cr-outside port if
> +# it has not established bfd tunnel with gw1.
> +# So make sure that, cr-outside is claimed by gw1 finally.
> +OVS_WAIT_WHILE(
> +    [cr_outside_ch=`ovn-sbctl --bare --columns=chassis find Port_binding logical_port=cr-outside`
> +     test $cr_outside_ch = $gw2_chassis])
> +
> +OVS_WAIT_UNTIL(
> +    [cr_outside_ch=`ovn-sbctl --bare --columns=chassis find Port_binding logical_port=cr-outside`
> +     test $cr_outside_ch = $gw1_chassis])
>
>  OVN_CLEANUP([gw1],[gw2],[hv1])
>
> --
> 2.23.0
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Lorenzo Bianconi Nov. 12, 2019, 11:43 a.m. UTC | #2
> From: Numan Siddique <numans@ovn.org>
> 
> This testsuite is failing frequently in travis CI and locally when tests are run with "-j5".
> 
> The test case deletes the chassis row for chassis 'gw2' and expects that this
> doesn't cause the chassisresident port on the master ('gw1') to not bounce as 'gw1'
> has higher priority than gw2. But since the ha chassis group has only 2 chassis,
> 'gw2' can claim the port momemtarily when 'gw2' chassis row is recreated back until
> BFD session with 'gw1' is not established.
> 
> This patch changes the test assertion approach and makes sure that 'gw1' is the
> owner of the chassisresident port eventually.
> 

Tested-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>

> Signed-off-by: Numan Siddique <numans@ovn.org>
> ---
>  tests/ovn.at | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/tests/ovn.at b/tests/ovn.at
> index cb7903db8..35ffaf331 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -11033,12 +11033,26 @@ ovn-nbctl --wait=hv --timeout=3 sync
>  # doesn't have the same effect because "name" is conserved, and the
>  # Chassis entry is not replaced.
>  
> -> gw1/ovn-controller.log
> -
>  gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
>  ovn-sbctl destroy Chassis $gw2_chassis
>  
> -OVS_WAIT_UNTIL([test 0 = `grep -c "Releasing lport" gw1/ovn-controller.log`])
> +# Wait for the gw2_chassis row is recreated.
> +OVS_WAIT_UNTIL([test 1 = `ovn-sbctl --bare --columns=_uuid find Chassis name=gw2 | wc -l`])
> +
> +gw1_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw1)
> +gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
> +
> +# When gw2 chassis row is destroyed, it gets recreated. There
> +# is a small window in which gw2 may claim the cr-outside port if
> +# it has not established bfd tunnel with gw1.
> +# So make sure that, cr-outside is claimed by gw1 finally.
> +OVS_WAIT_WHILE(
> +    [cr_outside_ch=`ovn-sbctl --bare --columns=chassis find Port_binding logical_port=cr-outside`
> +     test $cr_outside_ch = $gw2_chassis])
> +
> +OVS_WAIT_UNTIL(
> +    [cr_outside_ch=`ovn-sbctl --bare --columns=chassis find Port_binding logical_port=cr-outside`
> +     test $cr_outside_ch = $gw1_chassis])
>  
>  OVN_CLEANUP([gw1],[gw2],[hv1])
>  
> -- 
> 2.23.0
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Numan Siddique Nov. 12, 2019, 12:06 p.m. UTC | #3
On Tue, Nov 12, 2019 at 4:58 PM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On Mon, Nov 11, 2019 at 8:15 AM <numans@ovn.org> wrote:
> >
> > From: Numan Siddique <numans@ovn.org>
> >
> > This testsuite is failing frequently in travis CI and locally when tests are run with "-j5".
> >
> > The test case deletes the chassis row for chassis 'gw2' and expects that this
> > doesn't cause the chassisresident port on the master ('gw1') to not bounce as 'gw1'
> > has higher priority than gw2. But since the ha chassis group has only 2 chassis,
> > 'gw2' can claim the port momemtarily when 'gw2' chassis row is recreated back until
> > BFD session with 'gw1' is not established.
> >
> > This patch changes the test assertion approach and makes sure that 'gw1' is the
> > owner of the chassisresident port eventually.
> >
> > Signed-off-by: Numan Siddique <numans@ovn.org>
>
> Looks good to me. I ran the autotests and everything passes. I also
> executed test 85 in a 20 iterations loop and passed every time.
>
> A small typo in the commit message: s/momemtarily/momentarily. Also,
> (I'm not sure about this but) maybe s/recreated back/recreated.
> Otherwise:
>
> Acked-by: Dumitru Ceara <dceara@redhat.com>


Thanks Dumitru and Lorenzo for the reviews.
I applied this patch to master correcting the typos in the commit message.

Thanks
Numan

>
> Thanks,
> Dumitru
>
> > ---
> >  tests/ovn.at | 20 +++++++++++++++++---
> >  1 file changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index cb7903db8..35ffaf331 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -11033,12 +11033,26 @@ ovn-nbctl --wait=hv --timeout=3 sync
> >  # doesn't have the same effect because "name" is conserved, and the
> >  # Chassis entry is not replaced.
> >
> > -> gw1/ovn-controller.log
> > -
> >  gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
> >  ovn-sbctl destroy Chassis $gw2_chassis
> >
> > -OVS_WAIT_UNTIL([test 0 = `grep -c "Releasing lport" gw1/ovn-controller.log`])
> > +# Wait for the gw2_chassis row is recreated.
> > +OVS_WAIT_UNTIL([test 1 = `ovn-sbctl --bare --columns=_uuid find Chassis name=gw2 | wc -l`])
> > +
> > +gw1_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw1)
> > +gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
> > +
> > +# When gw2 chassis row is destroyed, it gets recreated. There
> > +# is a small window in which gw2 may claim the cr-outside port if
> > +# it has not established bfd tunnel with gw1.
> > +# So make sure that, cr-outside is claimed by gw1 finally.
> > +OVS_WAIT_WHILE(
> > +    [cr_outside_ch=`ovn-sbctl --bare --columns=chassis find Port_binding logical_port=cr-outside`
> > +     test $cr_outside_ch = $gw2_chassis])
> > +
> > +OVS_WAIT_UNTIL(
> > +    [cr_outside_ch=`ovn-sbctl --bare --columns=chassis find Port_binding logical_port=cr-outside`
> > +     test $cr_outside_ch = $gw1_chassis])
> >
> >  OVN_CLEANUP([gw1],[gw2],[hv1])
> >
> > --
> > 2.23.0
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>

Patch
diff mbox series

diff --git a/tests/ovn.at b/tests/ovn.at
index cb7903db8..35ffaf331 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -11033,12 +11033,26 @@  ovn-nbctl --wait=hv --timeout=3 sync
 # doesn't have the same effect because "name" is conserved, and the
 # Chassis entry is not replaced.
 
-> gw1/ovn-controller.log
-
 gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
 ovn-sbctl destroy Chassis $gw2_chassis
 
-OVS_WAIT_UNTIL([test 0 = `grep -c "Releasing lport" gw1/ovn-controller.log`])
+# Wait for the gw2_chassis row is recreated.
+OVS_WAIT_UNTIL([test 1 = `ovn-sbctl --bare --columns=_uuid find Chassis name=gw2 | wc -l`])
+
+gw1_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw1)
+gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
+
+# When gw2 chassis row is destroyed, it gets recreated. There
+# is a small window in which gw2 may claim the cr-outside port if
+# it has not established bfd tunnel with gw1.
+# So make sure that, cr-outside is claimed by gw1 finally.
+OVS_WAIT_WHILE(
+    [cr_outside_ch=`ovn-sbctl --bare --columns=chassis find Port_binding logical_port=cr-outside`
+     test $cr_outside_ch = $gw2_chassis])
+
+OVS_WAIT_UNTIL(
+    [cr_outside_ch=`ovn-sbctl --bare --columns=chassis find Port_binding logical_port=cr-outside`
+     test $cr_outside_ch = $gw1_chassis])
 
 OVN_CLEANUP([gw1],[gw2],[hv1])