diff mbox series

[ovs-dev] ovn.at: Fix occasional failure - IPv6 Neighbor Solicitation for unknown MAC

Message ID 1526765002-28725-1-git-send-email-hzhou8@ebay.com
State Changes Requested
Headers show
Series [ovs-dev] ovn.at: Fix occasional failure - IPv6 Neighbor Solicitation for unknown MAC | expand

Commit Message

Han Zhou May 19, 2018, 9:23 p.m. UTC
This case fails occasionally because although it waits until port
binding is completed on HV, the patch port creation may not be
completed yet on HV for the localnet port, so if the packets are sent
out at this moment, the case will fail. This patch ensures patch
port is created and then do another sync before sending the packet
so that the ovn-controller is given a chance to handle the change
and install related flows to OVS.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
---
 tests/ovn.at | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Numan Siddique May 21, 2018, 9:19 a.m. UTC | #1
On Sun, May 20, 2018 at 2:53 AM, Han Zhou <zhouhan@gmail.com> wrote:

> This case fails occasionally because although it waits until port
> binding is completed on HV, the patch port creation may not be
> completed yet on HV for the localnet port, so if the packets are sent
> out at this moment, the case will fail. This patch ensures patch
> port is created and then do another sync before sending the packet
> so that the ovn-controller is given a chance to handle the change
> and install related flows to OVS.
>
> Signed-off-by: Han Zhou <hzhou8@ebay.com>
>

Acked-by: Numan Siddique <nusiddiq@redhat.com>



> ---
>  tests/ovn.at | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 8d9519e..2187f48 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -9111,6 +9111,8 @@ ovs-vsctl -- add-port br-int hv1-vif1 -- \
>      ofport-request=1
>  ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
>
> +OVS_WAIT_UNTIL([ovs-vsctl show | grep 'patch-ln-public-to-br-int'])
> +
>  OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0_ip6-port1` = xup])
>  cr_uuid=`ovn-sbctl find port_binding logical_port=cr-ip6_public | grep
> _uuid | cut -f2 -d ":"`
>
> @@ -9118,6 +9120,9 @@ cr_uuid=`ovn-sbctl find port_binding
> logical_port=cr-ip6_public | grep _uuid | c
>  chassis_uuid=`ovn-sbctl list chassis | grep _uuid | cut -f2 -d ":"`
>  OVS_WAIT_UNTIL([test $chassis_uuid = `ovn-sbctl get port_binding $cr_uuid
> chassis`])
>
> +# Allow some time for ovn-northd and ovn-controller to catch up.
> +ovn-nbctl --wait=hv --timeout=3 sync
> +
>  trim_zeros() {
>      sed 's/\(00\)\{1,\}$//'
>  }
> --
> 2.1.0
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Ben Pfaff May 23, 2018, 8:08 p.m. UTC | #2
On Sat, May 19, 2018 at 02:23:22PM -0700, Han Zhou wrote:
> This case fails occasionally because although it waits until port
> binding is completed on HV, the patch port creation may not be
> completed yet on HV for the localnet port, so if the packets are sent
> out at this moment, the case will fail. This patch ensures patch
> port is created and then do another sync before sending the packet
> so that the ovn-controller is given a chance to handle the change
> and install related flows to OVS.
> 
> Signed-off-by: Han Zhou <hzhou8@ebay.com>

It sounds to me like this is actually a bug in ovn-controller: it should
not report that it is up-to-date until the patch port creation has
completed.  There is a mechanism to find out that this is complete, and
ovn-controller should use it.  The outline of how it would be done is
something like this:

    * When ovn-controller commits a change to the OVS database, it
      should also increment the next_cfg column in the Open_vSwitch
      table.  ovs-vsctl does this, with:

        ovsdb_idl_txn_increment(txn, &ovs->header_,
                                &ovsrec_open_vswitch_col_next_cfg, false);

    * Before ovn-controller updates nb_cfg, it waits until cur_cfg in
      the Open_vSwitch table reaches at least the next_cfg that it set.
Han Zhou May 23, 2018, 8:56 p.m. UTC | #3
On Wed, May 23, 2018 at 1:08 PM, Ben Pfaff <blp@ovn.org> wrote:
>
> On Sat, May 19, 2018 at 02:23:22PM -0700, Han Zhou wrote:
> > This case fails occasionally because although it waits until port
> > binding is completed on HV, the patch port creation may not be
> > completed yet on HV for the localnet port, so if the packets are sent
> > out at this moment, the case will fail. This patch ensures patch
> > port is created and then do another sync before sending the packet
> > so that the ovn-controller is given a chance to handle the change
> > and install related flows to OVS.
> >
> > Signed-off-by: Han Zhou <hzhou8@ebay.com>
>
> It sounds to me like this is actually a bug in ovn-controller: it should
> not report that it is up-to-date until the patch port creation has
> completed.  There is a mechanism to find out that this is complete, and
> ovn-controller should use it.  The outline of how it would be done is
> something like this:
>
>     * When ovn-controller commits a change to the OVS database, it
>       should also increment the next_cfg column in the Open_vSwitch
>       table.  ovs-vsctl does this, with:
>
>         ovsdb_idl_txn_increment(txn, &ovs->header_,
>                                 &ovsrec_open_vswitch_col_next_cfg, false);
>
>     * Before ovn-controller updates nb_cfg, it waits until cur_cfg in
>       the Open_vSwitch table reaches at least the next_cfg that it set.

Thanks for pointing out! I didn't notice that Open_vSwitch has same
mechanism as OVN DBs. I will submit a new patc! later.
diff mbox series

Patch

diff --git a/tests/ovn.at b/tests/ovn.at
index 8d9519e..2187f48 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -9111,6 +9111,8 @@  ovs-vsctl -- add-port br-int hv1-vif1 -- \
     ofport-request=1
 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
 
+OVS_WAIT_UNTIL([ovs-vsctl show | grep 'patch-ln-public-to-br-int'])
+
 OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up sw0_ip6-port1` = xup])
 cr_uuid=`ovn-sbctl find port_binding logical_port=cr-ip6_public | grep _uuid | cut -f2 -d ":"`
 
@@ -9118,6 +9120,9 @@  cr_uuid=`ovn-sbctl find port_binding logical_port=cr-ip6_public | grep _uuid | c
 chassis_uuid=`ovn-sbctl list chassis | grep _uuid | cut -f2 -d ":"`
 OVS_WAIT_UNTIL([test $chassis_uuid = `ovn-sbctl get port_binding $cr_uuid chassis`])
 
+# Allow some time for ovn-northd and ovn-controller to catch up.
+ovn-nbctl --wait=hv --timeout=3 sync
+
 trim_zeros() {
     sed 's/\(00\)\{1,\}$//'
 }