diff mbox series

[ovs-dev] controller: Fix an issue wrt cleanup of stale patch port.

Message ID 20240328122842.36708-1-priyankar.jain@nutanix.com
State Changes Requested
Headers show
Series [ovs-dev] controller: Fix an issue wrt cleanup of stale patch port. | expand

Checks

Context Check Description
ovsrobot/apply-robot warning apply and check: warning
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Priyankar Jain March 28, 2024, 12:28 p.m. UTC
Issue:
Upon updating the network_name option of localnet port from one physical
bridge to another, ovn-controller fails to cleanup the peer localnet
patch port from the old bridge and ends up creating a duplicate peer
localnet patch port which fails in the following ovsdb transaction:

```
"Transaction causes multiple rows in \"Interface\" table to have
identical values
(patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int)
```

Current workflow:
1. Keep a set of all existing localnet ports on br-int. Let us call this
   set: existing_ports.
2. For each localnet port in SB:
   2.1 Create a patch port on br-int. (if it already exists on br-int,
       do not create but remove the entry from exisitng_ports set).
   2.2 Create a peer patch port on br-phys-x. (if it already exists on the
       bridge, do not create but remove the entry from exisitng_ports set).
3. Finally, cleanup all the ports and its peers from exisiting_ports.

With the above workflow, upon network_name change of localnet LSP, since
ports on br-int does not change, only peer port needs to be move from
br-phys-x to br-phys-y, the localnet port is removed from
exisiting_ports in #2.1 and its peer never becomes eligible for cleanup.

Fix:
Include both patch port on br-int and its peer port in the
exisiting_ports set as part of #1.
This make sures that the peer port is only removed from existing_ports
only if it is already present on the correct bridge. (#2.1/#2.2)
Otherwise, during the cleanup in #3 it will be considered.

Fixes: b600316 ("Don't delete patch ports that don't belong to br-int")
Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
---
 controller/patch.c |  32 +++++++------
 tests/ovn.at       | 109 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 124 insertions(+), 17 deletions(-)

Comments

0-day Robot March 28, 2024, 1 p.m. UTC | #1
References:  <20240328122842.36708-1-priyankar.jain@nutanix.com>
 

Bleep bloop.  Greetings Priyankar Jain, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: "Fixes" tag is malformed.
Use the following format:
  git log -1 --pretty=format:"Fixes: %h (\"%s\")" --abbrev=12 COMMIT_REF

39: Fixes: b600316 ("Don't delete patch ports that don't belong to br-int")

Lines checked: 220, Warnings: 0, Errors: 1


Please check this out.  If you feel there has been an error, please email aconole@redhat.com

Thanks,
0-day Robot
Dumitru Ceara April 19, 2024, 8:21 a.m. UTC | #2
On 3/28/24 13:28, Priyankar Jain wrote:
> Issue:
> Upon updating the network_name option of localnet port from one physical
> bridge to another, ovn-controller fails to cleanup the peer localnet
> patch port from the old bridge and ends up creating a duplicate peer
> localnet patch port which fails in the following ovsdb transaction:
> 
> ```
> "Transaction causes multiple rows in \"Interface\" table to have
> identical values
> (patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int)
> ```
> 
> Current workflow:
> 1. Keep a set of all existing localnet ports on br-int. Let us call this
>    set: existing_ports.
> 2. For each localnet port in SB:
>    2.1 Create a patch port on br-int. (if it already exists on br-int,
>        do not create but remove the entry from exisitng_ports set).
>    2.2 Create a peer patch port on br-phys-x. (if it already exists on the
>        bridge, do not create but remove the entry from exisitng_ports set).
> 3. Finally, cleanup all the ports and its peers from exisiting_ports.
> 
> With the above workflow, upon network_name change of localnet LSP, since
> ports on br-int does not change, only peer port needs to be move from
> br-phys-x to br-phys-y, the localnet port is removed from
> exisiting_ports in #2.1 and its peer never becomes eligible for cleanup.
> 
> Fix:
> Include both patch port on br-int and its peer port in the
> exisiting_ports set as part of #1.
> This make sures that the peer port is only removed from existing_ports
> only if it is already present on the correct bridge. (#2.1/#2.2)
> Otherwise, during the cleanup in #3 it will be considered.
> 

Hi Priyankar,

Thanks for the patch!

> Fixes: b600316 ("Don't delete patch ports that don't belong to br-int")

This should actually be:

Fixes: b600316f252a ("Don't delete patch ports that don't belong to br-int")

> Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
> ---
>  controller/patch.c |  32 +++++++------
>  tests/ovn.at       | 109 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 124 insertions(+), 17 deletions(-)
> 
> diff --git a/controller/patch.c b/controller/patch.c
> index c1cd5060d..4fed6e375 100644
> --- a/controller/patch.c
> +++ b/controller/patch.c
> @@ -314,6 +314,21 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn,
>              || smap_get(&port->external_ids, "ovn-l3gateway-port")
>              || smap_get(&port->external_ids, "ovn-logical-patch-port")) {
>              shash_add(&existing_ports, port->name, port);
> +            /* Also add peer ports to the list. */
> +            for (size_t j = 0; j < port->n_interfaces; j++) {
> +                struct ovsrec_interface *p_iface = port->interfaces[j];
> +                if (strcmp(p_iface->type, "patch")) {
> +                    continue;
> +                }
> +                const char *peer_name = smap_get(&p_iface->options, "peer");
> +                if (peer_name) {
> +                    const struct ovsrec_port *peer_port =
> +                        get_port(ovsrec_port_by_name, peer_name);
> +                    if (peer_port) {
> +                        shash_add(&existing_ports, peer_port->name, peer_port);
> +                    }
> +                }
> +            }
>          }
>      }
>  
> @@ -336,23 +351,6 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn,
>           * ovn-controller.  Otherwise it may cause unncessary dataplane
>           * interruption during restart/upgrade. */
>          if (!daemon_started_recently()) {
> -            /* delete peer patch port first */
> -            for (size_t i = 0; i < port->n_interfaces; i++) {
> -                struct ovsrec_interface *iface = port->interfaces[i];
> -                if (strcmp(iface->type, "patch")) {
> -                    continue;
> -                }
> -                const char *iface_peer = smap_get(&iface->options, "peer");
> -                if (iface_peer) {
> -                    const struct ovsrec_port *peer_port =
> -                        get_port(ovsrec_port_by_name, iface_peer);
> -                    if (peer_port) {
> -                        remove_port(bridge_table, peer_port);
> -                    }
> -                }
> -            }
> -
> -            /* now delete integration bridge patch port */
>              remove_port(bridge_table, port);
>          }
>      }

The fix looks correct to me.

> diff --git a/tests/ovn.at b/tests/ovn.at
> index 4d0c7ad53..487e727c0 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -36454,6 +36454,115 @@ OVN_CLEANUP([hv1])
>  AT_CLEANUP
>  ])
>  
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([ovn-controller update network_name option for localnet port])
> +ovn_start
> +net_add n1
> +
> +sim_add hv1
> +as hv1
> +
> +# Bridge Topology
> +# Initial: br-int and br-phys-1 connected through ovn-localnet patch port.
> +#
> +# br-phys-1 -- br-int
> +#
> +# Final: br-int and br-phys-3 connected through ovn-localnet patch port.
> +# Similarly br-int-2 and br-hys-2 connected through localnet patch port
> +# but not owned by this controller.
> +#
> +# br-phys-1  br-int -- br-phys-3
> +# br-int-2 -- br-phys-2
> +
> +# Create bridges
> +ovs-vsctl add-br br-int

Please prefix all applicable commands with "check " to ensure things don't
fail by accident.

> +ovs-vsctl add-br br-int-2
> +ovs-vsctl add-br br-phys-1
> +ovs-vsctl add-br br-phys-2
> +ovs-vsctl add-br br-phys-3
> +ovs-vsctl set open . external-ids:ovn-bridge-mappings-hv1=phys-1:br-phys-1,phy-2:br-phys-2,phys-3:br-phys-3
> +
> +ovn_attach n1 br-phys-1 192.168.1.1 24
> +
> +ovs-vsctl -- add-port br-int vif1 -- \
> +    set interface vif1 external-ids:iface-id=lp1
> +
> +ovn-nbctl ls-add ls
> +
> +ovn-nbctl lsp-add ls lp1
> +ovn-nbctl lsp-set-addresses lp1 "00:00:00:00:00:01 10.0.0.1"
> +
> +ovn-nbctl lsp-add ls ln_port
> +ovn-nbctl lsp-set-addresses ln_port unknown
> +ovn-nbctl lsp-set-type ln_port localnet
> +ovn-nbctl lsp-set-options ln_port network_name=phys-1
> +wait_for_ports_up
> +ovn-nbctl --wait=hv sync
> +
> +# Allow controller to immediately clean patch ports up if any.
> +check ovn-appctl -t ovn-controller debug/ignore-startup-delay
> +
> +# check that patch port created on br-int and br-phys-1.
> +OVS_WAIT_UNTIL([
> +    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-br-int-to-ln_port | wc -l)
> +])
> +OVS_WAIT_UNTIL([
> +    test 1 = $(ovs-vsctl list-ports br-int | grep patch-br-int-to-ln_port | wc -l)
> +])
> +OVS_WAIT_UNTIL([
> +    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-ln_port-to-br-int | wc -l)
> +])
> +OVS_WAIT_UNTIL([
> +    test 1 = $(ovs-vsctl list-ports br-phys-1 | grep patch-ln_port-to-br-int | wc -l)
> +])
> +
> +ovn-appctl debug/pause
> +
> +# Now move the localnet port from phys-1 to phys-3.
> +ovn-nbctl lsp-set-options ln_port network_name=phys-3
> +
> +# Also create fake localnet ports on br-int-2
> +ovs-vsctl -- add-port br-int-2 fake-int-2-to-phys-2 -- \
> +    set port fake-int-2-to-phys-2 external-ids:ovn-localnet-port=fake-port -- \
> +    set interface fake-int-2-to-phys-2 options:peer=fake-phys-2-to-int-2 type=patch
> +ovs-vsctl -- add-port br-phys-2 fake-phys-2-to-int-2 -- \
> +    set port fake-phys-2-to-int-2 external-ids:ovn-localnet-port=fake-port -- \
> +    set interface fake-phys-2-to-int-2 options:peer=fake-int-2-to-phys-2 type=patch
> +
> +ovn-appctl debug/resume
> +ovn-nbctl --wait=hv sync
> +
> +# patch for port on br-phys-1 is cleanedup.
> +OVS_WAIT_UNTIL([
> +    test 0 = $(ovs-vsctl list-ports br-phys-1 | grep patch-ln_port-to-br-int | wc -l)
> +])
> +
> +# Patch port created on br-int and br-phy-3
> +OVS_WAIT_UNTIL([
> +    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-br-int-to-ln_port | wc -l)
> +])
> +OVS_WAIT_UNTIL([
> +    test 1 = $(ovs-vsctl list-ports br-int | grep patch-br-int-to-ln_port | wc -l)
> +])
> +OVS_WAIT_UNTIL([
> +    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-ln_port-to-br-int | wc -l)
> +])
> +OVS_WAIT_UNTIL([
> +    test 1 = $(ovs-vsctl list-ports br-phys-3 | grep patch-ln_port-to-br-int | wc -l)
> +])
> +
> +# check that patch port on a different bridge is not touched
> +OVS_WAIT_UNTIL([
> +    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=fake-int-2-to-phys-2 | wc -l)
> +])
> +OVS_WAIT_UNTIL([
> +    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=fake-phys-2-to-int-2 | wc -l)
> +])
> +
> +OVN_CLEANUP([hv1])
> +AT_CLEANUP
> +])
> +
>  # NOTE: This test case runs two ovn-controllers inside the same sandbox (hv1).
>  # Each controller uses a unique chassis name - hv1 and hv2 - and manage
>  # different bridges with different ports. This is why all 'as' commands below

Regards,
Dumitru
diff mbox series

Patch

diff --git a/controller/patch.c b/controller/patch.c
index c1cd5060d..4fed6e375 100644
--- a/controller/patch.c
+++ b/controller/patch.c
@@ -314,6 +314,21 @@  patch_run(struct ovsdb_idl_txn *ovs_idl_txn,
             || smap_get(&port->external_ids, "ovn-l3gateway-port")
             || smap_get(&port->external_ids, "ovn-logical-patch-port")) {
             shash_add(&existing_ports, port->name, port);
+            /* Also add peer ports to the list. */
+            for (size_t j = 0; j < port->n_interfaces; j++) {
+                struct ovsrec_interface *p_iface = port->interfaces[j];
+                if (strcmp(p_iface->type, "patch")) {
+                    continue;
+                }
+                const char *peer_name = smap_get(&p_iface->options, "peer");
+                if (peer_name) {
+                    const struct ovsrec_port *peer_port =
+                        get_port(ovsrec_port_by_name, peer_name);
+                    if (peer_port) {
+                        shash_add(&existing_ports, peer_port->name, peer_port);
+                    }
+                }
+            }
         }
     }
 
@@ -336,23 +351,6 @@  patch_run(struct ovsdb_idl_txn *ovs_idl_txn,
          * ovn-controller.  Otherwise it may cause unncessary dataplane
          * interruption during restart/upgrade. */
         if (!daemon_started_recently()) {
-            /* delete peer patch port first */
-            for (size_t i = 0; i < port->n_interfaces; i++) {
-                struct ovsrec_interface *iface = port->interfaces[i];
-                if (strcmp(iface->type, "patch")) {
-                    continue;
-                }
-                const char *iface_peer = smap_get(&iface->options, "peer");
-                if (iface_peer) {
-                    const struct ovsrec_port *peer_port =
-                        get_port(ovsrec_port_by_name, iface_peer);
-                    if (peer_port) {
-                        remove_port(bridge_table, peer_port);
-                    }
-                }
-            }
-
-            /* now delete integration bridge patch port */
             remove_port(bridge_table, port);
         }
     }
diff --git a/tests/ovn.at b/tests/ovn.at
index 4d0c7ad53..487e727c0 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -36454,6 +36454,115 @@  OVN_CLEANUP([hv1])
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn-controller update network_name option for localnet port])
+ovn_start
+net_add n1
+
+sim_add hv1
+as hv1
+
+# Bridge Topology
+# Initial: br-int and br-phys-1 connected through ovn-localnet patch port.
+#
+# br-phys-1 -- br-int
+#
+# Final: br-int and br-phys-3 connected through ovn-localnet patch port.
+# Similarly br-int-2 and br-hys-2 connected through localnet patch port
+# but not owned by this controller.
+#
+# br-phys-1  br-int -- br-phys-3
+# br-int-2 -- br-phys-2
+
+# Create bridges
+ovs-vsctl add-br br-int
+ovs-vsctl add-br br-int-2
+ovs-vsctl add-br br-phys-1
+ovs-vsctl add-br br-phys-2
+ovs-vsctl add-br br-phys-3
+ovs-vsctl set open . external-ids:ovn-bridge-mappings-hv1=phys-1:br-phys-1,phy-2:br-phys-2,phys-3:br-phys-3
+
+ovn_attach n1 br-phys-1 192.168.1.1 24
+
+ovs-vsctl -- add-port br-int vif1 -- \
+    set interface vif1 external-ids:iface-id=lp1
+
+ovn-nbctl ls-add ls
+
+ovn-nbctl lsp-add ls lp1
+ovn-nbctl lsp-set-addresses lp1 "00:00:00:00:00:01 10.0.0.1"
+
+ovn-nbctl lsp-add ls ln_port
+ovn-nbctl lsp-set-addresses ln_port unknown
+ovn-nbctl lsp-set-type ln_port localnet
+ovn-nbctl lsp-set-options ln_port network_name=phys-1
+wait_for_ports_up
+ovn-nbctl --wait=hv sync
+
+# Allow controller to immediately clean patch ports up if any.
+check ovn-appctl -t ovn-controller debug/ignore-startup-delay
+
+# check that patch port created on br-int and br-phys-1.
+OVS_WAIT_UNTIL([
+    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-br-int-to-ln_port | wc -l)
+])
+OVS_WAIT_UNTIL([
+    test 1 = $(ovs-vsctl list-ports br-int | grep patch-br-int-to-ln_port | wc -l)
+])
+OVS_WAIT_UNTIL([
+    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-ln_port-to-br-int | wc -l)
+])
+OVS_WAIT_UNTIL([
+    test 1 = $(ovs-vsctl list-ports br-phys-1 | grep patch-ln_port-to-br-int | wc -l)
+])
+
+ovn-appctl debug/pause
+
+# Now move the localnet port from phys-1 to phys-3.
+ovn-nbctl lsp-set-options ln_port network_name=phys-3
+
+# Also create fake localnet ports on br-int-2
+ovs-vsctl -- add-port br-int-2 fake-int-2-to-phys-2 -- \
+    set port fake-int-2-to-phys-2 external-ids:ovn-localnet-port=fake-port -- \
+    set interface fake-int-2-to-phys-2 options:peer=fake-phys-2-to-int-2 type=patch
+ovs-vsctl -- add-port br-phys-2 fake-phys-2-to-int-2 -- \
+    set port fake-phys-2-to-int-2 external-ids:ovn-localnet-port=fake-port -- \
+    set interface fake-phys-2-to-int-2 options:peer=fake-int-2-to-phys-2 type=patch
+
+ovn-appctl debug/resume
+ovn-nbctl --wait=hv sync
+
+# patch for port on br-phys-1 is cleanedup.
+OVS_WAIT_UNTIL([
+    test 0 = $(ovs-vsctl list-ports br-phys-1 | grep patch-ln_port-to-br-int | wc -l)
+])
+
+# Patch port created on br-int and br-phy-3
+OVS_WAIT_UNTIL([
+    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-br-int-to-ln_port | wc -l)
+])
+OVS_WAIT_UNTIL([
+    test 1 = $(ovs-vsctl list-ports br-int | grep patch-br-int-to-ln_port | wc -l)
+])
+OVS_WAIT_UNTIL([
+    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-ln_port-to-br-int | wc -l)
+])
+OVS_WAIT_UNTIL([
+    test 1 = $(ovs-vsctl list-ports br-phys-3 | grep patch-ln_port-to-br-int | wc -l)
+])
+
+# check that patch port on a different bridge is not touched
+OVS_WAIT_UNTIL([
+    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=fake-int-2-to-phys-2 | wc -l)
+])
+OVS_WAIT_UNTIL([
+    test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=fake-phys-2-to-int-2 | wc -l)
+])
+
+OVN_CLEANUP([hv1])
+AT_CLEANUP
+])
+
 # NOTE: This test case runs two ovn-controllers inside the same sandbox (hv1).
 # Each controller uses a unique chassis name - hv1 and hv2 - and manage
 # different bridges with different ports. This is why all 'as' commands below