Message ID | 20240328122842.36708-1-priyankar.jain@nutanix.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev] controller: Fix an issue wrt cleanup of stale patch port. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | warning | apply and check: warning |
ovsrobot/github-robot-_Build_and_Test | fail | github build: failed |
ovsrobot/github-robot-_ovn-kubernetes | success | github build: passed |
References: <20240328122842.36708-1-priyankar.jain@nutanix.com> Bleep bloop. Greetings Priyankar Jain, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: "Fixes" tag is malformed. Use the following format: git log -1 --pretty=format:"Fixes: %h (\"%s\")" --abbrev=12 COMMIT_REF 39: Fixes: b600316 ("Don't delete patch ports that don't belong to br-int") Lines checked: 220, Warnings: 0, Errors: 1 Please check this out. If you feel there has been an error, please email aconole@redhat.com Thanks, 0-day Robot
On 3/28/24 13:28, Priyankar Jain wrote: > Issue: > Upon updating the network_name option of localnet port from one physical > bridge to another, ovn-controller fails to cleanup the peer localnet > patch port from the old bridge and ends up creating a duplicate peer > localnet patch port which fails in the following ovsdb transaction: > > ``` > "Transaction causes multiple rows in \"Interface\" table to have > identical values > (patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int) > ``` > > Current workflow: > 1. Keep a set of all existing localnet ports on br-int. Let us call this > set: existing_ports. > 2. For each localnet port in SB: > 2.1 Create a patch port on br-int. (if it already exists on br-int, > do not create but remove the entry from exisitng_ports set). > 2.2 Create a peer patch port on br-phys-x. (if it already exists on the > bridge, do not create but remove the entry from exisitng_ports set). > 3. Finally, cleanup all the ports and its peers from exisiting_ports. > > With the above workflow, upon network_name change of localnet LSP, since > ports on br-int does not change, only peer port needs to be move from > br-phys-x to br-phys-y, the localnet port is removed from > exisiting_ports in #2.1 and its peer never becomes eligible for cleanup. > > Fix: > Include both patch port on br-int and its peer port in the > exisiting_ports set as part of #1. > This make sures that the peer port is only removed from existing_ports > only if it is already present on the correct bridge. (#2.1/#2.2) > Otherwise, during the cleanup in #3 it will be considered. > Hi Priyankar, Thanks for the patch! > Fixes: b600316 ("Don't delete patch ports that don't belong to br-int") This should actually be: Fixes: b600316f252a ("Don't delete patch ports that don't belong to br-int") > Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> > --- > controller/patch.c | 32 +++++++------ > tests/ovn.at | 109 +++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 124 insertions(+), 17 deletions(-) > > diff --git a/controller/patch.c b/controller/patch.c > index c1cd5060d..4fed6e375 100644 > --- a/controller/patch.c > +++ b/controller/patch.c > @@ -314,6 +314,21 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn, > || smap_get(&port->external_ids, "ovn-l3gateway-port") > || smap_get(&port->external_ids, "ovn-logical-patch-port")) { > shash_add(&existing_ports, port->name, port); > + /* Also add peer ports to the list. */ > + for (size_t j = 0; j < port->n_interfaces; j++) { > + struct ovsrec_interface *p_iface = port->interfaces[j]; > + if (strcmp(p_iface->type, "patch")) { > + continue; > + } > + const char *peer_name = smap_get(&p_iface->options, "peer"); > + if (peer_name) { > + const struct ovsrec_port *peer_port = > + get_port(ovsrec_port_by_name, peer_name); > + if (peer_port) { > + shash_add(&existing_ports, peer_port->name, peer_port); > + } > + } > + } > } > } > > @@ -336,23 +351,6 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn, > * ovn-controller. Otherwise it may cause unncessary dataplane > * interruption during restart/upgrade. */ > if (!daemon_started_recently()) { > - /* delete peer patch port first */ > - for (size_t i = 0; i < port->n_interfaces; i++) { > - struct ovsrec_interface *iface = port->interfaces[i]; > - if (strcmp(iface->type, "patch")) { > - continue; > - } > - const char *iface_peer = smap_get(&iface->options, "peer"); > - if (iface_peer) { > - const struct ovsrec_port *peer_port = > - get_port(ovsrec_port_by_name, iface_peer); > - if (peer_port) { > - remove_port(bridge_table, peer_port); > - } > - } > - } > - > - /* now delete integration bridge patch port */ > remove_port(bridge_table, port); > } > } The fix looks correct to me. > diff --git a/tests/ovn.at b/tests/ovn.at > index 4d0c7ad53..487e727c0 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -36454,6 +36454,115 @@ OVN_CLEANUP([hv1]) > AT_CLEANUP > ]) > > +OVN_FOR_EACH_NORTHD([ > +AT_SETUP([ovn-controller update network_name option for localnet port]) > +ovn_start > +net_add n1 > + > +sim_add hv1 > +as hv1 > + > +# Bridge Topology > +# Initial: br-int and br-phys-1 connected through ovn-localnet patch port. > +# > +# br-phys-1 -- br-int > +# > +# Final: br-int and br-phys-3 connected through ovn-localnet patch port. > +# Similarly br-int-2 and br-hys-2 connected through localnet patch port > +# but not owned by this controller. > +# > +# br-phys-1 br-int -- br-phys-3 > +# br-int-2 -- br-phys-2 > + > +# Create bridges > +ovs-vsctl add-br br-int Please prefix all applicable commands with "check " to ensure things don't fail by accident. > +ovs-vsctl add-br br-int-2 > +ovs-vsctl add-br br-phys-1 > +ovs-vsctl add-br br-phys-2 > +ovs-vsctl add-br br-phys-3 > +ovs-vsctl set open . external-ids:ovn-bridge-mappings-hv1=phys-1:br-phys-1,phy-2:br-phys-2,phys-3:br-phys-3 > + > +ovn_attach n1 br-phys-1 192.168.1.1 24 > + > +ovs-vsctl -- add-port br-int vif1 -- \ > + set interface vif1 external-ids:iface-id=lp1 > + > +ovn-nbctl ls-add ls > + > +ovn-nbctl lsp-add ls lp1 > +ovn-nbctl lsp-set-addresses lp1 "00:00:00:00:00:01 10.0.0.1" > + > +ovn-nbctl lsp-add ls ln_port > +ovn-nbctl lsp-set-addresses ln_port unknown > +ovn-nbctl lsp-set-type ln_port localnet > +ovn-nbctl lsp-set-options ln_port network_name=phys-1 > +wait_for_ports_up > +ovn-nbctl --wait=hv sync > + > +# Allow controller to immediately clean patch ports up if any. > +check ovn-appctl -t ovn-controller debug/ignore-startup-delay > + > +# check that patch port created on br-int and br-phys-1. > +OVS_WAIT_UNTIL([ > + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-br-int-to-ln_port | wc -l) > +]) > +OVS_WAIT_UNTIL([ > + test 1 = $(ovs-vsctl list-ports br-int | grep patch-br-int-to-ln_port | wc -l) > +]) > +OVS_WAIT_UNTIL([ > + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-ln_port-to-br-int | wc -l) > +]) > +OVS_WAIT_UNTIL([ > + test 1 = $(ovs-vsctl list-ports br-phys-1 | grep patch-ln_port-to-br-int | wc -l) > +]) > + > +ovn-appctl debug/pause > + > +# Now move the localnet port from phys-1 to phys-3. > +ovn-nbctl lsp-set-options ln_port network_name=phys-3 > + > +# Also create fake localnet ports on br-int-2 > +ovs-vsctl -- add-port br-int-2 fake-int-2-to-phys-2 -- \ > + set port fake-int-2-to-phys-2 external-ids:ovn-localnet-port=fake-port -- \ > + set interface fake-int-2-to-phys-2 options:peer=fake-phys-2-to-int-2 type=patch > +ovs-vsctl -- add-port br-phys-2 fake-phys-2-to-int-2 -- \ > + set port fake-phys-2-to-int-2 external-ids:ovn-localnet-port=fake-port -- \ > + set interface fake-phys-2-to-int-2 options:peer=fake-int-2-to-phys-2 type=patch > + > +ovn-appctl debug/resume > +ovn-nbctl --wait=hv sync > + > +# patch for port on br-phys-1 is cleanedup. > +OVS_WAIT_UNTIL([ > + test 0 = $(ovs-vsctl list-ports br-phys-1 | grep patch-ln_port-to-br-int | wc -l) > +]) > + > +# Patch port created on br-int and br-phy-3 > +OVS_WAIT_UNTIL([ > + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-br-int-to-ln_port | wc -l) > +]) > +OVS_WAIT_UNTIL([ > + test 1 = $(ovs-vsctl list-ports br-int | grep patch-br-int-to-ln_port | wc -l) > +]) > +OVS_WAIT_UNTIL([ > + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-ln_port-to-br-int | wc -l) > +]) > +OVS_WAIT_UNTIL([ > + test 1 = $(ovs-vsctl list-ports br-phys-3 | grep patch-ln_port-to-br-int | wc -l) > +]) > + > +# check that patch port on a different bridge is not touched > +OVS_WAIT_UNTIL([ > + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=fake-int-2-to-phys-2 | wc -l) > +]) > +OVS_WAIT_UNTIL([ > + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=fake-phys-2-to-int-2 | wc -l) > +]) > + > +OVN_CLEANUP([hv1]) > +AT_CLEANUP > +]) > + > # NOTE: This test case runs two ovn-controllers inside the same sandbox (hv1). > # Each controller uses a unique chassis name - hv1 and hv2 - and manage > # different bridges with different ports. This is why all 'as' commands below Regards, Dumitru
diff --git a/controller/patch.c b/controller/patch.c index c1cd5060d..4fed6e375 100644 --- a/controller/patch.c +++ b/controller/patch.c @@ -314,6 +314,21 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn, || smap_get(&port->external_ids, "ovn-l3gateway-port") || smap_get(&port->external_ids, "ovn-logical-patch-port")) { shash_add(&existing_ports, port->name, port); + /* Also add peer ports to the list. */ + for (size_t j = 0; j < port->n_interfaces; j++) { + struct ovsrec_interface *p_iface = port->interfaces[j]; + if (strcmp(p_iface->type, "patch")) { + continue; + } + const char *peer_name = smap_get(&p_iface->options, "peer"); + if (peer_name) { + const struct ovsrec_port *peer_port = + get_port(ovsrec_port_by_name, peer_name); + if (peer_port) { + shash_add(&existing_ports, peer_port->name, peer_port); + } + } + } } } @@ -336,23 +351,6 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn, * ovn-controller. Otherwise it may cause unncessary dataplane * interruption during restart/upgrade. */ if (!daemon_started_recently()) { - /* delete peer patch port first */ - for (size_t i = 0; i < port->n_interfaces; i++) { - struct ovsrec_interface *iface = port->interfaces[i]; - if (strcmp(iface->type, "patch")) { - continue; - } - const char *iface_peer = smap_get(&iface->options, "peer"); - if (iface_peer) { - const struct ovsrec_port *peer_port = - get_port(ovsrec_port_by_name, iface_peer); - if (peer_port) { - remove_port(bridge_table, peer_port); - } - } - } - - /* now delete integration bridge patch port */ remove_port(bridge_table, port); } } diff --git a/tests/ovn.at b/tests/ovn.at index 4d0c7ad53..487e727c0 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -36454,6 +36454,115 @@ OVN_CLEANUP([hv1]) AT_CLEANUP ]) +OVN_FOR_EACH_NORTHD([ +AT_SETUP([ovn-controller update network_name option for localnet port]) +ovn_start +net_add n1 + +sim_add hv1 +as hv1 + +# Bridge Topology +# Initial: br-int and br-phys-1 connected through ovn-localnet patch port. +# +# br-phys-1 -- br-int +# +# Final: br-int and br-phys-3 connected through ovn-localnet patch port. +# Similarly br-int-2 and br-hys-2 connected through localnet patch port +# but not owned by this controller. +# +# br-phys-1 br-int -- br-phys-3 +# br-int-2 -- br-phys-2 + +# Create bridges +ovs-vsctl add-br br-int +ovs-vsctl add-br br-int-2 +ovs-vsctl add-br br-phys-1 +ovs-vsctl add-br br-phys-2 +ovs-vsctl add-br br-phys-3 +ovs-vsctl set open . external-ids:ovn-bridge-mappings-hv1=phys-1:br-phys-1,phy-2:br-phys-2,phys-3:br-phys-3 + +ovn_attach n1 br-phys-1 192.168.1.1 24 + +ovs-vsctl -- add-port br-int vif1 -- \ + set interface vif1 external-ids:iface-id=lp1 + +ovn-nbctl ls-add ls + +ovn-nbctl lsp-add ls lp1 +ovn-nbctl lsp-set-addresses lp1 "00:00:00:00:00:01 10.0.0.1" + +ovn-nbctl lsp-add ls ln_port +ovn-nbctl lsp-set-addresses ln_port unknown +ovn-nbctl lsp-set-type ln_port localnet +ovn-nbctl lsp-set-options ln_port network_name=phys-1 +wait_for_ports_up +ovn-nbctl --wait=hv sync + +# Allow controller to immediately clean patch ports up if any. +check ovn-appctl -t ovn-controller debug/ignore-startup-delay + +# check that patch port created on br-int and br-phys-1. +OVS_WAIT_UNTIL([ + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-br-int-to-ln_port | wc -l) +]) +OVS_WAIT_UNTIL([ + test 1 = $(ovs-vsctl list-ports br-int | grep patch-br-int-to-ln_port | wc -l) +]) +OVS_WAIT_UNTIL([ + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-ln_port-to-br-int | wc -l) +]) +OVS_WAIT_UNTIL([ + test 1 = $(ovs-vsctl list-ports br-phys-1 | grep patch-ln_port-to-br-int | wc -l) +]) + +ovn-appctl debug/pause + +# Now move the localnet port from phys-1 to phys-3. +ovn-nbctl lsp-set-options ln_port network_name=phys-3 + +# Also create fake localnet ports on br-int-2 +ovs-vsctl -- add-port br-int-2 fake-int-2-to-phys-2 -- \ + set port fake-int-2-to-phys-2 external-ids:ovn-localnet-port=fake-port -- \ + set interface fake-int-2-to-phys-2 options:peer=fake-phys-2-to-int-2 type=patch +ovs-vsctl -- add-port br-phys-2 fake-phys-2-to-int-2 -- \ + set port fake-phys-2-to-int-2 external-ids:ovn-localnet-port=fake-port -- \ + set interface fake-phys-2-to-int-2 options:peer=fake-int-2-to-phys-2 type=patch + +ovn-appctl debug/resume +ovn-nbctl --wait=hv sync + +# patch for port on br-phys-1 is cleanedup. +OVS_WAIT_UNTIL([ + test 0 = $(ovs-vsctl list-ports br-phys-1 | grep patch-ln_port-to-br-int | wc -l) +]) + +# Patch port created on br-int and br-phy-3 +OVS_WAIT_UNTIL([ + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-br-int-to-ln_port | wc -l) +]) +OVS_WAIT_UNTIL([ + test 1 = $(ovs-vsctl list-ports br-int | grep patch-br-int-to-ln_port | wc -l) +]) +OVS_WAIT_UNTIL([ + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=patch-ln_port-to-br-int | wc -l) +]) +OVS_WAIT_UNTIL([ + test 1 = $(ovs-vsctl list-ports br-phys-3 | grep patch-ln_port-to-br-int | wc -l) +]) + +# check that patch port on a different bridge is not touched +OVS_WAIT_UNTIL([ + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=fake-int-2-to-phys-2 | wc -l) +]) +OVS_WAIT_UNTIL([ + test 1 = $(ovs-vsctl --columns _uuid --bare find Port name=fake-phys-2-to-int-2 | wc -l) +]) + +OVN_CLEANUP([hv1]) +AT_CLEANUP +]) + # NOTE: This test case runs two ovn-controllers inside the same sandbox (hv1). # Each controller uses a unique chassis name - hv1 and hv2 - and manage # different bridges with different ports. This is why all 'as' commands below
Issue: Upon updating the network_name option of localnet port from one physical bridge to another, ovn-controller fails to cleanup the peer localnet patch port from the old bridge and ends up creating a duplicate peer localnet patch port which fails in the following ovsdb transaction: ``` "Transaction causes multiple rows in \"Interface\" table to have identical values (patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int) ``` Current workflow: 1. Keep a set of all existing localnet ports on br-int. Let us call this set: existing_ports. 2. For each localnet port in SB: 2.1 Create a patch port on br-int. (if it already exists on br-int, do not create but remove the entry from exisitng_ports set). 2.2 Create a peer patch port on br-phys-x. (if it already exists on the bridge, do not create but remove the entry from exisitng_ports set). 3. Finally, cleanup all the ports and its peers from exisiting_ports. With the above workflow, upon network_name change of localnet LSP, since ports on br-int does not change, only peer port needs to be move from br-phys-x to br-phys-y, the localnet port is removed from exisiting_ports in #2.1 and its peer never becomes eligible for cleanup. Fix: Include both patch port on br-int and its peer port in the exisiting_ports set as part of #1. This make sures that the peer port is only removed from existing_ports only if it is already present on the correct bridge. (#2.1/#2.2) Otherwise, during the cleanup in #3 it will be considered. Fixes: b600316 ("Don't delete patch ports that don't belong to br-int") Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> --- controller/patch.c | 32 +++++++------ tests/ovn.at | 109 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 124 insertions(+), 17 deletions(-)