diff mbox series

[ovs-dev,RFC] bridge: Retry tunnel port addition for conflict.

Message ID 20240227191418.2335620-1-hzhou@ovn.org
State RFC
Delegated to: Simon Horman
Headers show
Series [ovs-dev,RFC] bridge: Retry tunnel port addition for conflict. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Han Zhou Feb. 27, 2024, 7:14 p.m. UTC
For kernel datapath, when a new tunnel port is created in the same
transaction in which an old tunnel port with the same tunnel
configuration is deleted, the new tunnel port creation will fail and
left in an error state. This can be easily reproduced in OVN by
triggering a chassis deletion and addition with the same encap in the
same SB DB transaction, such as:

ovn-sbctl chassis-add aaaaaa geneve 1.2.3.4
ovn-sbctl chassis-del aaaaaa -- chassis-add bbbbbb 1.2.3.4

ovs-vsctl show | grep error
error: "could not add network device ovn-bbbbbb-0 to ofproto (File exists)"

Related logs in OVS:
—
2024-02-23T05:41:49.978Z|405933|bridge|INFO|bridge br-int: deleted interface ovn-aaaaaa-0 on port 113
2024-02-23T05:41:49.989Z|405935|tunnel|WARN|ovn-bbbbbb-0: attempting to add tunnel port with same config as port 'ovn-aaaaaa-0' (::->1.2.3.4, key=flow, legacy_l2, dp port=9)
2024-02-23T05:41:49.989Z|405936|ofproto|WARN|br-int: could not add port ovn-bbbbbb-0 (File exists)
2024-02-23T05:41:49.989Z|405937|bridge|WARN|could not add network device ovn-bbbbbb-0 to ofproto (File exists)
—

Depending on when there are other OVSDB changes, it may take a long time
for the device to be added successfully, triggered by the next OVS
iteration.

(note: native tunnel ports do not have this problem)

The problem is how the tunnel port deletion and creation are handled. In
bridge_reconfigure(), port deletion is handled before creation, to avoid
such resource conflict. However, for kernel tunnel ports, the real clean
up is performed at the end of the bridge_reconfigure() in the:
bridge_run__()->ofproto_run()->...->ofproto_dpif:port_destruct()

We cannot call bridge_run__() at an earlier point before all
reconfigurations are done, so this patch tries a generic approach to
just re-run the bridge_reconfigure() when there are any port creations
encountered "File exists" error, which indicates a possible resource
conflict may have happened due to a later deleted port, and retry may
succeed.

Signed-off-by: Han Zhou <hzhou@ovn.org>
---
This is RFC because I am not sure if there is a better way to solve the problem
more specifically by executing the port_destruct for the old port before trying
to create the new port. The fix may be more complex though.

 tests/tunnel.at   |  1 +
 vswitchd/bridge.c | 47 ++++++++++++++++++++++++++++++++---------------
 2 files changed, 33 insertions(+), 15 deletions(-)

Comments

Ilya Maximets March 6, 2024, 9:39 p.m. UTC | #1
On 2/27/24 20:14, Han Zhou wrote:
> For kernel datapath, when a new tunnel port is created in the same
> transaction in which an old tunnel port with the same tunnel
> configuration is deleted, the new tunnel port creation will fail and
> left in an error state. This can be easily reproduced in OVN by
> triggering a chassis deletion and addition with the same encap in the
> same SB DB transaction, such as:
> 
> ovn-sbctl chassis-add aaaaaa geneve 1.2.3.4
> ovn-sbctl chassis-del aaaaaa -- chassis-add bbbbbb 1.2.3.4
> 
> ovs-vsctl show | grep error
> error: "could not add network device ovn-bbbbbb-0 to ofproto (File exists)"
> 
> Related logs in OVS:
> —
> 2024-02-23T05:41:49.978Z|405933|bridge|INFO|bridge br-int: deleted interface ovn-aaaaaa-0 on port 113
> 2024-02-23T05:41:49.989Z|405935|tunnel|WARN|ovn-bbbbbb-0: attempting to add tunnel port with same config as port 'ovn-aaaaaa-0' (::->1.2.3.4, key=flow, legacy_l2, dp port=9)
> 2024-02-23T05:41:49.989Z|405936|ofproto|WARN|br-int: could not add port ovn-bbbbbb-0 (File exists)
> 2024-02-23T05:41:49.989Z|405937|bridge|WARN|could not add network device ovn-bbbbbb-0 to ofproto (File exists)
> —

Hi, Han.  Thanks for the patch!

> 
> Depending on when there are other OVSDB changes, it may take a long time
> for the device to be added successfully, triggered by the next OVS
> iteration.
> 
> (note: native tunnel ports do not have this problem)

I don't think this is correct.  The code path is common for both system
and native tunnels.  I can reproduce the issues in a sandbox with:

$ make -j8 sandbox SANDBOXFLAGS="\-\-dummy='system'"
[tutorial]$ ovs-vsctl add-port br0 tunnel_port \
                -- set Interface tunnel_port \
                       type=geneve options:remote_ip=flow options:key=123
[tutorial]$ ovs-vsctl del-port tunnel_port \
                -- add-port br0 tunnel_port2 \
                -- set Interface tunnel_port2 \
                       type=geneve options:remote_ip=flow options:key=123
ovs-vsctl: Error detected while setting up 'tunnel_port2':
could not add network device tunnel_port2 to ofproto (File exists).
See ovs-vswitchd log for details.

The same should work in a testsuite as well, i.e. we should be able to
create a test for this scenario.

Note: The --dummy=system prevents OVS from replacing tunnel ports with
      dummy ones.

> 
> The problem is how the tunnel port deletion and creation are handled. In
> bridge_reconfigure(), port deletion is handled before creation, to avoid
> such resource conflict. However, for kernel tunnel ports, the real clean
> up is performed at the end of the bridge_reconfigure() in the:
> bridge_run__()->ofproto_run()->...->ofproto_dpif:port_destruct()
> 
> We cannot call bridge_run__() at an earlier point before all
> reconfigurations are done, so this patch tries a generic approach to
> just re-run the bridge_reconfigure() when there are any port creations
> encountered "File exists" error, which indicates a possible resource
> conflict may have happened due to a later deleted port, and retry may
> succeed.
> 
> Signed-off-by: Han Zhou <hzhou@ovn.org>
> ---
> This is RFC because I am not sure if there is a better way to solve the problem
> more specifically by executing the port_destruct for the old port before trying
> to create the new port. The fix may be more complex though.

I don't think re-trying is a good approach in general.  We should likely
just destroy the tnl_port structure right away, similarly to how we clean
up stp, lldp and bundles in ofproto_port_unregister().  Maybe we can add
a new callback similar to bundle_remove() and call tnl_port_del() from it?
(I didn't try, so I'm not 100% sure this will not cause any issues.)

What do you think?

Best regards, Ilya Maximets.
Tao Liu March 8, 2024, 8:17 a.m. UTC | #2
On 3/7/24 5:39 AM, Ilya Maximets wrote:
> On 2/27/24 20:14, Han Zhou wrote:
>> For kernel datapath, when a new tunnel port is created in the same
>> transaction in which an old tunnel port with the same tunnel
>> configuration is deleted, the new tunnel port creation will fail and
>> left in an error state. This can be easily reproduced in OVN by
>> triggering a chassis deletion and addition with the same encap in the
>> same SB DB transaction, such as:
>>
>> ovn-sbctl chassis-add aaaaaa geneve 1.2.3.4
>> ovn-sbctl chassis-del aaaaaa -- chassis-add bbbbbb 1.2.3.4
>>
>> ovs-vsctl show | grep error
>> error: "could not add network device ovn-bbbbbb-0 to ofproto (File exists)"
>>
>> Related logs in OVS:
>> —
>> 2024-02-23T05:41:49.978Z|405933|bridge|INFO|bridge br-int: deleted interface ovn-aaaaaa-0 on port 113
>> 2024-02-23T05:41:49.989Z|405935|tunnel|WARN|ovn-bbbbbb-0: attempting to add tunnel port with same config as port 'ovn-aaaaaa-0' (::->1.2.3.4, key=flow, legacy_l2, dp port=9)
>> 2024-02-23T05:41:49.989Z|405936|ofproto|WARN|br-int: could not add port ovn-bbbbbb-0 (File exists)
>> 2024-02-23T05:41:49.989Z|405937|bridge|WARN|could not add network device ovn-bbbbbb-0 to ofproto (File exists)
>> —
> 
> Hi, Han.  Thanks for the patch!
> 
>>
>> Depending on when there are other OVSDB changes, it may take a long time
>> for the device to be added successfully, triggered by the next OVS
>> iteration.
>>
>> (note: native tunnel ports do not have this problem)
> 
> I don't think this is correct.  The code path is common for both system
> and native tunnels.  I can reproduce the issues in a sandbox with:
> 
> $ make -j8 sandbox SANDBOXFLAGS="\-\-dummy='system'"
> [tutorial]$ ovs-vsctl add-port br0 tunnel_port \
>                  -- set Interface tunnel_port \
>                         type=geneve options:remote_ip=flow options:key=123
> [tutorial]$ ovs-vsctl del-port tunnel_port \
>                  -- add-port br0 tunnel_port2 \
>                  -- set Interface tunnel_port2 \
>                         type=geneve options:remote_ip=flow options:key=123
> ovs-vsctl: Error detected while setting up 'tunnel_port2':
> could not add network device tunnel_port2 to ofproto (File exists).
> See ovs-vswitchd log for details.
> 
> The same should work in a testsuite as well, i.e. we should be able to
> create a test for this scenario.
> 
> Note: The --dummy=system prevents OVS from replacing tunnel ports with
>        dummy ones.
> 
>>
>> The problem is how the tunnel port deletion and creation are handled. In
>> bridge_reconfigure(), port deletion is handled before creation, to avoid
>> such resource conflict. However, for kernel tunnel ports, the real clean
>> up is performed at the end of the bridge_reconfigure() in the:
>> bridge_run__()->ofproto_run()->...->ofproto_dpif:port_destruct()
>>
>> We cannot call bridge_run__() at an earlier point before all
>> reconfigurations are done, so this patch tries a generic approach to
>> just re-run the bridge_reconfigure() when there are any port creations
>> encountered "File exists" error, which indicates a possible resource
>> conflict may have happened due to a later deleted port, and retry may
>> succeed.
>>
>> Signed-off-by: Han Zhou <hzhou@ovn.org>
>> ---
>> This is RFC because I am not sure if there is a better way to solve the problem
>> more specifically by executing the port_destruct for the old port before trying
>> to create the new port. The fix may be more complex though.
> 
> I don't think re-trying is a good approach in general.  We should likely
> just destroy the tnl_port structure right away, similarly to how we clean
> up stp, lldp and bundles in ofproto_port_unregister().  Maybe we can add
> a new callback similar to bundle_remove() and call tnl_port_del() from it?
> (I didn't try, so I'm not 100% sure this will not cause any issues.)
> 
> What do you think?
> 
> Best regards, Ilya Maximets.
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Hi Ilya and Han,

We hit this issue some days ago. As our analysis, it was introduced by
commit fe83f81df977 ("netdev: Remove netdev from global shash when the 
user is changing interface configuration.")

Reproduce:
   ovs-vsctl add-port br-int vxlan0 \
     -- set interface vxlan0 type=vxlan options:remote_ip=10.10.10.1

   sleep 2

   ovs-vsctl --if-exists del-port vxlan0 \
     -- add-port br-int vxlan1 \
     -- set interface vxlan1 type=vxlan options:remote_ip=10.10.10.1
   ovs-vsctl: Error detected while setting up 'vxlan1': could not add
   network device vxlan1 to ofproto (File exists).

vswitchd log:
   bridge|INFO|bridge br-int: added interface vxlan0 on port 1106
   bridge|INFO|bridge br-int: deleted interface vxlan0 on port 1106
   tunnel|WARN|vxlan1: attempting to add tunnel port with same config as 
port 'vxlan0' (::->10.10.10.1, key=0, legacy_l2, dp port=122)
   ofproto|WARN|br-int: could not add port vxlan1 (File exists)
   bridge|WARN|could not add network device vxlan1 to ofproto (File exists)

CallTrace:
   bridge_reconfigure
     bridge_del_ports
       port_destroy
         iface_destroy__
           netdev_remove         <------ netdev vxlan0 removed
     bridge_delete_or_reconfigure_ports
       OFPROTO_PORT_FOR_EACH
         ofproto_port_dump_next
           port_dump_next
           port_query_by_name    <------ netdev_shash do not contain vxlan0
         ofproto_port_del        <------ vxlan0 do not del in ofproto
     bridge_add_ports
       bridge_add_ports__
         iface_create
           iface_do_create
             ofproto_port_add    <------ vxlan1 add failed


We tried to fix this by the following diff:

---
  ofproto/ofproto-dpif.c | 8 +++++---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index d7544ef..654468e 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -3907,14 +3907,16 @@ port_query_by_name(const struct ofproto 
*ofproto_, const char *devname,

      if (sset_contains(&ofproto->ghost_ports, devname)) {
          const char *type = netdev_get_type_from_name(devname);
+        const struct ofport *ofport =
+                        shash_find_data(&ofproto->up.port_by_name, 
devname);
+        if (!type && ofport && ofport->netdev) {
+            type = netdev_get_type(ofport->netdev);
+        }

          /* We may be called before ofproto->up.port_by_name is 
populated with
           * the appropriate ofport.  For this reason, we must get the 
name and
           * type from the netdev layer directly. */
          if (type) {
-            const struct ofport *ofport;
-
-            ofport = shash_find_data(&ofproto->up.port_by_name, devname);
              ofproto_port->ofp_port = ofport ? ofport->ofp_port : 
OFPP_NONE;
              ofproto_port->name = xstrdup(devname);
              ofproto_port->type = xstrdup(type);

Best regards, Tao
Han Zhou March 11, 2024, 5:17 a.m. UTC | #3
On Fri, Mar 8, 2024 at 12:17 AM Tao Liu <taoliu828@163.com> wrote:
>
>
> On 3/7/24 5:39 AM, Ilya Maximets wrote:
> > On 2/27/24 20:14, Han Zhou wrote:
> >> For kernel datapath, when a new tunnel port is created in the same
> >> transaction in which an old tunnel port with the same tunnel
> >> configuration is deleted, the new tunnel port creation will fail and
> >> left in an error state. This can be easily reproduced in OVN by
> >> triggering a chassis deletion and addition with the same encap in the
> >> same SB DB transaction, such as:
> >>
> >> ovn-sbctl chassis-add aaaaaa geneve 1.2.3.4
> >> ovn-sbctl chassis-del aaaaaa -- chassis-add bbbbbb 1.2.3.4
> >>
> >> ovs-vsctl show | grep error
> >> error: "could not add network device ovn-bbbbbb-0 to ofproto (File
exists)"
> >>
> >> Related logs in OVS:
> >> —
> >> 2024-02-23T05:41:49.978Z|405933|bridge|INFO|bridge br-int: deleted
interface ovn-aaaaaa-0 on port 113
> >> 2024-02-23T05:41:49.989Z|405935|tunnel|WARN|ovn-bbbbbb-0: attempting
to add tunnel port with same config as port 'ovn-aaaaaa-0' (::->1.2.3.4,
key=flow, legacy_l2, dp port=9)
> >> 2024-02-23T05:41:49.989Z|405936|ofproto|WARN|br-int: could not add
port ovn-bbbbbb-0 (File exists)
> >> 2024-02-23T05:41:49.989Z|405937|bridge|WARN|could not add network
device ovn-bbbbbb-0 to ofproto (File exists)
> >> —
> >
> > Hi, Han.  Thanks for the patch!
> >
> >>
> >> Depending on when there are other OVSDB changes, it may take a long
time
> >> for the device to be added successfully, triggered by the next OVS
> >> iteration.
> >>
> >> (note: native tunnel ports do not have this problem)
> >
> > I don't think this is correct.  The code path is common for both system
> > and native tunnels.  I can reproduce the issues in a sandbox with:
> >
> > $ make -j8 sandbox SANDBOXFLAGS="\-\-dummy='system'"
> > [tutorial]$ ovs-vsctl add-port br0 tunnel_port \
> >                  -- set Interface tunnel_port \
> >                         type=geneve options:remote_ip=flow
options:key=123
> > [tutorial]$ ovs-vsctl del-port tunnel_port \
> >                  -- add-port br0 tunnel_port2 \
> >                  -- set Interface tunnel_port2 \
> >                         type=geneve options:remote_ip=flow
options:key=123
> > ovs-vsctl: Error detected while setting up 'tunnel_port2':
> > could not add network device tunnel_port2 to ofproto (File exists).
> > See ovs-vswitchd log for details.
> >
> > The same should work in a testsuite as well, i.e. we should be able to
> > create a test for this scenario.
> >
> > Note: The --dummy=system prevents OVS from replacing tunnel ports with
> >        dummy ones.
> >

Thanks Ilya for the correction! --dummy=system is very helpful.

> >>
> >> The problem is how the tunnel port deletion and creation are handled.
In
> >> bridge_reconfigure(), port deletion is handled before creation, to
avoid
> >> such resource conflict. However, for kernel tunnel ports, the real
clean
> >> up is performed at the end of the bridge_reconfigure() in the:
> >> bridge_run__()->ofproto_run()->...->ofproto_dpif:port_destruct()
> >>
> >> We cannot call bridge_run__() at an earlier point before all
> >> reconfigurations are done, so this patch tries a generic approach to
> >> just re-run the bridge_reconfigure() when there are any port creations
> >> encountered "File exists" error, which indicates a possible resource
> >> conflict may have happened due to a later deleted port, and retry may
> >> succeed.
> >>
> >> Signed-off-by: Han Zhou <hzhou@ovn.org>
> >> ---
> >> This is RFC because I am not sure if there is a better way to solve
the problem
> >> more specifically by executing the port_destruct for the old port
before trying
> >> to create the new port. The fix may be more complex though.
> >
> > I don't think re-trying is a good approach in general.  We should likely
> > just destroy the tnl_port structure right away, similarly to how we
clean
> > up stp, lldp and bundles in ofproto_port_unregister().  Maybe we can add
> > a new callback similar to bundle_remove() and call tnl_port_del() from
it?
> > (I didn't try, so I'm not 100% sure this will not cause any issues.)
> >
> > What do you think?
> >
> > Best regards, Ilya Maximets.
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
> Hi Ilya and Han,
>
> We hit this issue some days ago. As our analysis, it was introduced by
> commit fe83f81df977 ("netdev: Remove netdev from global shash when the
> user is changing interface configuration.")
>
> Reproduce:
>    ovs-vsctl add-port br-int vxlan0 \
>      -- set interface vxlan0 type=vxlan options:remote_ip=10.10.10.1
>
>    sleep 2
>
>    ovs-vsctl --if-exists del-port vxlan0 \
>      -- add-port br-int vxlan1 \
>      -- set interface vxlan1 type=vxlan options:remote_ip=10.10.10.1
>    ovs-vsctl: Error detected while setting up 'vxlan1': could not add
>    network device vxlan1 to ofproto (File exists).
>
> vswitchd log:
>    bridge|INFO|bridge br-int: added interface vxlan0 on port 1106
>    bridge|INFO|bridge br-int: deleted interface vxlan0 on port 1106
>    tunnel|WARN|vxlan1: attempting to add tunnel port with same config as
> port 'vxlan0' (::->10.10.10.1, key=0, legacy_l2, dp port=122)
>    ofproto|WARN|br-int: could not add port vxlan1 (File exists)
>    bridge|WARN|could not add network device vxlan1 to ofproto (File
exists)
>
> CallTrace:
>    bridge_reconfigure
>      bridge_del_ports
>        port_destroy
>          iface_destroy__
>            netdev_remove         <------ netdev vxlan0 removed
>      bridge_delete_or_reconfigure_ports
>        OFPROTO_PORT_FOR_EACH
>          ofproto_port_dump_next
>            port_dump_next
>            port_query_by_name    <------ netdev_shash do not contain
vxlan0
>          ofproto_port_del        <------ vxlan0 do not del in ofproto
>      bridge_add_ports
>        bridge_add_ports__
>          iface_create
>            iface_do_create
>              ofproto_port_add    <------ vxlan1 add failed
>
>
> We tried to fix this by the following diff:
>
> ---
>   ofproto/ofproto-dpif.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index d7544ef..654468e 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -3907,14 +3907,16 @@ port_query_by_name(const struct ofproto
> *ofproto_, const char *devname,
>
>       if (sset_contains(&ofproto->ghost_ports, devname)) {
>           const char *type = netdev_get_type_from_name(devname);
> +        const struct ofport *ofport =
> +                        shash_find_data(&ofproto->up.port_by_name,
> devname);
> +        if (!type && ofport && ofport->netdev) {
> +            type = netdev_get_type(ofport->netdev);
> +        }
>
>           /* We may be called before ofproto->up.port_by_name is
> populated with
>            * the appropriate ofport.  For this reason, we must get the
> name and
>            * type from the netdev layer directly. */
>           if (type) {
> -            const struct ofport *ofport;
> -
> -            ofport = shash_find_data(&ofproto->up.port_by_name, devname);
>               ofproto_port->ofp_port = ofport ? ofport->ofp_port :
> OFPP_NONE;
>               ofproto_port->name = xstrdup(devname);
>               ofproto_port->type = xstrdup(type);
>
> Best regards, Tao
>

Thanks Tao for the patch. Your patch is much cleaner. Would you like to
send a formal patch?
Ilya, what do you think?

Han
Ilya Maximets March 11, 2024, 10:37 a.m. UTC | #4
On 3/11/24 06:17, Han Zhou wrote:
> 
> 
> On Fri, Mar 8, 2024 at 12:17 AM Tao Liu <taoliu828@163.com <mailto:taoliu828@163.com>> wrote:
>>
>>
>> On 3/7/24 5:39 AM, Ilya Maximets wrote:
>> > On 2/27/24 20:14, Han Zhou wrote:
>> >> For kernel datapath, when a new tunnel port is created in the same
>> >> transaction in which an old tunnel port with the same tunnel
>> >> configuration is deleted, the new tunnel port creation will fail and
>> >> left in an error state. This can be easily reproduced in OVN by
>> >> triggering a chassis deletion and addition with the same encap in the
>> >> same SB DB transaction, such as:
>> >>
>> >> ovn-sbctl chassis-add aaaaaa geneve 1.2.3.4
>> >> ovn-sbctl chassis-del aaaaaa -- chassis-add bbbbbb 1.2.3.4
>> >>
>> >> ovs-vsctl show | grep error
>> >> error: "could not add network device ovn-bbbbbb-0 to ofproto (File exists)"
>> >>
>> >> Related logs in OVS:
>> >> —
>> >> 2024-02-23T05:41:49.978Z|405933|bridge|INFO|bridge br-int: deleted interface ovn-aaaaaa-0 on port 113
>> >> 2024-02-23T05:41:49.989Z|405935|tunnel|WARN|ovn-bbbbbb-0: attempting to add tunnel port with same config as port 'ovn-aaaaaa-0' (::->1.2.3.4, key=flow, legacy_l2, dp port=9)
>> >> 2024-02-23T05:41:49.989Z|405936|ofproto|WARN|br-int: could not add port ovn-bbbbbb-0 (File exists)
>> >> 2024-02-23T05:41:49.989Z|405937|bridge|WARN|could not add network device ovn-bbbbbb-0 to ofproto (File exists)
>> >> —
>> >
>> > Hi, Han.  Thanks for the patch!
>> >
>> >>
>> >> Depending on when there are other OVSDB changes, it may take a long time
>> >> for the device to be added successfully, triggered by the next OVS
>> >> iteration.
>> >>
>> >> (note: native tunnel ports do not have this problem)
>> >
>> > I don't think this is correct.  The code path is common for both system
>> > and native tunnels.  I can reproduce the issues in a sandbox with:
>> >
>> > $ make -j8 sandbox SANDBOXFLAGS="\-\-dummy='system'"
>> > [tutorial]$ ovs-vsctl add-port br0 tunnel_port \
>> >                  -- set Interface tunnel_port \
>> >                         type=geneve options:remote_ip=flow options:key=123
>> > [tutorial]$ ovs-vsctl del-port tunnel_port \
>> >                  -- add-port br0 tunnel_port2 \
>> >                  -- set Interface tunnel_port2 \
>> >                         type=geneve options:remote_ip=flow options:key=123
>> > ovs-vsctl: Error detected while setting up 'tunnel_port2':
>> > could not add network device tunnel_port2 to ofproto (File exists).
>> > See ovs-vswitchd log for details.
>> >
>> > The same should work in a testsuite as well, i.e. we should be able to
>> > create a test for this scenario.
>> >
>> > Note: The --dummy=system prevents OVS from replacing tunnel ports with
>> >        dummy ones.
>> >
> 
> Thanks Ilya for the correction! --dummy=system is very helpful.
> 
>> >>
>> >> The problem is how the tunnel port deletion and creation are handled. In
>> >> bridge_reconfigure(), port deletion is handled before creation, to avoid
>> >> such resource conflict. However, for kernel tunnel ports, the real clean
>> >> up is performed at the end of the bridge_reconfigure() in the:
>> >> bridge_run__()->ofproto_run()->...->ofproto_dpif:port_destruct()
>> >>
>> >> We cannot call bridge_run__() at an earlier point before all
>> >> reconfigurations are done, so this patch tries a generic approach to
>> >> just re-run the bridge_reconfigure() when there are any port creations
>> >> encountered "File exists" error, which indicates a possible resource
>> >> conflict may have happened due to a later deleted port, and retry may
>> >> succeed.
>> >>
>> >> Signed-off-by: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org>>
>> >> ---
>> >> This is RFC because I am not sure if there is a better way to solve the problem
>> >> more specifically by executing the port_destruct for the old port before trying
>> >> to create the new port. The fix may be more complex though.
>> >
>> > I don't think re-trying is a good approach in general.  We should likely
>> > just destroy the tnl_port structure right away, similarly to how we clean
>> > up stp, lldp and bundles in ofproto_port_unregister().  Maybe we can add
>> > a new callback similar to bundle_remove() and call tnl_port_del() from it?
>> > (I didn't try, so I'm not 100% sure this will not cause any issues.)
>> >
>> > What do you think?
>> >
>> > Best regards, Ilya Maximets.
>> > _______________________________________________
>> > dev mailing list
>> > dev@openvswitch.org <mailto:dev@openvswitch.org>
>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
>>
>> Hi Ilya and Han,
>>
>> We hit this issue some days ago. As our analysis, it was introduced by
>> commit fe83f81df977 ("netdev: Remove netdev from global shash when the
>> user is changing interface configuration.")
>>
>> Reproduce:
>>    ovs-vsctl add-port br-int vxlan0 \
>>      -- set interface vxlan0 type=vxlan options:remote_ip=10.10.10.1
>>
>>    sleep 2
>>
>>    ovs-vsctl --if-exists del-port vxlan0 \
>>      -- add-port br-int vxlan1 \
>>      -- set interface vxlan1 type=vxlan options:remote_ip=10.10.10.1
>>    ovs-vsctl: Error detected while setting up 'vxlan1': could not add
>>    network device vxlan1 to ofproto (File exists).
>>
>> vswitchd log:
>>    bridge|INFO|bridge br-int: added interface vxlan0 on port 1106
>>    bridge|INFO|bridge br-int: deleted interface vxlan0 on port 1106
>>    tunnel|WARN|vxlan1: attempting to add tunnel port with same config as
>> port 'vxlan0' (::->10.10.10.1, key=0, legacy_l2, dp port=122)
>>    ofproto|WARN|br-int: could not add port vxlan1 (File exists)
>>    bridge|WARN|could not add network device vxlan1 to ofproto (File exists)
>>
>> CallTrace:
>>    bridge_reconfigure
>>      bridge_del_ports
>>        port_destroy
>>          iface_destroy__
>>            netdev_remove         <------ netdev vxlan0 removed
>>      bridge_delete_or_reconfigure_ports
>>        OFPROTO_PORT_FOR_EACH
>>          ofproto_port_dump_next
>>            port_dump_next
>>            port_query_by_name    <------ netdev_shash do not contain vxlan0
>>          ofproto_port_del        <------ vxlan0 do not del in ofproto
>>      bridge_add_ports
>>        bridge_add_ports__
>>          iface_create
>>            iface_do_create
>>              ofproto_port_add    <------ vxlan1 add failed
>>
>>
>> We tried to fix this by the following diff:
>>
>> ---
>>   ofproto/ofproto-dpif.c | 8 +++++---
>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
>> index d7544ef..654468e 100644
>> --- a/ofproto/ofproto-dpif.c
>> +++ b/ofproto/ofproto-dpif.c
>> @@ -3907,14 +3907,16 @@ port_query_by_name(const struct ofproto
>> *ofproto_, const char *devname,
>>
>>       if (sset_contains(&ofproto->ghost_ports, devname)) {
>>           const char *type = netdev_get_type_from_name(devname);
>> +        const struct ofport *ofport =
>> +                        shash_find_data(&ofproto->up.port_by_name,
>> devname);
>> +        if (!type && ofport && ofport->netdev) {
>> +            type = netdev_get_type(ofport->netdev);
>> +        }
>>
>>           /* We may be called before ofproto->up.port_by_name is
>> populated with
>>            * the appropriate ofport.  For this reason, we must get the
>> name and
>>            * type from the netdev layer directly. */
>>           if (type) {
>> -            const struct ofport *ofport;
>> -
>> -            ofport = shash_find_data(&ofproto->up.port_by_name, devname);
>>               ofproto_port->ofp_port = ofport ? ofport->ofp_port :
>> OFPP_NONE;
>>               ofproto_port->name = xstrdup(devname);
>>               ofproto_port->type = xstrdup(type);
>>
>> Best regards, Tao
>>
> 
> Thanks Tao for the patch. Your patch is much cleaner. Would you like to send a formal patch?
> Ilya, what do you think?

I didn't test this one, but it does indeed look like a better solution
to the problem!

port_query_by_name() function will need an update in the comment that
talks why we're retrieving the type from the netdev layer directly and
maybe moving this comment above the netdev_get_type_from_name call.
And we'll need a unit test for this issue.  But otherwise the code above
looks good.  Tao, if you can made these changes and post a formal patch,
that would be great.

Thanks for the analysis!

Best regards, Ilya Maximets.
Tao Liu March 12, 2024, 2 p.m. UTC | #5
On 03/11  , Ilya Maximets wrote:
> On 3/11/24 06:17, Han Zhou wrote:
> > 
> > 
> > On Fri, Mar 8, 2024 at 12:17 AM Tao Liu <taoliu828@163.com <mailto:taoliu828@163.com>> wrote:
> >>
> >>
> >> On 3/7/24 5:39 AM, Ilya Maximets wrote:
> >> > On 2/27/24 20:14, Han Zhou wrote:
> >> >> For kernel datapath, when a new tunnel port is created in the same
> >> >> transaction in which an old tunnel port with the same tunnel
> >> >> configuration is deleted, the new tunnel port creation will fail and
> >> >> left in an error state. This can be easily reproduced in OVN by
> >> >> triggering a chassis deletion and addition with the same encap in the
> >> >> same SB DB transaction, such as:
> >> >>
> >> >> ovn-sbctl chassis-add aaaaaa geneve 1.2.3.4
> >> >> ovn-sbctl chassis-del aaaaaa -- chassis-add bbbbbb 1.2.3.4
> >> >>
> >> >> ovs-vsctl show | grep error
> >> >> error: "could not add network device ovn-bbbbbb-0 to ofproto (File exists)"
> >> >>
> >> >> Related logs in OVS:
> >> >> —
> >> >> 2024-02-23T05:41:49.978Z|405933|bridge|INFO|bridge br-int: deleted interface ovn-aaaaaa-0 on port 113
> >> >> 2024-02-23T05:41:49.989Z|405935|tunnel|WARN|ovn-bbbbbb-0: attempting to add tunnel port with same config as port 'ovn-aaaaaa-0' (::->1.2.3.4, key=flow, legacy_l2, dp port=9)
> >> >> 2024-02-23T05:41:49.989Z|405936|ofproto|WARN|br-int: could not add port ovn-bbbbbb-0 (File exists)
> >> >> 2024-02-23T05:41:49.989Z|405937|bridge|WARN|could not add network device ovn-bbbbbb-0 to ofproto (File exists)
> >> >> —
> >> >
> >> > Hi, Han.  Thanks for the patch!
> >> >
> >> >>
> >> >> Depending on when there are other OVSDB changes, it may take a long time
> >> >> for the device to be added successfully, triggered by the next OVS
> >> >> iteration.
> >> >>
> >> >> (note: native tunnel ports do not have this problem)
> >> >
> >> > I don't think this is correct.  The code path is common for both system
> >> > and native tunnels.  I can reproduce the issues in a sandbox with:
> >> >
> >> > $ make -j8 sandbox SANDBOXFLAGS="\-\-dummy='system'"
> >> > [tutorial]$ ovs-vsctl add-port br0 tunnel_port \
> >> >                  -- set Interface tunnel_port \
> >> >                         type=geneve options:remote_ip=flow options:key=123
> >> > [tutorial]$ ovs-vsctl del-port tunnel_port \
> >> >                  -- add-port br0 tunnel_port2 \
> >> >                  -- set Interface tunnel_port2 \
> >> >                         type=geneve options:remote_ip=flow options:key=123
> >> > ovs-vsctl: Error detected while setting up 'tunnel_port2':
> >> > could not add network device tunnel_port2 to ofproto (File exists).
> >> > See ovs-vswitchd log for details.
> >> >
> >> > The same should work in a testsuite as well, i.e. we should be able to
> >> > create a test for this scenario.
> >> >
> >> > Note: The --dummy=system prevents OVS from replacing tunnel ports with
> >> >        dummy ones.
> >> >
> > 
> > Thanks Ilya for the correction! --dummy=system is very helpful.
> > 
> >> >>
> >> >> The problem is how the tunnel port deletion and creation are handled. In
> >> >> bridge_reconfigure(), port deletion is handled before creation, to avoid
> >> >> such resource conflict. However, for kernel tunnel ports, the real clean
> >> >> up is performed at the end of the bridge_reconfigure() in the:
> >> >> bridge_run__()->ofproto_run()->...->ofproto_dpif:port_destruct()
> >> >>
> >> >> We cannot call bridge_run__() at an earlier point before all
> >> >> reconfigurations are done, so this patch tries a generic approach to
> >> >> just re-run the bridge_reconfigure() when there are any port creations
> >> >> encountered "File exists" error, which indicates a possible resource
> >> >> conflict may have happened due to a later deleted port, and retry may
> >> >> succeed.
> >> >>
> >> >> Signed-off-by: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org>>
> >> >> ---
> >> >> This is RFC because I am not sure if there is a better way to solve the problem
> >> >> more specifically by executing the port_destruct for the old port before trying
> >> >> to create the new port. The fix may be more complex though.
> >> >
> >> > I don't think re-trying is a good approach in general.  We should likely
> >> > just destroy the tnl_port structure right away, similarly to how we clean
> >> > up stp, lldp and bundles in ofproto_port_unregister().  Maybe we can add
> >> > a new callback similar to bundle_remove() and call tnl_port_del() from it?
> >> > (I didn't try, so I'm not 100% sure this will not cause any issues.)
> >> >
> >> > What do you think?
> >> >
> >> > Best regards, Ilya Maximets.
> >> > _______________________________________________
> >> > dev mailing list
> >> > dev@openvswitch.org <mailto:dev@openvswitch.org>
> >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
> >>
> >> Hi Ilya and Han,
> >>
> >> We hit this issue some days ago. As our analysis, it was introduced by
> >> commit fe83f81df977 ("netdev: Remove netdev from global shash when the
> >> user is changing interface configuration.")
> >>
> >> Reproduce:
> >>    ovs-vsctl add-port br-int vxlan0 \
> >>      -- set interface vxlan0 type=vxlan options:remote_ip=10.10.10.1
> >>
> >>    sleep 2
> >>
> >>    ovs-vsctl --if-exists del-port vxlan0 \
> >>      -- add-port br-int vxlan1 \
> >>      -- set interface vxlan1 type=vxlan options:remote_ip=10.10.10.1
> >>    ovs-vsctl: Error detected while setting up 'vxlan1': could not add
> >>    network device vxlan1 to ofproto (File exists).
> >>
> >> vswitchd log:
> >>    bridge|INFO|bridge br-int: added interface vxlan0 on port 1106
> >>    bridge|INFO|bridge br-int: deleted interface vxlan0 on port 1106
> >>    tunnel|WARN|vxlan1: attempting to add tunnel port with same config as
> >> port 'vxlan0' (::->10.10.10.1, key=0, legacy_l2, dp port=122)
> >>    ofproto|WARN|br-int: could not add port vxlan1 (File exists)
> >>    bridge|WARN|could not add network device vxlan1 to ofproto (File exists)
> >>
> >> CallTrace:
> >>    bridge_reconfigure
> >>      bridge_del_ports
> >>        port_destroy
> >>          iface_destroy__
> >>            netdev_remove         <------ netdev vxlan0 removed
> >>      bridge_delete_or_reconfigure_ports
> >>        OFPROTO_PORT_FOR_EACH
> >>          ofproto_port_dump_next
> >>            port_dump_next
> >>            port_query_by_name    <------ netdev_shash do not contain vxlan0
> >>          ofproto_port_del        <------ vxlan0 do not del in ofproto
> >>      bridge_add_ports
> >>        bridge_add_ports__
> >>          iface_create
> >>            iface_do_create
> >>              ofproto_port_add    <------ vxlan1 add failed
> >>
> >>
> >> We tried to fix this by the following diff:
> >>
> >> ---
> >>   ofproto/ofproto-dpif.c | 8 +++++---
> >>   1 file changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> >> index d7544ef..654468e 100644
> >> --- a/ofproto/ofproto-dpif.c
> >> +++ b/ofproto/ofproto-dpif.c
> >> @@ -3907,14 +3907,16 @@ port_query_by_name(const struct ofproto
> >> *ofproto_, const char *devname,
> >>
> >>       if (sset_contains(&ofproto->ghost_ports, devname)) {
> >>           const char *type = netdev_get_type_from_name(devname);
> >> +        const struct ofport *ofport =
> >> +                        shash_find_data(&ofproto->up.port_by_name,
> >> devname);
> >> +        if (!type && ofport && ofport->netdev) {
> >> +            type = netdev_get_type(ofport->netdev);
> >> +        }
> >>
> >>           /* We may be called before ofproto->up.port_by_name is
> >> populated with
> >>            * the appropriate ofport.  For this reason, we must get the
> >> name and
> >>            * type from the netdev layer directly. */
> >>           if (type) {
> >> -            const struct ofport *ofport;
> >> -
> >> -            ofport = shash_find_data(&ofproto->up.port_by_name, devname);
> >>               ofproto_port->ofp_port = ofport ? ofport->ofp_port :
> >> OFPP_NONE;
> >>               ofproto_port->name = xstrdup(devname);
> >>               ofproto_port->type = xstrdup(type);
> >>
> >> Best regards, Tao
> >>
> > 
> > Thanks Tao for the patch. Your patch is much cleaner. Would you like to send a formal patch?
> > Ilya, what do you think?
> 
> I didn't test this one, but it does indeed look like a better solution
> to the problem!
> 
> port_query_by_name() function will need an update in the comment that
> talks why we're retrieving the type from the netdev layer directly and
> maybe moving this comment above the netdev_get_type_from_name call.
> And we'll need a unit test for this issue.  But otherwise the code above
> looks good.  Tao, if you can made these changes and post a formal patch,
> that would be great.
> 
> Thanks for the analysis!
> 
> Best regards, Ilya Maximets.

Hi Ilya and Han,

Thanks for your comments, I will send a formal patch.

Best regards, Tao.
diff mbox series

Patch

diff --git a/tests/tunnel.at b/tests/tunnel.at
index 71e7c2df4eae..d570c78790c7 100644
--- a/tests/tunnel.at
+++ b/tests/tunnel.at
@@ -1059,6 +1059,7 @@  AT_CHECK([ovs-vsctl add-port br0 p2 -- set Interface p2 type=vxlan \
 
 AT_CHECK([grep 'p2: could not set configuration (File exists)' ovs-vswitchd.log | sed "s/^.*\(p2:.*\)$/\1/"], [0],
   [p2: could not set configuration (File exists)
+p2: could not set configuration (File exists)
 ])
 
 OVS_VSWITCHD_STOP(["/p2: VXLAN-GBP, and non-VXLAN-GBP tunnels can't be configured on the same dst_port/d
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index 95a65fcdcd5e..9057da98e6c0 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -278,7 +278,7 @@  static void bridge_delete_ofprotos(void);
 static void bridge_delete_or_reconfigure_ports(struct bridge *);
 static void bridge_del_ports(struct bridge *,
                              const struct shash *wanted_ports);
-static void bridge_add_ports(struct bridge *,
+static bool bridge_add_ports(struct bridge *,
                              const struct shash *wanted_ports);
 
 static void bridge_configure_datapath_id(struct bridge *);
@@ -333,8 +333,8 @@  static void mirror_refresh_stats(struct mirror *);
 
 static void iface_configure_lacp(struct iface *,
                                  struct lacp_member_settings *);
-static bool iface_create(struct bridge *, const struct ovsrec_interface *,
-                         const struct ovsrec_port *);
+static int iface_create(struct bridge *, const struct ovsrec_interface *,
+                        const struct ovsrec_port *);
 static bool iface_is_internal(const struct ovsrec_interface *iface,
                               const struct ovsrec_bridge *br);
 static const char *iface_get_type(const struct ovsrec_interface *,
@@ -858,7 +858,9 @@  datapath_reconfigure(const struct ovsrec_open_vswitch *cfg)
     }
 }
 
-static void
+/* Returns true if any ports addition failed and may need retry. Otherwise
+ * return false. */
+static bool
 bridge_reconfigure(const struct ovsrec_open_vswitch *ovs_cfg)
 {
     struct sockaddr_in *managers;
@@ -943,8 +945,11 @@  bridge_reconfigure(const struct ovsrec_open_vswitch *ovs_cfg)
 
     config_ofproto_types(&ovs_cfg->other_config);
 
+    bool need_retry = false;
     HMAP_FOR_EACH (br, node, &all_bridges) {
-        bridge_add_ports(br, &br->wanted_ports);
+        if (bridge_add_ports(br, &br->wanted_ports)) {
+            need_retry = true;
+        }
         shash_destroy(&br->wanted_ports);
     }
 
@@ -1003,6 +1008,7 @@  bridge_reconfigure(const struct ovsrec_open_vswitch *ovs_cfg)
      * narrow race window in which e.g. ofproto/trace will not recognize the
      * new configuration (sometimes this causes unit test failures). */
     bridge_run__();
+    return need_retry;
 }
 
 /* Delete ofprotos which aren't configured or have the wrong type.  Create
@@ -1197,11 +1203,14 @@  bridge_delete_or_reconfigure_ports(struct bridge *br)
     sset_destroy(&ofproto_ports);
 }
 
-static void
+/* Returns true if any ports addition failed and may need retry. Otherwise
+ * return false. */
+static bool
 bridge_add_ports__(struct bridge *br, const struct shash *wanted_ports,
                    bool with_requested_port)
 {
     struct shash_node *port_node;
+    bool need_retry = false;
 
     SHASH_FOR_EACH (port_node, wanted_ports) {
         const struct ovsrec_port *port_cfg = port_node->data;
@@ -1216,23 +1225,29 @@  bridge_add_ports__(struct bridge *br, const struct shash *wanted_ports,
                 struct iface *iface = iface_lookup(br, iface_cfg->name);
 
                 if (!iface) {
-                    iface_create(br, iface_cfg, port_cfg);
+                    if (EEXIST == iface_create(br, iface_cfg, port_cfg)) {
+                        need_retry = true;
+                    }
                 }
             }
         }
     }
+    return need_retry;
 }
 
-static void
+/* Returns true if any ports addition failed and may need retry. Otherwise
+ * return false. */
+static bool
 bridge_add_ports(struct bridge *br, const struct shash *wanted_ports)
 {
     /* First add interfaces that request a particular port number. */
-    bridge_add_ports__(br, wanted_ports, true);
+    bool ret1 = bridge_add_ports__(br, wanted_ports, true);
 
     /* Then add interfaces that want automatic port number assignment.
      * We add these afterward to avoid accidentally taking a specifically
      * requested port number. */
-    bridge_add_ports__(br, wanted_ports, false);
+    bool ret2 = bridge_add_ports__(br, wanted_ports, false);
+    return ret1 || ret2;
 }
 
 static void
@@ -2157,8 +2172,8 @@  error:
  * automatically allocated for the iface.  Takes ownership of and
  * deallocates 'if_cfg'.
  *
- * Return true if an iface is successfully created, false otherwise. */
-static bool
+ * Return 0 if an iface is successfully created, error otherwise. */
+static int
 iface_create(struct bridge *br, const struct ovsrec_interface *iface_cfg,
              const struct ovsrec_port *port_cfg)
 {
@@ -2175,7 +2190,7 @@  iface_create(struct bridge *br, const struct ovsrec_interface *iface_cfg,
     if (error) {
         iface_clear_db_record(iface_cfg, errp);
         free(errp);
-        return false;
+        return error;
     }
 
     /* Get or create the port structure. */
@@ -2223,7 +2238,7 @@  iface_create(struct bridge *br, const struct ovsrec_interface *iface_cfg,
         }
     }
 
-    return true;
+    return 0;
 }
 
 /* Set forward BPDU option. */
@@ -3364,7 +3379,9 @@  bridge_run(void)
 
         idl_seqno = ovsdb_idl_get_seqno(idl);
         txn = ovsdb_idl_txn_create(idl);
-        bridge_reconfigure(cfg ? cfg : &null_cfg);
+        if (bridge_reconfigure(cfg ? cfg : &null_cfg)) {
+            bridge_reconfigure(cfg ? cfg : &null_cfg);
+        }
 
         if (cfg) {
             ovsrec_open_vswitch_set_cur_cfg(cfg, cfg->next_cfg);