diff mbox series

[ovs-dev] ofproto-dpif: Let the dpif report when a port is a duplicate.

Message ID 20180621225353.9303-1-blp@ovn.org
State Accepted
Headers show
Series [ovs-dev] ofproto-dpif: Let the dpif report when a port is a duplicate. | expand

Commit Message

Ben Pfaff June 21, 2018, 10:53 p.m. UTC
The port_add() function checks whether the port about to be added to the
dpif is already present and adds it only if it is not.  This duplicates a
check also present (and necessary) in each dpif and races with it as well.
When a dpif has a large number of ports, the check can be expensive (it is
not efficiently implemented).  It would be nice to made the check cheaper,
but it also seems reasonable to do as done in this patch and just let the
dpif report the duplication.

Reported-by: Haifeng Lin <haifeng.lin@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
---
 lib/dpif.c             | 9 +++++++--
 ofproto/ofproto-dpif.c | 7 +++----
 2 files changed, 10 insertions(+), 6 deletions(-)

Comments

Justin Pettit June 28, 2018, 12:08 a.m. UTC | #1
> On Jun 21, 2018, at 3:53 PM, Ben Pfaff <blp@ovn.org> wrote:
> 
> The port_add() function checks whether the port about to be added to the
> dpif is already present and adds it only if it is not.  This duplicates a
> check also present (and necessary) in each dpif and races with it as well.
> When a dpif has a large number of ports, the check can be expensive (it is
> not efficiently implemented).  It would be nice to made the check cheaper,
> but it also seems reasonable to do as done in this patch and just let the
> dpif report the duplication.
> 
> Reported-by: Haifeng Lin <haifeng.lin@huawei.com>
> Signed-off-by: Ben Pfaff <blp@ovn.org>

Acked-by: Justin Pettit <jpettit@ovn.org>

--Justin
Ben Pfaff July 5, 2018, 10 p.m. UTC | #2
On Wed, Jun 27, 2018 at 05:08:31PM -0700, Justin Pettit wrote:
> 
> > On Jun 21, 2018, at 3:53 PM, Ben Pfaff <blp@ovn.org> wrote:
> > 
> > The port_add() function checks whether the port about to be added to the
> > dpif is already present and adds it only if it is not.  This duplicates a
> > check also present (and necessary) in each dpif and races with it as well.
> > When a dpif has a large number of ports, the check can be expensive (it is
> > not efficiently implemented).  It would be nice to made the check cheaper,
> > but it also seems reasonable to do as done in this patch and just let the
> > dpif report the duplication.
> > 
> > Reported-by: Haifeng Lin <haifeng.lin@huawei.com>
> > Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> Acked-by: Justin Pettit <jpettit@ovn.org>

Thanks, applied to master.
Flavio Leitner Dec. 5, 2018, 5:12 p.m. UTC | #3
Hi Ben,

This patch introduced a regression in OSP environments using internal
ports in other netns. Their networking configuration is lost when
the service is restarted because the ports are recreated now.

Before the patch it checked using netlink if the port with a specific
"name" was already there. I believe that's the check you referred as
expensive below. Anyways, the check is a lookup in all ports attached
to the DP regardless of the port's netns.

After the patch it relies on the kernel to identify that situation.
Unfortunately the only protection there is register_netdevice() which
fails only if the port with that name exists in the current netns.

If the port is in another netns, it will get a new dp_port and because
of that userspace will delete the old port. At this point the original
port is gone from the other netns and there a fresh port in the current
netns.

I think the optimization is a good idea, so I came up with this kernel
patch to make sure we are not adding another vport with the same name.
It resolved the issue in my small env (want to do more tests though).

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 252adfb6fc0b..291b4a71a910 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -2022,6 +2022,11 @@ static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info)
 		return -ENOMEM;
 
 	ovs_lock();
+	vport = lookup_vport(sock_net(skb->sk), ovs_header, a);
+	err = -EEXIST;
+	if (!IS_ERR(vport) && vport)
+		goto exit_unlock_free;
+
 restart:
 	dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex);
 	err = -ENODEV;


However, OSP users using unpatched kernel with OVS 2.10 might trigger
the bug, so I wonder if we should revert the patch in 2.10 and work
on an improved fix for 2.11. Perhaps we can detect if the kernel fix
is in there (or not) by trying to add the same port twice once and
use that as a hint. Perhaps there is something cheaper in dpif to
verify if the vport is there that is not vulnerable to races.

Thanks,
fbl


On Thu, Jun 21, 2018 at 03:53:53PM -0700, Ben Pfaff wrote:
> The port_add() function checks whether the port about to be added to the
> dpif is already present and adds it only if it is not.  This duplicates a
> check also present (and necessary) in each dpif and races with it as well.
> When a dpif has a large number of ports, the check can be expensive (it is
> not efficiently implemented).  It would be nice to made the check cheaper,
> but it also seems reasonable to do as done in this patch and just let the
> dpif report the duplication.
> 
> Reported-by: Haifeng Lin <haifeng.lin@huawei.com>
> Signed-off-by: Ben Pfaff <blp@ovn.org>
> ---
>  lib/dpif.c             | 9 +++++++--
>  ofproto/ofproto-dpif.c | 7 +++----
>  2 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/dpif.c b/lib/dpif.c
> index f6a7f6a72e18..d78330bef3b8 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -591,8 +591,13 @@ dpif_port_add(struct dpif *dpif, struct netdev *netdev, odp_port_t *port_nop)
>              netdev_ports_insert(netdev, dpif->dpif_class, &dpif_port);
>          }
>      } else {
> -        VLOG_WARN_RL(&error_rl, "%s: failed to add %s as port: %s",
> -                     dpif_name(dpif), netdev_name, ovs_strerror(error));
> +        if (error != EEXIST) {
> +            VLOG_WARN_RL(&error_rl, "%s: failed to add %s as port: %s",
> +                         dpif_name(dpif), netdev_name, ovs_strerror(error));
> +        } else {
> +            /* It's fairly common for upper layers to try to add a duplicate
> +             * port, and they know how to handle it properly. */
> +        }
>          port_no = ODPP_NONE;
>      }
>      if (port_nop) {
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index ca4582cd5064..771be2fcc88a 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -3683,11 +3683,10 @@ port_add(struct ofproto *ofproto_, struct netdev *netdev)
>      }
>  
>      dp_port_name = netdev_vport_get_dpif_port(netdev, namebuf, sizeof namebuf);
> -    if (!dpif_port_exists(ofproto->backer->dpif, dp_port_name)) {
> -        odp_port_t port_no = ODPP_NONE;
> -        int error;
>  
> -        error = dpif_port_add(ofproto->backer->dpif, netdev, &port_no);
> +    odp_port_t port_no = ODPP_NONE;
> +    int error = dpif_port_add(ofproto->backer->dpif, netdev, &port_no);
> +    if (error != EEXIST) {
>          if (error) {
>              return error;
>          }
> -- 
> 2.16.1
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ben Pfaff Dec. 12, 2018, 4:36 p.m. UTC | #4
On Wed, Dec 05, 2018 at 03:12:08PM -0200, Flavio Leitner wrote:
> 
> Hi Ben,
> 
> This patch introduced a regression in OSP environments using internal
> ports in other netns. Their networking configuration is lost when
> the service is restarted because the ports are recreated now.
> 
> Before the patch it checked using netlink if the port with a specific
> "name" was already there. I believe that's the check you referred as
> expensive below. Anyways, the check is a lookup in all ports attached
> to the DP regardless of the port's netns.
> 
> After the patch it relies on the kernel to identify that situation.
> Unfortunately the only protection there is register_netdevice() which
> fails only if the port with that name exists in the current netns.
> 
> If the port is in another netns, it will get a new dp_port and because
> of that userspace will delete the old port. At this point the original
> port is gone from the other netns and there a fresh port in the current
> netns.
> 
> I think the optimization is a good idea, so I came up with this kernel
> patch to make sure we are not adding another vport with the same name.
> It resolved the issue in my small env (want to do more tests though).
> 
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 252adfb6fc0b..291b4a71a910 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -2022,6 +2022,11 @@ static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info)
>  		return -ENOMEM;
>  
>  	ovs_lock();
> +	vport = lookup_vport(sock_net(skb->sk), ovs_header, a);
> +	err = -EEXIST;
> +	if (!IS_ERR(vport) && vport)
> +		goto exit_unlock_free;
> +
>  restart:
>  	dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex);
>  	err = -ENODEV;
> 
> 
> However, OSP users using unpatched kernel with OVS 2.10 might trigger
> the bug, so I wonder if we should revert the patch in 2.10 and work
> on an improved fix for 2.11. Perhaps we can detect if the kernel fix
> is in there (or not) by trying to add the same port twice once and
> use that as a hint. Perhaps there is something cheaper in dpif to
> verify if the vport is there that is not vulnerable to races.

This seems reasonable to me.

Would you mind coming up with a series that reverts the OVS 2.10 patch
and separately adds the improved fix?
Flavio Leitner Dec. 13, 2018, 1:03 p.m. UTC | #5
On Wed, Dec 12, 2018 at 08:36:09AM -0800, Ben Pfaff wrote:
> On Wed, Dec 05, 2018 at 03:12:08PM -0200, Flavio Leitner wrote:
> > 
> > Hi Ben,
> > 
> > This patch introduced a regression in OSP environments using internal
> > ports in other netns. Their networking configuration is lost when
> > the service is restarted because the ports are recreated now.
> > 
> > Before the patch it checked using netlink if the port with a specific
> > "name" was already there. I believe that's the check you referred as
> > expensive below. Anyways, the check is a lookup in all ports attached
> > to the DP regardless of the port's netns.
> > 
> > After the patch it relies on the kernel to identify that situation.
> > Unfortunately the only protection there is register_netdevice() which
> > fails only if the port with that name exists in the current netns.
> > 
> > If the port is in another netns, it will get a new dp_port and because
> > of that userspace will delete the old port. At this point the original
> > port is gone from the other netns and there a fresh port in the current
> > netns.
> > 
> > I think the optimization is a good idea, so I came up with this kernel
> > patch to make sure we are not adding another vport with the same name.
> > It resolved the issue in my small env (want to do more tests though).
> > 
> > diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> > index 252adfb6fc0b..291b4a71a910 100644
> > --- a/net/openvswitch/datapath.c
> > +++ b/net/openvswitch/datapath.c
> > @@ -2022,6 +2022,11 @@ static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info)
> >  		return -ENOMEM;
> >  
> >  	ovs_lock();
> > +	vport = lookup_vport(sock_net(skb->sk), ovs_header, a);
> > +	err = -EEXIST;
> > +	if (!IS_ERR(vport) && vport)
> > +		goto exit_unlock_free;
> > +
> >  restart:
> >  	dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex);
> >  	err = -ENODEV;
> > 
> > 
> > However, OSP users using unpatched kernel with OVS 2.10 might trigger
> > the bug, so I wonder if we should revert the patch in 2.10 and work
> > on an improved fix for 2.11. Perhaps we can detect if the kernel fix
> > is in there (or not) by trying to add the same port twice once and
> > use that as a hint. Perhaps there is something cheaper in dpif to
> > verify if the vport is there that is not vulnerable to races.
> 
> This seems reasonable to me.
> 
> Would you mind coming up with a series that reverts the OVS 2.10 patch
> and separately adds the improved fix?

Sure, I will submit the revert ASAP, but I am going out in vacations and
I don't have the improved fix ready. I mean, I have the kernel part
above, but I am not sure yet if we could do better than that or not, so
it will take more time.

I will continue when I come back if nobody fixes that in the meantime.

Thanks,
Flavio Leitner Dec. 13, 2018, 2:32 p.m. UTC | #6
On Thu, Dec 13, 2018 at 11:03:40AM -0200, Flavio Leitner wrote:
> On Wed, Dec 12, 2018 at 08:36:09AM -0800, Ben Pfaff wrote:
> > Would you mind coming up with a series that reverts the OVS 2.10 patch
> > and separately adds the improved fix?
> 
> Sure, I will submit the revert ASAP, but I am going out in vacations and

Revert posted:
https://mail.openvswitch.org/pipermail/ovs-dev/2018-December/354496.html
Thanks,
diff mbox series

Patch

diff --git a/lib/dpif.c b/lib/dpif.c
index f6a7f6a72e18..d78330bef3b8 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -591,8 +591,13 @@  dpif_port_add(struct dpif *dpif, struct netdev *netdev, odp_port_t *port_nop)
             netdev_ports_insert(netdev, dpif->dpif_class, &dpif_port);
         }
     } else {
-        VLOG_WARN_RL(&error_rl, "%s: failed to add %s as port: %s",
-                     dpif_name(dpif), netdev_name, ovs_strerror(error));
+        if (error != EEXIST) {
+            VLOG_WARN_RL(&error_rl, "%s: failed to add %s as port: %s",
+                         dpif_name(dpif), netdev_name, ovs_strerror(error));
+        } else {
+            /* It's fairly common for upper layers to try to add a duplicate
+             * port, and they know how to handle it properly. */
+        }
         port_no = ODPP_NONE;
     }
     if (port_nop) {
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index ca4582cd5064..771be2fcc88a 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -3683,11 +3683,10 @@  port_add(struct ofproto *ofproto_, struct netdev *netdev)
     }
 
     dp_port_name = netdev_vport_get_dpif_port(netdev, namebuf, sizeof namebuf);
-    if (!dpif_port_exists(ofproto->backer->dpif, dp_port_name)) {
-        odp_port_t port_no = ODPP_NONE;
-        int error;
 
-        error = dpif_port_add(ofproto->backer->dpif, netdev, &port_no);
+    odp_port_t port_no = ODPP_NONE;
+    int error = dpif_port_add(ofproto->backer->dpif, netdev, &port_no);
+    if (error != EEXIST) {
         if (error) {
             return error;
         }