mbox series

[ovs-dev,0/3] revert port duplicate checking optimization

Message ID 20181213142448.8504-1-fbl@redhat.com
Headers show
Series revert port duplicate checking optimization | expand

Message

Flavio Leitner Dec. 13, 2018, 2:24 p.m. UTC
The optimization introduced a regression in OSP environments using
internal ports in other netns. Their networking configuration is lost
when the service is restarted because the ports are recreated now.

Before the patch it checked using netlink if the port with a specific
"name" was already there. The check is a lookup in all ports attached
to the DP regardless of the port's netns.

After the patch it relies on the kernel to identify that situation.
Unfortunately the only protection there is register_netdevice() which
fails only if the port with that name exists in the current netns.

If the port is in another netns, it will get a new dp_port and because
of that userspace will delete the old port. At this point the original
port is gone from the other netns and there a fresh port in the current
netns.

This patchset reverts the original commit and the two other follow ups.

Flavio Leitner (3):
  Revert "dpif-netlink: Don't destroy and recreate port if it exists"
  Revert "ofproto-dpif: Check for EBUSY as well"
  Revert "ofproto-dpif: Let the dpif report when a port is a duplicate."

 lib/dpif-netlink.c     | 4 ++--
 lib/dpif.c             | 9 ++-------
 ofproto/ofproto-dpif.c | 7 ++++---
 3 files changed, 8 insertions(+), 12 deletions(-)

Comments

Flavio Leitner Dec. 13, 2018, 2:30 p.m. UTC | #1
This should be applied to branch-2.10 as well.

(BTW, I had CC few folks in the patchset, but I am only seeing Guru, so
 I am adding them to this email just in case)

Thanks,
fbl

On Thu, Dec 13, 2018 at 12:24:45PM -0200, Flavio Leitner wrote:
> The optimization introduced a regression in OSP environments using
> internal ports in other netns. Their networking configuration is lost
> when the service is restarted because the ports are recreated now.
> 
> Before the patch it checked using netlink if the port with a specific
> "name" was already there. The check is a lookup in all ports attached
> to the DP regardless of the port's netns.
> 
> After the patch it relies on the kernel to identify that situation.
> Unfortunately the only protection there is register_netdevice() which
> fails only if the port with that name exists in the current netns.
> 
> If the port is in another netns, it will get a new dp_port and because
> of that userspace will delete the old port. At this point the original
> port is gone from the other netns and there a fresh port in the current
> netns.
> 
> This patchset reverts the original commit and the two other follow ups.
> 
> Flavio Leitner (3):
>   Revert "dpif-netlink: Don't destroy and recreate port if it exists"
>   Revert "ofproto-dpif: Check for EBUSY as well"
>   Revert "ofproto-dpif: Let the dpif report when a port is a duplicate."
> 
>  lib/dpif-netlink.c     | 4 ++--
>  lib/dpif.c             | 9 ++-------
>  ofproto/ofproto-dpif.c | 7 ++++---
>  3 files changed, 8 insertions(+), 12 deletions(-)
> 
> -- 
> 2.17.2
Flavio Leitner Jan. 23, 2019, 9:55 a.m. UTC | #2
Hi,

Just a reminder about this revert.

On Thu, Dec 13, 2018 at 12:30:47PM -0200, Flavio Leitner wrote:
> 
> This should be applied to branch-2.10 as well.

And now branch-2.11

Thanks,
fbl

> 
> (BTW, I had CC few folks in the patchset, but I am only seeing Guru, so
>  I am adding them to this email just in case)
> 
> Thanks,
> fbl
> 
> On Thu, Dec 13, 2018 at 12:24:45PM -0200, Flavio Leitner wrote:
> > The optimization introduced a regression in OSP environments using
> > internal ports in other netns. Their networking configuration is lost
> > when the service is restarted because the ports are recreated now.
> > 
> > Before the patch it checked using netlink if the port with a specific
> > "name" was already there. The check is a lookup in all ports attached
> > to the DP regardless of the port's netns.
> > 
> > After the patch it relies on the kernel to identify that situation.
> > Unfortunately the only protection there is register_netdevice() which
> > fails only if the port with that name exists in the current netns.
> > 
> > If the port is in another netns, it will get a new dp_port and because
> > of that userspace will delete the old port. At this point the original
> > port is gone from the other netns and there a fresh port in the current
> > netns.
> > 
> > This patchset reverts the original commit and the two other follow ups.
> > 
> > Flavio Leitner (3):
> >   Revert "dpif-netlink: Don't destroy and recreate port if it exists"
> >   Revert "ofproto-dpif: Check for EBUSY as well"
> >   Revert "ofproto-dpif: Let the dpif report when a port is a duplicate."
> > 
> >  lib/dpif-netlink.c     | 4 ++--
> >  lib/dpif.c             | 9 ++-------
> >  ofproto/ofproto-dpif.c | 7 ++++---
> >  3 files changed, 8 insertions(+), 12 deletions(-)
> > 
> > -- 
> > 2.17.2
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ben Pfaff Jan. 23, 2019, 8:13 p.m. UTC | #3
I'm not so happy about reverting without having the followup ready.  How
close are we to having the followup?  Basically we've got two problems
here.  Without the revert, we have one of them; with the revert, we have
the other one.  I'd rather not trade one for the other, that's not
ideal.  It would be much better to fix both in one shot.

On Wed, Jan 23, 2019 at 07:55:36AM -0200, Flavio Leitner wrote:
> 
> Hi,
> 
> Just a reminder about this revert.
> 
> On Thu, Dec 13, 2018 at 12:30:47PM -0200, Flavio Leitner wrote:
> > 
> > This should be applied to branch-2.10 as well.
> 
> And now branch-2.11
> 
> Thanks,
> fbl
> 
> > 
> > (BTW, I had CC few folks in the patchset, but I am only seeing Guru, so
> >  I am adding them to this email just in case)
> > 
> > Thanks,
> > fbl
> > 
> > On Thu, Dec 13, 2018 at 12:24:45PM -0200, Flavio Leitner wrote:
> > > The optimization introduced a regression in OSP environments using
> > > internal ports in other netns. Their networking configuration is lost
> > > when the service is restarted because the ports are recreated now.
> > > 
> > > Before the patch it checked using netlink if the port with a specific
> > > "name" was already there. The check is a lookup in all ports attached
> > > to the DP regardless of the port's netns.
> > > 
> > > After the patch it relies on the kernel to identify that situation.
> > > Unfortunately the only protection there is register_netdevice() which
> > > fails only if the port with that name exists in the current netns.
> > > 
> > > If the port is in another netns, it will get a new dp_port and because
> > > of that userspace will delete the old port. At this point the original
> > > port is gone from the other netns and there a fresh port in the current
> > > netns.
> > > 
> > > This patchset reverts the original commit and the two other follow ups.
> > > 
> > > Flavio Leitner (3):
> > >   Revert "dpif-netlink: Don't destroy and recreate port if it exists"
> > >   Revert "ofproto-dpif: Check for EBUSY as well"
> > >   Revert "ofproto-dpif: Let the dpif report when a port is a duplicate."
> > > 
> > >  lib/dpif-netlink.c     | 4 ++--
> > >  lib/dpif.c             | 9 ++-------
> > >  ofproto/ofproto-dpif.c | 7 ++++---
> > >  3 files changed, 8 insertions(+), 12 deletions(-)
> > > 
> > > -- 
> > > 2.17.2
> > 
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Flavio Leitner Jan. 24, 2019, 2:30 p.m. UTC | #4
On Wed, Jan 23, 2019 at 12:13:25PM -0800, Ben Pfaff wrote:
> I'm not so happy about reverting without having the followup ready.  How
> close are we to having the followup?  Basically we've got two problems
> here.  Without the revert, we have one of them; with the revert, we have
> the other one.  I'd rather not trade one for the other, that's not
> ideal.  It would be much better to fix both in one shot.

Hi Ben,

I guess one is a performance problem and the other is a broken
environment that has no workaround (to my knowledge).

The kernel fix and possibly a follow up in userspace is on my
ToDo list, but I haven't had a chance to get to it yet.

Thanks,
fbl

> 
> On Wed, Jan 23, 2019 at 07:55:36AM -0200, Flavio Leitner wrote:
> > 
> > Hi,
> > 
> > Just a reminder about this revert.
> > 
> > On Thu, Dec 13, 2018 at 12:30:47PM -0200, Flavio Leitner wrote:
> > > 
> > > This should be applied to branch-2.10 as well.
> > 
> > And now branch-2.11
> > 
> > Thanks,
> > fbl
> > 
> > > 
> > > (BTW, I had CC few folks in the patchset, but I am only seeing Guru, so
> > >  I am adding them to this email just in case)
> > > 
> > > Thanks,
> > > fbl
> > > 
> > > On Thu, Dec 13, 2018 at 12:24:45PM -0200, Flavio Leitner wrote:
> > > > The optimization introduced a regression in OSP environments using
> > > > internal ports in other netns. Their networking configuration is lost
> > > > when the service is restarted because the ports are recreated now.
> > > > 
> > > > Before the patch it checked using netlink if the port with a specific
> > > > "name" was already there. The check is a lookup in all ports attached
> > > > to the DP regardless of the port's netns.
> > > > 
> > > > After the patch it relies on the kernel to identify that situation.
> > > > Unfortunately the only protection there is register_netdevice() which
> > > > fails only if the port with that name exists in the current netns.
> > > > 
> > > > If the port is in another netns, it will get a new dp_port and because
> > > > of that userspace will delete the old port. At this point the original
> > > > port is gone from the other netns and there a fresh port in the current
> > > > netns.
> > > > 
> > > > This patchset reverts the original commit and the two other follow ups.
> > > > 
> > > > Flavio Leitner (3):
> > > >   Revert "dpif-netlink: Don't destroy and recreate port if it exists"
> > > >   Revert "ofproto-dpif: Check for EBUSY as well"
> > > >   Revert "ofproto-dpif: Let the dpif report when a port is a duplicate."
> > > > 
> > > >  lib/dpif-netlink.c     | 4 ++--
> > > >  lib/dpif.c             | 9 ++-------
> > > >  ofproto/ofproto-dpif.c | 7 ++++---
> > > >  3 files changed, 8 insertions(+), 12 deletions(-)
> > > > 
> > > > -- 
> > > > 2.17.2
> > > 
> > > _______________________________________________
> > > dev mailing list
> > > dev@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
Ben Pfaff Jan. 26, 2019, 3:11 a.m. UTC | #5
On Thu, Jan 24, 2019 at 12:30:08PM -0200, Flavio Leitner wrote:
> On Wed, Jan 23, 2019 at 12:13:25PM -0800, Ben Pfaff wrote:
> > I'm not so happy about reverting without having the followup ready.  How
> > close are we to having the followup?  Basically we've got two problems
> > here.  Without the revert, we have one of them; with the revert, we have
> > the other one.  I'd rather not trade one for the other, that's not
> > ideal.  It would be much better to fix both in one shot.
> 
> Hi Ben,
> 
> I guess one is a performance problem and the other is a broken
> environment that has no workaround (to my knowledge).

I guess you're right.  I'm not happy with it, but I reverted this on
master, branch-2.11, and branch-2.10.