Message ID | 3d191a3c51bd564da8b0c3ffe1e9c90fa7bd4d7b.1466590952.git.pabeni@redhat.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
Wed, Jun 22, 2016 at 12:25:15PM CEST, pabeni@redhat.com wrote: >Before the commit 0dfe17823945 ("net: vlan: goto another_round >instead of calling __netif_receive_skb"), on tagged skb ingress, >ptype specific protocol matches were delivered only to the >related vlan device, if any. >After said commit, jumping back to the 'another_round' label, allows >the later ptype specific check to match both orig_dev and skb->dev, >delivering the skb to both the vlan device and the underlying >device. >This cause i.e. packet sockets bound to a specific protocol type on >one of said devices to receive also frames really targeting the >other device. >This commit resets orig_dev before performing another round due to >vlan processing, allowing the skb to be delivered once again only >to the vlan specific ptypes. I don't get why vlan should behave differently in this comparing to other stacked devices like bond/team/br etc. Could you please explain? > >Fixes: 0dfe17823945 ("net: vlan: goto another_round instead of calling __netif_receive_skb") >Reported-by: Ryan Liu <Ryan.Liu@alcatel-lucent.com> >Reported-by: Cliff Chen <Cliff.Chen@alcatel-lucent.com> >Signed-off-by: Paolo Abeni <pabeni@redhat.com> >--- > net/core/dev.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > >diff --git a/net/core/dev.c b/net/core/dev.c >index 904ff43..9d08dd6 100644 >--- a/net/core/dev.c >+++ b/net/core/dev.c >@@ -4144,10 +4144,15 @@ ncls: > ret = deliver_skb(skb, pt_prev, orig_dev); > pt_prev = NULL; > } >- if (vlan_do_receive(&skb)) >+ if (vlan_do_receive(&skb)) { >+ /* avoid delivering to ptype registered on >+ * vlan's underlying device only >+ */ >+ orig_dev = skb->dev; > goto another_round; >- else if (unlikely(!skb)) >+ } else if (unlikely(!skb)) { > goto out; >+ } > } > > rx_handler = rcu_dereference(skb->dev->rx_handler); >-- >1.8.3.1 >
On Wed, 2016-06-22 at 12:55 +0200, Jiri Pirko wrote: > Wed, Jun 22, 2016 at 12:25:15PM CEST, pabeni@redhat.com wrote: > >Before the commit 0dfe17823945 ("net: vlan: goto another_round > >instead of calling __netif_receive_skb"), on tagged skb ingress, > >ptype specific protocol matches were delivered only to the > >related vlan device, if any. > >After said commit, jumping back to the 'another_round' label, allows > >the later ptype specific check to match both orig_dev and skb->dev, > >delivering the skb to both the vlan device and the underlying > >device. > >This cause i.e. packet sockets bound to a specific protocol type on > >one of said devices to receive also frames really targeting the > >other device. > >This commit resets orig_dev before performing another round due to > >vlan processing, allowing the skb to be delivered once again only > >to the vlan specific ptypes. > > I don't get why vlan should behave differently in this comparing to > other stacked devices like bond/team/br etc. > > Could you please explain? vlan behavior has changed with the commit 0dfe17823945 (which is old, but still...), this patch is trying to restore the old one. The new behavior fouls existing applications which where expecting ptype specific match to be delivered only on the relevant device. AFAIC jumping back to the 'another_round' is possible for team, bond, macvlan, macsec and ipvlan device, beyond vlan devices. I think the latter should be treated differently, beyond history reasons, because the vlan device processing, stripping the vlan tag, actually changes the ethernet type present into the ethernet header in respect to the wire packet layout; strict match on that value should give different results before and after vlan stripping. Paolo
Wed, Jun 22, 2016 at 02:21:29PM CEST, pabeni@redhat.com wrote: >On Wed, 2016-06-22 at 12:55 +0200, Jiri Pirko wrote: >> Wed, Jun 22, 2016 at 12:25:15PM CEST, pabeni@redhat.com wrote: >> >Before the commit 0dfe17823945 ("net: vlan: goto another_round >> >instead of calling __netif_receive_skb"), on tagged skb ingress, >> >ptype specific protocol matches were delivered only to the >> >related vlan device, if any. >> >After said commit, jumping back to the 'another_round' label, allows >> >the later ptype specific check to match both orig_dev and skb->dev, >> >delivering the skb to both the vlan device and the underlying >> >device. >> >This cause i.e. packet sockets bound to a specific protocol type on >> >one of said devices to receive also frames really targeting the >> >other device. >> >This commit resets orig_dev before performing another round due to >> >vlan processing, allowing the skb to be delivered once again only >> >to the vlan specific ptypes. >> >> I don't get why vlan should behave differently in this comparing to >> other stacked devices like bond/team/br etc. >> >> Could you please explain? > >vlan behavior has changed with the commit 0dfe17823945 (which is old, >but still...), this patch is trying to restore the old one. > >The new behavior fouls existing applications which where expecting ptype >specific match to be delivered only on the relevant device. The patch only unified the behaviour so everyone know what to expect regardless of the device type. > >AFAIC jumping back to the 'another_round' is possible for team, bond, >macvlan, macsec and ipvlan device, beyond vlan devices. I think the >latter should be treated differently, beyond history reasons, because >the vlan device processing, stripping the vlan tag, actually changes the >ethernet type present into the ethernet header in respect to the wire >packet layout; strict match on that value should give different results >before and after vlan stripping. The vlan header stripping happens before all of this, either in HW or skb_vlan_untag, so you can never match on that.
From: Jiri Pirko <jiri@resnulli.us> Date: Wed, 22 Jun 2016 14:40:50 +0200 > The patch only unified the behaviour so everyone know what to expect > regardless of the device type. +1 And reverting this behavior back is going to hurt more people than it will help, so I am not going to apply this, sorry.
On Sun, 2016-06-26 at 16:06 -0400, David Miller wrote: > From: Jiri Pirko <jiri@resnulli.us> > Date: Wed, 22 Jun 2016 14:40:50 +0200 > > > The patch only unified the behaviour so everyone know what to expect > > regardless of the device type. > > +1 > > And reverting this behavior back is going to hurt more people than it > will help, so I am not going to apply this, sorry. Would a /proc/net/vlan sysctl to control the switch from the new to the old behavior (with default to the new) make this acceptable ? Recent users will not be affected, and users of the old behavior could manage kernel upgrade without breaking their applications. Thank you, Paolo
Mon, Jun 27, 2016 at 11:28:32AM CEST, pabeni@redhat.com wrote: >On Sun, 2016-06-26 at 16:06 -0400, David Miller wrote: >> From: Jiri Pirko <jiri@resnulli.us> >> Date: Wed, 22 Jun 2016 14:40:50 +0200 >> >> > The patch only unified the behaviour so everyone know what to expect >> > regardless of the device type. >> >> +1 >> >> And reverting this behavior back is going to hurt more people than it >> will help, so I am not going to apply this, sorry. > >Would a /proc/net/vlan sysctl to control the switch from the new to the >old behavior (with default to the new) make this acceptable ? Ugh, please don't :( > >Recent users will not be affected, and users of the old behavior could >manage kernel upgrade without breaking their applications. > >Thank you, > >Paolo > >
From: Paolo Abeni <pabeni@redhat.com> Date: Mon, 27 Jun 2016 11:28:32 +0200 > On Sun, 2016-06-26 at 16:06 -0400, David Miller wrote: >> From: Jiri Pirko <jiri@resnulli.us> >> Date: Wed, 22 Jun 2016 14:40:50 +0200 >> >> > The patch only unified the behaviour so everyone know what to expect >> > regardless of the device type. >> >> +1 >> >> And reverting this behavior back is going to hurt more people than it >> will help, so I am not going to apply this, sorry. > > Would a /proc/net/vlan sysctl to control the switch from the new to the > old behavior (with default to the new) make this acceptable ? No, sorry. Please fix your applications while you have your attention on the issue and have the opportunity and impetus to do so.
diff --git a/net/core/dev.c b/net/core/dev.c index 904ff43..9d08dd6 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4144,10 +4144,15 @@ ncls: ret = deliver_skb(skb, pt_prev, orig_dev); pt_prev = NULL; } - if (vlan_do_receive(&skb)) + if (vlan_do_receive(&skb)) { + /* avoid delivering to ptype registered on + * vlan's underlying device only + */ + orig_dev = skb->dev; goto another_round; - else if (unlikely(!skb)) + } else if (unlikely(!skb)) { goto out; + } } rx_handler = rcu_dereference(skb->dev->rx_handler);
Before the commit 0dfe17823945 ("net: vlan: goto another_round instead of calling __netif_receive_skb"), on tagged skb ingress, ptype specific protocol matches were delivered only to the related vlan device, if any. After said commit, jumping back to the 'another_round' label, allows the later ptype specific check to match both orig_dev and skb->dev, delivering the skb to both the vlan device and the underlying device. This cause i.e. packet sockets bound to a specific protocol type on one of said devices to receive also frames really targeting the other device. This commit resets orig_dev before performing another round due to vlan processing, allowing the skb to be delivered once again only to the vlan specific ptypes. Fixes: 0dfe17823945 ("net: vlan: goto another_round instead of calling __netif_receive_skb") Reported-by: Ryan Liu <Ryan.Liu@alcatel-lucent.com> Reported-by: Cliff Chen <Cliff.Chen@alcatel-lucent.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- net/core/dev.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)