From patchwork Sat Oct 29 14:59:01 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jiri Pirko X-Patchwork-Id: 122538 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 11A871007D8 for ; Sun, 30 Oct 2011 01:59:25 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933299Ab1J2O7N (ORCPT ); Sat, 29 Oct 2011 10:59:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58381 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932744Ab1J2O7M (ORCPT ); Sat, 29 Oct 2011 10:59:12 -0400 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p9TEx3HM022644 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 29 Oct 2011 10:59:03 -0400 Received: from localhost (ovpn-116-18.ams2.redhat.com [10.36.116.18]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p9TEx2aU013281; Sat, 29 Oct 2011 10:59:02 -0400 Date: Sat, 29 Oct 2011 16:59:01 +0200 From: Jiri Pirko To: Eric Dumazet Cc: John Fastabend , David Miller , "jesse@nicira.com" , "hans.schillstrom@ericsson.com" , "mbizon@freebox.fr" , "netdev@vger.kernel.org" , "fubar@us.ibm.com" Subject: Re: [net-next PATCH] net: allow vlan traffic to be received under bond Message-ID: <20111029145900.GA2053@minipsycho.orion> References: <4E972301.4030004@intel.com> <20111018.234723.1235424375699917420.davem@davemloft.net> <1319796053.23112.92.camel@edumazet-laptop> <1319799986.23112.101.camel@edumazet-laptop> <4EAB62F0.2040101@intel.com> <1319883746.2586.33.camel@edumazet-laptop> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1319883746.2586.33.camel@edumazet-laptop> User-Agent: Mutt/1.5.21 (2010-09-15) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Sat, Oct 29, 2011 at 12:22:26PM CEST, eric.dumazet@gmail.com wrote: >Le vendredi 28 octobre 2011 à 19:20 -0700, John Fastabend a écrit : > >> Thanks Eric! Thought about this some and I haven't come up >> with anything better yet. Even though this might be a slight >> hack I would prefer this to reverting the patch. >> >> I'll think about this more tomorrow. Would you be against >> submitting this patch? > >I cant submit this patch, because its a hack and partial fix. > >For Unicast packets, we still do the wrong thing : setting their >pkt_type to PACKET_OTHERHOST before the call to rx_handler : > >In this case, bond_handle_frame() wont handle this packet correctly in >some cases (BOND_MODE_ALB ...). I suppose bridge might be confused as >well. So other problems remain. > >We should delay the PACKET_OTHERHOST setting to the last moment, that is >the last time vlan_do_receive() is called. > >What about following patch instead ? > >[PATCH] vlan: allow nested vlan_do_receive() > >commit 2425717b27eb (net: allow vlan traffic to be received under bond) >broke ARP processing on vlan on top of bonding. > > +-------+ >eth0 --| bond0 |---bond0.103 >eth1 --| | > +-------+ > >52870.115435: skb_gro_reset_offset <-napi_gro_receive >52870.115435: dev_gro_receive <-napi_gro_receive >52870.115435: napi_skb_finish <-napi_gro_receive >52870.115435: netif_receive_skb <-napi_skb_finish >52870.115435: get_rps_cpu <-netif_receive_skb >52870.115435: __netif_receive_skb <-netif_receive_skb >52870.115436: vlan_do_receive <-__netif_receive_skb >52870.115436: bond_handle_frame <-__netif_receive_skb >52870.115436: vlan_do_receive <-__netif_receive_skb >52870.115436: arp_rcv <-__netif_receive_skb >52870.115436: kfree_skb <-arp_rcv > >Packet is dropped in arp_rcv() because its pkt_type was set to >PACKET_OTHERHOST in the first vlan_do_receive() call, since no eth0.103 >exists. > >We really need to change pkt_type only if no more rx_handler is about to >be called for the packet. > >Signed-off-by: Eric Dumazet >--- > include/linux/if_vlan.h | 8 +++++--- > net/8021q/vlan_core.c | 7 +++++-- > net/core/dev.c | 4 ++-- > 3 files changed, 12 insertions(+), 7 deletions(-) > >diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h >index 44da482..95874ff 100644 >--- a/include/linux/if_vlan.h >+++ b/include/linux/if_vlan.h >@@ -106,7 +106,8 @@ extern struct net_device *__vlan_find_dev_deep(struct net_device *real_dev, > extern struct net_device *vlan_dev_real_dev(const struct net_device *dev); > extern u16 vlan_dev_vlan_id(const struct net_device *dev); > >-extern bool vlan_do_receive(struct sk_buff **skb); >+extern bool vlan_do_receive(struct sk_buff **skb, >+ rx_handler_func_t *rx_handler); > extern struct sk_buff *vlan_untag(struct sk_buff *skb); > > #else >@@ -128,9 +129,10 @@ static inline u16 vlan_dev_vlan_id(const struct net_device *dev) > return 0; > } > >-static inline bool vlan_do_receive(struct sk_buff **skb) >+static inline bool vlan_do_receive(struct sk_buff **skb, >+ rx_handler_func_t *rx_handler) > { >- if ((*skb)->vlan_tci & VLAN_VID_MASK) >+ if (((*skb)->vlan_tci & VLAN_VID_MASK) && !rx_handler) > (*skb)->pkt_type = PACKET_OTHERHOST; > return false; > } >diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c >index f1f2f7b..3ec1ada 100644 >--- a/net/8021q/vlan_core.c >+++ b/net/8021q/vlan_core.c >@@ -4,7 +4,7 @@ > #include > #include "vlan.h" > >-bool vlan_do_receive(struct sk_buff **skbp) >+bool vlan_do_receive(struct sk_buff **skbp, rx_handler_func_t *rx_handler) > { > struct sk_buff *skb = *skbp; > u16 vlan_id = skb->vlan_tci & VLAN_VID_MASK; >@@ -13,7 +13,10 @@ bool vlan_do_receive(struct sk_buff **skbp) > > vlan_dev = vlan_find_dev(skb->dev, vlan_id); > if (!vlan_dev) { >- if (vlan_id) >+ /* Only the last call to vlan_do_receive() should change >+ * pkt_type to PACKET_OTHERHOST >+ */ >+ if (vlan_id && !rx_handler) > skb->pkt_type = PACKET_OTHERHOST; > return false; > } >diff --git a/net/core/dev.c b/net/core/dev.c >index edcf019..40976b4 100644 >--- a/net/core/dev.c >+++ b/net/core/dev.c >@@ -3283,18 +3283,18 @@ another_round: > ncls: > #endif > >+ rx_handler = rcu_dereference(skb->dev->rx_handler); > if (vlan_tx_tag_present(skb)) { > if (pt_prev) { > ret = deliver_skb(skb, pt_prev, orig_dev); > pt_prev = NULL; > } >- if (vlan_do_receive(&skb)) >+ if (vlan_do_receive(&skb, rx_handler)) I must say I do not like passing rx_handler out like this. Apart it's not nice, it might be misleading.... How about something like following instead? I must test it but I believe it should resolve the problem. --- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h index 44da482..165a487 100644 --- a/include/linux/if_vlan.h +++ b/include/linux/if_vlan.h @@ -130,8 +130,6 @@ static inline u16 vlan_dev_vlan_id(const struct net_device *dev) static inline bool vlan_do_receive(struct sk_buff **skb) { - if ((*skb)->vlan_tci & VLAN_VID_MASK) - (*skb)->pkt_type = PACKET_OTHERHOST; return false; } @@ -141,6 +139,14 @@ static inline struct sk_buff *vlan_untag(struct sk_buff *skb) } #endif +static inline void vlan_handle_leftover(struct sk_buff *skb) +{ + u16 vlan_id = skb->vlan_tci & VLAN_VID_MASK; + + if (vlan_id) + skb->pkt_type = PACKET_OTHERHOST; +} + /** * vlan_insert_tag - regular VLAN tag inserting * @skb: skbuff to tag diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c index f1f2f7b..540da12 100644 --- a/net/8021q/vlan_core.c +++ b/net/8021q/vlan_core.c @@ -12,11 +12,8 @@ bool vlan_do_receive(struct sk_buff **skbp) struct vlan_pcpu_stats *rx_stats; vlan_dev = vlan_find_dev(skb->dev, vlan_id); - if (!vlan_dev) { - if (vlan_id) - skb->pkt_type = PACKET_OTHERHOST; + if (!vlan_dev) return false; - } skb = *skbp = skb_share_check(skb, GFP_ATOMIC); if (unlikely(!skb)) diff --git a/net/core/dev.c b/net/core/dev.c index b7ba81a..6fdfcc9 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3314,6 +3314,14 @@ ncls: } } + if (vlan_tx_tag_present(skb)) { + /* + * Tag is still present here. That means there's no device + * set up for this vlan id. So handle these leftovers here. + */ + vlan_handle_leftover(skb); + } + /* deliver only exact match when indicated */ null_or_dev = deliver_exact ? skb->dev : NULL;