diff mbox

[v2] net: deliver skbs on inactive slaves to exact matches

Message ID 20100603193011.4916.12354.stgit@jf-dev2-dcblab
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

John Fastabend June 3, 2010, 7:30 p.m. UTC
Currently, the accelerated receive path for VLAN's will
drop packets if the real device is an inactive slave and
is not one of the special pkts tested for in
skb_bond_should_drop().  This behavior is different then
the non-accelerated path and for pkts over a bonded vlan.

For example,

vlanx -> bond0 -> ethx

will be dropped in the vlan path and not delivered to any
packet handlers at all.  However,

bond0 -> vlanx -> ethx

and

bond0 -> ethx

will be delivered to handlers that match the exact dev,
because the VLAN path checks the real_dev which is not a
slave and netif_recv_skb() doesn't drop frames but only
delivers them to exact matches.

This patch adds a sk_buff flag which is used for tagging
skbs that would previously been dropped and allows the
skb to continue to skb_netif_recv().  Here we add
logic to check for the deliver_no_wcard flag and if it
is set only deliver to handlers that match exactly.  This
makes both paths above consistent and gives pkt handlers
a way to identify skbs that come from inactive slaves.
Without this patch in some configurations skbs will be
delivered to handlers with exact matches and in others
be dropped out right in the vlan path.

I have tested the following 4 configurations in failover modes
and load balancing modes.

# bond0 -> ethx

# vlanx -> bond0 -> ethx

# bond0 -> vlanx -> ethx

# bond0 -> ethx
            |
  vlanx -> --

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---

 include/linux/skbuff.h |    5 ++++-
 net/8021q/vlan_core.c  |    4 ++--
 net/core/dev.c         |   17 ++++++++++++++---
 3 files changed, 20 insertions(+), 6 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller June 11, 2010, 5:31 a.m. UTC | #1
From: John Fastabend <john.r.fastabend@intel.com>
Date: Thu, 03 Jun 2010 12:30:11 -0700

> Currently, the accelerated receive path for VLAN's will
> drop packets if the real device is an inactive slave and
> is not one of the special pkts tested for in
> skb_bond_should_drop().  This behavior is different then
> the non-accelerated path and for pkts over a bonded vlan.
 ...
> This patch adds a sk_buff flag which is used for tagging
> skbs that would previously been dropped and allows the
> skb to continue to skb_netif_recv().  Here we add
> logic to check for the deliver_no_wcard flag and if it
> is set only deliver to handlers that match exactly.  This
> makes both paths above consistent and gives pkt handlers
> a way to identify skbs that come from inactive slaves.
> Without this patch in some configurations skbs will be
> delivered to handlers with exact matches and in others
> be dropped out right in the vlan path.
 ...
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>

I can't figure out a better way to fix this myself so I've
applied your patch, thanks John!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin June 14, 2010, 1:21 p.m. UTC | #2
On Thu, Jun 03, 2010 at 12:30:11PM -0700, John Fastabend wrote:
> Currently, the accelerated receive path for VLAN's will
> drop packets if the real device is an inactive slave and
> is not one of the special pkts tested for in
> skb_bond_should_drop().  This behavior is different then
> the non-accelerated path and for pkts over a bonded vlan.
> 
> For example,
> 
> vlanx -> bond0 -> ethx
> 
> will be dropped in the vlan path and not delivered to any
> packet handlers at all.  However,
> 
> bond0 -> vlanx -> ethx
> 
> and
> 
> bond0 -> ethx
> 
> will be delivered to handlers that match the exact dev,
> because the VLAN path checks the real_dev which is not a
> slave and netif_recv_skb() doesn't drop frames but only
> delivers them to exact matches.
> 
> This patch adds a sk_buff flag which is used for tagging
> skbs that would previously been dropped and allows the
> skb to continue to skb_netif_recv().  Here we add
> logic to check for the deliver_no_wcard flag and if it
> is set only deliver to handlers that match exactly.  This
> makes both paths above consistent and gives pkt handlers
> a way to identify skbs that come from inactive slaves.
> Without this patch in some configurations skbs will be
> delivered to handlers with exact matches and in others
> be dropped out right in the vlan path.
> 
> I have tested the following 4 configurations in failover modes
> and load balancing modes.
> 
> # bond0 -> ethx
> 
> # vlanx -> bond0 -> ethx
> 
> # bond0 -> vlanx -> ethx
> 
> # bond0 -> ethx
>             |
>   vlanx -> --
> 
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>

I am using qemu with both tap and slirp (userspace) networking.
This works fine under 2.6.35-rc2 but breaks under 2.6.35-rc3:
ssh over slirp stops working sometimes right away
and sometimes after a bit of use, connection times out.

Git bisect gave me this commit:
597a264b1a9c7e36d1728f677c66c5c1f7e3b837.

Reverting 597a264b1a9c7e36d1728f677c66c5c1f7e3b837 fixes the issue
for me.

I'm short for time now so didn't debug this further.
I opened a bugzilla to track this issue:
https://bugzilla.kernel.org/show_bug.cgi?id=16204
Eric Dumazet June 14, 2010, 1:38 p.m. UTC | #3
Le lundi 14 juin 2010 à 15:34 +0200, Eric Dumazet a écrit :

> [PATCH] net: fix deliver_no_wcard regression on loopback device
> 
> deliver_no_wcard is not being set in skb_copy_header.
> In the skb_cloned case it is not being cleared and
> may cause the skb to be dropped when the loopback device
> pushes it back up the stack.
> 
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Oh I forgot :

Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>

> ---
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 9f07e74..bcf2fa3 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -532,6 +532,7 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
>  	new->ip_summed		= old->ip_summed;
>  	skb_copy_queue_mapping(new, old);
>  	new->priority		= old->priority;
> +	new->deliver_no_wcard	= old->deliver_no_wcard;
>  #if defined(CONFIG_IP_VS) || defined(CONFIG_IP_VS_MODULE)
>  	new->ipvs_property	= old->ipvs_property;
>  #endif
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin June 14, 2010, 2:10 p.m. UTC | #4
On Mon, Jun 14, 2010 at 03:38:28PM +0200, Eric Dumazet wrote:
> Le lundi 14 juin 2010 à 15:34 +0200, Eric Dumazet a écrit :
> 
> > [PATCH] net: fix deliver_no_wcard regression on loopback device
> > 
> > deliver_no_wcard is not being set in skb_copy_header.
> > In the skb_cloned case it is not being cleared and
> > may cause the skb to be dropped when the loopback device
> > pushes it back up the stack.
> > 
> > Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> > Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
> 
> Oh I forgot :
> 
> Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>

Grr. Could have saved myself a bit of time if I guessed
it's related.

> > ---
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 9f07e74..bcf2fa3 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -532,6 +532,7 @@ static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
> >  	new->ip_summed		= old->ip_summed;
> >  	skb_copy_queue_mapping(new, old);
> >  	new->priority		= old->priority;
> > +	new->deliver_no_wcard	= old->deliver_no_wcard;
> >  #if defined(CONFIG_IP_VS) || defined(CONFIG_IP_VS_MODULE)
> >  	new->ipvs_property	= old->ipvs_property;
> >  #endif
> > 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index bf243fc..f89e7fd 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -380,7 +380,10 @@  struct sk_buff {
 	kmemcheck_bitfield_begin(flags2);
 	__u16			queue_mapping:16;
 #ifdef CONFIG_IPV6_NDISC_NODETYPE
-	__u8			ndisc_nodetype:2;
+	__u8			ndisc_nodetype:2,
+				deliver_no_wcard:1;
+#else
+	__u8			deliver_no_wcard:1;
 #endif
 	kmemcheck_bitfield_end(flags2);
 
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index bd537fc..50f58f5 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -12,7 +12,7 @@  int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 		return NET_RX_DROP;
 
 	if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
-		goto drop;
+		skb->deliver_no_wcard = 1;
 
 	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
@@ -84,7 +84,7 @@  vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
 	struct sk_buff *p;
 
 	if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
-		goto drop;
+		skb->deliver_no_wcard = 1;
 
 	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
diff --git a/net/core/dev.c b/net/core/dev.c
index ccba781..9c356fd 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2821,13 +2821,24 @@  static int __netif_receive_skb(struct sk_buff *skb)
 	if (!skb->skb_iif)
 		skb->skb_iif = skb->dev->ifindex;
 
+	/*
+	 * bonding note: skbs received on inactive slaves should only
+	 * be delivered to pkt handlers that are exact matches.  Also
+	 * the deliver_no_wcard flag will be set.  If packet handlers
+	 * are sensitive to duplicate packets these skbs will need to
+	 * be dropped at the handler.  The vlan accel path may have
+	 * already set the deliver_no_wcard flag.
+	 */
 	null_or_orig = NULL;
 	orig_dev = skb->dev;
 	master = ACCESS_ONCE(orig_dev->master);
-	if (master) {
-		if (skb_bond_should_drop(skb, master))
+	if (skb->deliver_no_wcard)
+		null_or_orig = orig_dev;
+	else if (master) {
+		if (skb_bond_should_drop(skb, master)) {
+			skb->deliver_no_wcard = 1;
 			null_or_orig = orig_dev; /* deliver only exact match */
-		else
+		} else
 			skb->dev = master;
 	}