diff mbox

problem with IPoA (CLIP), NAT, and VLANS

Message ID 20090216232016.GA4803@ami.dom.local
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Jarek Poplawski Feb. 16, 2009, 11:20 p.m. UTC
Karl Hiramoto wrote, On 02/16/2009 04:02 PM:
...
> The problem ended up being the packet being corrupted when the vlan tag
> was being added and skb_cow_head()  was being called.
> 
> Anyone know why skb_cow_head() would corrupt the packet?  Perhaps it was
> not allocated correctly?     I'm using a big-endian ARM IXP435 board.

Hi,
Very nice debugging, but I think your patch doesn't look like enough:
1) it skips copy for cloned skbs,
2) this skb_cow_head() is needed anyway, sometimes.
So the real bug should be found in skb_cow_head() or elsewhere.

I attach here 2 patches for testing:
1) skb->mac_header update: it looks like needed, but I don't know if
   it matters here,
2) an extention of your patch but with pskb_expand_head() called for
   one to one copy.

BTW, if it's not a big problem it would be nice to try this e.g. on
2.6.29-rc5.

Thanks,
Jarek P.

---> patch #1

 include/linux/if_vlan.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Patrick McHardy Feb. 17, 2009, 9:03 a.m. UTC | #1
Jarek Poplawski wrote:
> Very nice debugging, but I think your patch doesn't look like enough:
> 1) it skips copy for cloned skbs,
> 2) this skb_cow_head() is needed anyway, sometimes.
> So the real bug should be found in skb_cow_head() or elsewhere.
> 
> I attach here 2 patches for testing:
> 1) skb->mac_header update: it looks like needed, but I don't know if
>    it matters here,
> 2) an extention of your patch but with pskb_expand_head() called for
>    one to one copy.
> 
> BTW, if it's not a big problem it would be nice to try this e.g. on
> 2.6.29-rc5.

The first patch looks fine. As for the second one, I would like
to understand why we're seing these packets. The VLAN driver uses
the original headroom plus the space it needs itself, which suggests
that the underlying driver specifies an incorrect headroom.
The driver doesn't appear to be part of the mainline kernel though.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Feb. 17, 2009, 9:52 a.m. UTC | #2
On Tue, Feb 17, 2009 at 10:03:57AM +0100, Patrick McHardy wrote:
...
> The first patch looks fine. As for the second one, I would like
> to understand why we're seing these packets. The VLAN driver uses
> the original headroom plus the space it needs itself, which suggests
> that the underlying driver specifies an incorrect headroom.
> The driver doesn't appear to be part of the mainline kernel though.

On the other hand, if Karl added his patch it seems this change of
NET_SKB_PAD isn't enough, and if omitting skb_cow_head() works, then
the headroom isn't the main problem?

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Karl Hiramoto Feb. 17, 2009, 11:49 a.m. UTC | #3
Jarek Poplawski wrote:
> Karl Hiramoto wrote, On 02/16/2009 04:02 PM:
> ...
>   
>> The problem ended up being the packet being corrupted when the vlan tag
>> was being added and skb_cow_head()  was being called.
>>
>> Anyone know why skb_cow_head() would corrupt the packet?  Perhaps it was
>> not allocated correctly?     I'm using a big-endian ARM IXP435 board.
>>     
>
> Hi,
> Very nice debugging, but I think your patch doesn't look like enough:
> 1) it skips copy for cloned skbs,
> 2) this skb_cow_head() is needed anyway, sometimes.
> So the real bug should be found in skb_cow_head() or elsewhere.
>
> I attach here 2 patches for testing:
> 2) an extention of your patch but with pskb_expand_head() called for
>    one to one copy.
>
>
>  include/linux/if_vlan.h |   13 ++++++++++---
>  1 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
> index f8ff918..e9a5eb1 100644
> --- a/include/linux/if_vlan.h
> +++ b/include/linux/if_vlan.h
> @@ -202,9 +202,16 @@ static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
>  {
>  	struct vlan_ethhdr *veth;
>  
> -	if (skb_cow_head(skb, VLAN_HLEN) < 0) {
> -		kfree_skb(skb);
> -		return NULL;
> +	if (skb_headroom(skb) < VLAN_HLEN) {
> +		if (skb_cow_head(skb, VLAN_HLEN) < 0) {
> +			kfree_skb(skb);
> +			return NULL;
> +		}
> +	} else {
> +		if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC) < 0) {
> +			kfree_skb(skb);
> +			return NULL;
> +		}
>  	}
>  	veth = (struct vlan_ethhdr *)skb_push(skb, VLAN_HLEN);
>  
>   


This breaks IPoA -> NAT -->VLAN for me.  Before, with a virgin 2.6.28.4 
I had no traffic.  With this patch,  what was a 50ms ping, jumps to
1050ms  and randomly there are duplicate packets.

What Patrick,  said may be the case,  that the underlaying atm driver is
not specifying the headroom.   The driver is ported to 2.6.28 from
2.6.11 based on the BSD licensed intel libaries:
http://www.intel.com/design/network/products/npfamily/download_ixp400.htm

Where underneath a bunch of layers the skb is allocated with kmalloc like:

size = PAGE_ALIGN (size + CACHE_LINE_SIZE);
order = get_order (size);
page = alloc_pages (GFP_KERNEL, order);


Why though does the same driver work with IPoE  Bridged ATM   RFC2684  ?


The other module in use is CLIP  net/atm/clip.c      I don't have any
other hardware though to test the CLIP driver with though.


A side note:  so far the original patch i sent works in all cases i have
tested, but fails with tcpdump.   I suspect its because the skb gets cloned.

Thanks,


Karl.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Feb. 17, 2009, 12:20 p.m. UTC | #4
On Tue, Feb 17, 2009 at 12:49:07PM +0100, Karl Hiramoto wrote:
...
> A side note:  so far the original patch i sent works in all cases i have
> tested, but fails with tcpdump.   I suspect its because the skb gets cloned.

If there is something readable from this tcpdump, it should be helpful
to see a packet for working and non-working case during such ping
(with -nXX option).

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Feb. 17, 2009, 12:28 p.m. UTC | #5
Karl Hiramoto wrote:
> What Patrick,  said may be the case,  that the underlaying atm driver is
> not specifying the headroom.   The driver is ported to 2.6.28 from
> 2.6.11 based on the BSD licensed intel libaries:
> http://www.intel.com/design/network/products/npfamily/download_ixp400.htm

If those are the ones I think they are, I'm not too surprised ...

> Where underneath a bunch of layers the skb is allocated with kmalloc like:
> 
> size = PAGE_ALIGN (size + CACHE_LINE_SIZE);
> order = get_order (size);
> page = alloc_pages (GFP_KERNEL, order);

How does it initialize the netdev struct?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Karl Hiramoto Feb. 17, 2009, 12:53 p.m. UTC | #6
Jarek Poplawski wrote:
> On Tue, Feb 17, 2009 at 12:49:07PM +0100, Karl Hiramoto wrote:
> ...
>   
>> A side note:  so far the original patch i sent works in all cases i have
>> tested, but fails with tcpdump.   I suspect its because the skb gets cloned.
>>     
>
> If there is something readable from this tcpdump, it should be helpful
> to see a packet for working and non-working case during such ping
> (with -nXX option).
> Jarek P.
>   

Note:  I have the patches i sent applied,  plus the  "skb->mac_header -=
VLAN_HLEN;"   patch from Jarek on 2.6.28.4

Doing a tcpdump simultaneously  on the atm and eth0.1 on the linux router.


tcpdump -i atm0 -nvXX icmp
tcpdump: listening on atm0, link-type LINUX_SLL (Linux cooked), capture
size 68 bytes
12:47:15.431821 IP (tos 0x0, ttl  63, id 0, offset 0, flags [DF], proto:
ICMP (1), length: 84) 80.33.85.178 > 80.58.0.33: ICMP echo request, id
54787, seq 1, length 64
        0x0000:  0004 0013 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0054 0000 4000 3f01 457b 5021 55b2  E..T..@.?.E{P!U.
        0x0020:  503a 0021 0800 24cc d603 0001 d4b1 9a49  P:.!..$........I
        0x0030:  a130 0200 0809 0a0b 0c0d 0e0f 1011 1213  .0..............
        0x0040:  1415 1617                                ....
12:47:15.491209 IP (tos 0x0, ttl 126, id 51644, offset 0, flags [none],
proto: ICMP (1), length: 84) 80.58.0.33 > 80.33.85.178: ICMP echo reply,
id 54787, seq 1, length 64
        0x0000:  0000 0013 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0054 c9bc 0000 7e01 7cbe 503a 0021  E..T....~.|.P:.!
        0x0020:  5021 55b2 0000 2ccc d603 0001 d4b1 9a49  P!U...,........I
        0x0030:  a130 0200 0809 0a0b 0c0d 0e0f 1011 1213  .0..............
        0x0040:  1415 1617                                ....
12:47:16.442008 IP (tos 0x0, ttl  63, id 0, offset 0, flags [DF], proto:
ICMP (1), length: 84) 80.33.85.178 > 80.58.0.33: ICMP echo request, id
54787, seq 2, length 64
        0x0000:  0004 0013 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0054 0000 4000 3f01 457b 5021 55b2  E..T..@.?.E{P!U.
        0x0020:  503a 0021 0800 eda1 d603 0002 d5b1 9a49  P:.!...........I
        0x0030:  d759 0200 0809 0a0b 0c0d 0e0f 1011 1213  .Y..............
        0x0040:  1415 1617                                ....
12:47:16.498148 IP (tos 0x0, ttl 126, id 51784, offset 0, flags [none],
proto: ICMP (1), length: 84) 80.58.0.33 > 80.33.85.178: ICMP echo reply,
id 54787, seq 2, length 64
        0x0000:  0000 0013 0000 0000 0000 0000 0000 0800  ................
        0x0010:  4500 0054 ca48 0000 7e01 7c32 503a 0021  E..T.H..~.|2P:.!
        0x0020:  5021 55b2 0000 f5a1 d603 0002 d5b1 9a49  P!U............I
        0x0030:  d759 0200 0809 0a0b 0c0d 0e0f 1011 1213  .Y..............
        0x0040:  1415 1617                  



tcpdump -i eth0.1 -nvXX icmp
tcpdump: listening on eth0.1, link-type EN10MB (Ethernet), capture size
68 bytes
12:47:15.434163 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF], proto:
ICMP (1), length: 84) 192.168.88.2 > 80.58.0.33: ICMP echo request, id
54787, seq 1, length 64
        0x0000:  525e a930 50db 0015 c509 9b4a 0800 4500  R^.0P......J..E.
        0x0010:  0054 0000 4000 4001 d1a3 c0a8 5802 503a  .T..@.@.....X.P:
        0x0020:  0021 0800 24cc d603 0001 d4b1 9a49 a130  .!..$........I.0
        0x0030:  0200 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819                                ....
12:47:16.441748 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF], proto:
ICMP (1), length: 84) 192.168.88.2 > 80.58.0.33: ICMP echo request, id
54787, seq 2, length 64
        0x0000:  525e a930 50db 0015 c509 9b4a 0800 4500  R^.0P......J..E.
        0x0010:  0054 0000 4000 4001 d1a3 c0a8 5802 503a  .T..@.@.....X.P:
        0x0020:  0021 0800 eda1 d603 0002 d5b1 9a49 d759  .!...........I.Y
        0x0030:  0200 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819                                ....
12:47:16.498342 IP (tos 0x0, ttl 125, id 47253, offset 0, flags [none],
proto: ICMP (1), length: 84) 80.58.0.33 > 192.168.88.2: ICMP echo reply,
id 54787, seq 2, length 64
        0x0000:  9b4a 525e a930 50db 8100 0001 0800 4500  .JR^.0P.......E.
        0x0010:  0054 b895 0000 7d01 1c0e 503a 0021 c0a8  .T....}...P:.!..
        0x0020:  5802 0000 ca1b d603 0002 b5b1 9a49 24e0  X............I$.
        0x0030:  0000 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819                                ....



This last tcpdump is on the machine doing a ping.   Note the ID and the
time of day looks to be corrupt.

ping -c 2  80.58.0.33
PING 80.58.0.33 (80.58.0.33) 56(84) bytes of data.
64 bytes from 80.58.0.33: icmp_seq=2 ttl=125 time=32156 ms

--- 80.58.0.33 ping statistics ---
2 packets transmitted, 1 received, 50% packet loss, time 1010ms
rtt min/avg/max/mdev = 32156.693/32156.693/32156.693/0.000 ms


tcpdump -i eth0 icmp -vn -XX
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96
bytes
13:47:16.143541 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
ICMP (1), length 84) 192.168.88.2 > 80.58.0.33: ICMP echo request, id
54787, seq 1, length 64
        0x0000:  525e a930 50db 0015 c509 9b4a 0800 4500  R^.0P......J..E.
        0x0010:  0054 0000 4000 4001 d1a3 c0a8 5802 503a  .T..@.@.....X.P:
        0x0020:  0021 0800 24cc d603 0001 d4b1 9a49 a130  .!..$........I.0
        0x0030:  0200 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425  ...........!"#$%
        0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435  &'()*+,-./012345
13:47:17.154093 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
ICMP (1), length 84) 192.168.88.2 > 80.58.0.33: ICMP echo request, id
54787, seq 2, length 64
        0x0000:  525e a930 50db 0015 c509 9b4a 0800 4500  R^.0P......J..E.
        0x0010:  0054 0000 4000 4001 d1a3 c0a8 5802 503a  .T..@.@.....X.P:
        0x0020:  0021 0800 eda1 d603 0002 d5b1 9a49 d759  .!...........I.Y
        0x0030:  0200 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425  ...........!"#$%
        0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435  &'()*+,-./012345
13:47:17.214073 IP (tos 0x0, ttl 125, id 47253, offset 0, flags [none],
proto ICMP (1), length 84) 80.58.0.33 > 192.168.88.2: ICMP echo reply,
id 54787, seq 2, length 64
        0x0000:  0015 c509 9b4a 525e a930 50db 0800 4500  .....JR^.0P...E.
        0x0010:  0054 b895 0000 7d01 1c0e 503a 0021 c0a8  .T....}...P:.!..
        0x0020:  5802 0000 ca1b d603 0002 b5b1 9a49 24e0  X............I$.
        0x0030:  0000 0809 0a0b 0c0d 0e0f 1011 1213 1415  ................
        0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425  ...........!"#$%
        0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435  &'()*+,-./012345





Thanks,

karl
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Feb. 17, 2009, 1:37 p.m. UTC | #7
On Tue, Feb 17, 2009 at 01:53:25PM +0100, Karl Hiramoto wrote:
> Jarek Poplawski wrote:
> > On Tue, Feb 17, 2009 at 12:49:07PM +0100, Karl Hiramoto wrote:
> > ...
> >   
> >> A side note:  so far the original patch i sent works in all cases i have
> >> tested, but fails with tcpdump.   I suspect its because the skb gets cloned.

Hmm... I would like to make sure: if tcpdump breaks it too, then
we don't need to blame skb_cow_head(), do we? If it's like this then
it looks like the driver's problem with copied/cloned skbs?! (Maybe
because some "private" offset is overwritten?)

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index f8ff918..e1ff5b1 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -210,6 +210,7 @@  static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
 
 	/* Move the mac addresses to the beginning of the new header. */
 	memmove(skb->data, skb->data + VLAN_HLEN, 2 * VLAN_ETH_ALEN);
+	skb->mac_header -= VLAN_HLEN;
 
 	/* first, the ethernet type */
 	veth->h_vlan_proto = htons(ETH_P_8021Q);

---> patch #2

 include/linux/if_vlan.h |   13 ++++++++++---
 1 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index f8ff918..e9a5eb1 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -202,9 +202,16 @@  static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
 {
 	struct vlan_ethhdr *veth;
 
-	if (skb_cow_head(skb, VLAN_HLEN) < 0) {
-		kfree_skb(skb);
-		return NULL;
+	if (skb_headroom(skb) < VLAN_HLEN) {
+		if (skb_cow_head(skb, VLAN_HLEN) < 0) {
+			kfree_skb(skb);
+			return NULL;
+		}
+	} else {
+		if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC) < 0) {
+			kfree_skb(skb);
+			return NULL;
+		}
 	}
 	veth = (struct vlan_ethhdr *)skb_push(skb, VLAN_HLEN);