diff mbox

[Bug,#16626] Machine hangs with EIP at skb_copy_and_csum_dev

Message ID 1283338251.2556.124.camel@edumazet-laptop
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Sept. 1, 2010, 10:50 a.m. UTC
Plamen, could you test following patch ?

I reproduced problem on a dev machine and following patch cured it.

Thanks

[PATCH] gro: fix different skb headrooms

packets entering GRO might have different headrooms, even for a given
flow (because of implementation details in drivers, like copybreak).
We cant force drivers to deliver packets with a fixed headroom.

1) fix skb_segment()

skb_segment() makes the false assumption headrooms of fragments are same
than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
errors, and crash later in skb_copy_and_csum_dev()

2) allocate a minimal skb for head of frag_list

skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
allocate a fresh skb. This adds NET_SKB_PAD to a padding already
provided by netdevice, depending on various things, like copybreak.

Use alloc_skb() to allocate an exact padding, to reduce cache line
needs:
NET_SKB_PAD + NET_IP_ALIGN

bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626

Many thanks to Plamen Petrov, testing many debugging patches !
With help of Jarek Poplawski.

Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jarek Poplawski <jarkao2@gmail.com>
---
patch against linux-2.6 current tree

 net/core/skbuff.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jarek Poplawski Sept. 1, 2010, 11:20 a.m. UTC | #1
On Wed, Sep 01, 2010 at 12:50:51PM +0200, Eric Dumazet wrote:
> Plamen, could you test following patch ?
> 
> I reproduced problem on a dev machine and following patch cured it.
> 
> Thanks
> 
> [PATCH] gro: fix different skb headrooms
> 
> packets entering GRO might have different headrooms, even for a given
> flow (because of implementation details in drivers, like copybreak).
> We cant force drivers to deliver packets with a fixed headroom.
> 
> 1) fix skb_segment()
> 
> skb_segment() makes the false assumption headrooms of fragments are same
> than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> errors, and crash later in skb_copy_and_csum_dev()

Eric, probably I missed something, but since the same test as in
skb_copy_and_csum_dev() gave different result a bit earlier on exactly
the same skb, I've suspected some sharing (or use after free)
problems, so I'm not sure your current diagnose can explain this.
(Unless this old test was dismissed later.)

Thanks,
Jarek P.

> 
> 2) allocate a minimal skb for head of frag_list
> 
> skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
> allocate a fresh skb. This adds NET_SKB_PAD to a padding already
> provided by netdevice, depending on various things, like copybreak.
> 
> Use alloc_skb() to allocate an exact padding, to reduce cache line
> needs:
> NET_SKB_PAD + NET_IP_ALIGN
> 
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
> 
> Many thanks to Plamen Petrov, testing many debugging patches !
> With help of Jarek Poplawski.
> 
> Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Jarek Poplawski <jarkao2@gmail.com>
> ---
> patch against linux-2.6 current tree
> 
>  net/core/skbuff.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 3a2513f..26396ff 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2573,6 +2573,10 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
>  		__copy_skb_header(nskb, skb);
>  		nskb->mac_len = skb->mac_len;
>  
> +		/* nskb and skb might have different headroom */
> +		if (nskb->ip_summed == CHECKSUM_PARTIAL)
> +			nskb->csum_start += skb_headroom(nskb) - headroom;
> +
>  		skb_reset_mac_header(nskb);
>  		skb_set_network_header(nskb, skb->mac_len);
>  		nskb->transport_header = (nskb->network_header +
> @@ -2702,8 +2706,8 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
>  	} else if (skb_gro_len(p) != pinfo->gso_size)
>  		return -E2BIG;
>  
> -	headroom = skb_headroom(p);
> -	nskb = netdev_alloc_skb(p->dev, headroom + skb_gro_offset(p));
> +	headroom = NET_SKB_PAD + NET_IP_ALIGN;
> +	nskb = alloc_skb(headroom + skb_gro_offset(p), GFP_ATOMIC);
>  	if (unlikely(!nskb))
>  		return -ENOMEM;
>  
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Sept. 1, 2010, 1:57 p.m. UTC | #2
Le mercredi 01 septembre 2010 à 11:20 +0000, Jarek Poplawski a écrit :
> On Wed, Sep 01, 2010 at 12:50:51PM +0200, Eric Dumazet wrote:
> > Plamen, could you test following patch ?
> > 
> > I reproduced problem on a dev machine and following patch cured it.
> > 
> > Thanks
> > 
> > [PATCH] gro: fix different skb headrooms
> > 
> > packets entering GRO might have different headrooms, even for a given
> > flow (because of implementation details in drivers, like copybreak).
> > We cant force drivers to deliver packets with a fixed headroom.
> > 
> > 1) fix skb_segment()
> > 
> > skb_segment() makes the false assumption headrooms of fragments are same
> > than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> > errors, and crash later in skb_copy_and_csum_dev()
> 
> Eric, probably I missed something, but since the same test as in
> skb_copy_and_csum_dev() gave different result a bit earlier on exactly
> the same skb, I've suspected some sharing (or use after free)
> problems, so I'm not sure your current diagnose can explain this.
> (Unless this old test was dismissed later.)

Oh, this is because your patch had an error for the gso part that read :

-               rc = ops->ndo_start_xmit(nskb, dev);
+               if (skb_csum_start_bug(skb, 50)) {
+                       kfree_skb(skb);
+                       rc = NETDEV_TX_OK;
+               } else
+                       rc = ops->ndo_start_xmit(nskb, dev);
+
                if (unlikely(rc != NETDEV_TX_OK)) {
                        if (rc & ~NETDEV_TX_MASK)
                                goto out_kfree_gso_skb;

You called skb_csum_start_bug(skb, 50) instead of
skb_csum_start_bug(nskb, 50)

Hope this clarify a bit ;)

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Sept. 1, 2010, 3:05 p.m. UTC | #3
On Wed, Sep 01, 2010 at 03:57:41PM +0200, Eric Dumazet wrote:
> Le mercredi 01 septembre 2010 ?? 11:20 +0000, Jarek Poplawski a écrit :
> > On Wed, Sep 01, 2010 at 12:50:51PM +0200, Eric Dumazet wrote:
> > > Plamen, could you test following patch ?
> > > 
> > > I reproduced problem on a dev machine and following patch cured it.
> > > 
> > > Thanks
> > > 
> > > [PATCH] gro: fix different skb headrooms
> > > 
> > > packets entering GRO might have different headrooms, even for a given
> > > flow (because of implementation details in drivers, like copybreak).
> > > We cant force drivers to deliver packets with a fixed headroom.
> > > 
> > > 1) fix skb_segment()
> > > 
> > > skb_segment() makes the false assumption headrooms of fragments are same
> > > than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> > > errors, and crash later in skb_copy_and_csum_dev()
> > 
> > Eric, probably I missed something, but since the same test as in
> > skb_copy_and_csum_dev() gave different result a bit earlier on exactly
> > the same skb, I've suspected some sharing (or use after free)
> > problems, so I'm not sure your current diagnose can explain this.
> > (Unless this old test was dismissed later.)
> 
> Oh, this is because your patch had an error for the gso part that read :
> 
> -               rc = ops->ndo_start_xmit(nskb, dev);
> +               if (skb_csum_start_bug(skb, 50)) {
> +                       kfree_skb(skb);
> +                       rc = NETDEV_TX_OK;
> +               } else
> +                       rc = ops->ndo_start_xmit(nskb, dev);
> +
>                 if (unlikely(rc != NETDEV_TX_OK)) {
>                         if (rc & ~NETDEV_TX_MASK)
>                                 goto out_kfree_gso_skb;
> 
> You called skb_csum_start_bug(skb, 50) instead of
> skb_csum_start_bug(nskb, 50)
> 
> Hope this clarify a bit ;)

All clear! Sorry for the false alarm!

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Sept. 2, 2010, 1:23 a.m. UTC | #4
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 01 Sep 2010 12:50:51 +0200

> [PATCH] gro: fix different skb headrooms
> 
> packets entering GRO might have different headrooms, even for a given
> flow (because of implementation details in drivers, like copybreak).
> We cant force drivers to deliver packets with a fixed headroom.
> 
> 1) fix skb_segment()
> 
> skb_segment() makes the false assumption headrooms of fragments are same
> than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> errors, and crash later in skb_copy_and_csum_dev()
> 
> 2) allocate a minimal skb for head of frag_list
> 
> skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
> allocate a fresh skb. This adds NET_SKB_PAD to a padding already
> provided by netdevice, depending on various things, like copybreak.
> 
> Use alloc_skb() to allocate an exact padding, to reduce cache line
> needs:
> NET_SKB_PAD + NET_IP_ALIGN
> 
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
> 
> Many thanks to Plamen Petrov, testing many debugging patches !
> With help of Jarek Poplawski.
> 
> Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Jarek Poplawski <jarkao2@gmail.com>

Good spotting, applied, thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Plamen Petrov Sept. 3, 2010, 8 a.m. UTC | #5
На 01.9.2010 г. 13:50, Eric Dumazet написа:
> Plamen, could you test following patch ?
>
> I reproduced problem on a dev machine and following patch cured it.
>
> Thanks
>
> [PATCH] gro: fix different skb headrooms
>
> packets entering GRO might have different headrooms, even for a given
> flow (because of implementation details in drivers, like copybreak).
> We cant force drivers to deliver packets with a fixed headroom.
>
> 1) fix skb_segment()
>
> skb_segment() makes the false assumption headrooms of fragments are same
> than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> errors, and crash later in skb_copy_and_csum_dev()
>
> 2) allocate a minimal skb for head of frag_list
>
> skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
> allocate a fresh skb. This adds NET_SKB_PAD to a padding already
> provided by netdevice, depending on various things, like copybreak.
>
> Use alloc_skb() to allocate an exact padding, to reduce cache line
> needs:
> NET_SKB_PAD + NET_IP_ALIGN
>
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
>
> Many thanks to Plamen Petrov, testing many debugging patches !
> With help of Jarek Poplawski.
>
> Reported-by: Plamen Petrov<pvp-lsts@fs.uni-ruse.bg>
> Signed-off-by: Eric Dumazet<eric.dumazet@gmail.com>
> CC: Jarek Poplawski<jarkao2@gmail.com>
> ---
> patch against linux-2.6 current tree
>
>   net/core/skbuff.c |    8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 3a2513f..26396ff 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2573,6 +2573,10 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
>   		__copy_skb_header(nskb, skb);
>   		nskb->mac_len = skb->mac_len;
>
> +		/* nskb and skb might have different headroom */
> +		if (nskb->ip_summed == CHECKSUM_PARTIAL)
> +			nskb->csum_start += skb_headroom(nskb) - headroom;
> +
>   		skb_reset_mac_header(nskb);
>   		skb_set_network_header(nskb, skb->mac_len);
>   		nskb->transport_header = (nskb->network_header +
> @@ -2702,8 +2706,8 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
>   	} else if (skb_gro_len(p) != pinfo->gso_size)
>   		return -E2BIG;
>
> -	headroom = skb_headroom(p);
> -	nskb = netdev_alloc_skb(p->dev, headroom + skb_gro_offset(p));
> +	headroom = NET_SKB_PAD + NET_IP_ALIGN;
> +	nskb = alloc_skb(headroom + skb_gro_offset(p), GFP_ATOMIC);
>   	if (unlikely(!nskb))
>   		return -ENOMEM;
>
>
>

I confirm that the above patch applied on top of v2.6.36-rc3 does not
show the problems that all the kernels since v2.6.35 (both stable
and Linus' tree) had.

My problematic machine has been running the patched 36-rc3 for 36 hours, 
and couning, with "generic receive offload" enabled only my tg3 nic.

Thank you very much for the wonderful job, Eric!
Thanks to you, too, Jarek!

Plamen Petrov
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu Sept. 3, 2010, 8:30 a.m. UTC | #6
On Wed, Sep 01, 2010 at 12:50:51PM +0200, Eric Dumazet wrote:
> Plamen, could you test following patch ?
> 
> I reproduced problem on a dev machine and following patch cured it.
> 
> Thanks
> 
> [PATCH] gro: fix different skb headrooms
> 
> packets entering GRO might have different headrooms, even for a given
> flow (because of implementation details in drivers, like copybreak).
> We cant force drivers to deliver packets with a fixed headroom.
> 
> 1) fix skb_segment()
> 
> skb_segment() makes the false assumption headrooms of fragments are same
> than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
> errors, and crash later in skb_copy_and_csum_dev()
> 
> 2) allocate a minimal skb for head of frag_list
> 
> skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
> allocate a fresh skb. This adds NET_SKB_PAD to a padding already
> provided by netdevice, depending on various things, like copybreak.
> 
> Use alloc_skb() to allocate an exact padding, to reduce cache line
> needs:
> NET_SKB_PAD + NET_IP_ALIGN
> 
> bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
> 
> Many thanks to Plamen Petrov, testing many debugging patches !
> With help of Jarek Poplawski.
> 
> Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Jarek Poplawski <jarkao2@gmail.com>

Thanks for diagnosing and fixing this!

> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 3a2513f..26396ff 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2573,6 +2573,10 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features)
>  		__copy_skb_header(nskb, skb);
>  		nskb->mac_len = skb->mac_len;
>  
> +		/* nskb and skb might have different headroom */
> +		if (nskb->ip_summed == CHECKSUM_PARTIAL)
> +			nskb->csum_start += skb_headroom(nskb) - headroom;

This test is redundant since we require CHECKSUM_PARTIAL for
GSO packets.

Cheers,
Jarek Poplawski Sept. 3, 2010, 9:06 a.m. UTC | #7
On Fri, Sep 03, 2010 at 11:00:52AM +0300, Plamen Petrov wrote:
> ???? 01.9.2010 ??. 13:50, Eric Dumazet ????????????:
>
> I confirm that the above patch applied on top of v2.6.36-rc3 does not
> show the problems that all the kernels since v2.6.35 (both stable
> and Linus' tree) had.
>
> My problematic machine has been running the patched 36-rc3 for 36 hours,  
> and couning, with "generic receive offload" enabled only my tg3 nic.
>
> Thank you very much for the wonderful job, Eric!
> Thanks to you, too, Jarek!

Not at all! I only messed up a bit :-(

All credits to Eric and Plamen!

Thanks again,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3a2513f..26396ff 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2573,6 +2573,10 @@  struct sk_buff *skb_segment(struct sk_buff *skb, int features)
 		__copy_skb_header(nskb, skb);
 		nskb->mac_len = skb->mac_len;
 
+		/* nskb and skb might have different headroom */
+		if (nskb->ip_summed == CHECKSUM_PARTIAL)
+			nskb->csum_start += skb_headroom(nskb) - headroom;
+
 		skb_reset_mac_header(nskb);
 		skb_set_network_header(nskb, skb->mac_len);
 		nskb->transport_header = (nskb->network_header +
@@ -2702,8 +2706,8 @@  int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 	} else if (skb_gro_len(p) != pinfo->gso_size)
 		return -E2BIG;
 
-	headroom = skb_headroom(p);
-	nskb = netdev_alloc_skb(p->dev, headroom + skb_gro_offset(p));
+	headroom = NET_SKB_PAD + NET_IP_ALIGN;
+	nskb = alloc_skb(headroom + skb_gro_offset(p), GFP_ATOMIC);
 	if (unlikely(!nskb))
 		return -ENOMEM;