diff mbox

tcp: fix setting csum_start in tcp_gso_segment

Message ID alpine.DEB.2.02.1406242056330.29780@tomh.mtv.corp.google.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Tom Herbert June 25, 2014, 4:03 a.m. UTC
Dave Jones reported that a crash is occuring in

csum_partial
tcp_gso_segment
inet_gso_segment
? update_dl_migration
skb_mac_gso_segment
__skb_gso_segment
dev_hard_start_xmit
sch_direct_xmit
__dev_queue_xmit
? dev_hard_start_xmit
dev_queue_xmit
ip_finish_output
? ip_output
ip_output
ip_forward_finish
ip_forward
ip_rcv_finish
ip_rcv
__netif_receive_skb_core
? __netif_receive_skb_core
? trace_hardirqs_on
__netif_receive_skb
netif_receive_skb_internal
napi_gro_complete
? napi_gro_complete
dev_gro_receive
? dev_gro_receive
napi_gro_receive

It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
not set correctly when doing non-scatter gather. We are using
offset as opposed to doffset.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Tom Herbert <therbert@google.com>
---
 net/core/skbuff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Linus Torvalds June 25, 2014, 4:17 a.m. UTC | #1
On Tue, Jun 24, 2014 at 9:03 PM, Tom Herbert <therbert@google.com> wrote:
>
> It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
> not set correctly when doing non-scatter gather. We are using
> offset as opposed to doffset.
>
> Reported-by: Dave Jones <davej@redhat.com>

DaveJ, I think you triggered this in five minutes on your box, and I
don't recall seeing anybody else reporting the oops (and google
doesn't find anything in the last month). So it's presumably somewhat
hw-specific. Does this fix the problem?

               Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet June 25, 2014, 7:19 a.m. UTC | #2
On Tue, 2014-06-24 at 21:03 -0700, Tom Herbert wrote:

> 
> It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
> not set correctly when doing non-scatter gather. We are using
> offset as opposed to doffset.
> 
> Reported-by: Dave Jones <davej@redhat.com>
> Signed-off-by: Tom Herbert <therbert@google.com>
> ---
>  net/core/skbuff.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 9cd5344..c1a3303 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2993,7 +2993,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>  							    skb_put(nskb, len),
>  							    len, 0);
>  			SKB_GSO_CB(nskb)->csum_start =
> -			    skb_headroom(nskb) + offset;
> +			    skb_headroom(nskb) + doffset;
>  			continue;
>  		}
>  

Yes, seems an obvious typo, but please change patch title.

This is not "tcp: fix setting csum_start in tcp_gso_segment"

Maybe "net: fix setting csum_start in skb_segment()"


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Jones June 25, 2014, 2:10 p.m. UTC | #3
On Tue, Jun 24, 2014 at 09:17:25PM -0700, Linus Torvalds wrote:
 > On Tue, Jun 24, 2014 at 9:03 PM, Tom Herbert <therbert@google.com> wrote:
 > >
 > > It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
 > > not set correctly when doing non-scatter gather. We are using
 > > offset as opposed to doffset.
 > >
 > > Reported-by: Dave Jones <davej@redhat.com>
 > 
 > DaveJ, I think you triggered this in five minutes on your box, and I
 > don't recall seeing anybody else reporting the oops (and google
 > doesn't find anything in the last month). So it's presumably somewhat
 > hw-specific. Does this fix the problem?

It's survived routing ~1GB of packets overnight, so I'd call this good.

thanks Tom.

	Dave

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller June 25, 2014, 6:52 p.m. UTC | #4
From: Dave Jones <davej@redhat.com>
Date: Wed, 25 Jun 2014 10:10:52 -0400

> On Tue, Jun 24, 2014 at 09:17:25PM -0700, Linus Torvalds wrote:
>  > On Tue, Jun 24, 2014 at 9:03 PM, Tom Herbert <therbert@google.com> wrote:
>  > >
>  > > It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
>  > > not set correctly when doing non-scatter gather. We are using
>  > > offset as opposed to doffset.
>  > >
>  > > Reported-by: Dave Jones <davej@redhat.com>
>  > 
>  > DaveJ, I think you triggered this in five minutes on your box, and I
>  > don't recall seeing anybody else reporting the oops (and google
>  > doesn't find anything in the last month). So it's presumably somewhat
>  > hw-specific. Does this fix the problem?
> 
> It's survived routing ~1GB of packets overnight, so I'd call this good.

Tom, please adjust the Subject line as suggested by Eric Dumazet and add
a Tested-by: for Dave.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9cd5344..c1a3303 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2993,7 +2993,7 @@  struct sk_buff *skb_segment(struct sk_buff *head_skb,
 							    skb_put(nskb, len),
 							    len, 0);
 			SKB_GSO_CB(nskb)->csum_start =
-			    skb_headroom(nskb) + offset;
+			    skb_headroom(nskb) + doffset;
 			continue;
 		}