Patchwork [1/2] tcp: must unclone packets before mangling them

login
register
mail settings
Submitter Eric Dumazet
Date Oct. 15, 2013, 6:54 p.m.
Message ID <1381863270.2045.62.camel@edumazet-glaptop.roam.corp.google.com>
Download mbox | patch
Permalink /patch/283769/
State Accepted
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - Oct. 15, 2013, 6:54 p.m.
From: Eric Dumazet <edumazet@google.com>

TCP stack should make sure it owns skbs before mangling them.

We had various crashes using bnx2x, and it turned out gso_size
was cleared right before bnx2x driver was populating TC descriptor
of the _previous_ packet send. TCP stack can sometime retransmit
packets that are still in Qdisc.

Of course we could make bnx2x driver more robust (using
ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack.

We have identified two points where skb_unclone() was needed.

This patch adds a WARN_ON_ONCE() to warn us if we missed another
fix of this kind.

Kudos to Neal for finding the root cause of this bug. Its visible
using small MSS.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
 net/ipv4/tcp_output.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Oct. 17, 2013, 8:08 p.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 15 Oct 2013 11:54:30 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> TCP stack should make sure it owns skbs before mangling them.
> 
> We had various crashes using bnx2x, and it turned out gso_size
> was cleared right before bnx2x driver was populating TC descriptor
> of the _previous_ packet send. TCP stack can sometime retransmit
> packets that are still in Qdisc.
> 
> Of course we could make bnx2x driver more robust (using
> ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack.
> 
> We have identified two points where skb_unclone() was needed.
> 
> This patch adds a WARN_ON_ONCE() to warn us if we missed another
> fix of this kind.
> 
> Kudos to Neal for finding the root cause of this bug. Its visible
> using small MSS.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>

Applied and queued up for -stable.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index c6f01f2..8fad1c1 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -986,6 +986,9 @@  static void tcp_queue_skb(struct sock *sk, struct sk_buff *skb)
 static void tcp_set_skb_tso_segs(const struct sock *sk, struct sk_buff *skb,
 				 unsigned int mss_now)
 {
+	/* Make sure we own this skb before messing gso_size/gso_segs */
+	WARN_ON_ONCE(skb_cloned(skb));
+
 	if (skb->len <= mss_now || !sk_can_gso(sk) ||
 	    skb->ip_summed == CHECKSUM_NONE) {
 		/* Avoid the costly divide in the normal
@@ -1067,9 +1070,7 @@  int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
 	if (nsize < 0)
 		nsize = 0;
 
-	if (skb_cloned(skb) &&
-	    skb_is_nonlinear(skb) &&
-	    pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
+	if (skb_unclone(skb, GFP_ATOMIC))
 		return -ENOMEM;
 
 	/* Get a new skb... force flag on. */
@@ -2344,6 +2345,8 @@  int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 		int oldpcount = tcp_skb_pcount(skb);
 
 		if (unlikely(oldpcount > 1)) {
+			if (skb_unclone(skb, GFP_ATOMIC))
+				return -ENOMEM;
 			tcp_init_tso_segs(sk, skb, cur_mss);
 			tcp_adjust_pcount(sk, skb, oldpcount - tcp_skb_pcount(skb));
 		}