diff mbox

tcp: do not forget FIN in tcp_shifted_skb()

Message ID 1380907901.3564.24.camel@edumazet-glaptop.roam.corp.google.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Oct. 4, 2013, 5:31 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

Yuchung found following problem :

 There are bugs in the SACK processing code, merging part in
 tcp_shift_skb_data(), that incorrectly resets or ignores the sacked
 skbs FIN flag. When a receiver first SACK the FIN sequence, and later
 throw away ofo queue (e.g., sack-reneging), the sender will stop
 retransmitting the FIN flag, and hangs forever.

Following packetdrill test can be used to reproduce the bug.

$ cat sack-merge-bug.pkt
`sysctl -q net.ipv4.tcp_fack=0`

// Establish a connection and send 10 MSS.
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+.000 bind(3, ..., ...) = 0
+.000 listen(3, 1) = 0

+.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
+.001 < . 1:1(0) ack 1 win 1024
+.000 accept(3, ..., ...) = 4

+.100 write(4, ..., 12000) = 12000
+.000 shutdown(4, SHUT_WR) = 0
+.000 > . 1:10001(10000) ack 1
+.050 < . 1:1(0) ack 2001 win 257
+.000 > FP. 10001:12001(2000) ack 1
+.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop>
+.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop>
// SACK reneg
+.050 < . 1:1(0) ack 12001 win 257
+0 %{ print "unacked: ",tcpi_unacked }%
+5 %{ print "" }%


First, a typo inverted left/right of one OR operation, then
code forgot to advance end_seq if the merged skb carried FIN.

Bug was added in 2.6.29 by commit 832d11c5cd076ab
("tcp: Try to restore large SKBs while SACK processing")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
 net/ipv4/tcp_input.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Ilpo Järvinen Oct. 4, 2013, 6:03 p.m. UTC | #1
On Fri, 4 Oct 2013, Eric Dumazet wrote:

> From: Eric Dumazet <edumazet@google.com>
> 
> Yuchung found following problem :
> 
>  There are bugs in the SACK processing code, merging part in
>  tcp_shift_skb_data(), that incorrectly resets or ignores the sacked
>  skbs FIN flag. When a receiver first SACK the FIN sequence, and later
>  throw away ofo queue (e.g., sack-reneging), the sender will stop
>  retransmitting the FIN flag, and hangs forever.
> 
> Following packetdrill test can be used to reproduce the bug.
> 
> [...snip...]
> 
> First, a typo inverted left/right of one OR operation, then
> code forgot to advance end_seq if the merged skb carried FIN.
> 
> Bug was added in 2.6.29 by commit 832d11c5cd076ab
> ("tcp: Try to restore large SKBs while SACK processing")
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
> Acked-by: Neal Cardwell <ncardwell@google.com>
> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> ---
>  net/ipv4/tcp_input.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 25a89ea..113dc5f 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -1284,7 +1284,10 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
>  		tp->lost_cnt_hint -= tcp_skb_pcount(prev);
>  	}
>  
> -	TCP_SKB_CB(skb)->tcp_flags |= TCP_SKB_CB(prev)->tcp_flags;
> +	TCP_SKB_CB(prev)->tcp_flags |= TCP_SKB_CB(skb)->tcp_flags;
> +	if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
> +		TCP_SKB_CB(prev)->end_seq++;
> +
>  	if (skb == tcp_highest_sack(sk))
>  		tcp_advance_highest_sack(sk, skb);

Nice that it was finally found. For some reason my memory tries to say 
that it wouldn't have not even tried to merge skbs with FIN but either
I changed that at some point while deving or I just remember wrong.

Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
David Miller Oct. 4, 2013, 6:17 p.m. UTC | #2
From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Fri, 4 Oct 2013 21:03:53 +0300 (EEST)

> On Fri, 4 Oct 2013, Eric Dumazet wrote:
> 
>> From: Eric Dumazet <edumazet@google.com>
>> 
>> Yuchung found following problem :
>> 
>>  There are bugs in the SACK processing code, merging part in
>>  tcp_shift_skb_data(), that incorrectly resets or ignores the sacked
>>  skbs FIN flag. When a receiver first SACK the FIN sequence, and later
>>  throw away ofo queue (e.g., sack-reneging), the sender will stop
>>  retransmitting the FIN flag, and hangs forever.
>> 
>> Following packetdrill test can be used to reproduce the bug.
 ...
> Nice that it was finally found. For some reason my memory tries to say 
> that it wouldn't have not even tried to merge skbs with FIN but either
> I changed that at some point while deving or I just remember wrong.
> 
> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

Applied, thanks everyone.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 25a89ea..113dc5f 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1284,7 +1284,10 @@  static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
 		tp->lost_cnt_hint -= tcp_skb_pcount(prev);
 	}
 
-	TCP_SKB_CB(skb)->tcp_flags |= TCP_SKB_CB(prev)->tcp_flags;
+	TCP_SKB_CB(prev)->tcp_flags |= TCP_SKB_CB(skb)->tcp_flags;
+	if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
+		TCP_SKB_CB(prev)->end_seq++;
+
 	if (skb == tcp_highest_sack(sk))
 		tcp_advance_highest_sack(sk, skb);