diff mbox

tcp: Fix for stalling connections

Message ID 4B1C4C25.1070104@tvk.rwth-aachen.de
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Damian Lukowski Dec. 7, 2009, 12:28 a.m. UTC
This patch fixes a problem in the TCP connection timeout calculation.
Currently, timeout decisions are made on the basis of the current
tcp_time_stamp and retrans_stamp, which is usually set at the first
retransmission.
However, if the retransmission fails in tcp_retransmit_skb(),
retrans_stamp is not updated and remains zero. This leads to wrong
decisions in retransmits_timed_out() if tcp_time_stamp is larger than
the specified timeout, which is very likely.
In this case, the TCP connection dies after the first attempted
(and unsuccessful) retransmission.

With this patch, tcp_skb_cb->when is used instead, when retrans_stamp
is not available.

Thanks to Ilpo Järvinen for code suggestions.

Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>
---
 include/net/tcp.h |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

Comments

Damian Lukowski Dec. 7, 2009, 12:32 a.m. UTC | #1
Damian Lukowski schrieb:
> This patch fixes a problem in the TCP connection timeout calculation.
> Currently, timeout decisions are made on the basis of the current
> tcp_time_stamp and retrans_stamp, which is usually set at the first
> retransmission.
> However, if the retransmission fails in tcp_retransmit_skb(),
> retrans_stamp is not updated and remains zero. This leads to wrong
> decisions in retransmits_timed_out() if tcp_time_stamp is larger than
> the specified timeout, which is very likely.
> In this case, the TCP connection dies after the first attempted
> (and unsuccessful) retransmission.
> 
> With this patch, tcp_skb_cb->when is used instead, when retrans_stamp
> is not available.
> 
> Thanks to Ilpo Järvinen for code suggestions.

... and Frederic Leroy for testing. ;)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Dec. 7, 2009, 6:15 a.m. UTC | #2
Damian Lukowski a écrit :
> This patch fixes a problem in the TCP connection timeout calculation.
> Currently, timeout decisions are made on the basis of the current
> tcp_time_stamp and retrans_stamp, which is usually set at the first
> retransmission.
> However, if the retransmission fails in tcp_retransmit_skb(),
> retrans_stamp is not updated and remains zero. This leads to wrong
> decisions in retransmits_timed_out() if tcp_time_stamp is larger than
> the specified timeout, which is very likely.
> In this case, the TCP connection dies after the first attempted
> (and unsuccessful) retransmission.
> 
> With this patch, tcp_skb_cb->when is used instead, when retrans_stamp
> is not available.
> 
> Thanks to Ilpo Järvinen for code suggestions.
> 
> Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>

Hmm, how old is this bug ?

You should a hint of faulty commit so that stable team can apply
this patch to 2.6.32 & 2.6.31

git describe 6fa12c85031485dff38ce550c24f10da23b0adaa

v2.6.31-rc5-1853-g6fa12c8

Or maybe David handles this for us, I dont know...

Minor note : retransmits_timed_out() is used in from net/ipv4/tcp_timer.c
I wonder why its a "static inline" in include/net/tcp.h

Thanks

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Damian Lukowski Dec. 7, 2009, 11:17 a.m. UTC | #3
Eric Dumazet schrieb:
> Damian Lukowski a écrit :
>> This patch fixes a problem in the TCP connection timeout calculation.
>> Currently, timeout decisions are made on the basis of the current
>> tcp_time_stamp and retrans_stamp, which is usually set at the first
>> retransmission.
>> However, if the retransmission fails in tcp_retransmit_skb(),
>> retrans_stamp is not updated and remains zero. This leads to wrong
>> decisions in retransmits_timed_out() if tcp_time_stamp is larger than
>> the specified timeout, which is very likely.
>> In this case, the TCP connection dies after the first attempted
>> (and unsuccessful) retransmission.
>>
>> With this patch, tcp_skb_cb->when is used instead, when retrans_stamp
>> is not available.
>>
>> Thanks to Ilpo Järvinen for code suggestions.
>>
>> Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>
> 
> Hmm, how old is this bug ?
> 
> You should a hint of faulty commit so that stable team can apply
> this patch to 2.6.32 & 2.6.31
> 
> git describe 6fa12c85031485dff38ce550c24f10da23b0adaa
> 
> v2.6.31-rc5-1853-g6fa12c8

It has been introduced with retransmits_timed_out() in 2.6.32,
as timeout calculations have based on the number of
retransmissions before. The patch is needed for 2.6.32
and 2.6.33, but 2.6.31 does not use retransmits_timed_out().

> Or maybe David handles this for us, I dont know...
> 
> Minor note : retransmits_timed_out() is used in from net/ipv4/tcp_timer.c
> I wonder why its a "static inline" in include/net/tcp.h

I can place it in tcp_timer.c, if it's that, what you mean.

Regards.

> Thanks
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Dec. 7, 2009, 11:22 a.m. UTC | #4
From: Damian Lukowski <damian@tvk.rwth-aachen.de>
Date: Mon, 07 Dec 2009 12:17:46 +0100

> Eric Dumazet schrieb:
>> You should a hint of faulty commit so that stable team can apply
>> this patch to 2.6.32 & 2.6.31
>> 
>> git describe 6fa12c85031485dff38ce550c24f10da23b0adaa
>> 
>> v2.6.31-rc5-1853-g6fa12c8
> 
> It has been introduced with retransmits_timed_out() in 2.6.32,
> as timeout calculations have based on the number of
> retransmissions before. The patch is needed for 2.6.32
> and 2.6.33, but 2.6.31 does not use retransmits_timed_out().

I think also the commit introducing the regression should be
explicitly referenced in the commit message of the fix.

>> Or maybe David handles this for us, I dont know...
>> 
>> Minor note : retransmits_timed_out() is used in from net/ipv4/tcp_timer.c
>> I wonder why its a "static inline" in include/net/tcp.h
> 
> I can place it in tcp_timer.c, if it's that, what you mean.

I'm pretty sure that's what he means :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Damian Lukowski Dec. 7, 2009, 11:41 a.m. UTC | #5
This series of patches fixes a problem with the new RTO
timeout calculation as introduced for 2.6.32. Under some
circumstances, if a retransmission fails, the connection
is likely to die silently.

Changelog since v1:
Moved retransmits_timed_out from include/net/tcp.h
to net/ipv4/tcp_timer.c, where it is used.

Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Damian Lukowski Dec. 7, 2009, 4:06 p.m. UTC | #6
This series of patches fixes a problem with the new RTO
timeout calculation as introduced for 2.6.32 in
commit 6fa12c85031485dff38ce550c24f10da23b0adaa.
Under some circumstances, if a retransmission fails, the connection
is likely to die silently.

Changes since v2:
The bugfix has to be applied first, so that the code cleanup
is optional.

Thanks

Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>
---
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 03a49c7..46f06e0 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1267,10 +1267,17 @@  static inline bool retransmits_timed_out(const struct sock *sk,
 					 unsigned int boundary)
 {
 	unsigned int timeout, linear_backoff_thresh;
+	unsigned int start_ts;
 
 	if (!inet_csk(sk)->icsk_retransmits)
 		return false;
 
+	if (unlikely(!tcp_sk(sk)->retrans_stamp))
+		start_ts = TCP_SKB_CB(tcp_write_queue_head(
+					(struct sock *)sk))->when;
+	else
+		start_ts = tcp_sk(sk)->retrans_stamp;
+
 	linear_backoff_thresh = ilog2(TCP_RTO_MAX/TCP_RTO_MIN);
 
 	if (boundary <= linear_backoff_thresh)
@@ -1279,7 +1286,7 @@  static inline bool retransmits_timed_out(const struct sock *sk,
 		timeout = ((2 << linear_backoff_thresh) - 1) * TCP_RTO_MIN +
 			  (boundary - linear_backoff_thresh) * TCP_RTO_MAX;
 
-	return (tcp_time_stamp - tcp_sk(sk)->retrans_stamp) >= timeout;
+	return (tcp_time_stamp - start_ts) >= timeout;
 }
 
 static inline struct sk_buff *tcp_send_head(struct sock *sk)