diff mbox

[net-next] tcp: fix bogus RTT for CC when retransmissions are acked

Message ID 1428711469-5328-1-git-send-email-kennetkl@ifi.uio.no
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Kenneth Klette Jonassen April 11, 2015, 12:17 a.m. UTC
Since retransmitted segments are not used for RTT estimation, previously
SACKed segments present in the rtx queue are used. This estimation can be
several times larger than the actual RTT. When a cumulative ack covers both
previously SACKed and retransmitted segments, CC may thus get a bogus RTT.

Such segments previously had an RTT estimation in tcp_sacktag_one(), so it
seems reasonable to not reuse them in tcp_clean_rtx_queue() at all.

Afaik, this has had no effect on SRTT/RTO because of Karn's check.

Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
---
 net/ipv4/tcp_input.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Comments

Eric Dumazet April 11, 2015, 12:50 a.m. UTC | #1
On Sat, 2015-04-11 at 02:17 +0200, Kenneth Klette Jonassen wrote:
> Since retransmitted segments are not used for RTT estimation, previously
> SACKed segments present in the rtx queue are used. This estimation can be
> several times larger than the actual RTT. When a cumulative ack covers both
> previously SACKed and retransmitted segments, CC may thus get a bogus RTT.
> 
> Such segments previously had an RTT estimation in tcp_sacktag_one(), so it
> seems reasonable to not reuse them in tcp_clean_rtx_queue() at all.
> 
> Afaik, this has had no effect on SRTT/RTO because of Karn's check.
> 
> Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
> ---
>  net/ipv4/tcp_input.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 031cf72..a7ef679 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -3099,17 +3099,15 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
>  			if (sacked & TCPCB_SACKED_RETRANS)
>  				tp->retrans_out -= acked_pcount;
>  			flag |= FLAG_RETRANS_DATA_ACKED;
> -		} else {
> +		} else if (!(sacked & TCPCB_SACKED_ACKED)) {
>  			last_ackt = skb->skb_mstamp;
>  			WARN_ON_ONCE(last_ackt.v64 == 0);
>  			if (!first_ackt.v64)
>  				first_ackt = last_ackt;
>  
> -			if (!(sacked & TCPCB_SACKED_ACKED)) {
> -				reord = min(pkts_acked, reord);
> -				if (!after(scb->end_seq, tp->high_seq))
> -					flag |= FLAG_ORIG_SACK_ACKED;
> -			}
> +			reord = min(pkts_acked, reord);
> +			if (!after(scb->end_seq, tp->high_seq))
> +				flag |= FLAG_ORIG_SACK_ACKED;
>  		}
>  
>  		if (sacked & TCPCB_SACKED_ACKED)

Any particular reason you did not CC authors of this recent patch ?

http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=666b805150efd62f05810ff0db08f44a2370c937

Thanks.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neal Cardwell April 12, 2015, 2:13 p.m. UTC | #2
On Fri, Apr 10, 2015 at 8:17 PM, Kenneth Klette Jonassen
<kennetkl@ifi.uio.no> wrote:
> Since retransmitted segments are not used for RTT estimation, previously
> SACKed segments present in the rtx queue are used. This estimation can be
> several times larger than the actual RTT. When a cumulative ack covers both
> previously SACKed and retransmitted segments, CC may thus get a bogus RTT.
>
> Such segments previously had an RTT estimation in tcp_sacktag_one(), so it
> seems reasonable to not reuse them in tcp_clean_rtx_queue() at all.
>
> Afaik, this has had no effect on SRTT/RTO because of Karn's check.
>
> Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>

Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>

neal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yuchung Cheng April 12, 2015, 4 p.m. UTC | #3
On Sun, Apr 12, 2015 at 7:13 AM, Neal Cardwell <ncardwell@google.com> wrote:
> On Fri, Apr 10, 2015 at 8:17 PM, Kenneth Klette Jonassen
> <kennetkl@ifi.uio.no> wrote:
>> Since retransmitted segments are not used for RTT estimation, previously
>> SACKed segments present in the rtx queue are used. This estimation can be
>> several times larger than the actual RTT. When a cumulative ack covers both
>> previously SACKed and retransmitted segments, CC may thus get a bogus RTT.
>>
>> Such segments previously had an RTT estimation in tcp_sacktag_one(), so it
>> seems reasonable to not reuse them in tcp_clean_rtx_queue() at all.
>>
>> Afaik, this has had no effect on SRTT/RTO because of Karn's check.
>>
>> Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
>
> Acked-by: Neal Cardwell <ncardwell@google.com>
> Tested-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
>
> neal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 13, 2015, 5:55 p.m. UTC | #4
From: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Date: Sat, 11 Apr 2015 02:17:49 +0200

> Since retransmitted segments are not used for RTT estimation, previously
> SACKed segments present in the rtx queue are used. This estimation can be
> several times larger than the actual RTT. When a cumulative ack covers both
> previously SACKed and retransmitted segments, CC may thus get a bogus RTT.
> 
> Such segments previously had an RTT estimation in tcp_sacktag_one(), so it
> seems reasonable to not reuse them in tcp_clean_rtx_queue() at all.
> 
> Afaik, this has had no effect on SRTT/RTO because of Karn's check.
> 
> Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 031cf72..a7ef679 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3099,17 +3099,15 @@  static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
 			if (sacked & TCPCB_SACKED_RETRANS)
 				tp->retrans_out -= acked_pcount;
 			flag |= FLAG_RETRANS_DATA_ACKED;
-		} else {
+		} else if (!(sacked & TCPCB_SACKED_ACKED)) {
 			last_ackt = skb->skb_mstamp;
 			WARN_ON_ONCE(last_ackt.v64 == 0);
 			if (!first_ackt.v64)
 				first_ackt = last_ackt;
 
-			if (!(sacked & TCPCB_SACKED_ACKED)) {
-				reord = min(pkts_acked, reord);
-				if (!after(scb->end_seq, tp->high_seq))
-					flag |= FLAG_ORIG_SACK_ACKED;
-			}
+			reord = min(pkts_acked, reord);
+			if (!after(scb->end_seq, tp->high_seq))
+				flag |= FLAG_ORIG_SACK_ACKED;
 		}
 
 		if (sacked & TCPCB_SACKED_ACKED)