diff mbox

tcp: fix tcp_trim_head() to adjust segment count with skb MSS

Message ID 1327807786-27185-1-git-send-email-ncardwell@google.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Neal Cardwell Jan. 29, 2012, 3:29 a.m. UTC
This commit fixes tcp_trim_head() to recalculate the number of
segments in the skb with the skb's existing MSS, so trimming the head
causes the skb segment count to be monotonically non-increasing - it
should stay the same or go down, but not increase.

Previously tcp_trim_head() used the current MSS of the connection. But
if there was a decrease in MSS between original transmission and ACK
(e.g. due to PMTUD), this could cause tcp_trim_head() to
counter-intuitively increase the segment count when trimming bytes off
the head of an skb. This violated assumptions in tcp_tso_acked() that
tcp_trim_head() only decreases the packet count, so that packets_acked
in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to
pass u32 pkts_acked values as large as 0xffffffff to
ca_ops->pkts_acked().

As an aside, if tcp_trim_head() had really wanted the skb to reflect
the current MSS, it should have called tcp_set_skb_tso_segs()
unconditionally, since a decrease in MSS would mean that a
single-packet skb should now be sliced into multiple segments.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
---
 net/ipv4/tcp_output.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

Comments

Nandita Dukkipati Jan. 30, 2012, 6:49 a.m. UTC | #1
On Sat, Jan 28, 2012 at 7:29 PM, Neal Cardwell <ncardwell@google.com> wrote:
>
> This commit fixes tcp_trim_head() to recalculate the number of
> segments in the skb with the skb's existing MSS, so trimming the head
> causes the skb segment count to be monotonically non-increasing - it
> should stay the same or go down, but not increase.
>
> Previously tcp_trim_head() used the current MSS of the connection. But
> if there was a decrease in MSS between original transmission and ACK
> (e.g. due to PMTUD), this could cause tcp_trim_head() to
> counter-intuitively increase the segment count when trimming bytes off
> the head of an skb. This violated assumptions in tcp_tso_acked() that
> tcp_trim_head() only decreases the packet count, so that packets_acked
> in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to
> pass u32 pkts_acked values as large as 0xffffffff to
> ca_ops->pkts_acked().
>
> As an aside, if tcp_trim_head() had really wanted the skb to reflect
> the current MSS, it should have called tcp_set_skb_tso_segs()
> unconditionally, since a decrease in MSS would mean that a
> single-packet skb should now be sliced into multiple segments.
>
> Signed-off-by: Neal Cardwell <ncardwell@google.com>

Acked-by: Nandita Dukkipati <nanditad@google.com>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ilpo Järvinen Jan. 30, 2012, 7:08 a.m. UTC | #2
On Sat, 28 Jan 2012, Neal Cardwell wrote:

> This commit fixes tcp_trim_head() to recalculate the number of
> segments in the skb with the skb's existing MSS, so trimming the head
> causes the skb segment count to be monotonically non-increasing - it
> should stay the same or go down, but not increase.
> 
> Previously tcp_trim_head() used the current MSS of the connection. But
> if there was a decrease in MSS between original transmission and ACK
> (e.g. due to PMTUD), this could cause tcp_trim_head() to
> counter-intuitively increase the segment count when trimming bytes off
> the head of an skb. This violated assumptions in tcp_tso_acked() that
> tcp_trim_head() only decreases the packet count, so that packets_acked
> in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to
> pass u32 pkts_acked values as large as 0xffffffff to
> ca_ops->pkts_acked().
> 
> As an aside, if tcp_trim_head() had really wanted the skb to reflect
> the current MSS, it should have called tcp_set_skb_tso_segs()
> unconditionally, since a decrease in MSS would mean that a
> single-packet skb should now be sliced into multiple segments.
> 
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> ---
>  net/ipv4/tcp_output.c |    6 ++----
>  1 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 8c8de27..4ff3b6d 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1141,11 +1141,9 @@ int tcp_trim_head(struct sock *sk, struct sk_buff *skb, u32 len)
>  	sk_mem_uncharge(sk, len);
>  	sock_set_flag(sk, SOCK_QUEUE_SHRUNK);
>  
> -	/* Any change of skb->len requires recalculation of tso
> -	 * factor and mss.
> -	 */
> +	/* Any change of skb->len requires recalculation of tso factor. */
>  	if (tcp_skb_pcount(skb) > 1)
> -		tcp_set_skb_tso_segs(sk, skb, tcp_current_mss(sk));
> +		tcp_set_skb_tso_segs(sk, skb, tcp_skb_mss(skb));
>  
>  	return 0;
>  }

Nice catch... this could solve some non-fatal counter inconsistencies too 
that have been occuring very rarely.

Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
David Miller Jan. 30, 2012, 5:43 p.m. UTC | #3
From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Mon, 30 Jan 2012 09:08:51 +0200 (EET)

> On Sat, 28 Jan 2012, Neal Cardwell wrote:
> 
>> This commit fixes tcp_trim_head() to recalculate the number of
>> segments in the skb with the skb's existing MSS, so trimming the head
>> causes the skb segment count to be monotonically non-increasing - it
>> should stay the same or go down, but not increase.
>> 
>> Previously tcp_trim_head() used the current MSS of the connection. But
>> if there was a decrease in MSS between original transmission and ACK
>> (e.g. due to PMTUD), this could cause tcp_trim_head() to
>> counter-intuitively increase the segment count when trimming bytes off
>> the head of an skb. This violated assumptions in tcp_tso_acked() that
>> tcp_trim_head() only decreases the packet count, so that packets_acked
>> in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to
>> pass u32 pkts_acked values as large as 0xffffffff to
>> ca_ops->pkts_acked().
>> 
>> As an aside, if tcp_trim_head() had really wanted the skb to reflect
>> the current MSS, it should have called tcp_set_skb_tso_segs()
>> unconditionally, since a decrease in MSS would mean that a
>> single-packet skb should now be sliced into multiple segments.
>> 
>> Signed-off-by: Neal Cardwell <ncardwell@google.com>
 ...
> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

Applied, thanks everyone.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 8c8de27..4ff3b6d 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1141,11 +1141,9 @@  int tcp_trim_head(struct sock *sk, struct sk_buff *skb, u32 len)
 	sk_mem_uncharge(sk, len);
 	sock_set_flag(sk, SOCK_QUEUE_SHRUNK);
 
-	/* Any change of skb->len requires recalculation of tso
-	 * factor and mss.
-	 */
+	/* Any change of skb->len requires recalculation of tso factor. */
 	if (tcp_skb_pcount(skb) > 1)
-		tcp_set_skb_tso_segs(sk, skb, tcp_current_mss(sk));
+		tcp_set_skb_tso_segs(sk, skb, tcp_skb_mss(skb));
 
 	return 0;
 }