Message ID | 1327807786-27185-1-git-send-email-ncardwell@google.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On Sat, Jan 28, 2012 at 7:29 PM, Neal Cardwell <ncardwell@google.com> wrote: > > This commit fixes tcp_trim_head() to recalculate the number of > segments in the skb with the skb's existing MSS, so trimming the head > causes the skb segment count to be monotonically non-increasing - it > should stay the same or go down, but not increase. > > Previously tcp_trim_head() used the current MSS of the connection. But > if there was a decrease in MSS between original transmission and ACK > (e.g. due to PMTUD), this could cause tcp_trim_head() to > counter-intuitively increase the segment count when trimming bytes off > the head of an skb. This violated assumptions in tcp_tso_acked() that > tcp_trim_head() only decreases the packet count, so that packets_acked > in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to > pass u32 pkts_acked values as large as 0xffffffff to > ca_ops->pkts_acked(). > > As an aside, if tcp_trim_head() had really wanted the skb to reflect > the current MSS, it should have called tcp_set_skb_tso_segs() > unconditionally, since a decrease in MSS would mean that a > single-packet skb should now be sliced into multiple segments. > > Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 28 Jan 2012, Neal Cardwell wrote: > This commit fixes tcp_trim_head() to recalculate the number of > segments in the skb with the skb's existing MSS, so trimming the head > causes the skb segment count to be monotonically non-increasing - it > should stay the same or go down, but not increase. > > Previously tcp_trim_head() used the current MSS of the connection. But > if there was a decrease in MSS between original transmission and ACK > (e.g. due to PMTUD), this could cause tcp_trim_head() to > counter-intuitively increase the segment count when trimming bytes off > the head of an skb. This violated assumptions in tcp_tso_acked() that > tcp_trim_head() only decreases the packet count, so that packets_acked > in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to > pass u32 pkts_acked values as large as 0xffffffff to > ca_ops->pkts_acked(). > > As an aside, if tcp_trim_head() had really wanted the skb to reflect > the current MSS, it should have called tcp_set_skb_tso_segs() > unconditionally, since a decrease in MSS would mean that a > single-packet skb should now be sliced into multiple segments. > > Signed-off-by: Neal Cardwell <ncardwell@google.com> > --- > net/ipv4/tcp_output.c | 6 ++---- > 1 files changed, 2 insertions(+), 4 deletions(-) > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index 8c8de27..4ff3b6d 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -1141,11 +1141,9 @@ int tcp_trim_head(struct sock *sk, struct sk_buff *skb, u32 len) > sk_mem_uncharge(sk, len); > sock_set_flag(sk, SOCK_QUEUE_SHRUNK); > > - /* Any change of skb->len requires recalculation of tso > - * factor and mss. > - */ > + /* Any change of skb->len requires recalculation of tso factor. */ > if (tcp_skb_pcount(skb) > 1) > - tcp_set_skb_tso_segs(sk, skb, tcp_current_mss(sk)); > + tcp_set_skb_tso_segs(sk, skb, tcp_skb_mss(skb)); > > return 0; > } Nice catch... this could solve some non-fatal counter inconsistencies too that have been occuring very rarely. Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> Date: Mon, 30 Jan 2012 09:08:51 +0200 (EET) > On Sat, 28 Jan 2012, Neal Cardwell wrote: > >> This commit fixes tcp_trim_head() to recalculate the number of >> segments in the skb with the skb's existing MSS, so trimming the head >> causes the skb segment count to be monotonically non-increasing - it >> should stay the same or go down, but not increase. >> >> Previously tcp_trim_head() used the current MSS of the connection. But >> if there was a decrease in MSS between original transmission and ACK >> (e.g. due to PMTUD), this could cause tcp_trim_head() to >> counter-intuitively increase the segment count when trimming bytes off >> the head of an skb. This violated assumptions in tcp_tso_acked() that >> tcp_trim_head() only decreases the packet count, so that packets_acked >> in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to >> pass u32 pkts_acked values as large as 0xffffffff to >> ca_ops->pkts_acked(). >> >> As an aside, if tcp_trim_head() had really wanted the skb to reflect >> the current MSS, it should have called tcp_set_skb_tso_segs() >> unconditionally, since a decrease in MSS would mean that a >> single-packet skb should now be sliced into multiple segments. >> >> Signed-off-by: Neal Cardwell <ncardwell@google.com> ... > Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Applied, thanks everyone. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 8c8de27..4ff3b6d 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1141,11 +1141,9 @@ int tcp_trim_head(struct sock *sk, struct sk_buff *skb, u32 len) sk_mem_uncharge(sk, len); sock_set_flag(sk, SOCK_QUEUE_SHRUNK); - /* Any change of skb->len requires recalculation of tso - * factor and mss. - */ + /* Any change of skb->len requires recalculation of tso factor. */ if (tcp_skb_pcount(skb) > 1) - tcp_set_skb_tso_segs(sk, skb, tcp_current_mss(sk)); + tcp_set_skb_tso_segs(sk, skb, tcp_skb_mss(skb)); return 0; }
This commit fixes tcp_trim_head() to recalculate the number of segments in the skb with the skb's existing MSS, so trimming the head causes the skb segment count to be monotonically non-increasing - it should stay the same or go down, but not increase. Previously tcp_trim_head() used the current MSS of the connection. But if there was a decrease in MSS between original transmission and ACK (e.g. due to PMTUD), this could cause tcp_trim_head() to counter-intuitively increase the segment count when trimming bytes off the head of an skb. This violated assumptions in tcp_tso_acked() that tcp_trim_head() only decreases the packet count, so that packets_acked in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to pass u32 pkts_acked values as large as 0xffffffff to ca_ops->pkts_acked(). As an aside, if tcp_trim_head() had really wanted the skb to reflect the current MSS, it should have called tcp_set_skb_tso_segs() unconditionally, since a decrease in MSS would mean that a single-packet skb should now be sliced into multiple segments. Signed-off-by: Neal Cardwell <ncardwell@google.com> --- net/ipv4/tcp_output.c | 6 ++---- 1 files changed, 2 insertions(+), 4 deletions(-)