Message ID | 1378488958.31445.47.camel@edumazet-glaptop |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Fri, 06 Sep 2013 10:35:58 -0700 > From: Eric Dumazet <edumazet@google.com> > > TCP receive window handling is multi staged. > > A socket has a memory budget, static or dynamic, in sk_rcvbuf. > > Because we do not really know how this memory budget translates to > a TCP window (payload), TCP announces a small initial window > (about 20 MSS). > > When a packet is received, we increase TCP rcv_win depending > on the payload/truesize ratio of this packet. Good citizen > packets give a hint that it's reasonable to have rcv_win = sk_rcvbuf/2 > > This heuristic takes place in tcp_grow_window() > > Problem is : We currently call tcp_grow_window() only for in-order > packets. > > This means that reorders or packet losses stop proper grow of > rcv_win, and senders are unable to benefit from fast recovery, > or proper reordering level detection. > > Really, a packet being stored in OFO queue is not a bad citizen. > It should be part of the game as in-order packets. > > In our traces, we very often see sender is limited by linux small > receive windows, even if linux hosts use autotuning (DRS) and should > allow rcv_win to grow to ~3MB. > > Signed-off-by: Eric Dumazet <edumazet@google.com> > Acked-by: Neal Cardwell <ncardwell@google.com> Applied. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 1969e16..28708d3 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4141,6 +4141,7 @@ static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb) if (!tcp_try_coalesce(sk, skb1, skb, &fragstolen)) { __skb_queue_after(&tp->out_of_order_queue, skb1, skb); } else { + tcp_grow_window(sk, skb); kfree_skb_partial(skb, fragstolen); skb = NULL; } @@ -4216,8 +4217,10 @@ add_sack: if (tcp_is_sack(tp)) tcp_sack_new_ofo_skb(sk, seq, end_seq); end: - if (skb) + if (skb) { + tcp_grow_window(sk, skb); skb_set_owner_r(skb, sk); + } } static int __must_check tcp_queue_rcv(struct sock *sk, struct sk_buff *skb, int hdrlen,