Message ID | 1440090487.6610.59.camel@edumazet-glaptop2.roam.corp.google.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Thu, Aug 20, 2015 at 1:08 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > From: Eric Dumazet <edumazet@google.com> > > slow start after idle might reduce cwnd, but we perform this > after first packet was cooked and sent. > > With TSO/GSO, it means that we might send a full TSO packet > even if cwnd should have been reduced to IW10. > > Moving the SSAI check in skb_entail() makes sense, because > we slightly reduce number of times this check is done, > especially for large send() and TCP Small queue callbacks from > softirq context. Very nice catch, and this fix seems like a definite improvement. One potential issue is that the connection can restart from idle not just because new data has been written (which this patch addresses), but also because the receive window opens and so now packets can be sent again. The old version of the code implicitly fired the restart code path in the "receive window opens" case as well, since it fired every time new data was sent. We might want to check if we need to call tcp_cwnd_restart() in tcp_ack_update_window(), next to the call for tcp_fast_path_check()? neal -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2015-08-21 at 11:10 -0400, Neal Cardwell wrote: > Very nice catch, and this fix seems like a definite improvement. > > One potential issue is that the connection can restart from idle not > just because new data has been written (which this patch addresses), > but also because the receive window opens and so now packets can be > sent again. The old version of the code implicitly fired the restart > code path in the "receive window opens" case as well, since it fired > every time new data was sent. We might want to check if we need to > call tcp_cwnd_restart() in tcp_ack_update_window(), next to the call > for tcp_fast_path_check()? Excellent, I wrote a 2nd packetdrill test to exercise this path, will submit a v2 soon. Thanks Neal -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/tcp.h b/include/net/tcp.h index 364426a..639f64e 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1165,6 +1165,7 @@ static inline void tcp_sack_reset(struct tcp_options_received *rx_opt) } u32 tcp_default_init_rwnd(u32 mss); +void tcp_cwnd_restart(struct sock *sk, s32 delta); /* Determine a window scaling and initial window to offer. */ void tcp_select_initial_window(int __space, __u32 mss, __u32 *rcv_wnd, diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 45534a5..e228433 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -627,6 +627,14 @@ static void skb_entail(struct sock *sk, struct sk_buff *skb) sk_mem_charge(sk, skb->truesize); if (tp->nonagle & TCP_NAGLE_PUSH) tp->nonagle &= ~TCP_NAGLE_PUSH; + + if (sysctl_tcp_slow_start_after_idle && + sk->sk_write_queue.next == skb) { + s32 delta = tcp_time_stamp - tp->lsndtime; + + if (delta > inet_csk(sk)->icsk_rto) + tcp_cwnd_restart(sk, delta); + } } static inline void tcp_mark_urg(struct tcp_sock *tp, int flags) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 444ab5b..1188e4f 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -137,12 +137,12 @@ static __u16 tcp_advertise_mss(struct sock *sk) } /* RFC2861. Reset CWND after idle period longer RTO to "restart window". - * This is the first part of cwnd validation mechanism. */ -static void tcp_cwnd_restart(struct sock *sk, const struct dst_entry *dst) + * This is the first part of cwnd validation mechanism. + */ +void tcp_cwnd_restart(struct sock *sk, s32 delta) { struct tcp_sock *tp = tcp_sk(sk); - s32 delta = tcp_time_stamp - tp->lsndtime; - u32 restart_cwnd = tcp_init_cwnd(tp, dst); + u32 restart_cwnd = tcp_init_cwnd(tp, __sk_dst_get(sk)); u32 cwnd = tp->snd_cwnd; tcp_ca_event(sk, CA_EVENT_CWND_RESTART); @@ -164,10 +164,6 @@ static void tcp_event_data_sent(struct tcp_sock *tp, struct inet_connection_sock *icsk = inet_csk(sk); const u32 now = tcp_time_stamp; - if (sysctl_tcp_slow_start_after_idle && - (!tp->packets_out && (s32)(now - tp->lsndtime) > icsk->icsk_rto)) - tcp_cwnd_restart(sk, __sk_dst_get(sk)); - tp->lsndtime = now; /* If it is a reply for ato after last received