Message ID | 1384267141.28458.24.camel@edumazet-glaptop2.roam.corp.google.com |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> > > After commit c9eeec26e32e ("tcp: TSQ can use a dynamic limit"), several > users reported throughput regressions, notably on mvneta and wifi > adapters. > > 802.11 AMPDU requires a fair amount of queueing to be effective. > > This patch partially reverts the change done in tcp_write_xmit() > so that the minimal amount is sysctl_tcp_limit_output_bytes. > > It also remove the use of this sysctl while building skb stored > in write queue, as TSO autosizing does the right thing anyway. > > Users with well behaving NICS and correct qdisc (like sch_fq), > can then lower the default sysctl_tcp_limit_output_bytes value from > 128KB to 8KB. > > The new usage of sysctl_tcp_limit_output_bytes permits each driver > author to check how driver performs when/if the value is set > to a minimum of 4KB : > > Normally, line rate for a single TCP flow should be possible, > but some drivers rely on timers to perform TX completion and > too long delays prevent reaching full throughput. I tested the patch with ath9k and performance with a 2-stream card is normal again, about 195 Mbps in open air. Thanks for the fix ! Also, I think this needs to be marked as a stable candidate, since 3.12 needs this fix. Sujith -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2013-11-12 at 06:39 -0800, Eric Dumazet wrote: > - limit = max(skb->truesize, sk->sk_pacing_rate >> 10); > + limit = max(sysctl_tcp_limit_output_bytes, > + sk->sk_pacing_rate >> 10); > I'll send a v2, a max_t(unsigned int, ..., ...) is needed here, as reported by kbuild bot. net/ipv4/tcp_output.c:1882:177: warning: comparison of distinct pointer types lacks a cast [enabled by default] -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index a46d785..7d8dc93 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -588,9 +588,6 @@ tcp_limit_output_bytes - INTEGER typical pfifo_fast qdiscs. tcp_limit_output_bytes limits the number of bytes on qdisc or device to reduce artificial RTT/cwnd and reduce bufferbloat. - Note: For GSO/TSO enabled flows, we try to have at least two - packets in flight. Reducing tcp_limit_output_bytes might also - reduce the size of individual GSO packet (64KB being the max) Default: 131072 tcp_challenge_ack_limit - INTEGER diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 6e5617b..be5246e 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -806,12 +806,6 @@ static unsigned int tcp_xmit_size_goal(struct sock *sk, u32 mss_now, xmit_size_goal = min_t(u32, gso_size, sk->sk_gso_max_size - 1 - hlen); - /* TSQ : try to have at least two segments in flight - * (one in NIC TX ring, another in Qdisc) - */ - xmit_size_goal = min_t(u32, xmit_size_goal, - sysctl_tcp_limit_output_bytes >> 1); - xmit_size_goal = tcp_bound_to_half_wnd(tp, xmit_size_goal); /* We try hard to avoid divides here */ diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index d46f214..9f0b338 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1875,8 +1875,12 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, * - better RTT estimation and ACK scheduling * - faster recovery * - high rates + * Alas, some drivers / subsystems require a fair amount + * of queued bytes to ensure line rate. + * One example is wifi aggregation (802.11 AMPDU) */ - limit = max(skb->truesize, sk->sk_pacing_rate >> 10); + limit = max(sysctl_tcp_limit_output_bytes, + sk->sk_pacing_rate >> 10); if (atomic_read(&sk->sk_wmem_alloc) > limit) { set_bit(TSQ_THROTTLED, &tp->tsq_flags);