diff mbox series

v3.19.8: tcp: re-enable high throughput for low pacing rate

Message ID 20190528183425.31691-1-sergej.benilov@googlemail.com
State Rejected
Delegated to: David Miller
Headers show
Series v3.19.8: tcp: re-enable high throughput for low pacing rate | expand

Commit Message

Sergej Benilov May 28, 2019, 6:34 p.m. UTC
Since commit 605ad7f184b60cfaacbc038aa6c55ee68dee3c89 "tcp: refine TSO autosizing",
the TSQ limit is computed as the smaller of
sysctl_tcp_limit_output_bytes and max(2 * skb->truesize, sk->sk_pacing_rate >> 10).
For low pacing rates, this approach sets a low limit, reducing throughput dramatically.

Compute the limit as the greater of sysctl_tcp_limit_output_bytes and max(2 * skb->truesize, sk->sk_pacing_rate >> 10).

Test:
netperf -H remote -l -2000000 -- -s 1000000

before patch:

MIGRATED TCP STREAM TEST from 0.0.0.0 () port 0 AF_INET to remote () port 0 AF_INET : demo
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380 327680 327680    250.17      0.06

after patch:

MIGRATED TCP STREAM TEST from 0.0.0.0 () port 0 AF_INET to remote () port 0 AF_INET : demo
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380 327680 327680    1.29       12.54

Signed-off-by: Sergej Benilov <sergej.benilov@googlemail.com>
---
 net/ipv4/tcp_output.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Dumazet May 28, 2019, 6:42 p.m. UTC | #1
On 5/28/19 11:34 AM, Sergej Benilov wrote:
> Since commit 605ad7f184b60cfaacbc038aa6c55ee68dee3c89 "tcp: refine TSO autosizing",
> the TSQ limit is computed as the smaller of
> sysctl_tcp_limit_output_bytes and max(2 * skb->truesize, sk->sk_pacing_rate >> 10).
> For low pacing rates, this approach sets a low limit, reducing throughput dramatically.

...

> 
> Signed-off-by: Sergej Benilov <sergej.benilov@googlemail.com>
> ---
>  net/ipv4/tcp_output.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index e625be56..71efca72 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2054,7 +2054,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
>  		 * One example is wifi aggregation (802.11 AMPDU)
>  		 */
>  		limit = max(2 * skb->truesize, sk->sk_pacing_rate >> 10);
> -		limit = min_t(u32, limit, sysctl_tcp_limit_output_bytes);
> +		limit = max_t(u32, limit, sysctl_tcp_limit_output_bytes);
>  
>  		if (atomic_read(&sk->sk_wmem_alloc) > limit) {
>  			set_bit(TSQ_THROTTLED, &tp->tsq_flags);
> 

NACK again, for the same reasons.
diff mbox series

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index e625be56..71efca72 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2054,7 +2054,7 @@  static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
 		 * One example is wifi aggregation (802.11 AMPDU)
 		 */
 		limit = max(2 * skb->truesize, sk->sk_pacing_rate >> 10);
-		limit = min_t(u32, limit, sysctl_tcp_limit_output_bytes);
+		limit = max_t(u32, limit, sysctl_tcp_limit_output_bytes);
 
 		if (atomic_read(&sk->sk_wmem_alloc) > limit) {
 			set_bit(TSQ_THROTTLED, &tp->tsq_flags);