[net-next] tcp: refine TSO autosizing

From: Eric Dumazet <edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>

Commit 95bd09eb2750 ("tcp: TSO packets automatic sizing") tried to
control TSO size, but did this at the wrong place (sendmsg() time)

At sendmsg() time, we might have a pessimistic view of flow rate,
and we end up building very small skbs (with 2 MSS per skb).

This is bad because :

 - It sends small TSO packets even in Slow Start where rate quickly
   increases.
 - It tends to make socket write queue very big, increasing tcp_ack()
   processing time, but also increasing memory needs, not necessarily
   accounted for, as fast clones overhead is currently ignored.
 - Lower GRO efficiency and more ACK packets.

Servers with a lot of small lived connections suffer from this.

Lets instead fill skbs as much as possible (64KB of payload), but split
them at xmit time, when we have a precise idea of the flow rate.
skb split is actually quite efficient.

Patch looks bigger than necessary, because TCP Small Queue decision now
has to take place after the eventual split.

Tested:

40 ms rtt link

nstat >/dev/null
netperf -H remote -Cc -l -2000000 -- -s 1000000
nstat | egrep "IpInReceives|IpOutRequests|TcpOutSegs|IpExtOutOctets"

Before patch :

Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380 2000000 2000000    0.36         44.22   0.00     0.06     0.000   5.007  
IpInReceives                    600                0.0
IpOutRequests                   599                0.0
TcpOutSegs                      1397               0.0
IpExtOutOctets                  2033249            0.0

After patch :

Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380 2000000 2000000    0.36         44.09   0.00     0.00     0.000   0.000  
IpInReceives                    257                0.0
IpOutRequests                   226                0.0
TcpOutSegs                      1399               0.0
IpExtOutOctets                  2013777            0.0

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp.c        |   14 ++------------
 net/ipv4/tcp_output.c |   38 ++++++++++++++++++++++++--------------
 2 files changed, 26 insertions(+), 26 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Message ID	1417788937.4322.21.camel@edumazet-glaptop2.roam.corp.google.com
State	Superseded, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id C8499140082 for <patchwork-incoming@ozlabs.org>; Sat, 6 Dec 2014 01:15:45 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751156AbaLEOPl (ORCPT <rfc822;patchwork-incoming@ozlabs.org>); Fri, 5 Dec 2014 09:15:41 -0500 Received: from mail-ie0-f173.google.com ([209.85.223.173]:39226 "EHLO mail-ie0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750779AbaLEOPk (ORCPT <rfc822;netdev@vger.kernel.org>); Fri, 5 Dec 2014 09:15:40 -0500 Received: by mail-ie0-f173.google.com with SMTP id y20so743180ier.18 for <netdev@vger.kernel.org>; Fri, 05 Dec 2014 06:15:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:subject:from:to:cc:date:content-type:mime-version :content-transfer-encoding; bh=UXfQJHz9MuFxBzFc2q4N8DMVS1/XfuAfV3uPGZS60kQ=; b=YNtJeAwR4DkYpMZmbITc3Ds4NbYiZ/KngyMZDVAjD5vhUhswLk6IPsBFXCSe+JRahu xMmBMDiquLPM3WGm/U2Ygcp8xfCmeynusY5W2lnoBaZSyAEkG8A1yRBnc68lPgg/sDSN M9yYUllapVTqtLMARc8x/nmGUFn/HLN1UELGmLkH6KWK0U8NAFD8Iq3PBIYkYQwHojXy 0aaKW9/KNxVO4OsTvN4L982camp2pFzzp7nuYkp7PIHDOvOWSYWpgGzp09BlWXWYl+Nx Db2HdWndjgySabioBk6XIZUFCB4oAt7YZI0bN4Gera6j11BF3FAYpNfa0kjQuG8FrmCl 2gHA== X-Received: by 10.50.50.141 with SMTP id c13mr2569131igo.5.1417788939611; Fri, 05 Dec 2014 06:15:39 -0800 (PST) Received: from [172.26.49.115] ([172.26.49.115]) by mx.google.com with ESMTPSA id i3sm16046919iod.19.2014.12.05.06.15.38 for <multiple recipients> (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128/128); Fri, 05 Dec 2014 06:15:39 -0800 (PST) Message-ID: <1417788937.4322.21.camel@edumazet-glaptop2.roam.corp.google.com> Subject: [PATCH net-next] tcp: refine TSO autosizing From: Eric Dumazet <eric.dumazet@gmail.com> To: David Miller <davem@davemloft.net> Cc: netdev <netdev@vger.kernel.org>, Neal Cardwell <ncardwell@google.com>, Yuchung Cheng <ycheng@google.com>, Nandita Dukkipati <nanditad@google.com> Date: Fri, 05 Dec 2014 06:15:37 -0800 Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: <netdev.vger.kernel.org> X-Mailing-List: netdev@vger.kernel.org

[net-next] tcp: refine TSO autosizing

Commit Message

Comments

Patch