From patchwork Mon Dec 11 01:55:04 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 846765 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="hFGCNqpM"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3yw5dW5smxz9s4q for ; Mon, 11 Dec 2017 12:55:43 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752348AbdLKBz2 (ORCPT ); Sun, 10 Dec 2017 20:55:28 -0500 Received: from mail-it0-f66.google.com ([209.85.214.66]:36698 "EHLO mail-it0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752292AbdLKBzX (ORCPT ); Sun, 10 Dec 2017 20:55:23 -0500 Received: by mail-it0-f66.google.com with SMTP id d16so12335596itj.1 for ; Sun, 10 Dec 2017 17:55:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zV2e+2LGuYvJP/ZgBs1UVIJwhXayBrDI0LuMQoJhFys=; b=hFGCNqpMg9aRf9QZ6R7g9U67tFir2QHzbdFD4TjjYf8qB99VTtCEzkK7oKKQjvlHfT jCBS3FNtZVaLoQleCMmgP9j1s0yCbHoEdA8k8m40omVclAleY3IQ4MnnDTNKrGZJC/GP JVHXagAdch1XeXw8PIjsiuhf8kaN9rEBfCMr/Iyo24jUF/Mch9N+rJG5ZzcAi3yy2Ehh Nq8QV8DY9lAwAupwjxmidFZ53qm5OrdZLztLBMa48ZCQii5RG2yFgS96UgX7tUzs9KGJ Jqs8Gs+cVBuQ9PRJZBo2SATuZfSMBXkjGZ48VTE8IhDVY6vNHezh8eOo1I+jC4IXzbXN Bq1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zV2e+2LGuYvJP/ZgBs1UVIJwhXayBrDI0LuMQoJhFys=; b=Cvf2eVH2fa8eZuEl1g37SEvDcPEJV5KPFipmcz03mtbC8w3F/M/CeMSIDDWvB9EpyS zw6rJ7+gXlSmR6LPRUeFDYJRRBTTErZgeC9YK2fbdjqEeHv/GNa45w1S2T2ULPt8rDP3 97Eq88QKC30+/OLLi86JoEA6hNRnYdsU+YVIrq216EtJe+E+iR7SBCZ3+EIY22KGXT55 RmUk0xVjMqHp4uX+kR8675M72cwYifEJ9vjzgGLCld5lnqeL1kQkmxOAGyC6b4Kef34j bSDlstZ7vezWqxfmufHJoEjk+iuQCCL3xEE4zjkVuwD736smzQR6eUcs0bo8A6IbJLme rOFQ== X-Gm-Message-State: AKGB3mKtYry6AHT/6WM6wdPPw7YfBi8GDcGCmgvSK1NRFLkMkKtid1Gl fvoZQGJlDaS/gM+DQ/XEUnCgQA== X-Google-Smtp-Source: AGs4zMYA1OCWx+FNa+aWHp2NgUOnQLmveA0w3by2bjTsNe2MFRQ+xZFAsbwPtmkzTo/0U2GE9KdlIQ== X-Received: by 10.36.91.11 with SMTP id g11mr15288777itb.111.1512957322756; Sun, 10 Dec 2017 17:55:22 -0800 (PST) Received: from localhost ([2620:15c:2c4:201:afb0:6d7f:a4a4:d6c7]) by smtp.gmail.com with ESMTPSA id 202sm3250876iti.6.2017.12.10.17.55.21 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 10 Dec 2017 17:55:21 -0800 (PST) From: Eric Dumazet To: "David S . Miller" , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Wei Wang , Priyaranjan Jha Cc: netdev , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 3/3] tcp: smoother receiver autotuning Date: Sun, 10 Dec 2017 17:55:04 -0800 Message-Id: <20171211015504.26551-4-edumazet@google.com> X-Mailer: git-send-email 2.15.1.424.g9478a66081-goog In-Reply-To: <20171211015504.26551-1-edumazet@google.com> References: <20171211015504.26551-1-edumazet@google.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Back in linux-3.13 (commit b0983d3c9b13 ("tcp: fix dynamic right sizing")) I addressed the pressing issues we had with receiver autotuning. But DRS suffers from extra latencies caused by rcv_rtt_est.rtt_us drifts. One common problem happens during slow start, since the apparent RTT measured by the receiver can be inflated by ~50%, at the end of one packet train. Also, a single drop can delay read() calls by one RTT, meaning tcp_rcv_space_adjust() can be called one RTT too late. By replacing the tri-modal heuristic with a continuous function, we can offset the effects of not growing 'at the optimal time'. The curve of the function matches prior behavior if the space increased by 25% and 50% exactly. Cost of added multiply/divide is small, considering a TCP flow typically would run this part of the code few times in its life. I tested this patch with 100 ms RTT / 1% loss link, 100 runs of (netperf -l 5), and got an average throughput of 4600 Mbit instead of 1700 Mbit. Signed-off-by: Eric Dumazet Acked-by: Soheil Hassas Yeganeh Acked-by: Wei Wang Acked-by: Neal Cardwell --- net/ipv4/tcp_input.c | 19 +++++-------------- 1 file changed, 5 insertions(+), 14 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 2900e58738cde0ad1ab4a034b6300876ac276edb..fefb46c16de7b1da76443f714a3f42faacca708d 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -601,26 +601,17 @@ void tcp_rcv_space_adjust(struct sock *sk) if (sock_net(sk)->ipv4.sysctl_tcp_moderate_rcvbuf && !(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) { int rcvmem, rcvbuf; - u64 rcvwin; + u64 rcvwin, grow; /* minimal window to cope with packet losses, assuming * steady state. Add some cushion because of small variations. */ rcvwin = ((u64)copied << 1) + 16 * tp->advmss; - /* If rate increased by 25%, - * assume slow start, rcvwin = 3 * copied - * If rate increased by 50%, - * assume sender can use 2x growth, rcvwin = 4 * copied - */ - if (copied >= - tp->rcvq_space.space + (tp->rcvq_space.space >> 2)) { - if (copied >= - tp->rcvq_space.space + (tp->rcvq_space.space >> 1)) - rcvwin <<= 1; - else - rcvwin += (rcvwin >> 1); - } + /* Accommodate for sender rate increase (eg. slow start) */ + grow = rcvwin * (copied - tp->rcvq_space.space); + do_div(grow, tp->rcvq_space.space); + rcvwin += (grow << 1); rcvmem = SKB_TRUESIZE(tp->advmss + MAX_TCP_HEADER); while (tcp_win_from_space(sk, rcvmem) < tp->advmss)