From patchwork Sat Nov 11 23:54:12 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 837094 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="pLtPPvd0"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3yZDJr6HHLz9t2d for ; Sun, 12 Nov 2017 10:54:20 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752275AbdKKXyQ (ORCPT ); Sat, 11 Nov 2017 18:54:16 -0500 Received: from mail-pg0-f67.google.com ([74.125.83.67]:50154 "EHLO mail-pg0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752062AbdKKXyP (ORCPT ); Sat, 11 Nov 2017 18:54:15 -0500 Received: by mail-pg0-f67.google.com with SMTP id g6so10060769pgn.6 for ; Sat, 11 Nov 2017 15:54:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:subject:from:to:cc:date:in-reply-to:references :mime-version:content-transfer-encoding; bh=3W0ruL5BRXXvgtLRu49mbNbz1/jyzb5mny/zcJwZXIs=; b=pLtPPvd0P+swnvPZjaZbYWPIRfEnX5o6QWSL466HF/gg10CefRTlC4ltZIhs3vXXGY 8s/Jznr4RmwfakiYs2UIjwzofOysUz/sM+Ge8sGzQpCF/rUpppCHGf6reGBKLsMoulEf /AONqlLJb1L59w27b4q7XZZEbf+B6E3z2QESaotjWv+SRp1qAuGYqdJMe4Qj30GVhOhT AqkJLrR9DwtB6rzii7H5wnQ+R61FmJarcxfiu1v8nlnjvvUYFQsq5qOyYPNHy7yl6Z52 PKeEXavWUCmdVKGcn0wm1ARaJTUzsIav4euKdy/FRmeMcyNo4uBj0xXqGbfz64Cnc1Dl PdHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:mime-version:content-transfer-encoding; bh=3W0ruL5BRXXvgtLRu49mbNbz1/jyzb5mny/zcJwZXIs=; b=bN3JjzVnffpctmrW07wVYQKctGX/ULzp1B77dEMsJyCSJ41pwPzwnPnXzNzJxp3uF6 ZApgdIBYOsJT+j5bSIQavm1KlQd9riLMDjuEhD2yRaNuGEHAtxeZiBy5kBeItDNkheab eBzm53n5w0Vny5mcKxPt/cGIvvZHp2V4S0hRDgr9EoC9qtnUAUuie97nW7syTxJy3O2p KpRfnFO15dQ7rXa9OfBRuS15cMcv4X9F5DU87MYTrR+1BgrIyS1nVP79Ln/xJ3DIddZv MGec4pif657bDQJMREq3PSTZEtgjSDxYwWz7FXRZcO9fy0XhVI8C7WqL/gZio4hXpgZe ZwqA== X-Gm-Message-State: AJaThX7c5D1kqTqntsy75NN27eTEXWA+vRoNJmD/xnBG5JbyiLgv1c8m r79Rr4NwxT/ltUuLYE5234M= X-Google-Smtp-Source: AGs4zMamSyJHUd83nmHMlzKAP/gyCUfEydyOu04EoP///e/H+a0GS/75eI5pyWScnjGhMxLIfAurpg== X-Received: by 10.159.207.149 with SMTP id z21mr4686606plo.258.1510444454884; Sat, 11 Nov 2017 15:54:14 -0800 (PST) Received: from [192.168.86.171] (c-67-180-167-114.hsd1.ca.comcast.net. [67.180.167.114]) by smtp.googlemail.com with ESMTPSA id e70sm21373552pgc.15.2017.11.11.15.54.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 11 Nov 2017 15:54:14 -0800 (PST) Message-ID: <1510444452.2849.149.camel@edumazet-glaptop3.roam.corp.google.com> Subject: [PATCH v2 net-next] tcp: allow drivers to tweak TSQ logic From: Eric Dumazet To: David Miller Cc: netdev , Johannes Berg , Toke =?iso-8859-1?q?H=F8iland-J=F8rgensen?= , Kir Kolyshkin Date: Sat, 11 Nov 2017 15:54:12 -0800 In-Reply-To: <1510281664.2849.143.camel@edumazet-glaptop3.roam.corp.google.com> References: <1510281664.2849.143.camel@edumazet-glaptop3.roam.corp.google.com> X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Eric Dumazet I had many reports that TSQ logic breaks wifi aggregation. Current logic is to allow up to 1 ms of bytes to be queued into qdisc and drivers queues. But Wifi aggregation needs a bigger budget to allow bigger rates to be discovered by various TCP Congestion Controls algorithms. This patch adds an extra socket field, allowing wifi drivers to select another log scale to derive TCP Small Queue credit from current pacing rate. Initial value is 10, meaning that this patch does not change current behavior. We expect wifi drivers to set this field to smaller values (tests have been done with values from 6 to 9) They would have to use following template : if (skb->sk && skb->sk->sk_pacing_shift != MY_PACING_SHIFT) skb->sk->sk_pacing_shift = MY_PACING_SHIFT; Ref: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1670041 Signed-off-by: Eric Dumazet Cc: Johannes Berg Cc: Toke Høiland-Jørgensen Cc: Kir Kolyshkin Acked-by: Neal Cardwell --- v2: added kernel-doc comment, based on Johannes feedback. include/net/sock.h | 2 ++ net/core/sock.c | 1 + net/ipv4/tcp_output.c | 4 ++-- 3 files changed, 5 insertions(+), 2 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 688a823dccc306bd21f47da167c6922161af5a6a..f8715c5af37d4e598770dbe5c5f83246241f18d5 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -267,6 +267,7 @@ struct sock_common { * @sk_gso_type: GSO type (e.g. %SKB_GSO_TCPV4) * @sk_gso_max_size: Maximum GSO segment size to build * @sk_gso_max_segs: Maximum number of GSO segments + * @sk_pacing_shift: scaling factor for TCP Small Queues * @sk_lingertime: %SO_LINGER l_linger setting * @sk_backlog: always used with the per-socket spinlock held * @sk_callback_lock: used with the callbacks in the end of this struct @@ -451,6 +452,7 @@ struct sock { kmemcheck_bitfield_end(flags); u16 sk_gso_max_segs; + u8 sk_pacing_shift; unsigned long sk_lingertime; struct proto *sk_prot_creator; rwlock_t sk_callback_lock; diff --git a/net/core/sock.c b/net/core/sock.c index 57bbd6040eb6a3c072ce4e024687786079552ddf..13719af7b4e35d2050ccba51d44c7f691a889b37 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2746,6 +2746,7 @@ void sock_init_data(struct socket *sock, struct sock *sk) sk->sk_max_pacing_rate = ~0U; sk->sk_pacing_rate = ~0U; + sk->sk_pacing_shift = 10; sk->sk_incoming_cpu = -1; /* * Before updating sk_refcnt, we must commit prior changes to memory diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 0256f7a410417d93c9edab9d25a3ce5a81c2b296..76dbe884f2469660028684a46fc19afa000a1353 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1720,7 +1720,7 @@ u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now, { u32 bytes, segs; - bytes = min(sk->sk_pacing_rate >> 10, + bytes = min(sk->sk_pacing_rate >> sk->sk_pacing_shift, sk->sk_gso_max_size - 1 - MAX_TCP_HEADER); /* Goal is to send at least one packet per ms, @@ -2198,7 +2198,7 @@ static bool tcp_small_queue_check(struct sock *sk, const struct sk_buff *skb, { unsigned int limit; - limit = max(2 * skb->truesize, sk->sk_pacing_rate >> 10); + limit = max(2 * skb->truesize, sk->sk_pacing_rate >> sk->sk_pacing_shift); limit = min_t(u32, limit, sock_net(sk)->ipv4.sysctl_tcp_limit_output_bytes); limit <<= factor;