From patchwork Wed Feb 28 22:40:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 879473 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="UKulqHop"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zs9X00WrNz9s1q for ; Thu, 1 Mar 2018 09:41:04 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935593AbeB1WlB (ORCPT ); Wed, 28 Feb 2018 17:41:01 -0500 Received: from mail-pf0-f194.google.com ([209.85.192.194]:43066 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935382AbeB1Wk6 (ORCPT ); Wed, 28 Feb 2018 17:40:58 -0500 Received: by mail-pf0-f194.google.com with SMTP id z14so1607973pfe.10 for ; Wed, 28 Feb 2018 14:40:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=xV6EOLcfF1FB9shYeyVjW78gWHyTuFEw4xi0FCd1f5E=; b=UKulqHopHDexNBFVhDkRyOnnaOvCw68YNXCwHMwNnlmEoA2SFfkIbMB8pQNPg5g0rW 61TYSTZd+4W7XmrnmmSEeVQmobDaWZVP9ZGvyRtX/xZcx/wInKXtj1qQKWT4MNTKwNrK ZJcBoPNF5oK943RaM9CS7IqW7EiTeb51Pj5fFHNJl4TrNgVWaZ50O1FUdHTWFVg8E00i jRIolq0RnKxtTJ1pvasml2ruyka1FC8I0p5Kdu++mU4TT09i/5VF4peWEkWyQTVdhdQX xJl4y76zQ0qYrP6BNbkfB4LSQ1U6OAXOZLbEF4i2iYY4rjXkIX5ZJtngLJ3lhVluEgx/ kmwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=xV6EOLcfF1FB9shYeyVjW78gWHyTuFEw4xi0FCd1f5E=; b=h3bhblQ+Pw6mZKmC3o+fz35+F62SnjJq75p8TRXELmAPzWF2cb/7w6ywbqEOGJZrx8 tDF1keULu++AtE5LQqbm0I0hUnZKrG/qNaKRtDw8QkkS5i2yzVP/MuuGWzXH1PKR3prF j+J5+krwlQzTXdn8gsd4ksGzvHgWkvt/XyJlHi479QeXziqa50mOU35/DYbC4dI6ReH+ s16MNy3m/hHuDxTdYzcgp91npCsTUSqK6npMtWMkgzbRuU4sjT+k+AEGaMyw+JY0bgag 1wYO8S+EbbvJ5XW9sX2IYp8jK/j6DtzfsP0kbo0grZnhnxzyYiEkSYTrJF6XWyKmGVWe 93vg== X-Gm-Message-State: APf1xPBZH0Zrdqt1x3CLuAlLTTv9XL95S0v0GKHJl0GABeKyXGQzVe+j COg2a2nixFLLw1s/ELuukme8Ag== X-Google-Smtp-Source: AH8x2243sQLXxoATjHHxJxQP7fZNQ7uBGt2CtHFSOYH8qxNHZ1Z38LNUQq2va9cDF/ygk+b8uIaJmg== X-Received: by 10.99.2.140 with SMTP id 134mr15732484pgc.117.1519857657526; Wed, 28 Feb 2018 14:40:57 -0800 (PST) Received: from localhost ([2620:15c:2c4:201:f5a:7eca:440a:3ead]) by smtp.gmail.com with ESMTPSA id r14sm5327783pfa.136.2018.02.28.14.40.56 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 28 Feb 2018 14:40:56 -0800 (PST) From: Eric Dumazet To: "David S . Miller" Cc: netdev , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 1/2] tcp_bbr: better deal with suboptimal GSO (II) Date: Wed, 28 Feb 2018 14:40:46 -0800 Message-Id: <20180228224047.123054-2-edumazet@google.com> X-Mailer: git-send-email 2.16.2.395.g2e18187dfd-goog In-Reply-To: <20180228224047.123054-1-edumazet@google.com> References: <20180228224047.123054-1-edumazet@google.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This is second part of dealing with suboptimal device gso parameters. In first patch (350c9f484bde "tcp_bbr: better deal with suboptimal GSO") we dealt with devices having low gso_max_segs Some devices lower gso_max_size from 64KB to 16 KB (r8152 is an example) In order to probe an optimal cwnd, we want BBR being not sensitive to whatever GSO constraint a device can have. This patch removes tso_segs_goal() CC callback in favor of min_tso_segs() for CC wanting to override sysctl_tcp_min_tso_segs Next patch will remove bbr->tso_segs_goal since it does not have to be persistent. Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell --- include/net/tcp.h | 6 ++---- net/ipv4/tcp_bbr.c | 23 +++++++++++++---------- net/ipv4/tcp_output.c | 15 ++++++++------- 3 files changed, 23 insertions(+), 21 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 92b06c6e7732ad7c61b580427fc085fa0dff1063..9c9b3768b350abfd51776563d220d5e97ca9da69 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -511,8 +511,6 @@ __u32 cookie_v6_init_sequence(const struct sk_buff *skb, __u16 *mss); #endif /* tcp_output.c */ -u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now, - int min_tso_segs); void __tcp_push_pending_frames(struct sock *sk, unsigned int cur_mss, int nonagle); int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs); @@ -981,8 +979,8 @@ struct tcp_congestion_ops { u32 (*undo_cwnd)(struct sock *sk); /* hook for packet ack accounting (optional) */ void (*pkts_acked)(struct sock *sk, const struct ack_sample *sample); - /* suggest number of segments for each skb to transmit (optional) */ - u32 (*tso_segs_goal)(struct sock *sk); + /* override sysctl_tcp_min_tso_segs */ + u32 (*min_tso_segs)(struct sock *sk); /* returns the multiplier used in tcp_sndbuf_expand (optional) */ u32 (*sndbuf_expand)(struct sock *sk); /* call when packets are delivered to update cwnd and pacing rate, diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index a471f696e13c82cddd11633fd4bfdbc6d84f4bcc..afc0567b8a98fbb718ba04505053aa3f62ab5784 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -261,23 +261,26 @@ static void bbr_set_pacing_rate(struct sock *sk, u32 bw, int gain) sk->sk_pacing_rate = rate; } -/* Return count of segments we want in the skbs we send, or 0 for default. */ -static u32 bbr_tso_segs_goal(struct sock *sk) +/* override sysctl_tcp_min_tso_segs */ +static u32 bbr_min_tso_segs(struct sock *sk) { - struct bbr *bbr = inet_csk_ca(sk); - - return bbr->tso_segs_goal; + return sk->sk_pacing_rate < (bbr_min_tso_rate >> 3) ? 1 : 2; } static void bbr_set_tso_segs_goal(struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk); struct bbr *bbr = inet_csk_ca(sk); - u32 min_segs; + u32 segs, bytes; + + /* Sort of tcp_tso_autosize() but ignoring + * driver provided sk_gso_max_size. + */ + bytes = min_t(u32, sk->sk_pacing_rate >> sk->sk_pacing_shift, + GSO_MAX_SIZE - 1 - MAX_TCP_HEADER); + segs = max_t(u32, bytes / tp->mss_cache, bbr_min_tso_segs(sk)); - min_segs = sk->sk_pacing_rate < (bbr_min_tso_rate >> 3) ? 1 : 2; - bbr->tso_segs_goal = min(tcp_tso_autosize(sk, tp->mss_cache, min_segs), - 0x7FU); + bbr->tso_segs_goal = min(segs, 0x7FU); } /* Save "last known good" cwnd so we can restore it after losses or PROBE_RTT */ @@ -936,7 +939,7 @@ static struct tcp_congestion_ops tcp_bbr_cong_ops __read_mostly = { .undo_cwnd = bbr_undo_cwnd, .cwnd_event = bbr_cwnd_event, .ssthresh = bbr_ssthresh, - .tso_segs_goal = bbr_tso_segs_goal, + .min_tso_segs = bbr_min_tso_segs, .get_info = bbr_get_info, .set_state = bbr_set_state, }; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 49d043de3476bdfcaf6e9a606d0da0f2094373a8..383cac0ff0ec059ca7dbc1a6304cc7f8183e008d 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1703,8 +1703,8 @@ static bool tcp_nagle_check(bool partial, const struct tcp_sock *tp, /* Return how many segs we'd like on a TSO packet, * to send one TSO packet per ms */ -u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now, - int min_tso_segs) +static u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now, + int min_tso_segs) { u32 bytes, segs; @@ -1720,7 +1720,6 @@ u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now, return segs; } -EXPORT_SYMBOL(tcp_tso_autosize); /* Return the number of segments we want in the skb we are transmitting. * See if congestion control module wants to decide; otherwise, autosize. @@ -1728,11 +1727,13 @@ EXPORT_SYMBOL(tcp_tso_autosize); static u32 tcp_tso_segs(struct sock *sk, unsigned int mss_now) { const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops; - u32 tso_segs = ca_ops->tso_segs_goal ? ca_ops->tso_segs_goal(sk) : 0; + u32 min_tso, tso_segs; - if (!tso_segs) - tso_segs = tcp_tso_autosize(sk, mss_now, - sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs); + min_tso = ca_ops->min_tso_segs ? + ca_ops->min_tso_segs(sk) : + sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs; + + tso_segs = tcp_tso_autosize(sk, mss_now, min_tso); return min_t(u32, tso_segs, sk->sk_gso_max_segs); } From patchwork Wed Feb 28 22:40:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 879474 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="DZ3HYS1T"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zs9X34fMDz9s1q for ; Thu, 1 Mar 2018 09:41:07 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935598AbeB1WlE (ORCPT ); Wed, 28 Feb 2018 17:41:04 -0500 Received: from mail-pf0-f194.google.com ([209.85.192.194]:42774 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935594AbeB1WlB (ORCPT ); Wed, 28 Feb 2018 17:41:01 -0500 Received: by mail-pf0-f194.google.com with SMTP id a16so1609105pfn.9 for ; Wed, 28 Feb 2018 14:41:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=+O8t93dSnpiZwRWTMd8/EqoPEnFDnwz5ojTfAEpKqX4=; b=DZ3HYS1TLqIG1tL50KPDd2a1YsDn+0aWskdRZ95Y+D3tO39uIj24qnFRiOa0vgQy7Z r6GV07GDYJeTaIXO8uPx3fNX2izeRrnv2rJ4XF+EvO+mxNHbF35z99jo6J4pBzXwSsqE EevYiwkCZNnU5qclzH84jFoh6TFnR47ujImquZKkVgYFermn+ONB/Qgx3u5y8fa9ROcv TP2Ko0l+ATd46N71NwKj7Gv03pfr/OVEkJ5ZdqHOVFEYEl1PyySyUHrud9aa9kVOvssq g8kifgqH+fG9+OiWY+KGhSuvmV0jwKPzs7F4KAaQKpo2tPwPTuDC/eK/T2FDXxHQmaL7 1yeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+O8t93dSnpiZwRWTMd8/EqoPEnFDnwz5ojTfAEpKqX4=; b=k4cb05Ed4IaY6oxCJELSo8GJqakG2y9fvlIdUEbHj6XZVVn6ap4U6ux4m8epUrZQ+4 WCfZJpHGp9zBAydMFcDTqu4XvhhL4mnHqQB6UUTW+aWnKFwOCG4qTlZBRQJ829Io/h4A fsmHWTLRzfJOE4ArCcrpk0brweZjL4GBzNpPH0bvF4Uut9lr/jYa6hNHzDy2umDV8Vh3 wzMNnPFfcWtYJahRl2Lr0o0SXqLcoMQe100G9EBSXIg5sQXgUSTnvlYzNARYAXgchPLj 3iwYQsxXxcNadNXNxEhBb6KVvV5ZIXkL2OIuwX3Vna+4Q415U7DEAGYt9Ny6Jbbvx+Xv +0cw== X-Gm-Message-State: APf1xPCcNbOLcGVHezMQgAJHnQh9vjzFWSL2XwLKJSQR9fPPMtRW5gUc EQDN0cw6G+aPNvq+ABSdYvwPBQ== X-Google-Smtp-Source: AG47ELshr/chWVGe7p/SAtfhGiNSxeBWl7H5301UDxUxE1X2nzyHZxbDRA/yZQ47o87Tr2Kn7RtiOg== X-Received: by 10.98.77.194 with SMTP id a185mr11661661pfb.123.1519857660299; Wed, 28 Feb 2018 14:41:00 -0800 (PST) Received: from localhost ([2620:15c:2c4:201:f5a:7eca:440a:3ead]) by smtp.gmail.com with ESMTPSA id m186sm4944366pfm.78.2018.02.28.14.40.58 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 28 Feb 2018 14:40:59 -0800 (PST) From: Eric Dumazet To: "David S . Miller" Cc: netdev , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 2/2] tcp_bbr: remove bbr->tso_segs_goal Date: Wed, 28 Feb 2018 14:40:47 -0800 Message-Id: <20180228224047.123054-3-edumazet@google.com> X-Mailer: git-send-email 2.16.2.395.g2e18187dfd-goog In-Reply-To: <20180228224047.123054-1-edumazet@google.com> References: <20180228224047.123054-1-edumazet@google.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Its value is computed then immediately used, there is no need to store it. Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell --- net/ipv4/tcp_bbr.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index afc0567b8a98fbb718ba04505053aa3f62ab5784..c92014cb1e165df05651eb05f2133c2e9473d54f 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -97,10 +97,9 @@ struct bbr { packet_conservation:1, /* use packet conservation? */ restore_cwnd:1, /* decided to revert cwnd to old value */ round_start:1, /* start of packet-timed tx->ack round? */ - tso_segs_goal:7, /* segments we want in each skb we send */ idle_restart:1, /* restarting after idle? */ probe_rtt_round_done:1, /* a BBR_PROBE_RTT round at 4 pkts? */ - unused:5, + unused:12, lt_is_sampling:1, /* taking long-term ("LT") samples now? */ lt_rtt_cnt:7, /* round trips in long-term interval */ lt_use_bw:1; /* use lt_bw as our bw estimate? */ @@ -267,10 +266,9 @@ static u32 bbr_min_tso_segs(struct sock *sk) return sk->sk_pacing_rate < (bbr_min_tso_rate >> 3) ? 1 : 2; } -static void bbr_set_tso_segs_goal(struct sock *sk) +static u32 bbr_tso_segs_goal(struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk); - struct bbr *bbr = inet_csk_ca(sk); u32 segs, bytes; /* Sort of tcp_tso_autosize() but ignoring @@ -280,7 +278,7 @@ static void bbr_set_tso_segs_goal(struct sock *sk) GSO_MAX_SIZE - 1 - MAX_TCP_HEADER); segs = max_t(u32, bytes / tp->mss_cache, bbr_min_tso_segs(sk)); - bbr->tso_segs_goal = min(segs, 0x7FU); + return min(segs, 0x7FU); } /* Save "last known good" cwnd so we can restore it after losses or PROBE_RTT */ @@ -351,7 +349,7 @@ static u32 bbr_target_cwnd(struct sock *sk, u32 bw, int gain) cwnd = (((w * gain) >> BBR_SCALE) + BW_UNIT - 1) / BW_UNIT; /* Allow enough full-sized skbs in flight to utilize end systems. */ - cwnd += 3 * bbr->tso_segs_goal; + cwnd += 3 * bbr_tso_segs_goal(sk); /* Reduce delayed ACKs by rounding up cwnd to the next even number. */ cwnd = (cwnd + 1) & ~1U; @@ -827,7 +825,6 @@ static void bbr_main(struct sock *sk, const struct rate_sample *rs) bw = bbr_bw(sk); bbr_set_pacing_rate(sk, bw, bbr->pacing_gain); - bbr_set_tso_segs_goal(sk); bbr_set_cwnd(sk, rs, rs->acked_sacked, bw, bbr->cwnd_gain); } @@ -837,7 +834,6 @@ static void bbr_init(struct sock *sk) struct bbr *bbr = inet_csk_ca(sk); bbr->prior_cwnd = 0; - bbr->tso_segs_goal = 0; /* default segs per skb until first ACK */ bbr->rtt_cnt = 0; bbr->next_rtt_delivered = 0; bbr->prev_ca_state = TCP_CA_Open;