From patchwork Thu Mar 12 15:49:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Westphal X-Patchwork-Id: 1253677 X-Patchwork-Delegate: pabeni@redhat.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.01.org (client-ip=2001:19d0:306:5::1; helo=ml01.01.org; envelope-from=mptcp-bounces@lists.01.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=strlen.de Received: from ml01.01.org (ml01.01.org [IPv6:2001:19d0:306:5::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48dYCr3V67z9sPF for ; Fri, 13 Mar 2020 02:49:11 +1100 (AEDT) Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 8883B10FC378A; Thu, 12 Mar 2020 08:50:00 -0700 (PDT) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2a0a:51c0:0:12e:520::1; helo=chamillionaire.breakpoint.cc; envelope-from=fw@breakpoint.cc; receiver= Received: from Chamillionaire.breakpoint.cc (Chamillionaire.breakpoint.cc [IPv6:2a0a:51c0:0:12e:520::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 5F70710FC377C for ; Thu, 12 Mar 2020 08:49:58 -0700 (PDT) Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.92) (envelope-from ) id 1jCQ53-0004gM-Ve; Thu, 12 Mar 2020 16:49:06 +0100 From: Florian Westphal To: Cc: Florian Westphal Date: Thu, 12 Mar 2020 16:49:01 +0100 Message-Id: <20200312154901.540-1-fw@strlen.de> X-Mailer: git-send-email 2.24.1 MIME-Version: 1.0 Message-ID-Hash: B4NDPGM4SB3STWL4KREKIL5CN27J6F3P X-Message-ID-Hash: B4NDPGM4SB3STWL4KREKIL5CN27J6F3P X-MailFrom: fw@breakpoint.cc X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.1.1 Precedence: list Subject: [MPTCP] [PATCH] tcp: mptcp: use mptcp receive buffer space to select rcv window List-Id: Discussions regarding MPTCP upstreaming Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: In MPTCP, the receive windo is shared across all subflows, because it refers to the mptcp-level sequence space. This commit doesn't change choice of initial window for passive or active connections. While it would be possible to change those as well, this adds complexity (especially when handling MP_JOIN requests). However, the MPTCP RFC specifically says that a MPTCP sender 'MUST NOT use the RCV.WND field of a TCP segment at the connection level if it does not also carry a DSS option with a Data ACK field.' SYN/SYNACK packets do not carry a DSS option with a Data ACK field. Signed-off-by: Florian Westphal --- Unless there are comments i will post this as RFC to netdev and will CC Eric Dumazet. include/net/mptcp.h | 13 +++++++++++++ net/ipv4/tcp_output.c | 6 ++++++ net/mptcp/subflow.c | 24 ++++++++++++++++++++++++ 3 files changed, 43 insertions(+) diff --git a/include/net/mptcp.h b/include/net/mptcp.h index 7489f9267640..611a4d959470 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -66,6 +66,9 @@ static inline bool rsk_is_mptcp(const struct request_sock *req) return tcp_rsk(req)->is_mptcp; } +int mptcp_full_space(const struct sock *ssk); +int mptcp_space(const struct sock *ssk); + void mptcp_parse_option(const struct sk_buff *skb, const unsigned char *ptr, int opsize, struct tcp_options_received *opt_rx); bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb, @@ -195,6 +198,16 @@ static inline bool mptcp_sk_is_subflow(const struct sock *sk) return false; } +static inline int mptcp_space(struct sock *ssk) +{ + return 0; +} + +static inline int mptcp_full_space(struct sock *ssk) +{ + return 0; +} + static inline void mptcp_seq_show(struct seq_file *seq) { } #endif /* CONFIG_MPTCP */ diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 306e25d743e8..b786da1db76a 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2771,6 +2771,12 @@ u32 __tcp_select_window(struct sock *sk) int full_space = min_t(int, tp->window_clamp, allowed_space); int window; + if (sk_is_mptcp(sk)) { + allowed_space = mptcp_full_space(sk); + free_space = mptcp_space(sk); + full_space = min_t(int, tp->window_clamp, allowed_space); + } + if (unlikely(mss > full_space)) { mss = full_space; if (mss <= 0) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index f9fe60599f79..52d76af75b7f 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -738,6 +738,30 @@ bool mptcp_subflow_data_available(struct sock *sk) return subflow->data_avail; } +/* If ssk has an mptcp parent socket, use the mptcp rcvbuf occupancy, + * not the ssk one. + * + * In mptcp, rwin is about the mptcp-level connection data. + * + * Data that is still on the ssk rx queue can thus be ignored, + * as far as mptcp peer is concerened that data is still inflight. + */ +int mptcp_full_space(const struct sock *ssk) +{ + const struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + const struct sock *sk = READ_ONCE(subflow->conn); + + return sk ? tcp_full_space(sk) : tcp_full_space(ssk); +} + +int mptcp_space(const struct sock *ssk) +{ + const struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + const struct sock *sk = READ_ONCE(subflow->conn); + + return sk ? tcp_space(sk) : tcp_space(ssk); +} + static void subflow_data_ready(struct sock *sk) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);