From patchwork Thu Aug 20 13:12:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Westphal X-Patchwork-Id: 1348371 X-Patchwork-Delegate: matthieu.baerts@tessares.net Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.01.org (client-ip=2001:19d0:306:5::1; helo=ml01.01.org; envelope-from=mptcp-bounces@lists.01.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=strlen.de Received: from ml01.01.org (ml01.01.org [IPv6:2001:19d0:306:5::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BXQ6g3xVNz9sRN for ; Thu, 20 Aug 2020 23:12:26 +1000 (AEST) Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 83FA613530A31; Thu, 20 Aug 2020 06:12:19 -0700 (PDT) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2a0a:51c0:0:12e:520::1; helo=chamillionaire.breakpoint.cc; envelope-from=fw@breakpoint.cc; receiver= Received: from Chamillionaire.breakpoint.cc (Chamillionaire.breakpoint.cc [IPv6:2a0a:51c0:0:12e:520::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 78FCE13530A2D for ; Thu, 20 Aug 2020 06:12:18 -0700 (PDT) Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.92) (envelope-from ) id 1k8kMZ-0008RB-2w; Thu, 20 Aug 2020 15:12:15 +0200 From: Florian Westphal To: Cc: Florian Westphal Date: Thu, 20 Aug 2020 15:12:09 +0200 Message-Id: <20200820131209.23430-1-fw@strlen.de> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 Message-ID-Hash: 7JQO2AMCMDIXEIHXHJNSUOQU7HN7N5HU X-Message-ID-Hash: 7JQO2AMCMDIXEIHXHJNSUOQU7HN7N5HU X-MailFrom: fw@breakpoint.cc X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.1.1 Precedence: list Subject: [MPTCP] [PATCH mptcp-next v3] mptcp: adjust mptcp receive buffer limit if subflow has larger one List-Id: Discussions regarding MPTCP upstreaming Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: In addition to tcp autotuning during read, it may also increase the receive buffer in tcp_clamp_window(). In this case, mptcp should adjust its receive buffer size as well so it can move all pending skbs from the subflow socket to the mptcp socket. At this time, TCP can have more skbs ready for processing than what the mptcp receive buffer size allows. In the mptcp case, the receive window announced is based on the free space of the mptcp parent socket instead of the individual subflows. Following the subflow allows mptcp to grow its receive buffer. This is especially noticeable for loopback traffic where two skbs are enough to fill the initial receive window. In mptcp_data_ready() we do not hold the mptcp socket lock, so modifying mptcp_sk->sk_rcvbuf is racy. Do it when moving skbs from subflow to mptcp socket, both sockets are locked in this case. v2: move rcvbuf update to __mptcp_move_skbs_from_subflow where both mptcp and subflow sk locks are held (pointed out by Mat Martineau). v3: only adjust if receive buffer size isn't fixed. (Mat Martineau). Signed-off-by: Florian Westphal Reviewed-by: Mat Martineau --- net/mptcp/protocol.c | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 77d655b0650c..91718dbc89e1 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -457,6 +457,18 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk, bool more_data_avail; struct tcp_sock *tp; bool done = false; + int sk_rbuf; + + sk_rbuf = READ_ONCE(sk->sk_rcvbuf); + + if (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) { + int ssk_rbuf = READ_ONCE(ssk->sk_rcvbuf); + + if (unlikely(ssk_rbuf > sk_rbuf)) { + WRITE_ONCE(sk->sk_rcvbuf, ssk_rbuf); + sk_rbuf = ssk_rbuf; + } + } pr_debug("msk=%p ssk=%p", msk, ssk); tp = tcp_sk(ssk); @@ -511,7 +523,7 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk, WRITE_ONCE(tp->copied_seq, seq); more_data_avail = mptcp_subflow_data_available(ssk); - if (atomic_read(&sk->sk_rmem_alloc) > READ_ONCE(sk->sk_rcvbuf)) { + if (atomic_read(&sk->sk_rmem_alloc) > sk_rbuf) { done = true; break; } @@ -603,6 +615,7 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); struct mptcp_sock *msk = mptcp_sk(sk); + int sk_rbuf, ssk_rbuf; bool wake; /* move_skbs_to_msk below can legitly clear the data_avail flag, @@ -613,12 +626,16 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk) if (wake) set_bit(MPTCP_DATA_READY, &msk->flags); - if (atomic_read(&sk->sk_rmem_alloc) < READ_ONCE(sk->sk_rcvbuf) && - move_skbs_to_msk(msk, ssk)) + ssk_rbuf = READ_ONCE(ssk->sk_rcvbuf); + sk_rbuf = READ_ONCE(sk->sk_rcvbuf); + if (unlikely(ssk_rbuf > sk_rbuf)) + sk_rbuf = ssk_rbuf; + + /* over limit? can't append more skbs to msk */ + if (atomic_read(&sk->sk_rmem_alloc) > sk_rbuf) goto wake; - /* don't schedule if mptcp sk is (still) over limit */ - if (atomic_read(&sk->sk_rmem_alloc) > READ_ONCE(sk->sk_rcvbuf)) + if (move_skbs_to_msk(msk, ssk)) goto wake; /* mptcp socket is owned, release_cb should retry */