From patchwork Thu Nov 14 17:32:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Westphal X-Patchwork-Id: 1194992 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.01.org (client-ip=198.145.21.10; helo=ml01.01.org; envelope-from=mptcp-bounces@lists.01.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=strlen.de Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47DSwz6wj1z9sP4 for ; Fri, 15 Nov 2019 04:22:59 +1100 (AEDT) Received: from new-ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 283E1100DC3D2; Thu, 14 Nov 2019 09:24:27 -0800 (PST) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2a0a:51c0:0:12e:520::1; helo=chamillionaire.breakpoint.cc; envelope-from=fw@breakpoint.cc; receiver= Received: from Chamillionaire.breakpoint.cc (Chamillionaire.breakpoint.cc [IPv6:2a0a:51c0:0:12e:520::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id C27C9100EE8CD for ; Thu, 14 Nov 2019 09:24:24 -0800 (PST) Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.92) (envelope-from ) id 1iVIpa-00062a-5S; Thu, 14 Nov 2019 18:22:54 +0100 From: Florian Westphal To: Cc: Florian Westphal Date: Thu, 14 Nov 2019 18:32:21 +0100 Message-Id: <20191114173225.21199-11-fw@strlen.de> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191114173225.21199-1-fw@strlen.de> References: <20191114173225.21199-1-fw@strlen.de> MIME-Version: 1.0 Message-ID-Hash: 2TYBNL54CG3F542IJS6TJ4NVPQHVGSY3 X-Message-ID-Hash: 2TYBNL54CG3F542IJS6TJ4NVPQHVGSY3 X-MailFrom: fw@breakpoint.cc X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.1.1 Precedence: list Subject: [MPTCP] [RFC 10/14] recv: make DATA_READY reflect ssk in-sequence state List-Id: Discussions regarding MPTCP upstreaming Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: In order to make mptcp_poll independent of the subflows, we need to keep the mptcp DATA_READY flag in sync, i.e., if it is set, at least one ssk has in-sequence data. If it is cleared, no further data is available. Avoid the unconditional clearing on recv entry. Instead make sure the flag is cleared on exit if there is no more in-sequence data available. Signed-off-by: Florian Westphal --- net/mptcp/protocol.c | 51 +++++++++++++++++++++++++++++--------------- 1 file changed, 34 insertions(+), 17 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 6fb178067a4a..b8f936c78ed3 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -556,8 +556,10 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, { struct mptcp_sock *msk = mptcp_sk(sk); struct mptcp_subflow_context *subflow; + bool more_data_avail = false; struct mptcp_read_arg arg; read_descriptor_t desc; + bool wait_data = false; struct socket *ssock; struct tcp_sock *tp; bool done = false; @@ -590,10 +592,6 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, u32 map_remaining; int bytes_read; - smp_mb__before_atomic(); - clear_bit(MPTCP_DATA_READY, &msk->flags); - smp_mb__after_atomic(); - ssk = mptcp_subflow_recv_lookup(msk); pr_debug("msk=%p ssk=%p", msk, ssk); if (!ssk) @@ -603,7 +601,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, tp = tcp_sk(ssk); lock_sock(ssk); - while (mptcp_subflow_data_available(ssk) && !done) { + do { /* try to read as much data as available */ map_remaining = subflow->map_data_len - mptcp_subflow_get_map_offset(subflow); @@ -614,8 +612,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, if (bytes_read < 0) { if (!copied) copied = bytes_read; - done = true; - continue; + goto next; } pr_debug("msk ack_seq=%llx -> %llx", msk->ack_seq, @@ -624,23 +621,27 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, copied += bytes_read; if (copied >= len) { done = true; - continue; + goto next; } if (tp->urg_data && tp->urg_seq == tp->copied_seq) { pr_err("Urgent data present, cannot proceed"); done = true; - continue; + goto next; } - } +next: + more_data_avail = mptcp_subflow_data_available(ssk); + } while (more_data_avail && !done); release_sock(ssk); continue; wait_for_data: + more_data_avail = false; + /* only the master socket status is relevant here. The exit * conditions mirror closely tcp_recvmsg() */ if (copied >= target) - break; + goto out; if (copied) { if (sk->sk_err || @@ -648,36 +649,52 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, (sk->sk_shutdown & RCV_SHUTDOWN) || !timeo || signal_pending(current)) - break; + goto out; } else { if (sk->sk_err) { copied = sock_error(sk); - break; + goto out; } if (sk->sk_shutdown & RCV_SHUTDOWN) - break; + goto out; if (sk->sk_state == TCP_CLOSE) { copied = -ENOTCONN; - break; + goto out; } if (!timeo) { copied = -EAGAIN; - break; + goto out; } if (signal_pending(current)) { copied = sock_intr_errno(timeo); - break; + goto out; } } pr_debug("block timeout %ld", timeo); + wait_data = true; mptcp_wait_data(sk, &timeo); } +out: + if (more_data_avail) { + if (!test_bit(MPTCP_DATA_READY, &msk->flags)) + set_bit(MPTCP_DATA_READY, &msk->flags); + } else if (!wait_data) { + clear_bit(MPTCP_DATA_READY, &msk->flags); + + /* .. race-breaker: ssk might get new data after last + * data_available() returns false. + */ + ssk = mptcp_subflow_recv_lookup(msk); + if (unlikely(ssk)) + set_bit(MPTCP_DATA_READY, &msk->flags); + } + release_sock(sk); return copied; }