From patchwork Thu Oct 1 14:08:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Abeni X-Patchwork-Id: 1375097 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.01.org (client-ip=2001:19d0:306:5::1; helo=ml01.01.org; envelope-from=mptcp-bounces@lists.01.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=byP/Egrn; dkim-atps=neutral Received: from ml01.01.org (ml01.01.org [IPv6:2001:19d0:306:5::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4C2FNF6BlRz9sPB for ; Fri, 2 Oct 2020 00:08:45 +1000 (AEST) Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id F13A615470B83; Thu, 1 Oct 2020 07:08:43 -0700 (PDT) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=63.128.21.124; helo=us-smtp-delivery-124.mimecast.com; envelope-from=pabeni@redhat.com; receiver= Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id A732415470B69 for ; Thu, 1 Oct 2020 07:08:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1601561320; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DhWUAZRjTPrkrJz0NDoXm5JG80m0dk2u6vLTX+oShtw=; b=byP/EgrnTNZIwiTW6Cq41uSkU1HRHh0TONA+/YQuPawYI4myWDXN/Z/p1okn7CC/GCpP88 mxBTFM7RoDYWkC1LYYRWjJcXP5+t6eGuHjopU8ZNbxebhKqoHO6oaALwRMKq/pwymHpkns US/VGt4M1Ge9ayGYyMqKheyU4IG8UlE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-202-eAGia9bUNdmeSJUiYrwnag-1; Thu, 01 Oct 2020 10:08:38 -0400 X-MC-Unique: eAGia9bUNdmeSJUiYrwnag-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 530CB101FFA8 for ; Thu, 1 Oct 2020 14:08:37 +0000 (UTC) Received: from gerbillo.redhat.com (ovpn-115-22.ams2.redhat.com [10.36.115.22]) by smtp.corp.redhat.com (Postfix) with ESMTP id A6218702E7 for ; Thu, 1 Oct 2020 14:08:36 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.01.org Date: Thu, 1 Oct 2020 16:08:13 +0200 Message-Id: In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Message-ID-Hash: 5FNOVGIENFM3WM5UXP42XW4W3LNPY2ZU X-Message-ID-Hash: 5FNOVGIENFM3WM5UXP42XW4W3LNPY2ZU X-MailFrom: pabeni@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.1.1 Precedence: list Subject: [MPTCP] [PATCH v2 12/13] mptcp: keep track of advertised windows right edge List-Id: Discussions regarding MPTCP upstreaming Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: From: Florian Westphal Before sending 'x' new bytes also check that the new snd_una would be within the permitted receive window. For every ACK that also contains a DSS ack, check wheter its tcp-level receive window would advance the current mptcp window right edge and update it if so. Co-developed-by: Paolo Abeni Signed-off-by: Paolo Abeni Signed-off-by: Florian Westphal --- net/mptcp/options.c | 24 ++++++++++++++++++---- net/mptcp/protocol.c | 47 +++++++++++++++++++++++++++++++++++++++++++- net/mptcp/protocol.h | 1 + 3 files changed, 67 insertions(+), 5 deletions(-) diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 81581cc84796..b6f3f0c4fdbd 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -790,11 +790,14 @@ static u64 expand_ack(u64 old_ack, u64 cur_ack, bool use_64bit) return cur_ack; } -static void update_una(struct mptcp_sock *msk, - struct mptcp_options_received *mp_opt) +static void ack_update_msk(struct mptcp_sock *msk, + const struct sock *ssk, + struct mptcp_options_received *mp_opt) { u64 new_snd_una, snd_una, old_snd_una = atomic64_read(&msk->snd_una); + u64 new_wnd_end, wnd_end, old_wnd_end = atomic64_read(&msk->wnd_end); u64 snd_nxt = READ_ONCE(msk->snd_nxt); + struct sock *sk = (struct sock *)msk; /* avoid ack expansion on update conflict, to reduce the risk of * wrongly expanding to a future ack sequence number, which is way @@ -806,12 +809,25 @@ static void update_una(struct mptcp_sock *msk, if (after64(new_snd_una, snd_nxt)) new_snd_una = old_snd_una; + new_wnd_end = new_snd_una + tcp_sk(ssk)->snd_wnd; + + while (after64(new_wnd_end, old_wnd_end)) { + wnd_end = old_wnd_end; + old_wnd_end = atomic64_cmpxchg(&msk->wnd_end, wnd_end, + new_wnd_end); + if (old_wnd_end == wnd_end) { + if (mptcp_send_head(sk)) + mptcp_schedule_work(sk); + break; + } + } + while (after64(new_snd_una, old_snd_una)) { snd_una = old_snd_una; old_snd_una = atomic64_cmpxchg(&msk->snd_una, snd_una, new_snd_una); if (old_snd_una == snd_una) { - mptcp_data_acked((struct sock *)msk); + mptcp_data_acked(sk); break; } } @@ -911,7 +927,7 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) * monodirectional flows will stuck */ if (mp_opt.use_ack) - update_una(msk, &mp_opt); + ack_update_msk(msk, sk, &mp_opt); /* Zero-data-length packets are dropped by the caller and not * propagated to the MPTCP layer, so the skb extension does not diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index ebdee8798f08..76d5fd11d2f5 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -57,6 +57,12 @@ static struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk) return msk->subflow; } +/* Returns end sequence number of the receiver's advertised window */ +static u64 mptcp_wnd_end(const struct mptcp_sock *msk) +{ + return atomic64_read(&msk->wnd_end); +} + static bool mptcp_is_tcpsk(struct sock *sk) { struct socket *sock = sk->sk_socket; @@ -174,6 +180,7 @@ static void mptcp_data_queue_ofo(struct mptcp_sock *msk, struct sk_buff *skb) if (after64(seq, max_seq)) { /* out of window */ mptcp_drop(sk, skb); + pr_info("oow by %ld\n", (unsigned long)seq - (unsigned long)max_seq); MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_NODSSWINDOW); return; } @@ -818,6 +825,7 @@ static void mptcp_clean_una(struct sock *sk) */ if (__mptcp_check_fallback(msk)) atomic64_set(&msk->snd_una, msk->snd_nxt); + snd_una = atomic64_read(&msk->snd_una); list_for_each_entry_safe(dfrag, dtmp, &msk->rtx_queue, list) { @@ -915,12 +923,30 @@ struct mptcp_sendmsg_info { unsigned int flags; }; +static int mptcp_check_allowed_size(struct mptcp_sock *msk, u64 data_seq, + int avail_size) +{ + u64 window_end = mptcp_wnd_end(msk); + + if (__mptcp_check_fallback(msk)) + return avail_size; + + if (!before64(data_seq + avail_size, window_end)) { + u64 allowed_size = window_end - data_seq; + + return min_t(unsigned int, allowed_size, avail_size); + } + + return avail_size; +} + static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk, struct mptcp_data_frag *dfrag, struct mptcp_sendmsg_info *info) { u64 data_seq = dfrag->data_seq + info->sent; struct mptcp_sock *msk = mptcp_sk(sk); + bool zero_window_probe = false; struct mptcp_ext *mpext = NULL; struct sk_buff *skb, *tail; bool can_collapse = false; @@ -950,6 +976,14 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk, avail_size = info->size_goal - skb->len; } + /* Zero window and all data acked? Probe. */ + avail_size = mptcp_check_allowed_size(msk, data_seq, avail_size); + if (avail_size == 0 && atomic64_read(&msk->snd_una) == msk->snd_nxt) { + zero_window_probe = true; + data_seq = atomic64_read(&msk->snd_una) - 1; + avail_size = 1; + } + if (WARN_ON_ONCE(info->sent > info->limit || info->limit > dfrag->data_len)) return 0; @@ -973,6 +1007,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk, if (mpext && tail && mpext == skb_ext_find(tail, SKB_EXT_MPTCP)) { WARN_ON_ONCE(!can_collapse); mpext->data_len += ret; + WARN_ON_ONCE(zero_window_probe); goto out; } @@ -991,6 +1026,12 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk, mpext->data_seq, mpext->subflow_seq, mpext->data_len, mpext->dsn64); + if (zero_window_probe) { + mptcp_subflow_ctx(ssk)->rel_write_seq += ret; + mpext->frozen = 1; + ret = 0; + tcp_push_pending_frames(ssk); + } out: mptcp_subflow_ctx(ssk)->rel_write_seq += ret; return ret; @@ -1810,7 +1851,7 @@ static void mptcp_worker(struct work_struct *work) info.limit = dfrag->already_sent; while (info.sent < dfrag->already_sent) { ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); - if (ret < 0) + if (ret <= 0) break; MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_RETRANSSEGS); @@ -2161,6 +2202,8 @@ struct sock *mptcp_sk_clone(const struct sock *sk, msk->write_seq = subflow_req->idsn + 1; msk->snd_nxt = msk->write_seq; atomic64_set(&msk->snd_una, msk->write_seq); + atomic64_set(&msk->wnd_end, msk->snd_nxt + req->rsk_rcv_wnd); + if (mp_opt->mp_capable) { msk->can_ack = true; msk->remote_key = mp_opt->sndr_key; @@ -2195,6 +2238,8 @@ void mptcp_rcv_space_init(struct mptcp_sock *msk, const struct sock *ssk) TCP_INIT_CWND * tp->advmss); if (msk->rcvq_space.space == 0) msk->rcvq_space.space = TCP_INIT_CWND * TCP_MSS_DEFAULT; + + atomic64_set(&msk->wnd_end, msk->snd_nxt + tcp_sk(ssk)->snd_wnd); } static struct sock *mptcp_accept(struct sock *sk, int flags, int *err, diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 482e982a706d..d33e9676a1a3 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -214,6 +214,7 @@ struct mptcp_sock { struct sock *last_snd; int snd_burst; atomic64_t snd_una; + atomic64_t wnd_end; unsigned long timer_ival; u32 token; unsigned long flags;