diff mbox series

[v3] mptcp: let MPTCP create max size skbs

Message ID d5c7935a21f20d37900cd51fc0f1f88327a0dfdd.1606406237.git.pabeni@redhat.com
State Accepted, archived
Commit 31859991dc0f08e814d3f0f5c6b2cd1d56f9b051
Delegated to: Matthieu Baerts
Headers show
Series [v3] mptcp: let MPTCP create max size skbs | expand

Commit Message

Paolo Abeni Nov. 26, 2020, 3:58 p.m. UTC
Currently the xmit path of the MPTCP protocol creates smaller-
than-max-size skbs, which is suboptimal for the performances.

There are a few things to improve:
- when coalescing to an existing skb, must clear the PUSH flag
- tcp_build_frag() expect the available space as an argument.
  When coalescing is enable MPTCP already subtracted the
  to-be-coalesced skb len. We must increment said argument
  accordingly.

Before:
./use_mptcp.sh netperf -H 127.0.0.1 -t TCP_STREAM
[...]
131072  16384  16384    30.00    24414.86

After:
./use_mptcp.sh netperf -H 127.0.0.1 -t TCP_STREAM
[...]
131072  16384  16384    30.05    28357.69

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
use_mptcp.sh forces exiting app to create MPTCP instead of TCP
ones via LD_PRELOAD of crafter socket() implementation.

https://github.com/pabeni/mptcp-tools/tree/master/use_mptcp
---
v2 -> v3:
 - drop the tcp bits. They caused stream corruption which
   could not be easily set, and dropping them does not affect
   the performance in a visible way, since that code path is
   hit only on corner cases
v1 -> v2:
 - prevent splitting if from_ext is frozen: should never happen
   but is cheap
 - provide dummy mptcp_skb_split() for non MPTCP build
---
 net/mptcp/protocol.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

Comments

Mat Martineau Dec. 1, 2020, 8:19 p.m. UTC | #1
On Thu, 26 Nov 2020, Paolo Abeni wrote:

> Currently the xmit path of the MPTCP protocol creates smaller-
> than-max-size skbs, which is suboptimal for the performances.
>
> There are a few things to improve:
> - when coalescing to an existing skb, must clear the PUSH flag
> - tcp_build_frag() expect the available space as an argument.
>  When coalescing is enable MPTCP already subtracted the
>  to-be-coalesced skb len. We must increment said argument
>  accordingly.
>
> Before:
> ./use_mptcp.sh netperf -H 127.0.0.1 -t TCP_STREAM
> [...]
> 131072  16384  16384    30.00    24414.86
>
> After:
> ./use_mptcp.sh netperf -H 127.0.0.1 -t TCP_STREAM
> [...]
> 131072  16384  16384    30.05    28357.69
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> use_mptcp.sh forces exiting app to create MPTCP instead of TCP
> ones via LD_PRELOAD of crafter socket() implementation.
>
> https://github.com/pabeni/mptcp-tools/tree/master/use_mptcp
> ---
> v2 -> v3:
> - drop the tcp bits. They caused stream corruption which
>   could not be easily set, and dropping them does not affect
>   the performance in a visible way, since that code path is
>   hit only on corner cases
> v1 -> v2:
> - prevent splitting if from_ext is frozen: should never happen
>   but is cheap
> - provide dummy mptcp_skb_split() for non MPTCP build
> ---
> net/mptcp/protocol.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>

Thanks for the simplified version, looks good!

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>

--
Mat Martineau
Intel
Matthieu Baerts Dec. 4, 2020, 6 p.m. UTC | #2
Hi Paolo, Mat,

On 26/11/2020 16:58, Paolo Abeni wrote:
> Currently the xmit path of the MPTCP protocol creates smaller-
> than-max-size skbs, which is suboptimal for the performances.
> 
> There are a few things to improve:
> - when coalescing to an existing skb, must clear the PUSH flag
> - tcp_build_frag() expect the available space as an argument.
>    When coalescing is enable MPTCP already subtracted the
>    to-be-coalesced skb len. We must increment said argument
>    accordingly.
> 
> Before:
> ./use_mptcp.sh netperf -H 127.0.0.1 -t TCP_STREAM
> [...]
> 131072  16384  16384    30.00    24414.86
> 
> After:
> ./use_mptcp.sh netperf -H 127.0.0.1 -t TCP_STREAM
> [...]
> 131072  16384  16384    30.05    28357.69
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Thank you for your nice improvement and review!

- 31859991dc0f: mptcp: let MPTCP create max size skbs
- Results: f8649f5ba21e..54981c30b018

Tests + export have been scheduled!

Cheers,
Matt
diff mbox series

Patch

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2a8174a7e630..82525d454c5e 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1256,6 +1256,7 @@  static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 	struct mptcp_ext *mpext = NULL;
 	struct sk_buff *skb, *tail;
 	bool can_collapse = false;
+	int size_bias = 0;
 	int avail_size;
 	size_t ret = 0;
 
@@ -1277,10 +1278,12 @@  static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 		mpext = skb_ext_find(skb, SKB_EXT_MPTCP);
 		can_collapse = (info->size_goal - skb->len > 0) &&
 			 mptcp_skb_can_collapse_to(data_seq, skb, mpext);
-		if (!can_collapse)
+		if (!can_collapse) {
 			TCP_SKB_CB(skb)->eor = 1;
-		else
+		} else {
+			size_bias = skb->len;
 			avail_size = info->size_goal - skb->len;
+		}
 	}
 
 	/* Zero window and all data acked? Probe. */
@@ -1300,8 +1303,8 @@  static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 		return 0;
 
 	ret = info->limit - info->sent;
-	tail = tcp_build_frag(ssk, avail_size, info->flags, dfrag->page,
-			      dfrag->offset + info->sent, &ret);
+	tail = tcp_build_frag(ssk, avail_size + size_bias, info->flags,
+			      dfrag->page, dfrag->offset + info->sent, &ret);
 	if (!tail) {
 		tcp_remove_empty_skb(sk, tcp_write_queue_tail(ssk));
 		return -ENOMEM;
@@ -1310,8 +1313,9 @@  static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 	/* if the tail skb is still the cached one, collapsing really happened.
 	 */
 	if (skb == tail) {
-		WARN_ON_ONCE(!can_collapse);
+		TCP_SKB_CB(tail)->tcp_flags &= ~TCPHDR_PSH;
 		mpext->data_len += ret;
+		WARN_ON_ONCE(!can_collapse);
 		WARN_ON_ONCE(zero_window_probe);
 		goto out;
 	}