diff mbox

[RFC,v3,net-next,2/3] tcp: Handle eor bit when coalescing skb

Message ID 1461133497-1515104-3-git-send-email-kafai@fb.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Martin KaFai Lau April 20, 2016, 6:24 a.m. UTC
This patch:
1. Prevent next_skb from coalescing to the prev_skb if
   TCP_SKB_CB(prev_skb)->eor is set
2. Update the TCP_SKB_CB(prev_skb)->eor if coalescing is
   allowed

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
 net/ipv4/tcp_input.c  | 4 ++++
 net/ipv4/tcp_output.c | 4 ++++
 2 files changed, 8 insertions(+)

Comments

Soheil Hassas Yeganeh April 20, 2016, 8:04 p.m. UTC | #1
On Wed, Apr 20, 2016 at 2:24 AM, Martin KaFai Lau <kafai@fb.com> wrote:
> This patch:
> 1. Prevent next_skb from coalescing to the prev_skb if
>    TCP_SKB_CB(prev_skb)->eor is set
> 2. Update the TCP_SKB_CB(prev_skb)->eor if coalescing is
>    allowed
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Soheil Hassas Yeganeh <soheil@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> ---
>  net/ipv4/tcp_input.c  | 4 ++++
>  net/ipv4/tcp_output.c | 4 ++++
>  2 files changed, 8 insertions(+)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 75e8336..68c55e5 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -1303,6 +1303,7 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
>         }
>
>         TCP_SKB_CB(prev)->tcp_flags |= TCP_SKB_CB(skb)->tcp_flags;
> +       TCP_SKB_CB(prev)->eor = TCP_SKB_CB(skb)->eor;
>         if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
>                 TCP_SKB_CB(prev)->end_seq++;
>
> @@ -1368,6 +1369,9 @@ static struct sk_buff *tcp_shift_skb_data(struct sock *sk, struct sk_buff *skb,
>         if ((TCP_SKB_CB(prev)->sacked & TCPCB_TAGBITS) != TCPCB_SACKED_ACKED)
>                 goto fallback;
>
> +       if (TCP_SKB_CB(prev)->eor)
> +               goto fallback;
> +

nit: You might want to add unlikely around all checks for "tcp_skb_cb->eor"s.

>         in_sack = !after(start_seq, TCP_SKB_CB(skb)->seq) &&
>                   !before(end_seq, TCP_SKB_CB(skb)->end_seq);
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index a6e4a83..96bdf98 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2494,6 +2494,7 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
>          * packet counting does not break.
>          */
>         TCP_SKB_CB(skb)->sacked |= TCP_SKB_CB(next_skb)->sacked & TCPCB_EVER_RETRANS;
> +       TCP_SKB_CB(skb)->eor = TCP_SKB_CB(next_skb)->eor;
>
>         /* changed transmit queue under us so clear hints */
>         tcp_clear_retrans_hints_partial(tp);
> @@ -2545,6 +2546,9 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to,
>                 if (!tcp_can_collapse(sk, skb))
>                         break;
>
> +               if (TCP_SKB_CB(to)->eor)
> +                       break;
> +

nit: Perhaps a better place to check for eor is right after entering
the loop? to skip a few instructions and tcp_can_collapse, in an
unlikely case eor is set.

>                 space -= skb->len;
>
>                 if (first) {
> --
> 2.5.1
>
Martin KaFai Lau April 21, 2016, 4:56 p.m. UTC | #2
On Wed, Apr 20, 2016 at 04:04:54PM -0400, Soheil Hassas Yeganeh wrote:
> > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> > index a6e4a83..96bdf98 100644
> > --- a/net/ipv4/tcp_output.c
> > +++ b/net/ipv4/tcp_output.c
> > @@ -2494,6 +2494,7 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
> >          * packet counting does not break.
> >          */
> >         TCP_SKB_CB(skb)->sacked |= TCP_SKB_CB(next_skb)->sacked & TCPCB_EVER_RETRANS;
> > +       TCP_SKB_CB(skb)->eor = TCP_SKB_CB(next_skb)->eor;
> >
> >         /* changed transmit queue under us so clear hints */
> >         tcp_clear_retrans_hints_partial(tp);
> > @@ -2545,6 +2546,9 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to,
> >                 if (!tcp_can_collapse(sk, skb))
> >                         break;
> >
> > +               if (TCP_SKB_CB(to)->eor)
> > +                       break;
> > +
>
> nit: Perhaps a better place to check for eor is right after entering
> the loop? to skip a few instructions and tcp_can_collapse, in an
> unlikely case eor is set.
hmm... Not sure I understand it.
You meant moving the unlikely case before (or after?) the more likely
cases which may have a better chance to break the loop sooner?
Soheil Hassas Yeganeh April 21, 2016, 9:14 p.m. UTC | #3
On Thu, Apr 21, 2016 at 12:56 PM, Martin KaFai Lau <kafai@fb.com> wrote:
> On Wed, Apr 20, 2016 at 04:04:54PM -0400, Soheil Hassas Yeganeh wrote:
>> > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>> > index a6e4a83..96bdf98 100644
>> > --- a/net/ipv4/tcp_output.c
>> > +++ b/net/ipv4/tcp_output.c
>> > @@ -2494,6 +2494,7 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
>> >          * packet counting does not break.
>> >          */
>> >         TCP_SKB_CB(skb)->sacked |= TCP_SKB_CB(next_skb)->sacked & TCPCB_EVER_RETRANS;
>> > +       TCP_SKB_CB(skb)->eor = TCP_SKB_CB(next_skb)->eor;
>> >
>> >         /* changed transmit queue under us so clear hints */
>> >         tcp_clear_retrans_hints_partial(tp);
>> > @@ -2545,6 +2546,9 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to,
>> >                 if (!tcp_can_collapse(sk, skb))
>> >                         break;
>> >
>> > +               if (TCP_SKB_CB(to)->eor)
>> > +                       break;
>> > +
>>
>> nit: Perhaps a better place to check for eor is right after entering
>> the loop? to skip a few instructions and tcp_can_collapse, in an
>> unlikely case eor is set.
> hmm... Not sure I understand it.
> You meant moving the unlikely case before (or after?) the more likely
> cases which may have a better chance to break the loop sooner?

Well I don't have strong preference here. So, feel free to ignore.
Though I'm not sure how "likely" are the checks in tcp_can_collapse.

On another note, do you think putting this is a self-documenting
helper function, say tcp_can_collapse_to(), would help readability?

Thanks.
Martin KaFai Lau April 22, 2016, 4:30 a.m. UTC | #4
On Thu, Apr 21, 2016 at 05:14:37PM -0400, Soheil Hassas Yeganeh wrote:
> On another note, do you think putting this is a self-documenting
> helper function, say tcp_can_collapse_to(), would help readability?
Sure.  I will move unlikely(TCP_SKB_CB(to)->eor) to a new helper
function tcp_skb_can_collapse_to() in the next spin.
diff mbox

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 75e8336..68c55e5 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1303,6 +1303,7 @@  static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
 	}
 
 	TCP_SKB_CB(prev)->tcp_flags |= TCP_SKB_CB(skb)->tcp_flags;
+	TCP_SKB_CB(prev)->eor = TCP_SKB_CB(skb)->eor;
 	if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
 		TCP_SKB_CB(prev)->end_seq++;
 
@@ -1368,6 +1369,9 @@  static struct sk_buff *tcp_shift_skb_data(struct sock *sk, struct sk_buff *skb,
 	if ((TCP_SKB_CB(prev)->sacked & TCPCB_TAGBITS) != TCPCB_SACKED_ACKED)
 		goto fallback;
 
+	if (TCP_SKB_CB(prev)->eor)
+		goto fallback;
+
 	in_sack = !after(start_seq, TCP_SKB_CB(skb)->seq) &&
 		  !before(end_seq, TCP_SKB_CB(skb)->end_seq);
 
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index a6e4a83..96bdf98 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2494,6 +2494,7 @@  static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
 	 * packet counting does not break.
 	 */
 	TCP_SKB_CB(skb)->sacked |= TCP_SKB_CB(next_skb)->sacked & TCPCB_EVER_RETRANS;
+	TCP_SKB_CB(skb)->eor = TCP_SKB_CB(next_skb)->eor;
 
 	/* changed transmit queue under us so clear hints */
 	tcp_clear_retrans_hints_partial(tp);
@@ -2545,6 +2546,9 @@  static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to,
 		if (!tcp_can_collapse(sk, skb))
 			break;
 
+		if (TCP_SKB_CB(to)->eor)
+			break;
+
 		space -= skb->len;
 
 		if (first) {