Message ID | 1461133497-1515104-3-git-send-email-kafai@fb.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, Apr 20, 2016 at 2:24 AM, Martin KaFai Lau <kafai@fb.com> wrote: > This patch: > 1. Prevent next_skb from coalescing to the prev_skb if > TCP_SKB_CB(prev_skb)->eor is set > 2. Update the TCP_SKB_CB(prev_skb)->eor if coalescing is > allowed > > Signed-off-by: Martin KaFai Lau <kafai@fb.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Neal Cardwell <ncardwell@google.com> > Cc: Soheil Hassas Yeganeh <soheil@google.com> > Cc: Willem de Bruijn <willemb@google.com> > Cc: Yuchung Cheng <ycheng@google.com> > --- > net/ipv4/tcp_input.c | 4 ++++ > net/ipv4/tcp_output.c | 4 ++++ > 2 files changed, 8 insertions(+) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 75e8336..68c55e5 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -1303,6 +1303,7 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb, > } > > TCP_SKB_CB(prev)->tcp_flags |= TCP_SKB_CB(skb)->tcp_flags; > + TCP_SKB_CB(prev)->eor = TCP_SKB_CB(skb)->eor; > if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN) > TCP_SKB_CB(prev)->end_seq++; > > @@ -1368,6 +1369,9 @@ static struct sk_buff *tcp_shift_skb_data(struct sock *sk, struct sk_buff *skb, > if ((TCP_SKB_CB(prev)->sacked & TCPCB_TAGBITS) != TCPCB_SACKED_ACKED) > goto fallback; > > + if (TCP_SKB_CB(prev)->eor) > + goto fallback; > + nit: You might want to add unlikely around all checks for "tcp_skb_cb->eor"s. > in_sack = !after(start_seq, TCP_SKB_CB(skb)->seq) && > !before(end_seq, TCP_SKB_CB(skb)->end_seq); > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index a6e4a83..96bdf98 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -2494,6 +2494,7 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb) > * packet counting does not break. > */ > TCP_SKB_CB(skb)->sacked |= TCP_SKB_CB(next_skb)->sacked & TCPCB_EVER_RETRANS; > + TCP_SKB_CB(skb)->eor = TCP_SKB_CB(next_skb)->eor; > > /* changed transmit queue under us so clear hints */ > tcp_clear_retrans_hints_partial(tp); > @@ -2545,6 +2546,9 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to, > if (!tcp_can_collapse(sk, skb)) > break; > > + if (TCP_SKB_CB(to)->eor) > + break; > + nit: Perhaps a better place to check for eor is right after entering the loop? to skip a few instructions and tcp_can_collapse, in an unlikely case eor is set. > space -= skb->len; > > if (first) { > -- > 2.5.1 >
On Wed, Apr 20, 2016 at 04:04:54PM -0400, Soheil Hassas Yeganeh wrote: > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > > index a6e4a83..96bdf98 100644 > > --- a/net/ipv4/tcp_output.c > > +++ b/net/ipv4/tcp_output.c > > @@ -2494,6 +2494,7 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb) > > * packet counting does not break. > > */ > > TCP_SKB_CB(skb)->sacked |= TCP_SKB_CB(next_skb)->sacked & TCPCB_EVER_RETRANS; > > + TCP_SKB_CB(skb)->eor = TCP_SKB_CB(next_skb)->eor; > > > > /* changed transmit queue under us so clear hints */ > > tcp_clear_retrans_hints_partial(tp); > > @@ -2545,6 +2546,9 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to, > > if (!tcp_can_collapse(sk, skb)) > > break; > > > > + if (TCP_SKB_CB(to)->eor) > > + break; > > + > > nit: Perhaps a better place to check for eor is right after entering > the loop? to skip a few instructions and tcp_can_collapse, in an > unlikely case eor is set. hmm... Not sure I understand it. You meant moving the unlikely case before (or after?) the more likely cases which may have a better chance to break the loop sooner?
On Thu, Apr 21, 2016 at 12:56 PM, Martin KaFai Lau <kafai@fb.com> wrote: > On Wed, Apr 20, 2016 at 04:04:54PM -0400, Soheil Hassas Yeganeh wrote: >> > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c >> > index a6e4a83..96bdf98 100644 >> > --- a/net/ipv4/tcp_output.c >> > +++ b/net/ipv4/tcp_output.c >> > @@ -2494,6 +2494,7 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb) >> > * packet counting does not break. >> > */ >> > TCP_SKB_CB(skb)->sacked |= TCP_SKB_CB(next_skb)->sacked & TCPCB_EVER_RETRANS; >> > + TCP_SKB_CB(skb)->eor = TCP_SKB_CB(next_skb)->eor; >> > >> > /* changed transmit queue under us so clear hints */ >> > tcp_clear_retrans_hints_partial(tp); >> > @@ -2545,6 +2546,9 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to, >> > if (!tcp_can_collapse(sk, skb)) >> > break; >> > >> > + if (TCP_SKB_CB(to)->eor) >> > + break; >> > + >> >> nit: Perhaps a better place to check for eor is right after entering >> the loop? to skip a few instructions and tcp_can_collapse, in an >> unlikely case eor is set. > hmm... Not sure I understand it. > You meant moving the unlikely case before (or after?) the more likely > cases which may have a better chance to break the loop sooner? Well I don't have strong preference here. So, feel free to ignore. Though I'm not sure how "likely" are the checks in tcp_can_collapse. On another note, do you think putting this is a self-documenting helper function, say tcp_can_collapse_to(), would help readability? Thanks.
On Thu, Apr 21, 2016 at 05:14:37PM -0400, Soheil Hassas Yeganeh wrote: > On another note, do you think putting this is a self-documenting > helper function, say tcp_can_collapse_to(), would help readability? Sure. I will move unlikely(TCP_SKB_CB(to)->eor) to a new helper function tcp_skb_can_collapse_to() in the next spin.
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 75e8336..68c55e5 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1303,6 +1303,7 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb, } TCP_SKB_CB(prev)->tcp_flags |= TCP_SKB_CB(skb)->tcp_flags; + TCP_SKB_CB(prev)->eor = TCP_SKB_CB(skb)->eor; if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN) TCP_SKB_CB(prev)->end_seq++; @@ -1368,6 +1369,9 @@ static struct sk_buff *tcp_shift_skb_data(struct sock *sk, struct sk_buff *skb, if ((TCP_SKB_CB(prev)->sacked & TCPCB_TAGBITS) != TCPCB_SACKED_ACKED) goto fallback; + if (TCP_SKB_CB(prev)->eor) + goto fallback; + in_sack = !after(start_seq, TCP_SKB_CB(skb)->seq) && !before(end_seq, TCP_SKB_CB(skb)->end_seq); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index a6e4a83..96bdf98 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2494,6 +2494,7 @@ static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb) * packet counting does not break. */ TCP_SKB_CB(skb)->sacked |= TCP_SKB_CB(next_skb)->sacked & TCPCB_EVER_RETRANS; + TCP_SKB_CB(skb)->eor = TCP_SKB_CB(next_skb)->eor; /* changed transmit queue under us so clear hints */ tcp_clear_retrans_hints_partial(tp); @@ -2545,6 +2546,9 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to, if (!tcp_can_collapse(sk, skb)) break; + if (TCP_SKB_CB(to)->eor) + break; + space -= skb->len; if (first) {
This patch: 1. Prevent next_skb from coalescing to the prev_skb if TCP_SKB_CB(prev_skb)->eor is set 2. Update the TCP_SKB_CB(prev_skb)->eor if coalescing is allowed Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Yuchung Cheng <ycheng@google.com> --- net/ipv4/tcp_input.c | 4 ++++ net/ipv4/tcp_output.c | 4 ++++ 2 files changed, 8 insertions(+)