diff mbox

Recurring trace from tcp_fragment()

Message ID 20150604205629.GB2951343@devbig242.prn2.facebook.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Martin KaFai Lau June 4, 2015, 8:56 p.m. UTC
On Thu, Jun 04, 2015 at 01:10:26PM -0700, Grant Zhang wrote:
> Hi Martin,
> 
> Thank you! My net.ipv4.tcp_mtu_probing is 1. After turning it off,
> the WARN_ON stack is gone.
Thanks for confirming it.

> Could you elaborate a bit on why this setting relates to the WARN_ON
> trace? 
The WARN_ON is complaining about tcp_fragment() is trying to slice
a skb which has a too-short skb->len.

When doing mtu probing, it may slice the skb.  In some cases (which
I also failed to reproduce in packetdrill), it does not
update some related skb values and then confuse tcp_fragment()
later on.

> And what are the pros/cons for disabling mtu_probing?
It depends on your traffic, I guess.  However, turning it off is not the
right fix.

FYI, here is the change I am trying.

Thanks,
--Martin

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Yuchung Cheng June 4, 2015, 9:29 p.m. UTC | #1
On Thu, Jun 4, 2015 at 1:56 PM, Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Thu, Jun 04, 2015 at 01:10:26PM -0700, Grant Zhang wrote:
> > Hi Martin,
> >
> > Thank you! My net.ipv4.tcp_mtu_probing is 1. After turning it off,
> > the WARN_ON stack is gone.
> Thanks for confirming it.
>
> > Could you elaborate a bit on why this setting relates to the WARN_ON
> > trace?
> The WARN_ON is complaining about tcp_fragment() is trying to slice
> a skb which has a too-short skb->len.
>
> When doing mtu probing, it may slice the skb.  In some cases (which
> I also failed to reproduce in packetdrill), it does not
> update some related skb values and then confuse tcp_fragment()
> later on.
>
> > And what are the pros/cons for disabling mtu_probing?
> It depends on your traffic, I guess.  However, turning it off is not the
> right fix.
There might be two bugs. We saw the warning with mtu_probing=0. This
would explain why Neal's fix did not work.


>
> FYI, here is the change I am trying.
>
> Thanks,
> --Martin
>
> diff --git i/net/ipv4/tcp_output.c w/net/ipv4/tcp_output.c
> index acec745..e767e53 100644
> --- i/net/ipv4/tcp_output.c
> +++ w/net/ipv4/tcp_output.c
> @@ -1920,6 +1920,8 @@ static int tcp_mtu_probe(struct sock *sk)
>                                                    ~(TCPHDR_FIN|TCPHDR_PSH);
>                         if (!skb_shinfo(skb)->nr_frags) {
>                                 skb_pull(skb, copy);
> +                               if (tcp_skb_pcount(skb) > 1)
> +                                       tcp_set_skb_tso_segs(sk, skb, mss_now);
>                                 if (skb->ip_summed != CHECKSUM_PARTIAL)
>                                         skb->csum = csum_partial(skb->data,
>                                                                  skb->len, 0);
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git i/net/ipv4/tcp_output.c w/net/ipv4/tcp_output.c
index acec745..e767e53 100644
--- i/net/ipv4/tcp_output.c
+++ w/net/ipv4/tcp_output.c
@@ -1920,6 +1920,8 @@  static int tcp_mtu_probe(struct sock *sk)
 						   ~(TCPHDR_FIN|TCPHDR_PSH);
 			if (!skb_shinfo(skb)->nr_frags) {
 				skb_pull(skb, copy);
+				if (tcp_skb_pcount(skb) > 1)
+					tcp_set_skb_tso_segs(sk, skb, mss_now);
 				if (skb->ip_summed != CHECKSUM_PARTIAL)
 					skb->csum = csum_partial(skb->data,
 								 skb->len, 0);