Message ID | 20170320192714.GB23552@localhost.localdomain |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Mon, 2017-03-20 at 16:27 -0300, Marcelo Ricardo Leitner wrote: > This warning is a hint, and can not assume senders are not dumb. > > Agreed. But we can make it consider such cases. What about the following > patch? (untested) > > I think we can directly account for the size of the timestamps in there, > as that won't make a difference to congestion control in case it's > wrong, and also validate against MTU if we have it. I didn't subtract > the headers from MTU on purpose, as dealing with ipv4/ipv6 there is > not worth for the same reason. > > This should silent this false-positive. Note that the problem could have its origin on a middle box, not on the host terminating the TCP flow. So we can try hard, but we can't eliminate false positives. Maybe replace the 12 by MAX_TCP_OPTION_SPACE ?
On Wed, Mar 22, 2017 at 06:47:42AM -0700, Eric Dumazet wrote: > On Mon, 2017-03-20 at 16:27 -0300, Marcelo Ricardo Leitner wrote: > > This warning is a hint, and can not assume senders are not dumb. > > > > Agreed. But we can make it consider such cases. What about the following > > patch? (untested) > > > > I think we can directly account for the size of the timestamps in there, > > as that won't make a difference to congestion control in case it's > > wrong, and also validate against MTU if we have it. I didn't subtract > > the headers from MTU on purpose, as dealing with ipv4/ipv6 there is > > not worth for the same reason. > > > > This should silent this false-positive. > > > Note that the problem could have its origin on a middle box, > not on the host terminating the TCP flow. > > So we can try hard, but we can't eliminate false positives. Agreed both. > > Maybe replace the 12 by MAX_TCP_OPTION_SPACE ? Yes, can be. Thanks. Marcelo
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 96b67a8b18c3..96a99446ddce 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -126,7 +126,8 @@ int sysctl_tcp_invalid_ratelimit __read_mostly = HZ/2; #define REXMIT_LOST 1 /* retransmit packets marked lost */ #define REXMIT_NEW 2 /* FRTO-style transmit of unsent/new packets */ -static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb) +static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb, + unsigned int len) { static bool __once __read_mostly; @@ -137,8 +138,9 @@ static void tcp_gro_dev_warn(struct sock *sk, const struct sk_buff *skb) rcu_read_lock(); dev = dev_get_by_index_rcu(sock_net(sk), skb->skb_iif); - pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised.\n", - dev ? dev->name : "Unknown driver"); + if (!dev || len >= dev->mtu) + pr_warn("%s: Driver has suspect GRO implementation, TCP performance may be compromised.\n", + dev ? dev->name : "Unknown driver"); rcu_read_unlock(); } } @@ -161,8 +163,9 @@ static void tcp_measure_rcv_mss(struct sock *sk, const struct sk_buff *skb) if (len >= icsk->icsk_ack.rcv_mss) { icsk->icsk_ack.rcv_mss = min_t(unsigned int, len, tcp_sk(sk)->advmss); - if (unlikely(icsk->icsk_ack.rcv_mss != len)) - tcp_gro_dev_warn(sk, skb); + /* The + 12 accounts for the possible lack of timestamps */ + if (unlikely(icsk->icsk_ack.rcv_mss + 12 < len)) + tcp_gro_dev_warn(sk, skb, len); } else { /* Otherwise, we make more careful check taking into account, * that SACKs block is variable.