diff mbox

[net-next] tcp: introduce TCPSpuriousRtxHostQueues SNMP counter

Message ID 1366303971.3205.62.camel@edumazet-glaptop
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet April 18, 2013, 4:52 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

Host queues (Qdisc + NIC) can hold packets so long that TCP can
eventually retransmit a packet before the first transmit even left
the host.

Its not clear right now if we could avoid this in the first place :

- We could arm RTO timer not at the time we enqueue packets, but
  at the time we TX complete them (tcp_wfree())

- Cancel the sending of the new copy of the packet if prior one
  is still in queue.

This patch adds instrumentation so that we can at least see how
often this problem happens.

TCPSpuriousRtxHostQueues SNMP counter is incremented every time
we detect the fast clone is not yet freed in tcp_transmit_skb()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Willem de Bruijn <willemb@google.com>
---
Google-Bug-Id: 8584703

 include/uapi/linux/snmp.h |    1 +
 net/ipv4/proc.c           |    1 +
 net/ipv4/tcp_output.c     |    7 +++++++
 3 files changed, 9 insertions(+)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Stephen Hemminger April 18, 2013, 5:45 p.m. UTC | #1
On Thu, 18 Apr 2013 09:52:51 -0700
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> From: Eric Dumazet <edumazet@google.com>
> 
> Host queues (Qdisc + NIC) can hold packets so long that TCP can
> eventually retransmit a packet before the first transmit even left
> the host.

I though you were use fq_codel ;-) and that wouldn't happen.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet April 18, 2013, 6:06 p.m. UTC | #2
On Thu, 2013-04-18 at 10:45 -0700, Stephen Hemminger wrote:
> On Thu, 18 Apr 2013 09:52:51 -0700
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > Host queues (Qdisc + NIC) can hold packets so long that TCP can
> > eventually retransmit a packet before the first transmit even left
> > the host.
> 
> I though you were use fq_codel ;-) and that wouldn't happen.

Remind that fq_codel drops packets at dequeue time only ;)

So TCP stack could retransmit while prior packet is still in fq_codel
queue.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hagen Paul Pfeifer April 18, 2013, 6:19 p.m. UTC | #3
* Eric Dumazet | 2013-04-18 11:06:21 [-0700]:

>On Thu, 2013-04-18 at 10:45 -0700, Stephen Hemminger wrote:
>> On Thu, 18 Apr 2013 09:52:51 -0700
>> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> 
>> > From: Eric Dumazet <edumazet@google.com>
>> > 
>> > Host queues (Qdisc + NIC) can hold packets so long that TCP can
>> > eventually retransmit a packet before the first transmit even left
>> > the host.

Just out of curiosity: do you see effects of commit 9ad7c049 (initRTO from
3secs to 1sec) and a long standing queue? (with no path metric, rtt in
particular)

Hagen
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 18, 2013, 6:57 p.m. UTC | #4
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 18 Apr 2013 09:52:51 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> Host queues (Qdisc + NIC) can hold packets so long that TCP can
> eventually retransmit a packet before the first transmit even left
> the host.
> 
> Its not clear right now if we could avoid this in the first place :
> 
> - We could arm RTO timer not at the time we enqueue packets, but
>   at the time we TX complete them (tcp_wfree())
> 
> - Cancel the sending of the new copy of the packet if prior one
>   is still in queue.
> 
> This patch adds instrumentation so that we can at least see how
> often this problem happens.
> 
> TCPSpuriousRtxHostQueues SNMP counter is incremented every time
> we detect the fast clone is not yet freed in tcp_transmit_skb()
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet April 18, 2013, 7:29 p.m. UTC | #5
On Thu, 2013-04-18 at 20:19 +0200, Hagen Paul Pfeifer wrote:
> * Eric Dumazet | 2013-04-18 11:06:21 [-0700]:
> 
> >On Thu, 2013-04-18 at 10:45 -0700, Stephen Hemminger wrote:
> >> On Thu, 18 Apr 2013 09:52:51 -0700
> >> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >> 
> >> > From: Eric Dumazet <edumazet@google.com>
> >> > 
> >> > Host queues (Qdisc + NIC) can hold packets so long that TCP can
> >> > eventually retransmit a packet before the first transmit even left
> >> > the host.
> 
> Just out of curiosity: do you see effects of commit 9ad7c049 (initRTO from
> 3secs to 1sec) and a long standing queue? (with no path metric, rtt in
> particular)
> 

I have no particular data for the initRTO change.

Interesting thing is that we send a SYN-ACK for every SYN we receive.
So if a client sends 4 SYN, we'll going to send 4 SYN-ACK, regardless of
the time of last sent SYN-ACK...

Here is an interesting study case.

12:02:43.484175 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [S], seq 3414870221, win 14600, options [mss 1460,sackOK,TS val 4294775698 ecr 0,nop,wscale 6], length 0
12:02:43.484201 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [S.], seq 3173332093, ack 3414870222, win 14480, options [mss 1460,sackOK,TS val 14218683 ecr 4294775698,no
p,wscale 6], length 0
12:02:44.884382 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [S.], seq 3173332093, ack 3414870222, win 14480, options [mss 1460,sackOK,TS val 14220084 ecr 4294775698,no
p,wscale 6], length 0
12:02:45.470224 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [S], seq 3414870221, win 14600, options [mss 1460,sackOK,TS val 4294776700 ecr 0,nop,wscale 6], length 0
12:02:45.470252 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [S.], seq 3173332093, ack 3414870222, win 14480, options [mss 1460,sackOK,TS val 14220669 ecr 4294775698,no
p,wscale 6], length 0
12:02:46.762462 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 0
12:02:46.762989 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 1:1449, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
12:02:46.763019 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 1449, win 272, options [nop,nop,TS val 14221962 ecr 4294777094], length 0
12:02:46.775104 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 1449:2897, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
12:02:46.775138 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 2897, win 317, options [nop,nop,TS val 14221974 ecr 4294777094], length 0
12:02:46.787215 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 2897:4345, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
12:02:46.787244 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 4345, win 362, options [nop,nop,TS val 14221986 ecr 4294777094], length 0
12:02:46.799326 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 4345:5793, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
12:02:46.799357 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 5793, win 408, options [nop,nop,TS val 14221998 ecr 4294777094], length 0
12:02:46.811438 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 5793:7241, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
12:02:46.811465 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 7241, win 453, options [nop,nop,TS val 14222010 ecr 4294777094], length 0
12:02:46.823549 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [P.], seq 7241:8689, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
12:02:46.823575 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 8689, win 498, options [nop,nop,TS val 14222023 ecr 4294777094], length 0
12:02:46.835662 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 8689:10137, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
12:02:46.835694 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 10137, win 543, options [nop,nop,TS val 14222035 ecr 4294777094], length 0
12:02:46.847775 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 10137:11585, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
12:02:46.847801 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 11585, win 589, options [nop,nop,TS val 14222047 ecr 4294777094], length 0
12:02:46.859889 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 11585:13033, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
12:02:46.859916 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 13033, win 634, options [nop,nop,TS val 14222059 ecr 4294777094], length 0
12:02:46.871998 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 13033:14481, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
12:02:46.872025 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 14481, win 660, options [nop,nop,TS val 14222071 ecr 4294777094], length 0
12:02:49.362204 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], ack 1, win 229, options [nop,nop,TS val 4294778499 ecr 14218683], length 0
12:02:50.415265 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], ack 1, win 229, options [nop,nop,TS val 4294779084 ecr 14218683], length 0

We can see the last two packets in this trace being flagged as
TCPSYNChallenge/TCPChallengeACK, because we react to the two extra
SYN-ACK we receive, after queueing the first 10 packets of data.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yuchung Cheng April 18, 2013, 7:54 p.m. UTC | #6
On Thu, Apr 18, 2013 at 12:29 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2013-04-18 at 20:19 +0200, Hagen Paul Pfeifer wrote:
>> * Eric Dumazet | 2013-04-18 11:06:21 [-0700]:
>>
>> >On Thu, 2013-04-18 at 10:45 -0700, Stephen Hemminger wrote:
>> >> On Thu, 18 Apr 2013 09:52:51 -0700
>> >> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> >>
>> >> > From: Eric Dumazet <edumazet@google.com>
>> >> >
>> >> > Host queues (Qdisc + NIC) can hold packets so long that TCP can
>> >> > eventually retransmit a packet before the first transmit even left
>> >> > the host.
>>
>> Just out of curiosity: do you see effects of commit 9ad7c049 (initRTO from
>> 3secs to 1sec) and a long standing queue? (with no path metric, rtt in
>> particular)
>>
>
> I have no particular data for the initRTO change.
>
> Interesting thing is that we send a SYN-ACK for every SYN we receive.
> So if a client sends 4 SYN, we'll going to send 4 SYN-ACK, regardless of
> the time of last sent SYN-ACK...
I am testing a patch to mitigate this exact issue now. Will post soon.

>
> Here is an interesting study case.
>
> 12:02:43.484175 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [S], seq 3414870221, win 14600, options [mss 1460,sackOK,TS val 4294775698 ecr 0,nop,wscale 6], length 0
> 12:02:43.484201 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [S.], seq 3173332093, ack 3414870222, win 14480, options [mss 1460,sackOK,TS val 14218683 ecr 4294775698,no
> p,wscale 6], length 0
> 12:02:44.884382 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [S.], seq 3173332093, ack 3414870222, win 14480, options [mss 1460,sackOK,TS val 14220084 ecr 4294775698,no
> p,wscale 6], length 0
> 12:02:45.470224 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [S], seq 3414870221, win 14600, options [mss 1460,sackOK,TS val 4294776700 ecr 0,nop,wscale 6], length 0
> 12:02:45.470252 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [S.], seq 3173332093, ack 3414870222, win 14480, options [mss 1460,sackOK,TS val 14220669 ecr 4294775698,no
> p,wscale 6], length 0
> 12:02:46.762462 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 0
> 12:02:46.762989 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 1:1449, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
> 12:02:46.763019 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 1449, win 272, options [nop,nop,TS val 14221962 ecr 4294777094], length 0
> 12:02:46.775104 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 1449:2897, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
> 12:02:46.775138 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 2897, win 317, options [nop,nop,TS val 14221974 ecr 4294777094], length 0
> 12:02:46.787215 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 2897:4345, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
> 12:02:46.787244 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 4345, win 362, options [nop,nop,TS val 14221986 ecr 4294777094], length 0
> 12:02:46.799326 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 4345:5793, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
> 12:02:46.799357 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 5793, win 408, options [nop,nop,TS val 14221998 ecr 4294777094], length 0
> 12:02:46.811438 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 5793:7241, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
> 12:02:46.811465 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 7241, win 453, options [nop,nop,TS val 14222010 ecr 4294777094], length 0
> 12:02:46.823549 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [P.], seq 7241:8689, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
> 12:02:46.823575 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 8689, win 498, options [nop,nop,TS val 14222023 ecr 4294777094], length 0
> 12:02:46.835662 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 8689:10137, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
> 12:02:46.835694 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 10137, win 543, options [nop,nop,TS val 14222035 ecr 4294777094], length 0
> 12:02:46.847775 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 10137:11585, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
> 12:02:46.847801 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 11585, win 589, options [nop,nop,TS val 14222047 ecr 4294777094], length 0
> 12:02:46.859889 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 11585:13033, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
> 12:02:46.859916 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 13033, win 634, options [nop,nop,TS val 14222059 ecr 4294777094], length 0
> 12:02:46.871998 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 13033:14481, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
> 12:02:46.872025 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 14481, win 660, options [nop,nop,TS val 14222071 ecr 4294777094], length 0
> 12:02:49.362204 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], ack 1, win 229, options [nop,nop,TS val 4294778499 ecr 14218683], length 0
> 12:02:50.415265 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], ack 1, win 229, options [nop,nop,TS val 4294779084 ecr 14218683], length 0
>
> We can see the last two packets in this trace being flagged as
> TCPSYNChallenge/TCPChallengeACK, because we react to the two extra
> SYN-ACK we receive, after queueing the first 10 packets of data.
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hagen Paul Pfeifer April 18, 2013, 8:26 p.m. UTC | #7
* Eric Dumazet | 2013-04-18 12:29:38 [-0700]:

>I have no particular data for the initRTO change.
>
>Interesting thing is that we send a SYN-ACK for every SYN we receive.
>So if a client sends 4 SYN, we'll going to send 4 SYN-ACK, regardless of
>the time of last sent SYN-ACK...

I see, one question: should that not be addressed on the sender side? Trace
smells like a datacenter setup with a low RTT and a small init RTO where the
last-time-syn-ack-timestamp guard makes sense. Hopefully Yuchung's patch keep
networks with low bandwidth of a few kbits and a rtt > 1 seconds in mind.

>Here is an interesting study case.
>
>12:02:43.484175 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [S], seq 3414870221, win 14600, options [mss 1460,sackOK,TS val 4294775698 ecr 0,nop,wscale 6], length 0
>12:02:43.484201 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [S.], seq 3173332093, ack 3414870222, win 14480, options [mss 1460,sackOK,TS val 14218683 ecr 4294775698,no
>p,wscale 6], length 0
>12:02:44.884382 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [S.], seq 3173332093, ack 3414870222, win 14480, options [mss 1460,sackOK,TS val 14220084 ecr 4294775698,no
>p,wscale 6], length 0
>12:02:45.470224 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [S], seq 3414870221, win 14600, options [mss 1460,sackOK,TS val 4294776700 ecr 0,nop,wscale 6], length 0
>12:02:45.470252 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [S.], seq 3173332093, ack 3414870222, win 14480, options [mss 1460,sackOK,TS val 14220669 ecr 4294775698,no
>p,wscale 6], length 0
>12:02:46.762462 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 0
>12:02:46.762989 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 1:1449, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
>12:02:46.763019 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 1449, win 272, options [nop,nop,TS val 14221962 ecr 4294777094], length 0
>12:02:46.775104 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 1449:2897, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
>12:02:46.775138 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 2897, win 317, options [nop,nop,TS val 14221974 ecr 4294777094], length 0
>12:02:46.787215 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 2897:4345, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
>12:02:46.787244 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 4345, win 362, options [nop,nop,TS val 14221986 ecr 4294777094], length 0
>12:02:46.799326 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 4345:5793, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
>12:02:46.799357 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 5793, win 408, options [nop,nop,TS val 14221998 ecr 4294777094], length 0
>12:02:46.811438 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 5793:7241, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
>12:02:46.811465 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 7241, win 453, options [nop,nop,TS val 14222010 ecr 4294777094], length 0
>12:02:46.823549 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [P.], seq 7241:8689, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
>12:02:46.823575 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 8689, win 498, options [nop,nop,TS val 14222023 ecr 4294777094], length 0
>12:02:46.835662 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 8689:10137, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
>12:02:46.835694 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 10137, win 543, options [nop,nop,TS val 14222035 ecr 4294777094], length 0
>12:02:46.847775 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 10137:11585, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
>12:02:46.847801 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 11585, win 589, options [nop,nop,TS val 14222047 ecr 4294777094], length 0
>12:02:46.859889 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 11585:13033, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
>12:02:46.859916 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 13033, win 634, options [nop,nop,TS val 14222059 ecr 4294777094], length 0
>12:02:46.871998 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], seq 13033:14481, ack 1, win 229, options [nop,nop,TS val 4294777094 ecr 14218683], length 1448
>12:02:46.872025 IP 7.7.7.84.51407 > 7.7.7.83.49489: Flags [.], ack 14481, win 660, options [nop,nop,TS val 14222071 ecr 4294777094], length 0
>12:02:49.362204 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], ack 1, win 229, options [nop,nop,TS val 4294778499 ecr 14218683], length 0
>12:02:50.415265 IP 7.7.7.83.49489 > 7.7.7.84.51407: Flags [.], ack 1, win 229, options [nop,nop,TS val 4294779084 ecr 14218683], length 0
>
>We can see the last two packets in this trace being flagged as
>TCPSYNChallenge/TCPChallengeACK, because we react to the two extra
>SYN-ACK we receive, after queueing the first 10 packets of data.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index e00013a..fefdec91 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -247,6 +247,7 @@  enum
 	LINUX_MIB_TCPFASTOPENPASSIVEFAIL,	/* TCPFastOpenPassiveFail */
 	LINUX_MIB_TCPFASTOPENLISTENOVERFLOW,	/* TCPFastOpenListenOverflow */
 	LINUX_MIB_TCPFASTOPENCOOKIEREQD,	/* TCPFastOpenCookieReqd */
+	LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES, /* TCPSpuriousRtxHostQueues */
 	__LINUX_MIB_MAX
 };
 
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index b6f2ea1..6da51d5 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -269,6 +269,7 @@  static const struct snmp_mib snmp4_net_list[] = {
 	SNMP_MIB_ITEM("TCPFastOpenPassiveFail", LINUX_MIB_TCPFASTOPENPASSIVEFAIL),
 	SNMP_MIB_ITEM("TCPFastOpenListenOverflow", LINUX_MIB_TCPFASTOPENLISTENOVERFLOW),
 	SNMP_MIB_ITEM("TCPFastOpenCookieReqd", LINUX_MIB_TCPFASTOPENCOOKIEREQD),
+	SNMP_MIB_ITEM("TCPSpuriousRtxHostQueues", LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES),
 	SNMP_MIB_SENTINEL
 };
 
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index d126943..1dc9ccc 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -846,6 +846,13 @@  static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 		__net_timestamp(skb);
 
 	if (likely(clone_it)) {
+		const struct sk_buff *fclone = skb + 1;
+
+		if (unlikely(skb->fclone == SKB_FCLONE_ORIG &&
+			     fclone->fclone == SKB_FCLONE_CLONE))
+			NET_INC_STATS_BH(sock_net(sk),
+					 LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES);
+
 		if (unlikely(skb_cloned(skb)))
 			skb = pskb_copy(skb, gfp_mask);
 		else