diff mbox

[net] tcp: refresh skb timestamp at retransmit time

Message ID 1462852516.23934.46.camel@edumazet-glaptop3.roam.corp.google.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet May 10, 2016, 3:55 a.m. UTC
From: Eric Dumazet <edumazet@google.com>

In the very unlikely case __tcp_retransmit_skb() can not use the cloning
done in tcp_transmit_skb(), we need to refresh skb_mstamp before doing
the copy and transmit, otherwise TCP TS val will be an exact copy of
original transmit.

Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
 net/ipv4/tcp_output.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Yuchung Cheng May 10, 2016, 3:01 p.m. UTC | #1
On Mon, May 9, 2016 at 8:55 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> In the very unlikely case __tcp_retransmit_skb() can not use the cloning
> done in tcp_transmit_skb(), we need to refresh skb_mstamp before doing
> the copy and transmit, otherwise TCP TS val will be an exact copy of
> original transmit.
>
> Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>

Nice catch Eric. Recovery algorithm like RACK definitely requires this
patch b/c it relies on skb mstamps.
does the failure usually occur under memory stress?

> ---
>  net/ipv4/tcp_output.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 441ae9da3a23..79a03b87a771 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2640,8 +2640,10 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
>          */
>         if (unlikely((NET_IP_ALIGN && ((unsigned long)skb->data & 3)) ||
>                      skb_headroom(skb) >= 0xFFFF)) {
> -               struct sk_buff *nskb = __pskb_copy(skb, MAX_TCP_HEADER,
> -                                                  GFP_ATOMIC);
> +               struct sk_buff *nskb;
> +
> +               skb_mstamp_get(&skb->skb_mstamp);
> +               nskb = __pskb_copy(skb, MAX_TCP_HEADER, GFP_ATOMIC);
>                 err = nskb ? tcp_transmit_skb(sk, nskb, 0, GFP_ATOMIC) :
>                              -ENOBUFS;
>         } else {
>
>
Eric Dumazet May 10, 2016, 3:51 p.m. UTC | #2
On Tue, 2016-05-10 at 08:01 -0700, Yuchung Cheng wrote:
> On Mon, May 9, 2016 at 8:55 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > In the very unlikely case __tcp_retransmit_skb() can not use the cloning
> > done in tcp_transmit_skb(), we need to refresh skb_mstamp before doing
> > the copy and transmit, otherwise TCP TS val will be an exact copy of
> > original transmit.
> >
> > Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Yuchung Cheng <ycheng@google.com>
> Acked-by: Yuchung Cheng <ycheng@google.com>
> 
> Nice catch Eric. Recovery algorithm like RACK definitely requires this
> patch b/c it relies on skb mstamps.
> does the failure usually occur under memory stress?

For x86, the NET_IP_ALIGN is 0, so the only 'problem' would happen
for devices with big MTU but no SG support.

In the normal case, we allocate small skb->head skbs
(SKB_WITH_OVERHEAD(2048 - MAX_TCP_HEADER) in select_size()

So this bug should not happen for most devices.

RACK will be better, but I was also wondering if PAWS checks on receiver
could drop all subsequent retransmits and we would have a TCP stalled
connection ? That would be a more serious bug.


For arches with NET_IP_ALIGN==2, the bug would be possible if the
receiver is playing games by partially acking the packets we send.
David Miller May 10, 2016, 7:59 p.m. UTC | #3
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 09 May 2016 20:55:16 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> In the very unlikely case __tcp_retransmit_skb() can not use the cloning
> done in tcp_transmit_skb(), we need to refresh skb_mstamp before doing
> the copy and transmit, otherwise TCP TS val will be an exact copy of
> original transmit.
> 
> Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>

Applied and queued up for -stable, thanks.
diff mbox

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 441ae9da3a23..79a03b87a771 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2640,8 +2640,10 @@  int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 	 */
 	if (unlikely((NET_IP_ALIGN && ((unsigned long)skb->data & 3)) ||
 		     skb_headroom(skb) >= 0xFFFF)) {
-		struct sk_buff *nskb = __pskb_copy(skb, MAX_TCP_HEADER,
-						   GFP_ATOMIC);
+		struct sk_buff *nskb;
+
+		skb_mstamp_get(&skb->skb_mstamp);
+		nskb = __pskb_copy(skb, MAX_TCP_HEADER, GFP_ATOMIC);
 		err = nskb ? tcp_transmit_skb(sk, nskb, 0, GFP_ATOMIC) :
 			     -ENOBUFS;
 	} else {