diff mbox series

[net] tcp: do not mangle skb->cb[] in tcp_make_synack()

Message ID 1509651025.2849.23.camel@edumazet-glaptop3.roam.corp.google.com
State Accepted, archived
Delegated to: David Miller
Headers show
Series [net] tcp: do not mangle skb->cb[] in tcp_make_synack() | expand

Commit Message

Eric Dumazet Nov. 2, 2017, 7:30 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

Christoph Paasch sent a patch to address the following issue :

tcp_make_synack() is leaving some TCP private info in skb->cb[],
then send the packet by other means than tcp_transmit_skb()

tcp_transmit_skb() makes sure to clear skb->cb[] to not confuse
IPv4/IPV6 stacks, but we have no such cleanup for SYNACK.

tcp_make_synack() should not use tcp_init_nondata_skb() :

tcp_init_nondata_skb() really should be limited to skbs put in write/rtx
queues (the ones that are only sent via tcp_transmit_skb())

This patch fixes the issue and should even save few cpu cycles ;)

Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Christoph Paasch <cpaasch@apple.com>
---
 net/ipv4/tcp_output.c |    9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

Comments

Christoph Paasch Nov. 2, 2017, 7:49 p.m. UTC | #1
On 02/11/17 - 12:30:25, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Christoph Paasch sent a patch to address the following issue :
> 
> tcp_make_synack() is leaving some TCP private info in skb->cb[],
> then send the packet by other means than tcp_transmit_skb()
> 
> tcp_transmit_skb() makes sure to clear skb->cb[] to not confuse
> IPv4/IPV6 stacks, but we have no such cleanup for SYNACK.
> 
> tcp_make_synack() should not use tcp_init_nondata_skb() :
> 
> tcp_init_nondata_skb() really should be limited to skbs put in write/rtx
> queues (the ones that are only sent via tcp_transmit_skb())
> 
> This patch fixes the issue and should even save few cpu cycles ;)
> 
> Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Christoph Paasch <cpaasch@apple.com>
> ---
>  net/ipv4/tcp_output.c |    9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)

Reviewed-by: Christoph Paasch <cpaasch@apple.com>
David Miller Nov. 3, 2017, 5:32 a.m. UTC | #2
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 02 Nov 2017 12:30:25 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> Christoph Paasch sent a patch to address the following issue :
> 
> tcp_make_synack() is leaving some TCP private info in skb->cb[],
> then send the packet by other means than tcp_transmit_skb()
> 
> tcp_transmit_skb() makes sure to clear skb->cb[] to not confuse
> IPv4/IPV6 stacks, but we have no such cleanup for SYNACK.
> 
> tcp_make_synack() should not use tcp_init_nondata_skb() :
> 
> tcp_init_nondata_skb() really should be limited to skbs put in write/rtx
> queues (the ones that are only sent via tcp_transmit_skb())
> 
> This patch fixes the issue and should even save few cpu cycles ;)
> 
> Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Christoph Paasch <cpaasch@apple.com>

Applied and queued up for -stable.
diff mbox series

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 823003eef3a21a5cc5c27e0be9f46159afa060df..478909f4694d00076c96b7a3be1eda62b6be8bef 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3180,13 +3180,8 @@  struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
 	th->source = htons(ireq->ir_num);
 	th->dest = ireq->ir_rmt_port;
 	skb->mark = ireq->ir_mark;
-	/* Setting of flags are superfluous here for callers (and ECE is
-	 * not even correctly set)
-	 */
-	tcp_init_nondata_skb(skb, tcp_rsk(req)->snt_isn,
-			     TCPHDR_SYN | TCPHDR_ACK);
-
-	th->seq = htonl(TCP_SKB_CB(skb)->seq);
+	skb->ip_summed = CHECKSUM_PARTIAL;
+	th->seq = htonl(tcp_rsk(req)->snt_isn);
 	/* XXX data is queued and acked as is. No buffer/window check */
 	th->ack_seq = htonl(tcp_rsk(req)->rcv_nxt);