diff mbox

[net] ipv6: update skb->csum when CE mark is propagated

Message ID 1452862616.1223.165.camel@edumazet-glaptop2.roam.corp.google.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Jan. 15, 2016, 12:56 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

When a tunnel decapsulates the outer header, it has to comply
with RFC 6080 and eventually propagate CE mark into inner header.

It turns out IP6_ECN_set_ce() does not correctly update skb->csum
for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure"
messages and stack traces.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/inet_ecn.h       |   19 ++++++++++++++++---
 net/ipv6/xfrm6_mode_tunnel.c |    2 +-
 2 files changed, 17 insertions(+), 4 deletions(-)

Comments

Herbert Xu Jan. 15, 2016, 1:45 p.m. UTC | #1
On Fri, Jan 15, 2016 at 04:56:56AM -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> When a tunnel decapsulates the outer header, it has to comply
> with RFC 6080 and eventually propagate CE mark into inner header.
> 
> It turns out IP6_ECN_set_ce() does not correctly update skb->csum
> for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure"
> messages and stack traces.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Good catch!

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
Eric Dumazet Jan. 15, 2016, 3:44 p.m. UTC | #2
On Fri, 2016-01-15 at 21:45 +0800, Herbert Xu wrote:
> On Fri, Jan 15, 2016 at 04:56:56AM -0800, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > When a tunnel decapsulates the outer header, it has to comply
> > with RFC 6080 and eventually propagate CE mark into inner header.
> > 
> > It turns out IP6_ECN_set_ce() does not correctly update skb->csum
> > for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure"
> > messages and stack traces.
> > 
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> 
> Good catch!

Thanks Herbert

Note that I considered to use 

skb->csum = csum_add(skb->csum, from ^ to);

instead of the more generic

skb->csum = csum_add(csum_sub(skb->csum, from), to);

I can spin a v2 if you guys prefer the optimized version (but a little
more hacky...)
David Miller Jan. 15, 2016, 8:07 p.m. UTC | #3
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 15 Jan 2016 04:56:56 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> When a tunnel decapsulates the outer header, it has to comply
> with RFC 6080 and eventually propagate CE mark into inner header.
> 
> It turns out IP6_ECN_set_ce() does not correctly update skb->csum
> for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure"
> messages and stack traces.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable, thanks Eric.
diff mbox

Patch

diff --git a/include/net/inet_ecn.h b/include/net/inet_ecn.h
index 84b20835b736..0dc0a51da38f 100644
--- a/include/net/inet_ecn.h
+++ b/include/net/inet_ecn.h
@@ -111,11 +111,24 @@  static inline void ipv4_copy_dscp(unsigned int dscp, struct iphdr *inner)
 
 struct ipv6hdr;
 
-static inline int IP6_ECN_set_ce(struct ipv6hdr *iph)
+/* Note:
+ * IP_ECN_set_ce() has to tweak IPV4 checksum when setting CE,
+ * meaning both changes have no effect on skb->csum if/when CHECKSUM_COMPLETE
+ * In IPv6 case, no checksum compensates the change in IPv6 header,
+ * so we have to update skb->csum.
+ */
+static inline int IP6_ECN_set_ce(struct sk_buff *skb, struct ipv6hdr *iph)
 {
+	__be32 from, to;
+
 	if (INET_ECN_is_not_ect(ipv6_get_dsfield(iph)))
 		return 0;
-	*(__be32*)iph |= htonl(INET_ECN_CE << 20);
+
+	from = *(__be32 *)iph;
+	to = from | htonl(INET_ECN_CE << 20);
+	*(__be32 *)iph = to;
+	if (skb->ip_summed == CHECKSUM_COMPLETE)
+		skb->csum = csum_add(csum_sub(skb->csum, from), to);
 	return 1;
 }
 
@@ -142,7 +155,7 @@  static inline int INET_ECN_set_ce(struct sk_buff *skb)
 	case cpu_to_be16(ETH_P_IPV6):
 		if (skb_network_header(skb) + sizeof(struct ipv6hdr) <=
 		    skb_tail_pointer(skb))
-			return IP6_ECN_set_ce(ipv6_hdr(skb));
+			return IP6_ECN_set_ce(skb, ipv6_hdr(skb));
 		break;
 	}
 
diff --git a/net/ipv6/xfrm6_mode_tunnel.c b/net/ipv6/xfrm6_mode_tunnel.c
index f7fbdbabe50e..372855eeaf42 100644
--- a/net/ipv6/xfrm6_mode_tunnel.c
+++ b/net/ipv6/xfrm6_mode_tunnel.c
@@ -23,7 +23,7 @@  static inline void ipip6_ecn_decapsulate(struct sk_buff *skb)
 	struct ipv6hdr *inner_iph = ipipv6_hdr(skb);
 
 	if (INET_ECN_is_ce(XFRM_MODE_SKB_CB(skb)->tos))
-		IP6_ECN_set_ce(inner_iph);
+		IP6_ECN_set_ce(skb, inner_iph);
 }
 
 /* Add encapsulation header.