From patchwork Thu Mar 7 23:21:51 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pravin B Shelar X-Patchwork-Id: 225998 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id C70FA2C03AA for ; Fri, 8 Mar 2013 10:20:34 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760025Ab3CGXU1 (ORCPT ); Thu, 7 Mar 2013 18:20:27 -0500 Received: from na3sys009aog117.obsmtp.com ([74.125.149.242]:52901 "HELO na3sys009aog117.obsmtp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1757597Ab3CGXUY (ORCPT ); Thu, 7 Mar 2013 18:20:24 -0500 Received: from mail-ye0-f198.google.com ([209.85.213.198]) (using TLSv1) by na3sys009aob117.postini.com ([74.125.148.12]) with SMTP ID DSNKUTkgtsyBhWNXzEQXIzy5Acz40JWvL6rN@postini.com; Thu, 07 Mar 2013 15:20:24 PST Received: by mail-ye0-f198.google.com with SMTP id m13so1435019yen.1 for ; Thu, 07 Mar 2013 15:20:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:x-received:from:to:cc:subject:date:message-id:x-mailer :x-gm-message-state; bh=uSDZeV6Kyqea6iWXMzmvCZ11WRB8jjhjA+SDltPoj/M=; b=Gl3w4PdrNaurXf6eDvqX/tC7gA4bbOCjwuYYr82UgXhp6xZsaZelDLIG3rPE7ucNdY xVDpUL/pldyvJHNhBPTxcKp15KKnf0Pmyr76qee/PL5JzwVeBUxGeDdFWMqCq/C/SWip Bab58M9cHsQY+r86l3nmV5w+MdTjkESmFDScX68cyf6h4t+zpn9bBorIwj67JVQujeDM eyHg1FWMHMBbIGSXeEn7zmvHnhjUXi1LLOJsmaPxDp2a71H6jaOtSCqUk+huSF2UCqQB ePWuOr+JvidRxHolD6X6vpFXGlbdii1UxKEqi93baJlMZ4oktKBR334fI1veTyI4vOT0 dprA== X-Received: by 10.236.160.7 with SMTP id t7mr17249yhk.173.1362698422911; Thu, 07 Mar 2013 15:20:22 -0800 (PST) X-Received: by 10.236.160.7 with SMTP id t7mr17244yhk.173.1362698422797; Thu, 07 Mar 2013 15:20:22 -0800 (PST) Received: from localhost ([64.125.181.92]) by mx.google.com with ESMTPS id x16sm4764033yhj.6.2013.03.07.15.20.22 (version=TLSv1.1 cipher=RC4-SHA bits=128/128); Thu, 07 Mar 2013 15:20:22 -0800 (PST) From: Pravin B Shelar To: davem@davemloft.net, netdev@vger.kernel.org Cc: jesse@nicira.com, Pravin B Shelar Subject: [PATCH net-next 3/4] tunneling: Add generic Tunnel segmentation. Date: Thu, 7 Mar 2013 15:21:51 -0800 Message-Id: <1362698511-2477-1-git-send-email-pshelar@nicira.com> X-Mailer: git-send-email 1.7.12.315.g682ce8b X-Gm-Message-State: ALoCoQmWRRQlMpn0RQRf/PX3qVqoBo8hcsaRdQv19iCMrn7ZwMmHM0DVndNT99LrOKdOjuelxoavHjM8yvJGIhs//+rlHzmQGa0jQy/kTbUompcgEFIvSm6KHNzYuKV/euv5DIcgxA0SaqlZfZk0NEgCpZVZzck30CGrHrMZXeywylYXEI7Qvao= Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Adds generic tunneling offloading support for IPv4-UDP based tunnels. GSO type is added to request this offload for a skb. netdev feature NETIF_F_UDP_TUNNEL is added for hardware offloaded udp-tunnel support. Currently no device supports this feature, software offload is used. This can be used by tunneling protocols like VXLAN. CC: Jesse Gross Signed-off-by: Pravin B Shelar --- include/linux/netdev_features.h | 7 ++- include/linux/skbuff.h | 2 + net/core/ethtool.c | 1 + net/ipv4/af_inet.c | 6 ++- net/ipv4/tcp.c | 1 + net/ipv4/udp.c | 115 ++++++++++++++++++++++++++++++--------- net/ipv6/ip6_offload.c | 1 + net/ipv6/udp_offload.c | 10 +++- 8 files changed, 112 insertions(+), 31 deletions(-) diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h index 3dd3934..9980e8d 100644 --- a/include/linux/netdev_features.h +++ b/include/linux/netdev_features.h @@ -42,9 +42,9 @@ enum { NETIF_F_TSO6_BIT, /* ... TCPv6 segmentation */ NETIF_F_FSO_BIT, /* ... FCoE segmentation */ NETIF_F_GSO_GRE_BIT, /* ... GRE with TSO */ - /**/NETIF_F_GSO_LAST, /* [can't be last bit, see GSO_MASK] */ - NETIF_F_GSO_RESERVED2 /* ... free (fill GSO_MASK to 8 bits) */ - = NETIF_F_GSO_LAST, + NETIF_F_GSO_UDP_TUNNEL_BIT, /* ... UDP TUNNEL with TSO */ + /**/NETIF_F_GSO_LAST = /* last bit, see GSO_MASK */ + NETIF_F_GSO_UDP_TUNNEL_BIT, NETIF_F_FCOE_CRC_BIT, /* FCoE CRC32 */ NETIF_F_SCTP_CSUM_BIT, /* SCTP checksum offload */ @@ -103,6 +103,7 @@ enum { #define NETIF_F_RXFCS __NETIF_F(RXFCS) #define NETIF_F_RXALL __NETIF_F(RXALL) #define NETIF_F_GRE_GSO __NETIF_F(GSO_GRE) +#define NETIF_F_UDP_TUNNEL __NETIF_F(UDP_TUNNEL) /* Features valid for ethtool to change */ /* = all defined minus driver/device-class-related */ diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index d7f96ff..eb2106f 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -316,6 +316,8 @@ enum { SKB_GSO_FCOE = 1 << 5, SKB_GSO_GRE = 1 << 6, + + SKB_GSO_UDP_TUNNEL = 1 << 7, }; #if BITS_PER_LONG > 32 diff --git a/net/core/ethtool.c b/net/core/ethtool.c index 3e9b2c3..adc1351 100644 --- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -78,6 +78,7 @@ static const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN] [NETIF_F_TSO6_BIT] = "tx-tcp6-segmentation", [NETIF_F_FSO_BIT] = "tx-fcoe-segmentation", [NETIF_F_GSO_GRE_BIT] = "tx-gre-segmentation", + [NETIF_F_GSO_UDP_TUNNEL_BIT] = "tx-udp_tnl-segmentation", [NETIF_F_FCOE_CRC_BIT] = "tx-checksum-fcoe-crc", [NETIF_F_SCTP_CSUM_BIT] = "tx-checksum-sctp", diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 68f6a94..1924069 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1283,6 +1283,7 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb, int ihl; int id; unsigned int offset = 0; + bool tunnel; if (!(features & NETIF_F_V4_CSUM)) features &= ~NETIF_F_SG; @@ -1293,6 +1294,7 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb, SKB_GSO_DODGY | SKB_GSO_TCP_ECN | SKB_GSO_GRE | + SKB_GSO_UDP_TUNNEL | 0))) goto out; @@ -1307,6 +1309,8 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb, if (unlikely(!pskb_may_pull(skb, ihl))) goto out; + tunnel = !!skb->encapsulation; + __skb_pull(skb, ihl); skb_reset_transport_header(skb); iph = ip_hdr(skb); @@ -1326,7 +1330,7 @@ static struct sk_buff *inet_gso_segment(struct sk_buff *skb, skb = segs; do { iph = ip_hdr(skb); - if (proto == IPPROTO_UDP) { + if (!tunnel && proto == IPPROTO_UDP) { iph->id = htons(id); iph->frag_off = htons(offset >> 3); if (skb->next != NULL) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 47e854f..8d14573 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3044,6 +3044,7 @@ struct sk_buff *tcp_tso_segment(struct sk_buff *skb, SKB_GSO_TCP_ECN | SKB_GSO_TCPV6 | SKB_GSO_GRE | + SKB_GSO_UDP_TUNNEL | 0) || !(type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))) goto out; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 265c42c..41760e0 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2272,31 +2272,88 @@ void __init udp_init(void) int udp4_ufo_send_check(struct sk_buff *skb) { - const struct iphdr *iph; - struct udphdr *uh; - - if (!pskb_may_pull(skb, sizeof(*uh))) + if (!pskb_may_pull(skb, sizeof(struct udphdr))) return -EINVAL; - iph = ip_hdr(skb); - uh = udp_hdr(skb); + if (likely(!skb->encapsulation)) { + const struct iphdr *iph; + struct udphdr *uh; - uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, skb->len, - IPPROTO_UDP, 0); - skb->csum_start = skb_transport_header(skb) - skb->head; - skb->csum_offset = offsetof(struct udphdr, check); - skb->ip_summed = CHECKSUM_PARTIAL; + iph = ip_hdr(skb); + uh = udp_hdr(skb); + + uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, skb->len, + IPPROTO_UDP, 0); + skb->csum_start = skb_transport_header(skb) - skb->head; + skb->csum_offset = offsetof(struct udphdr, check); + skb->ip_summed = CHECKSUM_PARTIAL; + } return 0; } +static struct sk_buff *skb_udp_tunnel_segment(struct sk_buff *skb, + netdev_features_t features) +{ + struct sk_buff *segs = ERR_PTR(-EINVAL); + int mac_len = skb->mac_len; + int tnl_hlen = skb_inner_mac_header(skb) - skb_transport_header(skb); + int outer_hlen; + netdev_features_t enc_features; + + if (unlikely(!pskb_may_pull(skb, tnl_hlen))) + goto out; + + skb->encapsulation = 0; + __skb_pull(skb, tnl_hlen); + skb_reset_mac_header(skb); + skb_set_network_header(skb, skb_inner_network_offset(skb)); + skb->mac_len = skb_inner_network_offset(skb); + + /* segment inner packet. */ + enc_features = skb->dev->hw_enc_features & netif_skb_features(skb); + segs = skb_mac_gso_segment(skb, enc_features); + if (!segs || IS_ERR(segs)) + goto out; + + outer_hlen = skb_tnl_header_len(skb); + skb = segs; + do { + struct udphdr *uh; + int udp_offset = outer_hlen - tnl_hlen; + + skb->mac_len = mac_len; + + skb_push(skb, outer_hlen); + skb_reset_mac_header(skb); + skb_set_network_header(skb, mac_len); + skb_set_transport_header(skb, udp_offset); + uh = udp_hdr(skb); + uh->len = htons(skb->len - udp_offset); + + /* csum segment if tunnel sets skb with csum. */ + if (unlikely(uh->check)) { + struct iphdr *iph = ip_hdr(skb); + + uh->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, + skb->len - udp_offset, + IPPROTO_UDP, 0); + uh->check = csum_fold(skb_checksum(skb, udp_offset, + skb->len - udp_offset, 0)); + if (uh->check == 0) + uh->check = CSUM_MANGLED_0; + + } + skb->ip_summed = CHECKSUM_NONE; + } while ((skb = skb->next)); +out: + return segs; +} + struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb, netdev_features_t features) { struct sk_buff *segs = ERR_PTR(-EINVAL); unsigned int mss; - int offset; - __wsum csum; - mss = skb_shinfo(skb)->gso_size; if (unlikely(skb->len <= mss)) goto out; @@ -2306,6 +2363,7 @@ struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb, int type = skb_shinfo(skb)->gso_type; if (unlikely(type & ~(SKB_GSO_UDP | SKB_GSO_DODGY | + SKB_GSO_UDP_TUNNEL | SKB_GSO_GRE) || !(type & (SKB_GSO_UDP)))) goto out; @@ -2316,20 +2374,27 @@ struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb, goto out; } - /* Do software UFO. Complete and fill in the UDP checksum as HW cannot - * do checksum of UDP packets sent as multiple IP fragments. - */ - offset = skb_checksum_start_offset(skb); - csum = skb_checksum(skb, offset, skb->len - offset, 0); - offset += skb->csum_offset; - *(__sum16 *)(skb->data + offset) = csum_fold(csum); - skb->ip_summed = CHECKSUM_NONE; - /* Fragment the skb. IP headers of the fragments are updated in * inet_gso_segment() */ - segs = skb_segment(skb, features); + if (skb->encapsulation && skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL) + segs = skb_udp_tunnel_segment(skb, features); + else { + int offset; + __wsum csum; + + /* Do software UFO. Complete and fill in the UDP checksum as + * HW cannot do checksum of UDP packets sent as multiple + * IP fragments. + */ + offset = skb_checksum_start_offset(skb); + csum = skb_checksum(skb, offset, skb->len - offset, 0); + offset += skb->csum_offset; + *(__sum16 *)(skb->data + offset) = csum_fold(csum); + skb->ip_summed = CHECKSUM_NONE; + + segs = skb_segment(skb, features); + } out: return segs; } - diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index 8234c1d..e02a65a 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -100,6 +100,7 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb, SKB_GSO_DODGY | SKB_GSO_TCP_ECN | SKB_GSO_GRE | + SKB_GSO_UDP_TUNNEL | SKB_GSO_TCPV6 | 0))) goto out; diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c index cf05cf0..3ced14e 100644 --- a/net/ipv6/udp_offload.c +++ b/net/ipv6/udp_offload.c @@ -21,6 +21,10 @@ static int udp6_ufo_send_check(struct sk_buff *skb) const struct ipv6hdr *ipv6h; struct udphdr *uh; + /* UDP Tunnel offload on ipv6 is not yet supported. */ + if (skb->encapsulation) + return -EINVAL; + if (!pskb_may_pull(skb, sizeof(*uh))) return -EINVAL; @@ -56,8 +60,10 @@ static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb, /* Packet is from an untrusted source, reset gso_segs. */ int type = skb_shinfo(skb)->gso_type; - if (unlikely(type & ~(SKB_GSO_UDP | SKB_GSO_DODGY | - SKB_GSO_GRE) || + if (unlikely(type & ~(SKB_GSO_UDP | + SKB_GSO_DODGY | + SKB_GSO_UDP_TUNNEL | + SKB_GSO_GRE) || !(type & (SKB_GSO_UDP)))) goto out;