From patchwork Fri Mar 22 15:15:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Willem de Bruijn X-Patchwork-Id: 1061304 Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="hU2CPwXc"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44QnK92zvRz9sRm for ; Sat, 23 Mar 2019 02:15:25 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727092AbfCVPPX (ORCPT ); Fri, 22 Mar 2019 11:15:23 -0400 Received: from mail-qt1-f194.google.com ([209.85.160.194]:35075 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726681AbfCVPPU (ORCPT ); Fri, 22 Mar 2019 11:15:20 -0400 Received: by mail-qt1-f194.google.com with SMTP id h39so2916175qte.2 for ; Fri, 22 Mar 2019 08:15:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=dYDUyrmySvIu7IN8aocM6MTGfzr672SlfeFaQPagAyA=; b=hU2CPwXc0KtfdEK1pW6IxICOQXKES5eBZDOhlTW/7hHmaFFZymWG+YeXYTkyYTcZqI mn52JZTkrW/qmGnbaFFbUQxJhkVNPbh+ntuuxpjvXS30G1tdtpS77X42/3SNL0DTe6G9 LC4gt4/SidH7xjuF9x9RKyWwv6/3jHpKJlIpVPQ2OACtbBf6n2fR3ivJocKnOl5GSryg j4nPYz2dcVW+aG/lgwRS6rTtDwb0LOHPGb33EYHGkRmSNUUOzwL5DyuBWIHWv32KEN/1 3Gb+JjWcTpvjbvLY5xdeh/pmy5IZXB5GGruKPhV4CT21xmLuj94hDczXaeHTmoeaNYrt aNHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dYDUyrmySvIu7IN8aocM6MTGfzr672SlfeFaQPagAyA=; b=bOWEebW2hqGc8VwTxY1G+clQC7SY4LJ1vvBA4yNI87OOWEDvyZOvPAdTB/TlZ8jnzf 9srcvzQe3YkBwRJWx6GTNQ2Pw8YcNNOCeqYDHLqAJVZfr9gRPqB5+5+Un5UzRCKF5m6u Jr7LCuyXZAhmiWZWJ9CtdJWbIPb4cF+iWGCXyaOw5Y73Ig+2S77w18q+yhEPjSZPVNZu cyvwCSX2FZE7iAfQT3nQGSy+y2J/N/vEmCI1gozWjfdoYgzr3bw4M+xjRDR+t9AohHMz +Gct7dif4kG2Ru7vn3ycxY1vUCQ1soTBnGyK/GxmZ2u8LP4qwMJdYm64/fy2u6KgORBW XrhQ== X-Gm-Message-State: APjAAAWqKGlRxNt3CitdqIT/i+khpbKBQe1QtmjreRiXtTv7cpHUjNIJ LviBXgoS7oQte6GctOtIeWh34wFO X-Google-Smtp-Source: APXvYqyL6yM+GqTIzCRstqY2FNc1CgI4FKRtBHAwxEX0ArpCboGu9+uZWz8ScxRVDnM8jixCUF3r6A== X-Received: by 2002:ac8:1884:: with SMTP id s4mr8776822qtj.339.1553267719189; Fri, 22 Mar 2019 08:15:19 -0700 (PDT) Received: from willemb1.nyc.corp.google.com ([2620:0:1003:315:3fa1:a34c:1128:1d39]) by smtp.gmail.com with ESMTPSA id v4sm4631317qtq.94.2019.03.22.08.15.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 22 Mar 2019 08:15:18 -0700 (PDT) From: Willem de Bruijn To: netdev@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, alan.maguire@oracle.com, Willem de Bruijn Subject: [PATCH bpf-next v2 09/13] bpf: add bpf_skb_adjust_room encap flags Date: Fri, 22 Mar 2019 11:15:00 -0400 Message-Id: <20190322151504.89983-10-willemdebruijn.kernel@gmail.com> X-Mailer: git-send-email 2.21.0.392.gf8f6787159e-goog In-Reply-To: <20190322151504.89983-1-willemdebruijn.kernel@gmail.com> References: <20190322151504.89983-1-willemdebruijn.kernel@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Willem de Bruijn When pushing tunnel headers, annotate skbs in the same way as tunnel devices. For GSO packets, the network stack requires certain fields set to segment packets with tunnel headers. gro_gse_segment depends on transport and inner mac header, for instance. Add an option to pass this information. Remove the restriction on len_diff to network header length, which is too short, e.g., for GRE protocols. Changes v1->v2: - document new flags - BPF_F_ADJ_ROOM_MASK moved Signed-off-by: Willem de Bruijn --- include/uapi/linux/bpf.h | 19 +++++++++++- net/core/filter.c | 63 ++++++++++++++++++++++++++++++++++++---- 2 files changed, 76 insertions(+), 6 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 4f157d0ec571..f770f0de5b9c 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1486,11 +1486,20 @@ union bpf_attr { * * **BPF_ADJ_ROOM_NET**: Adjust room at the network layer * (room space is added or removed below the layer 3 header). * - * There is one supported flag at this time: + * The following flags are supported at this time: * * * **BPF_F_ADJ_ROOM_FIXED_GSO**: Do not adjust gso_size. * Adjusting mss in this way is not allowed for datagrams. * + * * **BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 **: + * * **BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 **: + * Any new space is reserved to hold a tunnel header. + * Configure skb offsets and other fields accordingly. + * + * * **BPF_F_ADJ_ROOM_ENCAP_L4_GRE **: + * * **BPF_F_ADJ_ROOM_ENCAP_L4_UDP **: + * Use with ENCAP_L3 flags to further specify the tunnel type. + * * A call to this helper is susceptible to change the underlaying * packet buffer. Therefore, at load time, all checks on pointers * previously done by the verifier are invalidated and must be @@ -2632,6 +2641,14 @@ enum bpf_func_id { /* BPF_FUNC_skb_adjust_room flags. */ #define BPF_F_ADJ_ROOM_FIXED_GSO (1ULL << 0) +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 (1ULL << 1) +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 (1ULL << 2) +#define BPF_F_ADJ_ROOM_ENCAP_L3_MASK (BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \ + BPF_F_ADJ_ROOM_ENCAP_L3_IPV6) + +#define BPF_F_ADJ_ROOM_ENCAP_L4_GRE (1ULL << 3) +#define BPF_F_ADJ_ROOM_ENCAP_L4_UDP (1ULL << 4) + /* Mode for BPF_FUNC_skb_adjust_room helper. */ enum bpf_adj_room_mode { BPF_ADJ_ROOM_NET, diff --git a/net/core/filter.c b/net/core/filter.c index 393d1e4903b5..d0ebeb0147bc 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2966,6 +2966,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb) static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff, u64 flags) { + bool encap = flags & BPF_F_ADJ_ROOM_ENCAP_L3_MASK; + unsigned int gso_type = SKB_GSO_DODGY; + u16 mac_len, inner_net, inner_trans; int ret; if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) { @@ -2979,10 +2982,60 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff, if (unlikely(ret < 0)) return ret; + if (encap) { + if (skb->protocol != htons(ETH_P_IP) && + skb->protocol != htons(ETH_P_IPV6)) + return -ENOTSUPP; + + if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 && + flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6) + return -EINVAL; + + if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE && + flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP) + return -EINVAL; + + if (skb->encapsulation) + return -EALREADY; + + mac_len = skb->network_header - skb->mac_header; + inner_net = skb->network_header; + inner_trans = skb->transport_header; + } + ret = bpf_skb_net_hdr_push(skb, off, len_diff); if (unlikely(ret < 0)) return ret; + if (encap) { + /* inner mac == inner_net on l3 encap */ + skb->inner_mac_header = inner_net; + skb->inner_network_header = inner_net; + skb->inner_transport_header = inner_trans; + skb_set_inner_protocol(skb, skb->protocol); + + skb->encapsulation = 1; + skb_set_network_header(skb, mac_len); + + if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP) + gso_type |= SKB_GSO_UDP_TUNNEL; + else if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE) + gso_type |= SKB_GSO_GRE; + else if (flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6) + gso_type |= SKB_GSO_IPXIP6; + else + gso_type |= SKB_GSO_IPXIP4; + + if (flags & BPF_F_ADJ_ROOM_ENCAP_L4_GRE || + flags & BPF_F_ADJ_ROOM_ENCAP_L4_UDP) { + int nh_len = flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 ? + sizeof(struct ipv6hdr) : + sizeof(struct iphdr); + + skb_set_transport_header(skb, mac_len + nh_len); + } + } + if (skb_is_gso(skb)) { struct skb_shared_info *shinfo = skb_shinfo(skb); @@ -2991,7 +3044,7 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff, skb_decrease_gso_size(shinfo, len_diff); /* Header must be checked, and gso_segs recomputed. */ - shinfo->gso_type |= SKB_GSO_DODGY; + shinfo->gso_type |= gso_type; shinfo->gso_segs = 0; } @@ -3039,12 +3092,14 @@ static u32 __bpf_skb_max_len(const struct sk_buff *skb) SKB_MAX_ALLOC; } -#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO) +#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \ + BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \ + BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \ + BPF_F_ADJ_ROOM_ENCAP_L4_UDP) BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, u32, mode, u64, flags) { - bool trans_same = skb->transport_header == skb->network_header; u32 len_cur, len_diff_abs = abs(len_diff); u32 len_min = bpf_skb_net_base_len(skb); u32 len_max = __bpf_skb_max_len(skb); @@ -3073,8 +3128,6 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, } len_cur = skb->len - skb_network_offset(skb); - if (skb_transport_header_was_set(skb) && !trans_same) - len_cur = skb_network_header_len(skb); if ((shrink && (len_diff_abs >= len_cur || len_cur - len_diff_abs < len_min)) || (!shrink && (skb->len + len_diff_abs > len_max &&