From patchwork Fri Jul 31 04:44:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yoshiki Komachi X-Patchwork-Id: 1339217 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=Fwn4fVpI; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BHvpQ0zSDz9sRK for ; Fri, 31 Jul 2020 14:45:02 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726257AbgGaEpB (ORCPT ); Fri, 31 Jul 2020 00:45:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726058AbgGaEpB (ORCPT ); Fri, 31 Jul 2020 00:45:01 -0400 Received: from mail-pj1-x1043.google.com (mail-pj1-x1043.google.com [IPv6:2607:f8b0:4864:20::1043]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 59EB2C061574; Thu, 30 Jul 2020 21:45:01 -0700 (PDT) Received: by mail-pj1-x1043.google.com with SMTP id k1so6196654pjt.5; Thu, 30 Jul 2020 21:45:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=WD31emG5AAw62cTFIzmY9OBwp6Qx1m67HwbYW6jOzgE=; b=Fwn4fVpIuOoiY6zQvbpgMCZzaLl4zm5qe1vcnKdKkipMxP8rnrEnNmarXXW2F4251W OuOcoCTWnnG7RJOBGbpCpl7VFICmd6rktak1eV2hL+YKaPj74PK2wemOGPCAIpWTAlsD s9pCQcj2W5ErTSrqlp6pkn2Tzj8EbAA5QIxVNYiew5xpGC2iH87AsfRkSRYAkKoz74cz JfZOIXzscNbFvLquroOiEG8zS9VfZiMt3bywjxx9RcoHr1rwb1k7fw9KDT7x1iR39X7j AFejoRKZvZy90KeIVo/YIjcWN1qGox6/8MUj6yxY+4Lc03xXyALCyh763BLEt+XKYnEO IQEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=WD31emG5AAw62cTFIzmY9OBwp6Qx1m67HwbYW6jOzgE=; b=R7/JD3yVo0JFSzvEOOAHVuRTpMHoc9wyhxZmsZSRcQ7v9QxYkU+owaW1H3p9JLIRny MewrqOE2y/H9PTWuUweCLT1xE0yGN2qRyIvUIy0N4ETcC4CCsW01t1ECYAAz88lb5pmz ZqUpBzPeN3BQu1et+4PVdq/E/fRZ1wXSwowgWgIMPl1i3tyWo5NfNfBW2VuFEDNpEQGb 0I+XB4DVeUOH5tGwE9kwOWZtCiME8w8kEEHYVRHQlhx46tfT7b9+heNGUPLS8bkA5R86 k8hwJFTatOHL7dpFYYDZ48j8gypKqR3yxvMidz/pHiCa6N8HtRwQltw13RRhpH27KddL cV6w== X-Gm-Message-State: AOAM530gTWk2dHrRIqyqA4dJjEzqa5GNd4BGk9jKOyHAfnBOvpr8nWQ3 DXd5GJXOnFKurMJaWeu4fVQ= X-Google-Smtp-Source: ABdhPJxfqbrlJkN0gPj7CdOreSoE9noW2lX2r4E2qd+4uyCDucE/2gd/JByR4Zr9P6JaNWPSgIdLtQ== X-Received: by 2002:a63:135b:: with SMTP id 27mr2084276pgt.37.1596170700904; Thu, 30 Jul 2020 21:45:00 -0700 (PDT) Received: from dali.ht.sfc.keio.ac.jp (dali.ht.sfc.keio.ac.jp. [133.27.170.2]) by smtp.gmail.com with ESMTPSA id x6sm2329573pfd.53.2020.07.30.21.44.57 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 Jul 2020 21:45:00 -0700 (PDT) From: Yoshiki Komachi To: "David S. Miller" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Jakub Kicinski , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , KP Singh , Roopa Prabhu , Nikolay Aleksandrov , David Ahern Cc: Yoshiki Komachi , netdev@vger.kernel.org, bridge@lists.linux-foundation.org, bpf@vger.kernel.org Subject: [RFC PATCH bpf-next 1/3] net/bridge: Add new function to access FDB from XDP programs Date: Fri, 31 Jul 2020 13:44:18 +0900 Message-Id: <1596170660-5582-2-git-send-email-komachi.yoshiki@gmail.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1596170660-5582-1-git-send-email-komachi.yoshiki@gmail.com> References: <1596170660-5582-1-git-send-email-komachi.yoshiki@gmail.com> Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This patch adds a function to find the destination port from the FDB in the kernel tables, which mainly helps XDP programs to access FDB in the kernel via bpf helper. Note that, unlike the existing br_fdb_find_port(), this function takes an ingress device as an argument. The br_fdb_find_port() also enables us to access FDB in the kernel, and rcu_read_lock()/rcu_read_unlock() must be called in the function. But, these are unnecessary in that cases because XDP programs have to call APIs with rcu_read_lock()/rcu_read_unlock(). Thus, proposed function could be used without these locks in the function. Signed-off-by: Yoshiki Komachi --- include/linux/if_bridge.h | 11 +++++++++++ net/bridge/br_fdb.c | 25 +++++++++++++++++++++++++ 2 files changed, 36 insertions(+) diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h index 6479a38e52fa..24d72d115d0b 100644 --- a/include/linux/if_bridge.h +++ b/include/linux/if_bridge.h @@ -127,6 +127,9 @@ static inline int br_vlan_get_info(const struct net_device *dev, u16 vid, struct net_device *br_fdb_find_port(const struct net_device *br_dev, const unsigned char *addr, __u16 vid); +struct net_device *br_fdb_find_port_xdp(const struct net_device *dev, + const unsigned char *addr, + __u16 vid); void br_fdb_clear_offload(const struct net_device *dev, u16 vid); bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag); #else @@ -138,6 +141,14 @@ br_fdb_find_port(const struct net_device *br_dev, return NULL; } +static inline struct net_device * +br_fdb_find_port_xdp(const struct net_device *dev, + const unsigned char *addr, + __u16 vid); +{ + return NULL; +} + static inline void br_fdb_clear_offload(const struct net_device *dev, u16 vid) { } diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 9db504baa094..79bc3c2da668 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -141,6 +141,31 @@ struct net_device *br_fdb_find_port(const struct net_device *br_dev, } EXPORT_SYMBOL_GPL(br_fdb_find_port); +struct net_device *br_fdb_find_port_xdp(const struct net_device *dev, + const unsigned char *addr, + __u16 vid) +{ + struct net_bridge_fdb_entry *f; + struct net_device *dst = NULL; + struct net_bridge *br = NULL; + struct net_bridge_port *p; + + p = br_port_get_check_rcu(dev); + if (!p) + return NULL; + + br = p->br; + if (!br) + return NULL; + + f = br_fdb_find_rcu(br, addr, vid); + if (f && f->dst) + dst = f->dst->dev; + + return dst; +} +EXPORT_SYMBOL_GPL(br_fdb_find_port_xdp); + struct net_bridge_fdb_entry *br_fdb_find_rcu(struct net_bridge *br, const unsigned char *addr, __u16 vid) From patchwork Fri Jul 31 04:44:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yoshiki Komachi X-Patchwork-Id: 1339219 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=U3GTlWMC; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BHvpf5XZdz9sRK for ; Fri, 31 Jul 2020 14:45:14 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726482AbgGaEpO (ORCPT ); Fri, 31 Jul 2020 00:45:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726058AbgGaEpN (ORCPT ); Fri, 31 Jul 2020 00:45:13 -0400 Received: from mail-pj1-x1041.google.com (mail-pj1-x1041.google.com [IPv6:2607:f8b0:4864:20::1041]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5C46C061574; Thu, 30 Jul 2020 21:45:13 -0700 (PDT) Received: by mail-pj1-x1041.google.com with SMTP id t6so2802130pjr.0; Thu, 30 Jul 2020 21:45:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=r2Yg1heF9fxl++ije5V+tZ46UP8ORkDq+IRAfwbTvxk=; b=U3GTlWMCMbnB0a1a4nUTMWYWJEA32XbAALwB1tetqYaCWUPV85CJH9AqgkuB4OBKep WpOzzoXUc5MIukEcFMO3EkYy0pxv6Mg9nICkhOvBeFgg8LawSReCJw2VwYmIFG4dH25y d8BWAdluMZlCiU7fVwOkFbiS3Wvy0SYyGpaGAy6qvtc1+7WZFVqXvPN5pbkzxcvujvLh tjkdBC5kLtZ+ZxIEJ3spLamUORNU/mLyWd0LHDWTDN3EeqdbP3SkI5Hn0XQI5RQ9IALS gOf4QVsYRpb0n3jLtx0/LdF6sJWf/nWkGS2KM8fu/vy2x40SKFBzls8dOdsyXKeNh808 oeCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=r2Yg1heF9fxl++ije5V+tZ46UP8ORkDq+IRAfwbTvxk=; b=BBC3PK7vQxoUdyziwd6al0ffSQ4ksA8sy7t3NNg4ky9fNAqzSxvlrZgY9QM/xLTiMa FwHqZPv2FfHdSBhcngOjtDpoypJeC+aaK23MOi6CGeYzT7vtJ4Bk5Bx11N1i5yzmwdWA NEKKHJXGJO5GTy2qkZ8ayHLEBRV3fELG6VpssK1fk5yj+HjwEuRoWp7YRBs7tcBsF7/5 pnMYoNdMSoaM9yh4LkwHhMn4D+dU4HIcE/JAcQduNi3uM4rJLkcaFxcCT0WNCmMvid3T Bb9Xwtg8v3pvckXdvwNi/QFO8S1bi6Hu0qwdmDrIIqNABkDUw5KC+1lJVvcAE0a2/9cK Vqig== X-Gm-Message-State: AOAM530XU8XygoPIy8nLm64plbUQIiUmJ3SSkGiKifku+NJFK9pUn0Vq XN6f26L2WeCvpKhMVBDS2sY= X-Google-Smtp-Source: ABdhPJyVhtfT74Io/E2Qh7f4S4+FadEzukz6ZhrG85nz5UiUS4gy0EGNZCVuDiagMx8DyKSn2C+Zrw== X-Received: by 2002:a17:90b:24a:: with SMTP id fz10mr2332353pjb.36.1596170713251; Thu, 30 Jul 2020 21:45:13 -0700 (PDT) Received: from dali.ht.sfc.keio.ac.jp (dali.ht.sfc.keio.ac.jp. [133.27.170.2]) by smtp.gmail.com with ESMTPSA id x6sm2329573pfd.53.2020.07.30.21.45.09 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 Jul 2020 21:45:12 -0700 (PDT) From: Yoshiki Komachi To: "David S. Miller" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Jakub Kicinski , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , KP Singh , Roopa Prabhu , Nikolay Aleksandrov , David Ahern Cc: Yoshiki Komachi , netdev@vger.kernel.org, bridge@lists.linux-foundation.org, bpf@vger.kernel.org Subject: [RFC PATCH bpf-next 2/3] bpf: Add helper to do forwarding lookups in kernel FDB table Date: Fri, 31 Jul 2020 13:44:19 +0900 Message-Id: <1596170660-5582-3-git-send-email-komachi.yoshiki@gmail.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1596170660-5582-1-git-send-email-komachi.yoshiki@gmail.com> References: <1596170660-5582-1-git-send-email-komachi.yoshiki@gmail.com> Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This patch adds a new bpf helper to access FDB in the kernel tables from XDP programs. The helper enables us to find the destination port of master bridge in XDP layer with high speed. If an entry in the tables is successfully found, egress device index will be returned. In cases of failure, packets will be dropped or forwarded to upper networking stack in the kernel by XDP programs. Multicast and broadcast packets are currently not supported. Thus, these will need to be passed to upper layer on the basis of XDP_PASS action. The API uses destination MAC and VLAN ID as keys, so XDP programs need to extract these from forwarded packets. Signed-off-by: Yoshiki Komachi --- include/uapi/linux/bpf.h | 28 +++++++++++++++++++++ net/core/filter.c | 45 ++++++++++++++++++++++++++++++++++ scripts/bpf_helpers_doc.py | 1 + tools/include/uapi/linux/bpf.h | 28 +++++++++++++++++++++ 4 files changed, 102 insertions(+) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 54d0c886e3ba..f2e729dd1721 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -2149,6 +2149,22 @@ union bpf_attr { * * > 0 one of **BPF_FIB_LKUP_RET_** codes explaining why the * packet is not forwarded or needs assist from full stack * + * long bpf_fdb_lookup(void *ctx, struct bpf_fdb_lookup *params, int plen, u32 flags) + * Description + * Do FDB lookup in kernel tables using parameters in *params*. + * If lookup is successful (ie., FDB lookup finds a destination entry), + * ifindex is set to the egress device index from the FDB lookup. + * Both multicast and broadcast packets are currently unsupported + * in XDP layer. + * + * *plen* argument is the size of the passed **struct bpf_fdb_lookup**. + * *ctx* is only **struct xdp_md** for XDP programs. + * + * Return + * * < 0 if any input argument is invalid + * * 0 on success (destination port is found) + * * > 0 on failure (there is no entry) + * * long bpf_sock_hash_update(struct bpf_sock_ops *skops, struct bpf_map *map, void *key, u64 flags) * Description * Add an entry to, or update a sockhash *map* referencing sockets. @@ -3449,6 +3465,7 @@ union bpf_attr { FN(get_stack), \ FN(skb_load_bytes_relative), \ FN(fib_lookup), \ + FN(fdb_lookup), \ FN(sock_hash_update), \ FN(msg_redirect_hash), \ FN(sk_redirect_hash), \ @@ -4328,6 +4345,17 @@ struct bpf_fib_lookup { __u8 dmac[6]; /* ETH_ALEN */ }; +enum { + BPF_FDB_LKUP_RET_SUCCESS, /* lookup successful */ + BPF_FDB_LKUP_RET_NOENT, /* entry is not found */ +}; + +struct bpf_fdb_lookup { + unsigned char addr[6]; /* ETH_ALEN */ + __u16 vlan_id; + __u32 ifindex; +}; + enum bpf_task_fd_type { BPF_FD_TYPE_RAW_TRACEPOINT, /* tp name */ BPF_FD_TYPE_TRACEPOINT, /* tp name */ diff --git a/net/core/filter.c b/net/core/filter.c index 654c346b7d91..68800d1b8cd5 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -45,6 +45,7 @@ #include #include #include +#include #include #include #include @@ -5084,6 +5085,46 @@ static const struct bpf_func_proto bpf_skb_fib_lookup_proto = { .arg4_type = ARG_ANYTHING, }; +#if IS_ENABLED(CONFIG_BRIDGE) +BPF_CALL_4(bpf_xdp_fdb_lookup, struct xdp_buff *, ctx, + struct bpf_fdb_lookup *, params, int, plen, u32, flags) +{ + struct net_device *src, *dst; + struct net *net; + + if (plen < sizeof(*params)) + return -EINVAL; + + net = dev_net(ctx->rxq->dev); + + if (is_multicast_ether_addr(params->addr) || + is_broadcast_ether_addr(params->addr)) + return BPF_FDB_LKUP_RET_NOENT; + + src = dev_get_by_index_rcu(net, params->ifindex); + if (unlikely(!src)) + return -ENODEV; + + dst = br_fdb_find_port_xdp(src, params->addr, params->vlan_id); + if (dst) { + params->ifindex = dst->ifindex; + return BPF_FDB_LKUP_RET_SUCCESS; + } + + return BPF_FDB_LKUP_RET_NOENT; +} + +static const struct bpf_func_proto bpf_xdp_fdb_lookup_proto = { + .func = bpf_xdp_fdb_lookup, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_CONST_SIZE, + .arg4_type = ARG_ANYTHING, +}; +#endif + #if IS_ENABLED(CONFIG_IPV6_SEG6_BPF) static int bpf_push_seg6_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len) { @@ -6477,6 +6518,10 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_xdp_adjust_tail_proto; case BPF_FUNC_fib_lookup: return &bpf_xdp_fib_lookup_proto; +#if IS_ENABLED(CONFIG_BRIDGE) + case BPF_FUNC_fdb_lookup: + return &bpf_xdp_fdb_lookup_proto; +#endif #ifdef CONFIG_INET case BPF_FUNC_sk_lookup_udp: return &bpf_xdp_sk_lookup_udp_proto; diff --git a/scripts/bpf_helpers_doc.py b/scripts/bpf_helpers_doc.py index 5bfa448b4704..49ebd2273614 100755 --- a/scripts/bpf_helpers_doc.py +++ b/scripts/bpf_helpers_doc.py @@ -448,6 +448,7 @@ class PrinterHelpers(Printer): '__wsum', 'struct bpf_fib_lookup', + 'struct bpf_fdb_lookup', 'struct bpf_perf_event_data', 'struct bpf_perf_event_value', 'struct bpf_pidns_info', diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 54d0c886e3ba..f2e729dd1721 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -2149,6 +2149,22 @@ union bpf_attr { * * > 0 one of **BPF_FIB_LKUP_RET_** codes explaining why the * packet is not forwarded or needs assist from full stack * + * long bpf_fdb_lookup(void *ctx, struct bpf_fdb_lookup *params, int plen, u32 flags) + * Description + * Do FDB lookup in kernel tables using parameters in *params*. + * If lookup is successful (ie., FDB lookup finds a destination entry), + * ifindex is set to the egress device index from the FDB lookup. + * Both multicast and broadcast packets are currently unsupported + * in XDP layer. + * + * *plen* argument is the size of the passed **struct bpf_fdb_lookup**. + * *ctx* is only **struct xdp_md** for XDP programs. + * + * Return + * * < 0 if any input argument is invalid + * * 0 on success (destination port is found) + * * > 0 on failure (there is no entry) + * * long bpf_sock_hash_update(struct bpf_sock_ops *skops, struct bpf_map *map, void *key, u64 flags) * Description * Add an entry to, or update a sockhash *map* referencing sockets. @@ -3449,6 +3465,7 @@ union bpf_attr { FN(get_stack), \ FN(skb_load_bytes_relative), \ FN(fib_lookup), \ + FN(fdb_lookup), \ FN(sock_hash_update), \ FN(msg_redirect_hash), \ FN(sk_redirect_hash), \ @@ -4328,6 +4345,17 @@ struct bpf_fib_lookup { __u8 dmac[6]; /* ETH_ALEN */ }; +enum { + BPF_FDB_LKUP_RET_SUCCESS, /* lookup successful */ + BPF_FDB_LKUP_RET_NOENT, /* entry is not found */ +}; + +struct bpf_fdb_lookup { + unsigned char addr[6]; /* ETH_ALEN */ + __u16 vlan_id; + __u32 ifindex; +}; + enum bpf_task_fd_type { BPF_FD_TYPE_RAW_TRACEPOINT, /* tp name */ BPF_FD_TYPE_TRACEPOINT, /* tp name */ From patchwork Fri Jul 31 04:44:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Yoshiki Komachi X-Patchwork-Id: 1339221 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=H1brED2w; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BHvpx0kNPz9sT6 for ; Fri, 31 Jul 2020 14:45:29 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730177AbgGaEp2 (ORCPT ); Fri, 31 Jul 2020 00:45:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726058AbgGaEpZ (ORCPT ); Fri, 31 Jul 2020 00:45:25 -0400 Received: from mail-pj1-x1042.google.com (mail-pj1-x1042.google.com [IPv6:2607:f8b0:4864:20::1042]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 44E47C061574; Thu, 30 Jul 2020 21:45:25 -0700 (PDT) Received: by mail-pj1-x1042.google.com with SMTP id kr4so4283460pjb.2; Thu, 30 Jul 2020 21:45:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=U6Iy5mejGiHq/x5N+bTQ+MH8P/renFqH7MtqkvQz7hg=; b=H1brED2wfv4NtiwGieGzk6sAq3yxdHX58SWeS5e5tX+CS+NCUe62Dba1HAAhxsBxXF JYQB/OtUJQ+hTlnl40t3r1u/d0E2Z5MgfDBF5j2M4REs/VAhTvPxqpoVaWy2Q1GUuwz3 9Ov3GzhzOt83x044KBNwV9l8hWmBMFfxVxoKfc/4+tzqn004/y3UMB7tmKfQWpPFS0BJ GzV01oyGjfkAZxyrRH+D1sG6Cal/KEJdQmbp7IuW4gACEk7d8cBoEm0VwZE9DV063hml um38tK94Uh2x2kJhDIa/mLY4eO+DbqYt+VylyzLTb0WDKo+3tejiF/XbtPRVKA9ZidvL RQag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=U6Iy5mejGiHq/x5N+bTQ+MH8P/renFqH7MtqkvQz7hg=; b=ec6yQG26ns4Ir+8Xy96jYJ/6OJfuZT0pbcT7Osci11h8WyGuxIwbV3KcfvMgSNXJyo lYM6LDa+cCHu/comHuL1ZV3G9rfRNDXaAm/t+MXDoLqhbrNe0Q6oDdWEvBzuCUDVgfIx JQdt4cTYAZIxEJfR1qcbfhSniBmM/HvLJVg6A0zG1hZg/Sj82aeXJwhTAC0cao4MDZ4Q I6iGvv7AIJv2n6lczFaMJ3Q1Soc6aJwuabHAWU2n77je44c6KJPEf0IbNKAwXbOZ5UU3 AILCKw+/T5sNc15TowFR6Y/PyYBWc4hsE/A2kgPGdZkUW8JGBOvh5Ia8zpGlUiokdHiQ 0RoQ== X-Gm-Message-State: AOAM533+MzZFMJHAAoDAY1BzHmO6E2jjCPFK6nlzhkjdeKJQIVsRpC1O A8B3bCbX/luAg1zZLnm/Gvs= X-Google-Smtp-Source: ABdhPJzTkp/ph31pACVINY2T1cr3eR1ZZxZYAxfiRYhy/7iPbKuxyrLMuHunx7Rc7nfrZBWXMck8jg== X-Received: by 2002:a17:90a:d304:: with SMTP id p4mr2430400pju.153.1596170724396; Thu, 30 Jul 2020 21:45:24 -0700 (PDT) Received: from dali.ht.sfc.keio.ac.jp (dali.ht.sfc.keio.ac.jp. [133.27.170.2]) by smtp.gmail.com with ESMTPSA id x6sm2329573pfd.53.2020.07.30.21.45.20 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 Jul 2020 21:45:23 -0700 (PDT) From: Yoshiki Komachi To: "David S. Miller" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Jakub Kicinski , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , KP Singh , Roopa Prabhu , Nikolay Aleksandrov , David Ahern Cc: Yoshiki Komachi , netdev@vger.kernel.org, bridge@lists.linux-foundation.org, bpf@vger.kernel.org Subject: [RFC PATCH bpf-next 3/3] samples/bpf: Add a simple bridge example accelerated with XDP Date: Fri, 31 Jul 2020 13:44:20 +0900 Message-Id: <1596170660-5582-4-git-send-email-komachi.yoshiki@gmail.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1596170660-5582-1-git-send-email-komachi.yoshiki@gmail.com> References: <1596170660-5582-1-git-send-email-komachi.yoshiki@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch adds a simple example of XDP-based bridge with the new bpf_fdb_lookup helper. This program simply forwards packets based on the destination port given by FDB in the kernel. Note that both vlan filtering and learning features are currently unsupported in this example. There is another plan to recreate a userspace application (xdp_bridge_user.c) as a daemon process, which helps to automate not only detection of status changes in bridge port but also handling vlan protocol updates. Note: David Ahern suggested a new bpf helper [1] to get master vlan/bonding devices in XDP programs attached to their slaves when the master vlan/bonding devices are bridge ports. If this idea is accepted and the helper is introduced in the future, we can handle interfaces slaved to vlan/bonding devices in this sample by calling the suggested bpf helper (I guess it can get vlan/bonding ifindex from their slave ifindex). Notice that we don't need to change bpf_fdb_lookup() API to use such a feature, but we just need to modify bpf programs like this sample. [1]: http://vger.kernel.org/lpc-networking2018.html#session-1 Signed-off-by: Yoshiki Komachi --- samples/bpf/Makefile | 3 + samples/bpf/xdp_bridge_kern.c | 129 ++++++++++++++++++ samples/bpf/xdp_bridge_user.c | 239 ++++++++++++++++++++++++++++++++++ 3 files changed, 371 insertions(+) create mode 100644 samples/bpf/xdp_bridge_kern.c create mode 100644 samples/bpf/xdp_bridge_user.c diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index f87ee02073ba..d470368fe8de 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -53,6 +53,7 @@ tprogs-y += task_fd_query tprogs-y += xdp_sample_pkts tprogs-y += ibumad tprogs-y += hbm +tprogs-y += xdp_bridge # Libbpf dependencies LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a @@ -109,6 +110,7 @@ task_fd_query-objs := bpf_load.o task_fd_query_user.o $(TRACE_HELPERS) xdp_sample_pkts-objs := xdp_sample_pkts_user.o $(TRACE_HELPERS) ibumad-objs := bpf_load.o ibumad_user.o $(TRACE_HELPERS) hbm-objs := bpf_load.o hbm.o $(CGROUP_HELPERS) +xdp_bridge-objs := xdp_bridge_user.o # Tell kbuild to always build the programs always-y := $(tprogs-y) @@ -170,6 +172,7 @@ always-y += ibumad_kern.o always-y += hbm_out_kern.o always-y += hbm_edt_kern.o always-y += xdpsock_kern.o +always-y += xdp_bridge_kern.o ifeq ($(ARCH), arm) # Strip all except -D__LINUX_ARM_ARCH__ option needed to handle linux diff --git a/samples/bpf/xdp_bridge_kern.c b/samples/bpf/xdp_bridge_kern.c new file mode 100644 index 000000000000..00f802503199 --- /dev/null +++ b/samples/bpf/xdp_bridge_kern.c @@ -0,0 +1,129 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 NTT Corp. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of version 2 of the GNU General Public + * License as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ +#define KBUILD_MODNAME "foo" +#include +#include +#include +#include +#include +#include +#include + +#include + +struct { + __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(int)); + __uint(max_entries, 64); +} xdp_tx_ports SEC(".maps"); + +static __always_inline int xdp_bridge_proto(struct xdp_md *ctx, u16 br_vlan_proto) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + struct bpf_fdb_lookup fdb_lookup_params; + struct vlan_hdr *vlan_hdr = NULL; + struct ethhdr *eth = data; + u16 h_proto; + u64 nh_off; + int rc; + + nh_off = sizeof(*eth); + if (data + nh_off > data_end) + return XDP_DROP; + + __builtin_memset(&fdb_lookup_params, 0, sizeof(fdb_lookup_params)); + + h_proto = eth->h_proto; + + if (unlikely(ntohs(h_proto) < ETH_P_802_3_MIN)) + return XDP_PASS; + + /* Handle VLAN tagged packet */ + if (h_proto == br_vlan_proto) { + vlan_hdr = (void *)eth + nh_off; + nh_off += sizeof(*vlan_hdr); + if ((void *)eth + nh_off > data_end) + return XDP_PASS; + + fdb_lookup_params.vlan_id = ntohs(vlan_hdr->h_vlan_TCI) & + VLAN_VID_MASK; + } + + /* FIXME: Although Linux bridge provides us with vlan filtering (contains + * PVID) at ingress, the feature is currently unsupported in this XDP program. + * + * Two ideas to realize the vlan filtering are below: + * 1. usespace daemon monitors bridge vlan events and notifies XDP programs + * of them through BPF maps + * 2. introduce another bpf helper to retrieve bridge vlan information + * + * + * FIXME: After the vlan filtering, learning feature is required here, but + * it is currently unsupported as well. If another bpf helper for learning + * is accepted, the processing could be implemented in the future. + */ + + memcpy(&fdb_lookup_params.addr, eth->h_dest, ETH_ALEN); + + /* Note: This program definitely takes ifindex of ingress interface as + * a bridge port. Linux networking devices can be stacked and physical + * interfaces are not necessarily slaves of bridges (e.g., bonding or + * vlan devices can be slaves of bridges), but stacked bridge ports are + * currently unsupported in this program. In such cases, XDP programs + * should be attached to a lower device in order to process packets with + * higher speed. Then, a new bpf helper to find upper devices will be + * required here in the future because they will be registered on FDB + * in the kernel. + */ + fdb_lookup_params.ifindex = ctx->ingress_ifindex; + + rc = bpf_fdb_lookup(ctx, &fdb_lookup_params, sizeof(fdb_lookup_params), 0); + if (rc != BPF_FDB_LKUP_RET_SUCCESS) { + /* In cases of flooding, XDP_PASS will be returned here */ + return XDP_PASS; + } + + /* FIXME: Although Linux bridge provides us with vlan filtering (contains + * untagged policy) at egress as well, the feature is currently unsupported + * in this XDP program. + * + * Two ideas to realize the vlan filtering are below: + * 1. usespace daemon monitors bridge vlan events and notifies XDP programs + * of them through BPF maps + * 2. introduce another bpf helper to retrieve bridge vlan information + */ + + return bpf_redirect_map(&xdp_tx_ports, fdb_lookup_params.ifindex, XDP_PASS); +} + +SEC("xdp_bridge") +int xdp_bridge_prog(struct xdp_md *ctx) +{ + return xdp_bridge_proto(ctx, 0); +} + +SEC("xdp_8021q_bridge") +int xdp_8021q_bridge_prog(struct xdp_md *ctx) +{ + return xdp_bridge_proto(ctx, htons(ETH_P_8021Q)); +} + +SEC("xdp_8021ad_bridge") +int xdp_8021ad_bridge_prog(struct xdp_md *ctx) +{ + return xdp_bridge_proto(ctx, htons(ETH_P_8021AD)); +} + +char _license[] SEC("license") = "GPL"; diff --git a/samples/bpf/xdp_bridge_user.c b/samples/bpf/xdp_bridge_user.c new file mode 100644 index 000000000000..6ed0a2ece6f4 --- /dev/null +++ b/samples/bpf/xdp_bridge_user.c @@ -0,0 +1,239 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 NTT Corp. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of version 2 of the GNU General Public + * License as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#define STRERR_BUFSIZE 128 + +static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; + +static int do_attach(int idx, int prog_fd, int map_fd, const char *name) +{ + int err; + + err = bpf_set_link_xdp_fd(idx, prog_fd, xdp_flags); + if (err < 0) { + printf("ERROR: failed to attach program to %s\n", name); + return err; + } + + /* Adding ifindex as a possible egress TX port */ + err = bpf_map_update_elem(map_fd, &idx, &idx, 0); + if (err) + printf("ERROR: failed using device %s as TX-port\n", name); + + return err; +} + +static int do_detach(int idx, const char *name) +{ + int err; + + err = bpf_set_link_xdp_fd(idx, -1, xdp_flags); + if (err < 0) + printf("ERROR: failed to detach program from %s\n", name); + + /* FIXME: Need to delete the corresponding entry in shared devmap + * with bpf_map_delete_elem((map_fd, &idx); + */ + return err; +} + +static int do_reuse_map(struct bpf_map *map, char *pin_path, bool *pinned) +{ + const char *path = "/sys/fs/bpf/xdp_bridge"; + char errmsg[STRERR_BUFSIZE]; + int err, len, pin_fd; + + len = snprintf(pin_path, PATH_MAX, "%s/%s", path, bpf_map__name(map)); + if (len < 0) + return -EINVAL; + else if (len >= PATH_MAX) + return -ENAMETOOLONG; + + pin_fd = bpf_obj_get(pin_path); + if (pin_fd < 0) { + err = -errno; + if (err == -ENOENT) { + *pinned = false; + return 0; + } + + libbpf_strerror(-err, errmsg, sizeof(errmsg)); + printf("couldn't retrieve pinned map: %s\n", errmsg); + return err; + } + + err = bpf_map__reuse_fd(map, pin_fd); + if (err) { + printf("failed to reuse map: %s\n", strerror(errno)); + close(pin_fd); + } + + return err; +} + +static void usage(const char *prog) +{ + fprintf(stderr, + "usage: %s [OPTS] interface-list\n" + "\nOPTS:\n" + " -Q enable vlan filtering (802.1Q)\n" + " -A enable vlan filtering (802.1ad)\n" + " -d detach program\n", + prog); +} + +int main(int argc, char **argv) +{ + struct bpf_object_open_attr attr = { + .prog_type = BPF_PROG_TYPE_XDP, + }; + char filename[PATH_MAX], pin_path[PATH_MAX]; + const char *prog_name = "xdp_bridge"; + int prog_fd = -1, map_fd = -1; + struct bpf_program *prog; + struct bpf_object *obj; + int opt, i, idx, err; + struct bpf_map *map; + bool pinned = true; + int attach = 1; + int ret = 0; + + while ((opt = getopt(argc, argv, ":dQASF")) != -1) { + switch (opt) { + case 'd': + attach = 0; + break; + case 'S': + xdp_flags |= XDP_FLAGS_SKB_MODE; + break; + case 'F': + xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST; + break; + case 'Q': + prog_name = "xdp_8021q_bridge"; + break; + case 'A': + prog_name = "xdp_8021ad_bridge"; + break; + default: + usage(basename(argv[0])); + return 1; + } + } + + if (!(xdp_flags & XDP_FLAGS_SKB_MODE)) + xdp_flags |= XDP_FLAGS_DRV_MODE; + + if (optind == argc) { + usage(basename(argv[0])); + return 1; + } + + if (attach) { + snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]); + attr.file = filename; + + if (access(filename, O_RDONLY) < 0) { + printf("error accessing file %s: %s\n", + filename, strerror(errno)); + return 1; + } + + obj = bpf_object__open_xattr(&attr); + if (libbpf_get_error(obj)) { + printf("cannot open xdp program: %s\n", strerror(errno)); + return 1; + } + + map = bpf_object__find_map_by_name(obj, "xdp_tx_ports"); + if (libbpf_get_error(map)) { + printf("map not found: %s\n", strerror(errno)); + goto err; + } + + err = do_reuse_map(map, pin_path, &pinned); + if (err) { + printf("error reusing map %s: %s\n", + bpf_map__name(map), strerror(errno)); + goto err; + } + + err = bpf_object__load(obj); + if (err) { + printf("cannot load xdp program: %s\n", strerror(errno)); + goto err; + } + + prog = bpf_object__find_program_by_title(obj, prog_name); + prog_fd = bpf_program__fd(prog); + if (prog_fd < 0) { + printf("program not found: %s\n", strerror(prog_fd)); + goto err; + } + + map_fd = bpf_map__fd(map); + if (map_fd < 0) { + printf("map not found: %s\n", strerror(map_fd)); + goto err; + } + + if (!pinned) { + err = bpf_map__pin(map, pin_path); + if (err) { + printf("failed to pin map: %s\n", strerror(errno)); + goto err; + } + } + } + + for (i = optind; i < argc; ++i) { + idx = if_nametoindex(argv[i]); + if (!idx) + idx = strtoul(argv[i], NULL, 0); + + if (!idx) { + fprintf(stderr, "Invalid arg\n"); + return 1; + } + if (attach) { + err = do_attach(idx, prog_fd, map_fd, argv[i]); + if (err) + ret = err; + } else { + err = do_detach(idx, argv[i]); + if (err) + ret = err; + } + } + + return ret; +err: + bpf_object__close(obj); + return 1; +}