From patchwork Thu Nov 7 17:47:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 1191339 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 4789pz6dS2z9sPk for ; Fri, 8 Nov 2019 04:47:55 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389787AbfKGRry (ORCPT ); Thu, 7 Nov 2019 12:47:54 -0500 Received: from mga03.intel.com ([134.134.136.65]:33463 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730591AbfKGRrx (ORCPT ); Thu, 7 Nov 2019 12:47:53 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Nov 2019 09:47:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,278,1569308400"; d="scan'208";a="353858401" Received: from unknown (HELO VM.jf.intel.com) ([10.78.3.78]) by orsmga004.jf.intel.com with ESMTP; 07 Nov 2019 09:47:52 -0800 From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com, u9012063@gmail.com Cc: bpf@vger.kernel.org Subject: [PATCH bpf-next 1/5] libbpf: support XDP_SHARED_UMEM with external XDP program Date: Thu, 7 Nov 2019 18:47:36 +0100 Message-Id: <1573148860-30254-2-git-send-email-magnus.karlsson@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1573148860-30254-1-git-send-email-magnus.karlsson@intel.com> References: <1573148860-30254-1-git-send-email-magnus.karlsson@intel.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add support in libbpf to create multiple sockets that share a single umem. Note that an external XDP program need to be supplied that routes the incoming traffic to the desired sockets. So you need to supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load your own XDP program. Signed-off-by: Magnus Karlsson Acked-by: Jonathan Lemon Tested-by: William Tu --- tools/lib/bpf/xsk.c | 27 +++++++++++++++++---------- 1 file changed, 17 insertions(+), 10 deletions(-) diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index 86c1b61..8ebd810 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -586,15 +586,21 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname, if (!umem || !xsk_ptr || !rx || !tx) return -EFAULT; - if (umem->refcount) { - pr_warn("Error: shared umems not supported by libbpf.\n"); - return -EBUSY; - } - xsk = calloc(1, sizeof(*xsk)); if (!xsk) return -ENOMEM; + err = xsk_set_xdp_socket_config(&xsk->config, usr_config); + if (err) + goto out_xsk_alloc; + + if (umem->refcount && + !(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) { + pr_warn("Error: shared umems not supported by libbpf supplied XDP program.\n"); + err = -EBUSY; + goto out_xsk_alloc; + } + if (umem->refcount++ > 0) { xsk->fd = socket(AF_XDP, SOCK_RAW, 0); if (xsk->fd < 0) { @@ -616,10 +622,6 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname, memcpy(xsk->ifname, ifname, IFNAMSIZ - 1); xsk->ifname[IFNAMSIZ - 1] = '\0'; - err = xsk_set_xdp_socket_config(&xsk->config, usr_config); - if (err) - goto out_socket; - if (rx) { err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING, &xsk->config.rx_size, @@ -687,7 +689,12 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname, sxdp.sxdp_family = PF_XDP; sxdp.sxdp_ifindex = xsk->ifindex; sxdp.sxdp_queue_id = xsk->queue_id; - sxdp.sxdp_flags = xsk->config.bind_flags; + if (umem->refcount > 1) { + sxdp.sxdp_flags = XDP_SHARED_UMEM; + sxdp.sxdp_shared_umem_fd = umem->fd; + } else { + sxdp.sxdp_flags = xsk->config.bind_flags; + } err = bind(xsk->fd, (struct sockaddr *)&sxdp, sizeof(sxdp)); if (err) { From patchwork Thu Nov 7 17:47:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 1191340 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 4789q23llXz9sNx for ; Fri, 8 Nov 2019 04:47:58 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389799AbfKGRrz (ORCPT ); Thu, 7 Nov 2019 12:47:55 -0500 Received: from mga03.intel.com ([134.134.136.65]:33463 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730400AbfKGRry (ORCPT ); Thu, 7 Nov 2019 12:47:54 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Nov 2019 09:47:54 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,278,1569308400"; d="scan'208";a="353858406" Received: from unknown (HELO VM.jf.intel.com) ([10.78.3.78]) by orsmga004.jf.intel.com with ESMTP; 07 Nov 2019 09:47:53 -0800 From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com, u9012063@gmail.com Cc: bpf@vger.kernel.org Subject: [PATCH bpf-next 2/5] samples/bpf: add XDP_SHARED_UMEM support to xdpsock Date: Thu, 7 Nov 2019 18:47:37 +0100 Message-Id: <1573148860-30254-3-git-send-email-magnus.karlsson@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1573148860-30254-1-git-send-email-magnus.karlsson@intel.com> References: <1573148860-30254-1-git-send-email-magnus.karlsson@intel.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add support for the XDP_SHARED_UMEM mode to the xdpsock sample application. As libbpf does not have a built in XDP program for this mode, we use an explicitly loaded XDP program. This also serves as an example on how to write your own XDP program that can route to an AF_XDP socket. Signed-off-by: Magnus Karlsson Acked-by: Jonathan Lemon Tested-by: William Tu --- samples/bpf/Makefile | 1 + samples/bpf/xdpsock.h | 11 ++++ samples/bpf/xdpsock_kern.c | 24 ++++++++ samples/bpf/xdpsock_user.c | 141 +++++++++++++++++++++++++++++++-------------- 4 files changed, 135 insertions(+), 42 deletions(-) create mode 100644 samples/bpf/xdpsock.h create mode 100644 samples/bpf/xdpsock_kern.c diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index 4df11dd..8a9af3a 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -167,6 +167,7 @@ always += xdp_sample_pkts_kern.o always += ibumad_kern.o always += hbm_out_kern.o always += hbm_edt_kern.o +always += xdpsock_kern.o ifeq ($(ARCH), arm) # Strip all except -D__LINUX_ARM_ARCH__ option needed to handle linux diff --git a/samples/bpf/xdpsock.h b/samples/bpf/xdpsock.h new file mode 100644 index 0000000..b7eca15 --- /dev/null +++ b/samples/bpf/xdpsock.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0 + * + * Copyright(c) 2019 Intel Corporation. + */ + +#ifndef XDPSOCK_H_ +#define XDPSOCK_H_ + +#define MAX_SOCKS 4 + +#endif /* XDPSOCK_H */ diff --git a/samples/bpf/xdpsock_kern.c b/samples/bpf/xdpsock_kern.c new file mode 100644 index 0000000..a06177c --- /dev/null +++ b/samples/bpf/xdpsock_kern.c @@ -0,0 +1,24 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include "bpf_helpers.h" +#include "xdpsock.h" + +/* This XDP program is only needed for the XDP_SHARED_UMEM mode. + * If you do not use this mode, libbpf can supply an XDP program for you. + */ + +struct { + __uint(type, BPF_MAP_TYPE_XSKMAP); + __uint(max_entries, MAX_SOCKS); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(int)); +} xsks_map SEC(".maps"); + +static unsigned int rr; + +SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx) +{ + rr = (rr + 1) & (MAX_SOCKS - 1); + + return bpf_redirect_map(&xsks_map, rr, XDP_DROP); +} diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index 405c4e0..d3dba93 100644 --- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -29,6 +29,7 @@ #include "libbpf.h" #include "xsk.h" +#include "xdpsock.h" #include #ifndef SOL_XDP @@ -47,7 +48,6 @@ #define BATCH_SIZE 64 #define DEBUG_HEXDUMP 0 -#define MAX_SOCKS 8 typedef __u64 u64; typedef __u32 u32; @@ -75,7 +75,8 @@ static u32 opt_xdp_bind_flags; static int opt_xsk_frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE; static int opt_timeout = 1000; static bool opt_need_wakeup = true; -static __u32 prog_id; +static u32 opt_num_xsks = 1; +static u32 prog_id; struct xsk_umem_info { struct xsk_ring_prod fq; @@ -179,7 +180,7 @@ static void *poller(void *arg) static void remove_xdp_program(void) { - __u32 curr_prog_id = 0; + u32 curr_prog_id = 0; if (bpf_get_link_xdp_id(opt_ifindex, &curr_prog_id, opt_xdp_flags)) { printf("bpf_get_link_xdp_id failed\n"); @@ -196,11 +197,11 @@ static void remove_xdp_program(void) static void int_exit(int sig) { struct xsk_umem *umem = xsks[0]->umem->umem; - - (void)sig; + int i; dump_stats(); - xsk_socket__delete(xsks[0]->xsk); + for (i = 0; i < num_socks; i++) + xsk_socket__delete(xsks[i]->xsk); (void)xsk_umem__delete(umem); remove_xdp_program(); @@ -290,8 +291,8 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size) .frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM, .flags = opt_umem_flags }; - - int ret; + int ret, i; + u32 idx; umem = calloc(1, sizeof(*umem)); if (!umem) @@ -303,6 +304,15 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size) if (ret) exit_with_error(-ret); + ret = xsk_ring_prod__reserve(&umem->fq, + XSK_RING_PROD__DEFAULT_NUM_DESCS, &idx); + if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS) + exit_with_error(-ret); + for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++) + *xsk_ring_prod__fill_addr(&umem->fq, idx++) = + i * opt_xsk_frame_size; + xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS); + umem->buffer = buffer; return umem; } @@ -312,8 +322,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem) struct xsk_socket_config cfg; struct xsk_socket_info *xsk; int ret; - u32 idx; - int i; xsk = calloc(1, sizeof(*xsk)); if (!xsk) @@ -322,11 +330,15 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem) xsk->umem = umem; cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; - cfg.libbpf_flags = 0; + if (opt_num_xsks > 1) + cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD; + else + cfg.libbpf_flags = 0; cfg.xdp_flags = opt_xdp_flags; cfg.bind_flags = opt_xdp_bind_flags; - ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue, umem->umem, - &xsk->rx, &xsk->tx, &cfg); + + ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue, + umem->umem, &xsk->rx, &xsk->tx, &cfg); if (ret) exit_with_error(-ret); @@ -334,17 +346,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem) if (ret) exit_with_error(-ret); - ret = xsk_ring_prod__reserve(&xsk->umem->fq, - XSK_RING_PROD__DEFAULT_NUM_DESCS, - &idx); - if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS) - exit_with_error(-ret); - for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS; i++) - *xsk_ring_prod__fill_addr(&xsk->umem->fq, idx++) = - i * opt_xsk_frame_size; - xsk_ring_prod__submit(&xsk->umem->fq, - XSK_RING_PROD__DEFAULT_NUM_DESCS); - return xsk; } @@ -363,6 +364,7 @@ static struct option long_options[] = { {"frame-size", required_argument, 0, 'f'}, {"no-need-wakeup", no_argument, 0, 'm'}, {"unaligned", no_argument, 0, 'u'}, + {"shared-umem", no_argument, 0, 'M'}, {0, 0, 0, 0} }; @@ -386,6 +388,7 @@ static void usage(const char *prog) " -m, --no-need-wakeup Turn off use of driver need wakeup flag.\n" " -f, --frame-size=n Set the frame size (must be a power of two in aligned mode, default is %d).\n" " -u, --unaligned Enable unaligned chunk placement\n" + " -M, --shared-umem Enable XDP_SHARED_UMEM\n" "\n"; fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE); exit(EXIT_FAILURE); @@ -398,7 +401,7 @@ static void parse_command_line(int argc, char **argv) opterr = 0; for (;;) { - c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:mu", + c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:muM", long_options, &option_index); if (c == -1) break; @@ -448,11 +451,14 @@ static void parse_command_line(int argc, char **argv) break; case 'f': opt_xsk_frame_size = atoi(optarg); + break; case 'm': opt_need_wakeup = false; opt_xdp_bind_flags &= ~XDP_USE_NEED_WAKEUP; break; - + case 'M': + opt_num_xsks = MAX_SOCKS; + break; default: usage(basename(argv[0])); } @@ -586,11 +592,9 @@ static void rx_drop(struct xsk_socket_info *xsk, struct pollfd *fds) static void rx_drop_all(void) { - struct pollfd fds[MAX_SOCKS + 1]; + struct pollfd fds[MAX_SOCKS] = {}; int i, ret; - memset(fds, 0, sizeof(fds)); - for (i = 0; i < num_socks; i++) { fds[i].fd = xsk_socket__fd(xsks[i]->xsk); fds[i].events = POLLIN; @@ -633,11 +637,10 @@ static void tx_only(struct xsk_socket_info *xsk, u32 frame_nb) static void tx_only_all(void) { - struct pollfd fds[MAX_SOCKS]; + struct pollfd fds[MAX_SOCKS] = {}; u32 frame_nb[MAX_SOCKS] = {}; int i, ret; - memset(fds, 0, sizeof(fds)); for (i = 0; i < num_socks; i++) { fds[0].fd = xsk_socket__fd(xsks[i]->xsk); fds[0].events = POLLOUT; @@ -706,11 +709,9 @@ static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds) static void l2fwd_all(void) { - struct pollfd fds[MAX_SOCKS]; + struct pollfd fds[MAX_SOCKS] = {}; int i, ret; - memset(fds, 0, sizeof(fds)); - for (i = 0; i < num_socks; i++) { fds[i].fd = xsk_socket__fd(xsks[i]->xsk); fds[i].events = POLLOUT | POLLIN; @@ -728,13 +729,65 @@ static void l2fwd_all(void) } } +static void load_xdp_program(char **argv, struct bpf_object **obj) +{ + struct bpf_prog_load_attr prog_load_attr = { + .prog_type = BPF_PROG_TYPE_XDP, + }; + char xdp_filename[256]; + int prog_fd; + + snprintf(xdp_filename, sizeof(xdp_filename), "%s_kern.o", argv[0]); + prog_load_attr.file = xdp_filename; + + if (bpf_prog_load_xattr(&prog_load_attr, obj, &prog_fd)) + exit(EXIT_FAILURE); + if (prog_fd < 0) { + fprintf(stderr, "ERROR: no program found: %s\n", + strerror(prog_fd)); + exit(EXIT_FAILURE); + } + + if (bpf_set_link_xdp_fd(opt_ifindex, prog_fd, opt_xdp_flags) < 0) { + fprintf(stderr, "ERROR: link set xdp fd failed\n"); + exit(EXIT_FAILURE); + } +} + +static void enter_xsks_into_map(struct bpf_object *obj) +{ + struct bpf_map *map; + int i, xsks_map; + + map = bpf_object__find_map_by_name(obj, "xsks_map"); + xsks_map = bpf_map__fd(map); + if (xsks_map < 0) { + fprintf(stderr, "ERROR: no xsks map found: %s\n", + strerror(xsks_map)); + exit(EXIT_FAILURE); + } + + for (i = 0; i < num_socks; i++) { + int fd = xsk_socket__fd(xsks[i]->xsk); + int key, ret; + + key = i; + ret = bpf_map_update_elem(xsks_map, &key, &fd, 0); + if (ret) { + fprintf(stderr, "ERROR: bpf_map_update_elem %d\n", i); + exit(EXIT_FAILURE); + } + } +} + int main(int argc, char **argv) { struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY}; struct xsk_umem_info *umem; + struct bpf_object *obj; pthread_t pt; + int i, ret; void *bufs; - int ret; parse_command_line(argc, argv); @@ -744,6 +797,9 @@ int main(int argc, char **argv) exit(EXIT_FAILURE); } + if (opt_num_xsks > 1) + load_xdp_program(argv, &obj); + /* Reserve memory for the umem. Use hugepages if unaligned chunk mode */ bufs = mmap(NULL, NUM_FRAMES * opt_xsk_frame_size, PROT_READ | PROT_WRITE, @@ -752,16 +808,17 @@ int main(int argc, char **argv) printf("ERROR: mmap failed\n"); exit(EXIT_FAILURE); } - /* Create sockets... */ + + /* Create sockets... */ umem = xsk_configure_umem(bufs, NUM_FRAMES * opt_xsk_frame_size); - xsks[num_socks++] = xsk_configure_socket(umem); + for (i = 0; i < opt_num_xsks; i++) + xsks[num_socks++] = xsk_configure_socket(umem); - if (opt_bench == BENCH_TXONLY) { - int i; + for (i = 0; i < NUM_FRAMES; i++) + gen_eth_frame(umem, i * opt_xsk_frame_size); - for (i = 0; i < NUM_FRAMES; i++) - (void)gen_eth_frame(umem, i * opt_xsk_frame_size); - } + if (opt_num_xsks > 1 && opt_bench != BENCH_TXONLY) + enter_xsks_into_map(obj); signal(SIGINT, int_exit); signal(SIGTERM, int_exit); From patchwork Thu Nov 7 17:47:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 1191342 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 4789q36Nqzz9sPk for ; Fri, 8 Nov 2019 04:47:59 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389816AbfKGRr5 (ORCPT ); Thu, 7 Nov 2019 12:47:57 -0500 Received: from mga03.intel.com ([134.134.136.65]:33465 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389788AbfKGRry (ORCPT ); Thu, 7 Nov 2019 12:47:54 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Nov 2019 09:47:54 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,278,1569308400"; d="scan'208";a="353858409" Received: from unknown (HELO VM.jf.intel.com) ([10.78.3.78]) by orsmga004.jf.intel.com with ESMTP; 07 Nov 2019 09:47:54 -0800 From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com, u9012063@gmail.com Cc: bpf@vger.kernel.org Subject: [PATCH bpf-next 3/5] libbpf: allow for creating Rx or Tx only AF_XDP sockets Date: Thu, 7 Nov 2019 18:47:38 +0100 Message-Id: <1573148860-30254-4-git-send-email-magnus.karlsson@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1573148860-30254-1-git-send-email-magnus.karlsson@intel.com> References: <1573148860-30254-1-git-send-email-magnus.karlsson@intel.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The libbpf AF_XDP code is extended to allow for the creation of Rx only or Tx only sockets. Previously it returned an error if the socket was not initialized for both Rx and Tx. Signed-off-by: Magnus Karlsson Acked-by: Jonathan Lemon Tested-by: William Tu --- tools/lib/bpf/xsk.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index 8ebd810..303ed63 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -562,7 +562,8 @@ static int xsk_setup_xdp_prog(struct xsk_socket *xsk) } } - err = xsk_set_bpf_maps(xsk); + if (xsk->rx) + err = xsk_set_bpf_maps(xsk); if (err) { xsk_delete_bpf_maps(xsk); close(xsk->prog_fd); @@ -583,7 +584,7 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname, struct xsk_socket *xsk; int err; - if (!umem || !xsk_ptr || !rx || !tx) + if (!umem || !xsk_ptr || !(rx || tx)) return -EFAULT; xsk = calloc(1, sizeof(*xsk)); From patchwork Thu Nov 7 17:47:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 1191341 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 4789q323wFz9sP6 for ; Fri, 8 Nov 2019 04:47:59 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389824AbfKGRr5 (ORCPT ); Thu, 7 Nov 2019 12:47:57 -0500 Received: from mga03.intel.com ([134.134.136.65]:33466 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389795AbfKGRrz (ORCPT ); Thu, 7 Nov 2019 12:47:55 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Nov 2019 09:47:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,278,1569308400"; d="scan'208";a="353858413" Received: from unknown (HELO VM.jf.intel.com) ([10.78.3.78]) by orsmga004.jf.intel.com with ESMTP; 07 Nov 2019 09:47:54 -0800 From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com, u9012063@gmail.com Cc: bpf@vger.kernel.org Subject: [PATCH bpf-next 4/5] samples/bpf: use Rx-only and Tx-only sockets in xdpsock Date: Thu, 7 Nov 2019 18:47:39 +0100 Message-Id: <1573148860-30254-5-git-send-email-magnus.karlsson@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1573148860-30254-1-git-send-email-magnus.karlsson@intel.com> References: <1573148860-30254-1-git-send-email-magnus.karlsson@intel.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Use Rx-only sockets for the rxdrop sample and Tx-only sockets for the txpush sample in the xdpsock application. This so that we exercise and show case these socket types too. Signed-off-by: Magnus Karlsson Acked-by: Jonathan Lemon Tested-by: William Tu --- samples/bpf/xdpsock_user.c | 41 +++++++++++++++++++++++++++++------------ 1 file changed, 29 insertions(+), 12 deletions(-) diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index d3dba93..a1f96e5 100644 --- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -291,8 +291,7 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size) .frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM, .flags = opt_umem_flags }; - int ret, i; - u32 idx; + int ret; umem = calloc(1, sizeof(*umem)); if (!umem) @@ -300,10 +299,18 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size) ret = xsk_umem__create(&umem->umem, buffer, size, &umem->fq, &umem->cq, &cfg); - if (ret) exit_with_error(-ret); + umem->buffer = buffer; + return umem; +} + +static void xsk_populate_fill_ring(struct xsk_umem_info *umem) +{ + int ret, i; + u32 idx; + ret = xsk_ring_prod__reserve(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS, &idx); if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS) @@ -312,15 +319,15 @@ static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size) *xsk_ring_prod__fill_addr(&umem->fq, idx++) = i * opt_xsk_frame_size; xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS); - - umem->buffer = buffer; - return umem; } -static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem) +static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem, + bool rx, bool tx) { struct xsk_socket_config cfg; struct xsk_socket_info *xsk; + struct xsk_ring_cons *rxr; + struct xsk_ring_prod *txr; int ret; xsk = calloc(1, sizeof(*xsk)); @@ -337,8 +344,10 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem) cfg.xdp_flags = opt_xdp_flags; cfg.bind_flags = opt_xdp_bind_flags; - ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue, - umem->umem, &xsk->rx, &xsk->tx, &cfg); + rxr = rx ? &xsk->rx : NULL; + txr = tx ? &xsk->tx : NULL; + ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue, umem->umem, + rxr, txr, &cfg); if (ret) exit_with_error(-ret); @@ -783,6 +792,7 @@ static void enter_xsks_into_map(struct bpf_object *obj) int main(int argc, char **argv) { struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY}; + bool rx = false, tx = false; struct xsk_umem_info *umem; struct bpf_object *obj; pthread_t pt; @@ -811,11 +821,18 @@ int main(int argc, char **argv) /* Create sockets... */ umem = xsk_configure_umem(bufs, NUM_FRAMES * opt_xsk_frame_size); + if (opt_bench == BENCH_RXDROP || opt_bench == BENCH_L2FWD) { + rx = true; + xsk_populate_fill_ring(umem); + } + if (opt_bench == BENCH_L2FWD || opt_bench == BENCH_TXONLY) + tx = true; for (i = 0; i < opt_num_xsks; i++) - xsks[num_socks++] = xsk_configure_socket(umem); + xsks[num_socks++] = xsk_configure_socket(umem, rx, tx); - for (i = 0; i < NUM_FRAMES; i++) - gen_eth_frame(umem, i * opt_xsk_frame_size); + if (opt_bench == BENCH_TXONLY) + for (i = 0; i < NUM_FRAMES; i++) + gen_eth_frame(umem, i * opt_xsk_frame_size); if (opt_num_xsks > 1 && opt_bench != BENCH_TXONLY) enter_xsks_into_map(obj); From patchwork Thu Nov 7 17:47:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 1191344 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 4789q73pH0z9sNx for ; Fri, 8 Nov 2019 04:48:03 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389841AbfKGRsB (ORCPT ); Thu, 7 Nov 2019 12:48:01 -0500 Received: from mga03.intel.com ([134.134.136.65]:33466 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389795AbfKGRr6 (ORCPT ); Thu, 7 Nov 2019 12:47:58 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Nov 2019 09:47:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,278,1569308400"; d="scan'208";a="353858419" Received: from unknown (HELO VM.jf.intel.com) ([10.78.3.78]) by orsmga004.jf.intel.com with ESMTP; 07 Nov 2019 09:47:55 -0800 From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com, u9012063@gmail.com Cc: bpf@vger.kernel.org Subject: [PATCH bpf-next 5/5] xsk: extend documentation for Rx|Tx-only sockets and shared umems Date: Thu, 7 Nov 2019 18:47:40 +0100 Message-Id: <1573148860-30254-6-git-send-email-magnus.karlsson@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1573148860-30254-1-git-send-email-magnus.karlsson@intel.com> References: <1573148860-30254-1-git-send-email-magnus.karlsson@intel.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add more documentation about the new Rx-only and Tx-only sockets in libbpf and also how libbpf can now support shared umems. Also found two pieces that could be improved in the text, that got fixed in this commit. Signed-off-by: Magnus Karlsson Acked-by: Jonathan Lemon Tested-by: William Tu --- Documentation/networking/af_xdp.rst | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst index 7a4caaa..5bc55a4 100644 --- a/Documentation/networking/af_xdp.rst +++ b/Documentation/networking/af_xdp.rst @@ -295,7 +295,7 @@ round-robin example of distributing packets is shown below: { rr = (rr + 1) & (MAX_SOCKS - 1); - return bpf_redirect_map(&xsks_map, rr, 0); + return bpf_redirect_map(&xsks_map, rr, XDP_DROP); } Note, that since there is only a single set of FILL and COMPLETION @@ -304,6 +304,12 @@ to make sure that multiple processes or threads do not use these rings concurrently. There are no synchronization primitives in the libbpf code that protects multiple users at this point in time. +Libbpf uses this mode if you create more than one socket tied to the +same umem. However, note that you need to supply the +XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the +xsk_socket__create calls and load your own XDP program as there is no +built in one in libbpf that will route the traffic for you. + XDP_USE_NEED_WAKEUP bind flag ----------------------------- @@ -355,10 +361,22 @@ to set the size of at least one of the RX and TX rings. If you set both, you will be able to both receive and send traffic from your application, but if you only want to do one of them, you can save resources by only setting up one of them. Both the FILL ring and the -COMPLETION ring are mandatory if you have a UMEM tied to your socket, -which is the normal case. But if the XDP_SHARED_UMEM flag is used, any -socket after the first one does not have a UMEM and should in that -case not have any FILL or COMPLETION rings created. +COMPLETION ring are mandatory as you need to have a UMEM tied to your +socket. But if the XDP_SHARED_UMEM flag is used, any socket after the +first one does not have a UMEM and should in that case not have any +FILL or COMPLETION rings created as the ones from the shared umem will +be used. Note, that the rings are single-producer single-consumer, so +do not try to access them from multiple processes at the same +time. See the XDP_SHARED_UMEM section. + +In libbpf, you can create Rx-only and Tx-only sockets by supplying +NULL to the rx and tx arguments, respectively, to the +xsk_socket__create function. + +If you create a Tx-only socket, we recommend that you do not put any +packets on the fill ring. If you do this, drivers might think you are +going to receive something when you in fact will not, and this can +negatively impact performance. XDP_UMEM_REG setsockopt -----------------------