Message ID | 20181207114431.18038-1-bjorn.topel@gmail.com |
---|---|
Headers | show |
Series | Add XDP_ATTACH bind() flag to AF_XDP sockets | expand |
On Fri, 7 Dec 2018 12:44:24 +0100 Björn Töpel <bjorn.topel@gmail.com> wrote: > The rationale behind attach is performance and ease of use. Many XDP > socket users just need a simple way of creating/binding a socket and > receiving frames right away without loading an XDP program. > > XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply > is a kernel provided XDP program that is installed to the netdev when > XDP_ATTACH is being passed as a bind() flag. > > The builtin program is the simplest program possible to redirect a > frame to an attached socket. In restricted C it would look like this: > > SEC("xdp") > int xdp_prog(struct xdp_md *ctx) > { > return bpf_xsk_redirect(ctx); > } > > The builtin program loaded via XDP_ATTACH behaves, from an > install-to-netdev/uninstall-from-netdev point of view, differently > from regular XDP programs. The easiest way to look at it is as a > 2-level hierarchy, where regular XDP programs has precedence over the > builtin one. > > If no regular XDP program is installed to the netdev, the builtin will > be install. If the builtin program is installed, and a regular is > installed, regular XDP program will have precedence over the builtin > one. > > Further, if a regular program is installed, and later removed, the > builtin one will automatically be installed. > > The sxdp_flags field of struct sockaddr_xdp gets two new options > XDP_BUILTIN_SKB_MODE and XDP_BUILTIN_DRV_MODE, which maps to the > corresponding XDP netlink install flags. > > The builtin XDP program functionally adds even more complexity to the > already hard to read dev_change_xdp_fd. Maybe it would be simpler to > store the program in the struct net_device together with the install > flags instead of calling the ndo_bpf multiple times? (As far as I can see from reading the code, correct me if I'm wrong.) If an AF_XDP program uses XDP_ATTACH, then it installs the builtin-program as the XDP program on the "entire" device. That means all RX-queues will call this XDP-bpf program (indirect call), and it is actually only relevant for the specific queue_index. Yes, the helper call does check that the 'xdp->rxq->queue_index' for an attached 'xsk' and return XDP_PASS if it is NULL: +BPF_CALL_1(bpf_xdp_xsk_redirect, struct xdp_buff *, xdp) +{ + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct xdp_sock *xsk; + + xsk = READ_ONCE(xdp->rxq->dev->_rx[xdp->rxq->queue_index].xsk); + if (xsk) { + ri->xsk = xsk; + return XDP_REDIRECT; + } + + return XDP_PASS; +} Why do every normal XDP_PASS packet have to pay this overhead (indirect call), when someone loads an AF_XDP socket program? The AF_XDP socket is tied hard and only relevant to a specific RX-queue (which is why we get a performance boost due to SPSC queues). I acknowledge there is a need for this, but this use-case shows there is a need for attaching XDP programs per RX-queue basis.
Den fre 7 dec. 2018 kl 14:42 skrev Jesper Dangaard Brouer <brouer@redhat.com>: > > On Fri, 7 Dec 2018 12:44:24 +0100 > Björn Töpel <bjorn.topel@gmail.com> wrote: > > > The rationale behind attach is performance and ease of use. Many XDP > > socket users just need a simple way of creating/binding a socket and > > receiving frames right away without loading an XDP program. > > > > XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply > > is a kernel provided XDP program that is installed to the netdev when > > XDP_ATTACH is being passed as a bind() flag. > > > > The builtin program is the simplest program possible to redirect a > > frame to an attached socket. In restricted C it would look like this: > > > > SEC("xdp") > > int xdp_prog(struct xdp_md *ctx) > > { > > return bpf_xsk_redirect(ctx); > > } > > > > The builtin program loaded via XDP_ATTACH behaves, from an > > install-to-netdev/uninstall-from-netdev point of view, differently > > from regular XDP programs. The easiest way to look at it is as a > > 2-level hierarchy, where regular XDP programs has precedence over the > > builtin one. > > > > If no regular XDP program is installed to the netdev, the builtin will > > be install. If the builtin program is installed, and a regular is > > installed, regular XDP program will have precedence over the builtin > > one. > > > > Further, if a regular program is installed, and later removed, the > > builtin one will automatically be installed. > > > > The sxdp_flags field of struct sockaddr_xdp gets two new options > > XDP_BUILTIN_SKB_MODE and XDP_BUILTIN_DRV_MODE, which maps to the > > corresponding XDP netlink install flags. > > > > The builtin XDP program functionally adds even more complexity to the > > already hard to read dev_change_xdp_fd. Maybe it would be simpler to > > store the program in the struct net_device together with the install > > flags instead of calling the ndo_bpf multiple times? > > (As far as I can see from reading the code, correct me if I'm wrong.) > > If an AF_XDP program uses XDP_ATTACH, then it installs the > builtin-program as the XDP program on the "entire" device. That means > all RX-queues will call this XDP-bpf program (indirect call), and it is > actually only relevant for the specific queue_index. Yes, the helper > call does check that the 'xdp->rxq->queue_index' for an attached 'xsk' > and return XDP_PASS if it is NULL: > Yes, you are correct. The builtin XDP program, just like a regular XDP program, affects the whole netdev. So, yes the non-AF_XDP queues would get a performance hit from this. Just to reiterate -- this isn't new for this series. This has always been the case for XDP when acting on just one queue. > +BPF_CALL_1(bpf_xdp_xsk_redirect, struct xdp_buff *, xdp) > +{ > + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > + struct xdp_sock *xsk; > + > + xsk = READ_ONCE(xdp->rxq->dev->_rx[xdp->rxq->queue_index].xsk); > + if (xsk) { > + ri->xsk = xsk; > + return XDP_REDIRECT; > + } > + > + return XDP_PASS; > +} > > Why do every normal XDP_PASS packet have to pay this overhead (indirect > call), when someone loads an AF_XDP socket program? The AF_XDP socket > is tied hard and only relevant to a specific RX-queue (which is why we > get a performance boost due to SPSC queues). > > I acknowledge there is a need for this, but this use-case shows there > is a need for attaching XDP programs per RX-queue basis. > From my AF_XDP perspective, having a program per queue would make sense. The discussion of a per-queue has been up before, and I think the conclusion was that it would be too complex from a configuration/tooling point-of-view. Again, for AF_XDP this would be great. When we started to hack on AF_PACKET v4, we had some ideas of doing the "queue slicing" on a netdev level. So, e.g. take a netdev, and create, say, macvlans that took over parts of parents queues (something in line of what John did with NETIF_F_HW_L2FW_DOFFLOAD for macvlan) and then use the macvlan interface as the dedicated AF_XDP interface. Personally, I like the current queue slicing model, and having a way of loading an XDP program per queue would be nice -- unless the UX for the poor sysadmin will be terrible. :-) Björn > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer
Den fre 7 dec. 2018 kl 14:42 skrev Jesper Dangaard Brouer <brouer@redhat.com>: > > On Fri, 7 Dec 2018 12:44:24 +0100 > Björn Töpel <bjorn.topel@gmail.com> wrote: > > > The rationale behind attach is performance and ease of use. Many XDP > > socket users just need a simple way of creating/binding a socket and > > receiving frames right away without loading an XDP program. > > > > XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply > > is a kernel provided XDP program that is installed to the netdev when > > XDP_ATTACH is being passed as a bind() flag. > > > > The builtin program is the simplest program possible to redirect a > > frame to an attached socket. In restricted C it would look like this: > > > > SEC("xdp") > > int xdp_prog(struct xdp_md *ctx) > > { > > return bpf_xsk_redirect(ctx); > > } > > > > The builtin program loaded via XDP_ATTACH behaves, from an > > install-to-netdev/uninstall-from-netdev point of view, differently > > from regular XDP programs. The easiest way to look at it is as a > > 2-level hierarchy, where regular XDP programs has precedence over the > > builtin one. > > > > If no regular XDP program is installed to the netdev, the builtin will > > be install. If the builtin program is installed, and a regular is > > installed, regular XDP program will have precedence over the builtin > > one. > > > > Further, if a regular program is installed, and later removed, the > > builtin one will automatically be installed. > > > > The sxdp_flags field of struct sockaddr_xdp gets two new options > > XDP_BUILTIN_SKB_MODE and XDP_BUILTIN_DRV_MODE, which maps to the > > corresponding XDP netlink install flags. > > > > The builtin XDP program functionally adds even more complexity to the > > already hard to read dev_change_xdp_fd. Maybe it would be simpler to > > store the program in the struct net_device together with the install > > flags instead of calling the ndo_bpf multiple times? > > (As far as I can see from reading the code, correct me if I'm wrong.) > > If an AF_XDP program uses XDP_ATTACH, then it installs the > builtin-program as the XDP program on the "entire" device. That means > all RX-queues will call this XDP-bpf program (indirect call), and it is > actually only relevant for the specific queue_index. Yes, the helper > call does check that the 'xdp->rxq->queue_index' for an attached 'xsk' > and return XDP_PASS if it is NULL: > A side-note: What one can do, which I did for the plumbers work, is bypassing the indirect call in bpf_prog_run_xdp by doing a "if the XDP program is a builtin, call internal_bpf_xsk_redirect directly". Then, the XDP_PASS path wont be taxed by the indirect call -- but only for this special XDP_ATTACH program. And you'll still get an additional if-statement in for the skb-path. Björn > +BPF_CALL_1(bpf_xdp_xsk_redirect, struct xdp_buff *, xdp) > +{ > + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > + struct xdp_sock *xsk; > + > + xsk = READ_ONCE(xdp->rxq->dev->_rx[xdp->rxq->queue_index].xsk); > + if (xsk) { > + ri->xsk = xsk; > + return XDP_REDIRECT; > + } > + > + return XDP_PASS; > +} > > Why do every normal XDP_PASS packet have to pay this overhead (indirect > call), when someone loads an AF_XDP socket program? The AF_XDP socket > is tied hard and only relevant to a specific RX-queue (which is why we > get a performance boost due to SPSC queues). > > I acknowledge there is a need for this, but this use-case shows there > is a need for attaching XDP programs per RX-queue basis. > > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer
On Fri, Dec 07, 2018 at 12:44:24PM +0100, Björn Töpel wrote: > From: Björn Töpel <bjorn.topel@intel.com> > > Hi! > > This patch set adds support for a new XDP socket bind option, > XDP_ATTACH. > > The rationale behind attach is performance and ease of use. Many XDP > socket users just need a simple way of creating/binding a socket and > receiving frames right away without loading an XDP program. > > XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply > is a kernel provided XDP program that is installed to the netdev when > XDP_ATTACH is being passed as a bind() flag. > > The builtin program is the simplest program possible to redirect a > frame to an attached socket. In restricted C it would look like this: > > SEC("xdp") > int xdp_prog(struct xdp_md *ctx) > { > return bpf_xsk_redirect(ctx); > } > > The builtin program loaded via XDP_ATTACH behaves, from an > install-to-netdev/uninstall-from-netdev point of view, differently > from regular XDP programs. The easiest way to look at it is as a > 2-level hierarchy, where regular XDP programs has precedence over the > builtin one. The feature makes sense to me. May be XDP_ATTACH_BUILTIN would be a better name ? Also I think it needs another parameter to say which builtin program to use. This unconditional xsk_redirect is fine for performance benchmarking, but for production I suspect the users would want an easy way to stay safe when they're playing with AF_XDP. So another builtin program that redirects ssh and ping traffic back to the kernel would be a nice addition.
Den fre 7 dec. 2018 kl 22:21 skrev Alexei Starovoitov <alexei.starovoitov@gmail.com>: > > On Fri, Dec 07, 2018 at 12:44:24PM +0100, Björn Töpel wrote: > > From: Björn Töpel <bjorn.topel@intel.com> > > > > Hi! > > > > This patch set adds support for a new XDP socket bind option, > > XDP_ATTACH. > > > > The rationale behind attach is performance and ease of use. Many XDP > > socket users just need a simple way of creating/binding a socket and > > receiving frames right away without loading an XDP program. > > > > XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply > > is a kernel provided XDP program that is installed to the netdev when > > XDP_ATTACH is being passed as a bind() flag. > > > > The builtin program is the simplest program possible to redirect a > > frame to an attached socket. In restricted C it would look like this: > > > > SEC("xdp") > > int xdp_prog(struct xdp_md *ctx) > > { > > return bpf_xsk_redirect(ctx); > > } > > > > The builtin program loaded via XDP_ATTACH behaves, from an > > install-to-netdev/uninstall-from-netdev point of view, differently > > from regular XDP programs. The easiest way to look at it is as a > > 2-level hierarchy, where regular XDP programs has precedence over the > > builtin one. > > The feature makes sense to me. > May be XDP_ATTACH_BUILTIN would be a better name ? Yes, agree, or maybe XDP_BUILTIN_ATTACH? Regardless, I'll change the name for the next revision. > Also I think it needs another parameter to say which builtin > program to use. Yup, I had a plan to add the parameter when it's actually more than *one* builtin, but you're right, let's do it right away. I'll add a builtin prog enum field to the struct sockaddr_xdp. > This unconditional xsk_redirect is fine for performance > benchmarking, but for production I suspect the users would want > an easy way to stay safe when they're playing with AF_XDP. For setups that don't direct the flows explicitly by HW filters, yes! > So another builtin program that redirects ssh and ping traffic > back to the kernel would be a nice addition. > I suspect AF_XDP users would prefer redirecting packets to the kernel via the CPUMAP instead of XDP_PASS -- not paying for the ipstack on the AF_XDP core. Another builtin would be a tcpdump-like behavior, but that would require an XDP clone (which Magnus is actually experimenting with!). I'll address your input and get back with a new revision. Thanks for spending time on the series! Björn
Hi: Seems there are only 6 patches of the patch set in patchwork https://patchwork.ozlabs.org/project/netdev/list/?submitter=70569 https://patchwork.ozlabs.org/project/netdev/list/?series=80389 Anyone can help to check why patch 7/7 is missing? Thanks Qi > -----Original Message----- > From: Björn Töpel [mailto:bjorn.topel@gmail.com] > Sent: Friday, December 7, 2018 7:45 PM > To: bjorn.topel@gmail.com; Karlsson, Magnus <magnus.karlsson@intel.com>; > magnus.karlsson@gmail.com; ast@kernel.org; daniel@iogearbox.net; > netdev@vger.kernel.org > Cc: Topel, Bjorn <bjorn.topel@intel.com>; brouer@redhat.com; > u9012063@gmail.com; Zhang, Qi Z <qi.z.zhang@intel.com> > Subject: [PATCH bpf-next 7/7] samples: bpf: add support for XDP_ATTACH to > xdpsock > > From: Björn Töpel <bjorn.topel@intel.com> > > Teach the sample xdpsock application about XDP_ATTACH. > > Signed-off-by: Björn Töpel <bjorn.topel@intel.com> > --- > samples/bpf/xdpsock_user.c | 108 +++++++++++++++++++++++-------------- > 1 file changed, 67 insertions(+), 41 deletions(-) > > diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index > 57ecadc58403..12f908b60468 100644 > --- a/samples/bpf/xdpsock_user.c > +++ b/samples/bpf/xdpsock_user.c > @@ -633,7 +633,8 @@ static void int_exit(int sig) { > (void)sig; > dump_stats(); > - bpf_set_link_xdp_fd(opt_ifindex, -1, opt_xdp_flags); > + if (!(opt_xdp_bind_flags & XDP_ATTACH)) > + bpf_set_link_xdp_fd(opt_ifindex, -1, opt_xdp_flags); > exit(EXIT_SUCCESS); > } > > @@ -650,6 +651,7 @@ static struct option long_options[] = { > {"interval", required_argument, 0, 'n'}, > {"zero-copy", no_argument, 0, 'z'}, > {"copy", no_argument, 0, 'c'}, > + {"attach", no_argument, 0, 'a'}, > {0, 0, 0, 0} > }; > > @@ -670,6 +672,7 @@ static void usage(const char *prog) > " -n, --interval=n Specify statistics update interval (default 1 > sec).\n" > " -z, --zero-copy Force zero-copy mode.\n" > " -c, --copy Force copy mode.\n" > + " -a, --attach XDP_ATTACH mode.\n" > "\n"; > fprintf(stderr, str, prog); > exit(EXIT_FAILURE); > @@ -682,7 +685,7 @@ static void parse_command_line(int argc, char **argv) > opterr = 0; > > for (;;) { > - c = getopt_long(argc, argv, "rtli:q:psSNn:cz", long_options, > + c = getopt_long(argc, argv, "rtli:q:psSNn:cza", long_options, > &option_index); > if (c == -1) > break; > @@ -725,11 +728,22 @@ static void parse_command_line(int argc, char > **argv) > case 'c': > opt_xdp_bind_flags |= XDP_COPY; > break; > + case 'a': > + opt_xdp_bind_flags |= XDP_ATTACH; > + break; > default: > usage(basename(argv[0])); > } > } > > + if (opt_xdp_bind_flags & XDP_ATTACH) { > + if (opt_xdp_flags & XDP_FLAGS_DRV_MODE) > + opt_xdp_bind_flags |= XDP_BUILTIN_DRV_MODE; > + if (opt_xdp_flags & XDP_FLAGS_SKB_MODE) > + opt_xdp_bind_flags |= XDP_BUILTIN_SKB_MODE; > + > + } > + > opt_ifindex = if_nametoindex(opt_if); > if (!opt_ifindex) { > fprintf(stderr, "ERROR: interface \"%s\" does not exist\n", @@ > -903,7 +917,7 @@ int main(int argc, char **argv) > struct bpf_prog_load_attr prog_load_attr = { > .prog_type = BPF_PROG_TYPE_XDP, > }; > - int prog_fd, qidconf_map, xsks_map; > + int prog_fd, qidconf_map, xsks_map = -1; > struct bpf_object *obj; > char xdp_filename[256]; > struct bpf_map *map; > @@ -918,59 +932,71 @@ int main(int argc, char **argv) > exit(EXIT_FAILURE); > } > > - snprintf(xdp_filename, sizeof(xdp_filename), "%s_kern.o", argv[0]); > - prog_load_attr.file = xdp_filename; > + if (!(opt_xdp_bind_flags & XDP_ATTACH)) { > + snprintf(xdp_filename, sizeof(xdp_filename), "%s_kern.o", > + argv[0]); > + prog_load_attr.file = xdp_filename; > > - if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd)) > - exit(EXIT_FAILURE); > - if (prog_fd < 0) { > - fprintf(stderr, "ERROR: no program found: %s\n", > - strerror(prog_fd)); > - exit(EXIT_FAILURE); > - } > + if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd)) > + exit(EXIT_FAILURE); > + if (prog_fd < 0) { > + fprintf(stderr, "ERROR: no program found: %s\n", > + strerror(prog_fd)); > + exit(EXIT_FAILURE); > + } > > - map = bpf_object__find_map_by_name(obj, "qidconf_map"); > - qidconf_map = bpf_map__fd(map); > - if (qidconf_map < 0) { > - fprintf(stderr, "ERROR: no qidconf map found: %s\n", > - strerror(qidconf_map)); > - exit(EXIT_FAILURE); > - } > + map = bpf_object__find_map_by_name(obj, "qidconf_map"); > + qidconf_map = bpf_map__fd(map); > + if (qidconf_map < 0) { > + fprintf(stderr, "ERROR: no qidconf map found: %s\n", > + strerror(qidconf_map)); > + exit(EXIT_FAILURE); > + } > > - map = bpf_object__find_map_by_name(obj, "xsks_map"); > - xsks_map = bpf_map__fd(map); > - if (xsks_map < 0) { > - fprintf(stderr, "ERROR: no xsks map found: %s\n", > - strerror(xsks_map)); > - exit(EXIT_FAILURE); > - } > + map = bpf_object__find_map_by_name(obj, "xsks_map"); > + xsks_map = bpf_map__fd(map); > + if (xsks_map < 0) { > + fprintf(stderr, "ERROR: no xsks map found: %s\n", > + strerror(xsks_map)); > + exit(EXIT_FAILURE); > + } > > - if (bpf_set_link_xdp_fd(opt_ifindex, prog_fd, opt_xdp_flags) < 0) { > - fprintf(stderr, "ERROR: link set xdp fd failed\n"); > - exit(EXIT_FAILURE); > - } > + if (bpf_set_link_xdp_fd(opt_ifindex, prog_fd, > + opt_xdp_flags) < 0) { > + fprintf(stderr, "ERROR: link set xdp fd failed\n"); > + exit(EXIT_FAILURE); > + } > > - ret = bpf_map_update_elem(qidconf_map, &key, &opt_queue, 0); > - if (ret) { > - fprintf(stderr, "ERROR: bpf_map_update_elem qidconf\n"); > - exit(EXIT_FAILURE); > + ret = bpf_map_update_elem(qidconf_map, &key, &opt_queue, 0); > + if (ret) { > + fprintf(stderr, "ERROR: bpf_map_update_elem qidconf\n"); > + exit(EXIT_FAILURE); > + } > } > > /* Create sockets... */ > xsks[num_socks++] = xsk_configure(NULL); > > #if RR_LB > + if (opt_xdp_bind_flags & XDP_ATTACH) > + exit(EXIT_FAILURE); > + > for (i = 0; i < MAX_SOCKS - 1; i++) > xsks[num_socks++] = xsk_configure(xsks[0]->umem); #endif > > - /* ...and insert them into the map. */ > - for (i = 0; i < num_socks; i++) { > - key = i; > - ret = bpf_map_update_elem(xsks_map, &key, &xsks[i]->sfd, 0); > - if (ret) { > - fprintf(stderr, "ERROR: bpf_map_update_elem %d\n", i); > - exit(EXIT_FAILURE); > + if (!(opt_xdp_bind_flags & XDP_ATTACH)) { > + /* ...and insert them into the map. */ > + for (i = 0; i < num_socks; i++) { > + key = i; > + ret = bpf_map_update_elem(xsks_map, &key, > + &xsks[i]->sfd, 0); > + if (ret) { > + fprintf(stderr, > + "ERROR: bpf_map_update_elem %d\n", > + i); > + exit(EXIT_FAILURE); > + } > } > } > > -- > 2.19.1
On Fri, 7 Dec 2018 15:01:55 +0100 Björn Töpel <bjorn.topel@gmail.com> wrote: > Den fre 7 dec. 2018 kl 14:42 skrev Jesper Dangaard Brouer <brouer@redhat.com>: > > > > On Fri, 7 Dec 2018 12:44:24 +0100 > > Björn Töpel <bjorn.topel@gmail.com> wrote: > > > > > The rationale behind attach is performance and ease of use. Many XDP > > > socket users just need a simple way of creating/binding a socket and > > > receiving frames right away without loading an XDP program. > > > > > > XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply > > > is a kernel provided XDP program that is installed to the netdev when > > > XDP_ATTACH is being passed as a bind() flag. > > > > > > The builtin program is the simplest program possible to redirect a > > > frame to an attached socket. In restricted C it would look like this: > > > > > > SEC("xdp") > > > int xdp_prog(struct xdp_md *ctx) > > > { > > > return bpf_xsk_redirect(ctx); > > > } > > > > > > The builtin program loaded via XDP_ATTACH behaves, from an > > > install-to-netdev/uninstall-from-netdev point of view, differently > > > from regular XDP programs. The easiest way to look at it is as a > > > 2-level hierarchy, where regular XDP programs has precedence over the > > > builtin one. > > > > > > If no regular XDP program is installed to the netdev, the builtin will > > > be install. If the builtin program is installed, and a regular is > > > installed, regular XDP program will have precedence over the builtin > > > one. > > > > > > Further, if a regular program is installed, and later removed, the > > > builtin one will automatically be installed. > > > > > > The sxdp_flags field of struct sockaddr_xdp gets two new options > > > XDP_BUILTIN_SKB_MODE and XDP_BUILTIN_DRV_MODE, which maps to the > > > corresponding XDP netlink install flags. > > > > > > The builtin XDP program functionally adds even more complexity to the > > > already hard to read dev_change_xdp_fd. Maybe it would be simpler to > > > store the program in the struct net_device together with the install > > > flags instead of calling the ndo_bpf multiple times? > > > > (As far as I can see from reading the code, correct me if I'm wrong.) > > > > If an AF_XDP program uses XDP_ATTACH, then it installs the > > builtin-program as the XDP program on the "entire" device. That means > > all RX-queues will call this XDP-bpf program (indirect call), and it is > > actually only relevant for the specific queue_index. Yes, the helper > > call does check that the 'xdp->rxq->queue_index' for an attached 'xsk' > > and return XDP_PASS if it is NULL: > > > > Yes, you are correct. The builtin XDP program, just like a regular XDP > program, affects the whole netdev. So, yes the non-AF_XDP queues would > get a performance hit from this. Just to reiterate -- this isn't new > for this series. This has always been the case for XDP when acting on > just one queue. > > > +BPF_CALL_1(bpf_xdp_xsk_redirect, struct xdp_buff *, xdp) > > +{ > > + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > > + struct xdp_sock *xsk; > > + > > + xsk = READ_ONCE(xdp->rxq->dev->_rx[xdp->rxq->queue_index].xsk); > > + if (xsk) { > > + ri->xsk = xsk; > > + return XDP_REDIRECT; > > + } > > + > > + return XDP_PASS; > > +} > > > > Why do every normal XDP_PASS packet have to pay this overhead (indirect > > call), when someone loads an AF_XDP socket program? The AF_XDP socket > > is tied hard and only relevant to a specific RX-queue (which is why we > > get a performance boost due to SPSC queues). > > > > I acknowledge there is a need for this, but this use-case shows there > > is a need for attaching XDP programs per RX-queue basis. > > > > From my AF_XDP perspective, having a program per queue would make > sense. The discussion of a per-queue has been up before, and I think > the conclusion was that it would be too complex from a > configuration/tooling point-of-view. Again, for AF_XDP this would be > great. I really think it is about time that we introduce these per-queue XDP programs. From a configuration/tooling point-of-view, your proposed solution have the same issues, we have to solve. (Like listing introspection need to show the per-queue XDP/BPF programs). And you have almost solved it, by keeping the per-queue program in net_device struct. Alexei already requested more types of builtin programs. But as the XDP-driver API can only load XDP for the entire NIC, then how can you use two builtin programs at the same time?
On Fri, 7 Dec 2018 13:21:08 -0800 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > for production I suspect the users would want > an easy way to stay safe when they're playing with AF_XDP. > So another builtin program that redirects ssh and ping traffic > back to the kernel would be a nice addition. Are you saying a buildin program that need to parse different kinds of Eth-type headers (DSA, VLAN, QinqQ) and find the TCP port to match port 22 to return XDP_PASS, or else call AF_XDP redurect. That seems to be pure overhead for this fast-path buildin program for AF_XDP. Would a solution be to install a NIC hardware filter that redirect SSH port 22 to another RX-queue. And then have a buildin program that returns XDP_PASS installed on that RX-queue. And change Bjørns semantics, such that RX-queue programs takes precedence over the global XDP program. This would also be a good fail safe in general for XDP. If the RX-queues take precedence, I can use this fail safe approach. E.g. when I want to test my new global XDP program, I'll use ethtool match my management IP and send that to a specific RX-queue and my fail-safe BPF program.
On Sat, Dec 08, 2018 at 04:12:12PM +0100, Jesper Dangaard Brouer wrote: > On Fri, 7 Dec 2018 13:21:08 -0800 > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > for production I suspect the users would want > > an easy way to stay safe when they're playing with AF_XDP. > > So another builtin program that redirects ssh and ping traffic > > back to the kernel would be a nice addition. > > Are you saying a buildin program that need to parse different kinds of > Eth-type headers (DSA, VLAN, QinqQ) Hi Jesper Parsing DSA headers is quite hard, since most of them are not discoverable, you need to know they are there, and where they actually are. I also don't think there are any use cases for actually trying to parse them on the master interface. You are more likely to want to run the eBPF program on the slave interface, once the DSA header has been removed, and it is clear what interface the frame is actually for. Andrew
Den lör 8 dec. 2018 kl 15:52 skrev Jesper Dangaard Brouer <brouer@redhat.com>: > > On Fri, 7 Dec 2018 15:01:55 +0100 > Björn Töpel <bjorn.topel@gmail.com> wrote: > > > Den fre 7 dec. 2018 kl 14:42 skrev Jesper Dangaard Brouer <brouer@redhat.com>: > > > > > > On Fri, 7 Dec 2018 12:44:24 +0100 > > > Björn Töpel <bjorn.topel@gmail.com> wrote: > > > > > > > The rationale behind attach is performance and ease of use. Many XDP > > > > socket users just need a simple way of creating/binding a socket and > > > > receiving frames right away without loading an XDP program. > > > > > > > > XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply > > > > is a kernel provided XDP program that is installed to the netdev when > > > > XDP_ATTACH is being passed as a bind() flag. > > > > > > > > The builtin program is the simplest program possible to redirect a > > > > frame to an attached socket. In restricted C it would look like this: > > > > > > > > SEC("xdp") > > > > int xdp_prog(struct xdp_md *ctx) > > > > { > > > > return bpf_xsk_redirect(ctx); > > > > } > > > > > > > > The builtin program loaded via XDP_ATTACH behaves, from an > > > > install-to-netdev/uninstall-from-netdev point of view, differently > > > > from regular XDP programs. The easiest way to look at it is as a > > > > 2-level hierarchy, where regular XDP programs has precedence over the > > > > builtin one. > > > > > > > > If no regular XDP program is installed to the netdev, the builtin will > > > > be install. If the builtin program is installed, and a regular is > > > > installed, regular XDP program will have precedence over the builtin > > > > one. > > > > > > > > Further, if a regular program is installed, and later removed, the > > > > builtin one will automatically be installed. > > > > > > > > The sxdp_flags field of struct sockaddr_xdp gets two new options > > > > XDP_BUILTIN_SKB_MODE and XDP_BUILTIN_DRV_MODE, which maps to the > > > > corresponding XDP netlink install flags. > > > > > > > > The builtin XDP program functionally adds even more complexity to the > > > > already hard to read dev_change_xdp_fd. Maybe it would be simpler to > > > > store the program in the struct net_device together with the install > > > > flags instead of calling the ndo_bpf multiple times? > > > > > > (As far as I can see from reading the code, correct me if I'm wrong.) > > > > > > If an AF_XDP program uses XDP_ATTACH, then it installs the > > > builtin-program as the XDP program on the "entire" device. That means > > > all RX-queues will call this XDP-bpf program (indirect call), and it is > > > actually only relevant for the specific queue_index. Yes, the helper > > > call does check that the 'xdp->rxq->queue_index' for an attached 'xsk' > > > and return XDP_PASS if it is NULL: > > > > > > > Yes, you are correct. The builtin XDP program, just like a regular XDP > > program, affects the whole netdev. So, yes the non-AF_XDP queues would > > get a performance hit from this. Just to reiterate -- this isn't new > > for this series. This has always been the case for XDP when acting on > > just one queue. > > > > > +BPF_CALL_1(bpf_xdp_xsk_redirect, struct xdp_buff *, xdp) > > > +{ > > > + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > > > + struct xdp_sock *xsk; > > > + > > > + xsk = READ_ONCE(xdp->rxq->dev->_rx[xdp->rxq->queue_index].xsk); > > > + if (xsk) { > > > + ri->xsk = xsk; > > > + return XDP_REDIRECT; > > > + } > > > + > > > + return XDP_PASS; > > > +} > > > > > > Why do every normal XDP_PASS packet have to pay this overhead (indirect > > > call), when someone loads an AF_XDP socket program? The AF_XDP socket > > > is tied hard and only relevant to a specific RX-queue (which is why we > > > get a performance boost due to SPSC queues). > > > > > > I acknowledge there is a need for this, but this use-case shows there > > > is a need for attaching XDP programs per RX-queue basis. > > > > > > > From my AF_XDP perspective, having a program per queue would make > > sense. The discussion of a per-queue has been up before, and I think > > the conclusion was that it would be too complex from a > > configuration/tooling point-of-view. Again, for AF_XDP this would be > > great. > > I really think it is about time that we introduce these per-queue XDP > programs. From a configuration/tooling point-of-view, your proposed > solution have the same issues, we have to solve. (Like listing > introspection need to show the per-queue XDP/BPF programs). And you have > almost solved it, by keeping the per-queue program in net_device > struct. > I just to emphasize; My series is only a convenience method for loading a (sticky/hierarchical) XDP program and a new bpf-function. Nothing more. I'm not sure what you mean when you say that "proposed solution have the same issues". Do you mean XDP in general? *I'm* all in for a per-queue XDP program -- but I might be biased due to AF_XDP :-P Let's explore in this thread what that would look like and the semantics of that! > Alexei already requested more types of builtin programs. But as the > XDP-driver API can only load XDP for the entire NIC, then how can you > use two builtin programs at the same time? > You can't. Again, this is just another way of loading an XDP program. If socket A loads a builtin B on dev C, and socket X tries to load a builtin Y on dev C this will fail/clash. This is not different from today. My view on builtin has been "an XDP program provided by the kernel". A related problem is if socket A attaches with XDP_DRV and socket B attaches with XDP_SKB on the same netdev. (Ick, this is the messy parts of XDP. :-() > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer
Den lör 8 dec. 2018 kl 16:12 skrev Jesper Dangaard Brouer <brouer@redhat.com>: > > On Fri, 7 Dec 2018 13:21:08 -0800 > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > for production I suspect the users would want > > an easy way to stay safe when they're playing with AF_XDP. > > So another builtin program that redirects ssh and ping traffic > > back to the kernel would be a nice addition. > > Are you saying a buildin program that need to parse different kinds of > Eth-type headers (DSA, VLAN, QinqQ) and find the TCP port to match port > 22 to return XDP_PASS, or else call AF_XDP redurect. That seems to be > pure overhead for this fast-path buildin program for AF_XDP. > > Would a solution be to install a NIC hardware filter that redirect SSH > port 22 to another RX-queue. And then have a buildin program that > returns XDP_PASS installed on that RX-queue. And change Bjørns > semantics, such that RX-queue programs takes precedence over the global > XDP program. This would also be a good fail safe in general for XDP. > Exactly this; I'd say this is the most common way of using AF_XDP, i.e. steer a certain flow to a Rx queue, and *all* packets on that queue are pulled to the AF_XDP socket. This is why the builtin is the way it is, it's what is being used, not only for benchmarking scenarios. > If the RX-queues take precedence, I can use this fail safe approach. > E.g. when I want to test my new global XDP program, I'll use ethtool > match my management IP and send that to a specific RX-queue and my > fail-safe BPF program. > Interesting take to have a *per queue* XDP program that overrides the regular one. Maybe this is a better way to use builtins? > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer
On Sat, 8 Dec 2018 17:55:05 +0100 Andrew Lunn <andrew@lunn.ch> wrote: > On Sat, Dec 08, 2018 at 04:12:12PM +0100, Jesper Dangaard Brouer wrote: > > On Fri, 7 Dec 2018 13:21:08 -0800 > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > > > for production I suspect the users would want > > > an easy way to stay safe when they're playing with AF_XDP. > > > So another builtin program that redirects ssh and ping traffic > > > back to the kernel would be a nice addition. > > > > Are you saying a buildin program that need to parse different kinds of > > Eth-type headers (DSA, VLAN, QinqQ) > > Hi Jesper > > Parsing DSA headers is quite hard, since most of them are not > discoverable, you need to know they are there, and where they actually > are. > > I also don't think there are any use cases for actually trying to > parse them on the master interface. You are more likely to want to run > the eBPF program on the slave interface, once the DSA header has been > removed, and it is clear what interface the frame is actually for. That is not how XDP works. XDP must run on the "master" physical driver interface, as the "slave" interface is a virtual DSA interface. I did mention DSA because I'm handling that on the EspressoBin implementation.
On Sat, 8 Dec 2018 19:43:38 +0100 Björn Töpel <bjorn.topel@gmail.com> wrote: > Den lör 8 dec. 2018 kl 15:52 skrev Jesper Dangaard Brouer <brouer@redhat.com>: > > > > On Fri, 7 Dec 2018 15:01:55 +0100 > > Björn Töpel <bjorn.topel@gmail.com> wrote: > > > > > Den fre 7 dec. 2018 kl 14:42 skrev Jesper Dangaard Brouer <brouer@redhat.com>: > > > > > > > > On Fri, 7 Dec 2018 12:44:24 +0100 > > > > Björn Töpel <bjorn.topel@gmail.com> wrote: > > > > > > > > > The rationale behind attach is performance and ease of use. Many XDP > > > > > socket users just need a simple way of creating/binding a socket and > > > > > receiving frames right away without loading an XDP program. > > > > > > > > > > XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply > > > > > is a kernel provided XDP program that is installed to the netdev when > > > > > XDP_ATTACH is being passed as a bind() flag. > > > > > > > > > > The builtin program is the simplest program possible to redirect a > > > > > frame to an attached socket. In restricted C it would look like this: > > > > > > > > > > SEC("xdp") > > > > > int xdp_prog(struct xdp_md *ctx) > > > > > { > > > > > return bpf_xsk_redirect(ctx); > > > > > } > > > > > > > > > > The builtin program loaded via XDP_ATTACH behaves, from an > > > > > install-to-netdev/uninstall-from-netdev point of view, differently > > > > > from regular XDP programs. The easiest way to look at it is as a > > > > > 2-level hierarchy, where regular XDP programs has precedence over the > > > > > builtin one. > > > > > > > > > > If no regular XDP program is installed to the netdev, the builtin will > > > > > be install. If the builtin program is installed, and a regular is > > > > > installed, regular XDP program will have precedence over the builtin > > > > > one. > > > > > > > > > > Further, if a regular program is installed, and later removed, the > > > > > builtin one will automatically be installed. > > > > > > > > > > The sxdp_flags field of struct sockaddr_xdp gets two new options > > > > > XDP_BUILTIN_SKB_MODE and XDP_BUILTIN_DRV_MODE, which maps to the > > > > > corresponding XDP netlink install flags. > > > > > > > > > > The builtin XDP program functionally adds even more complexity to the > > > > > already hard to read dev_change_xdp_fd. Maybe it would be simpler to > > > > > store the program in the struct net_device together with the install > > > > > flags instead of calling the ndo_bpf multiple times? > > > > > > > > (As far as I can see from reading the code, correct me if I'm wrong.) > > > > > > > > If an AF_XDP program uses XDP_ATTACH, then it installs the > > > > builtin-program as the XDP program on the "entire" device. That means > > > > all RX-queues will call this XDP-bpf program (indirect call), and it is > > > > actually only relevant for the specific queue_index. Yes, the helper > > > > call does check that the 'xdp->rxq->queue_index' for an attached 'xsk' > > > > and return XDP_PASS if it is NULL: > > > > > > > > > > Yes, you are correct. The builtin XDP program, just like a regular XDP > > > program, affects the whole netdev. So, yes the non-AF_XDP queues would > > > get a performance hit from this. Just to reiterate -- this isn't new > > > for this series. This has always been the case for XDP when acting on > > > just one queue. > > > > > > > +BPF_CALL_1(bpf_xdp_xsk_redirect, struct xdp_buff *, xdp) > > > > +{ > > > > + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > > > > + struct xdp_sock *xsk; > > > > + > > > > + xsk = READ_ONCE(xdp->rxq->dev->_rx[xdp->rxq->queue_index].xsk); > > > > + if (xsk) { > > > > + ri->xsk = xsk; > > > > + return XDP_REDIRECT; > > > > + } > > > > + > > > > + return XDP_PASS; > > > > +} > > > > > > > > Why do every normal XDP_PASS packet have to pay this overhead (indirect > > > > call), when someone loads an AF_XDP socket program? The AF_XDP socket > > > > is tied hard and only relevant to a specific RX-queue (which is why we > > > > get a performance boost due to SPSC queues). > > > > > > > > I acknowledge there is a need for this, but this use-case shows there > > > > is a need for attaching XDP programs per RX-queue basis. > > > > > > > > > > From my AF_XDP perspective, having a program per queue would make > > > sense. The discussion of a per-queue has been up before, and I think > > > the conclusion was that it would be too complex from a > > > configuration/tooling point-of-view. Again, for AF_XDP this would be > > > great. > > > > I really think it is about time that we introduce these per-queue XDP > > programs. From a configuration/tooling point-of-view, your proposed > > solution have the same issues, we have to solve. (Like listing > > introspection need to show the per-queue XDP/BPF programs). And you have > > almost solved it, by keeping the per-queue program in net_device > > struct. > > > > I just to emphasize; My series is only a convenience method for > loading a (sticky/hierarchical) XDP program and a new bpf-function. > Nothing more. I'm not sure what you mean when you say that "proposed > solution have the same issues". Do you mean XDP in general? > > *I'm* all in for a per-queue XDP program -- but I might be biased due > to AF_XDP :-P Let's explore in this thread what that would look like > and the semantics of that! > Yes, please! (p.s. I'm going on vacation next week, which is unfortunate as I really want to find a good semantics for this). > > Alexei already requested more types of builtin programs. But as the > > XDP-driver API can only load XDP for the entire NIC, then how can > > you use two builtin programs at the same time? > > > > You can't. Again, this is just another way of loading an XDP program. > If socket A loads a builtin B on dev C, and socket X tries to load a > builtin Y on dev C this will fail/clash. This is not different from > today. My view on builtin has been "an XDP program provided by the > kernel". A related problem is if socket A attaches with XDP_DRV and > socket B attaches with XDP_SKB on the same netdev. (Ick, this is the > messy parts of XDP. :-() This again show we need per-RX-queue XDP programs.
> That is not how XDP works. XDP must run on the "master" physical driver > interface, as the "slave" interface is a virtual DSA interface. I did > mention DSA because I'm handling that on the EspressoBin implementation. Hi Jesper It is going to be interesting to see how you do that. As a user, i want to attach the XDP program to a slave interface. So i assume you somehow redirect that to the master, with some extra meta data to allow XDP on the master to detect packets for a specific slave? Andrew
Den fre 7 dec. 2018 kl 12:44 skrev Björn Töpel <bjorn.topel@gmail.com>: > > From: Björn Töpel <bjorn.topel@intel.com> > > Hi! > > This patch set adds support for a new XDP socket bind option, > XDP_ATTACH. > > The rationale behind attach is performance and ease of use. Many XDP > socket users just need a simple way of creating/binding a socket and > receiving frames right away without loading an XDP program. > > XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply > is a kernel provided XDP program that is installed to the netdev when > XDP_ATTACH is being passed as a bind() flag. > > The builtin program is the simplest program possible to redirect a > frame to an attached socket. In restricted C it would look like this: > > SEC("xdp") > int xdp_prog(struct xdp_md *ctx) > { > return bpf_xsk_redirect(ctx); > } > > The builtin program loaded via XDP_ATTACH behaves, from an > install-to-netdev/uninstall-from-netdev point of view, differently > from regular XDP programs. The easiest way to look at it is as a > 2-level hierarchy, where regular XDP programs has precedence over the > builtin one. > > If no regular XDP program is installed to the netdev, the builtin will > be install. If the builtin program is installed, and a regular is > installed, regular XDP program will have precedence over the builtin > one. > > Further, if a regular program is installed, and later removed, the > builtin one will automatically be installed. > > The sxdp_flags field of struct sockaddr_xdp gets two new options > XDP_BUILTIN_SKB_MODE and XDP_BUILTIN_DRV_MODE, which maps to the > corresponding XDP netlink install flags. > > The builtin XDP program functionally adds even more complexity to the > already hard to read dev_change_xdp_fd. Maybe it would be simpler to > store the program in the struct net_device together with the install > flags instead of calling the ndo_bpf multiple times? > > The outline of the series is as following: > patch 1-2: Introduce the first part of XDP_ATTACH, simply adding > the socket to the netdev structure. > patch 3: Add a new BPF function, bpf_xsk_redirect, that > redirects a frame to an attached socket. > patch 4-5: Preparatory commits for built in BPF programs > patch 6: Make XDP_ATTACH load a builtin XDP program > patch 7: Extend the samples application with XDP_ATTACH > support > > Patch 1 through 3 gives the performance boost and make it possible to > use AF_XDP sockets without an XSKMAP, but still requires an explicit > XDP program to be loaded. > > Patch 4 through 6 make it possible to use XDP socket without explictly > loading an XDP program. > > The performance numbers for rxdrop (Intel(R) Xeon(R) Gold 6154 CPU @ > 3.00GHz): > > XDP_SKB: > XSKMAP: 2.8 Mpps > XDP_ATTACH: 2.9 Mpps > > XDP_DRV - copy: > XSKMAP: 8.5 Mpps > XDP_ATTACH: 9.3 Mpps > > XDP_DRV - zero-copy: > XSKMAP: 15.1 Mpps > XDP_ATTACH: 17.3 Mpps > > Thanks! > Björn > Firstly, thanks for all the feedback on the series. I will do a respin of the series, but without the builtin functionality, IOW only the XDP_ATTACH bind() option to assign an XDP socket to a netdev Rx queue, and the corresponding bpf_xsk_redirect (patch 1 to 3). This will give a perfomance boost, and also requires a less complicated XDP program. The builtin/sticky XDP program has too many down-sides: * Explicit BPF instructions in the kernel. * Composability: mix regular BPF programs with builtin ones. * The sticky XDP program is not intuitive for a user, e.g. "ip link dev eth0 xdp off" will not disable the sticky one * More complex XDP install code. We'll this route instead: Introduce XDP_ATTACH/bpf_xsk_redirect (as above) and let libbpf load the trivial AF_XDP program. Björn > > Björn Töpel (7): > xsk: simplify AF_XDP socket teardown > xsk: add XDP_ATTACH bind() flag > bpf: add bpf_xsk_redirect function > bpf: prepare for builtin bpf program > bpf: add function to load builtin BPF program > xsk: load a builtin XDP program on XDP_ATTACH > samples: bpf: add support for XDP_ATTACH to xdpsock > > include/linux/bpf.h | 2 + > include/linux/filter.h | 4 + > include/linux/netdevice.h | 11 +++ > include/net/xdp_sock.h | 2 + > include/trace/events/xdp.h | 61 +++++++++++++++ > include/uapi/linux/bpf.h | 14 +++- > include/uapi/linux/if_xdp.h | 9 ++- > kernel/bpf/syscall.c | 91 ++++++++++++++-------- > net/core/dev.c | 84 +++++++++++++++++++-- > net/core/filter.c | 100 ++++++++++++++++++++++++ > net/xdp/xsk.c | 146 +++++++++++++++++++++++++++++------- > samples/bpf/xdpsock_user.c | 108 ++++++++++++++++---------- > 12 files changed, 524 insertions(+), 108 deletions(-) > > -- > 2.19.1 >
From: Björn Töpel <bjorn.topel@intel.com> Hi! This patch set adds support for a new XDP socket bind option, XDP_ATTACH. The rationale behind attach is performance and ease of use. Many XDP socket users just need a simple way of creating/binding a socket and receiving frames right away without loading an XDP program. XDP_ATTACH adds a mechanism we call "builtin XDP program" that simply is a kernel provided XDP program that is installed to the netdev when XDP_ATTACH is being passed as a bind() flag. The builtin program is the simplest program possible to redirect a frame to an attached socket. In restricted C it would look like this: SEC("xdp") int xdp_prog(struct xdp_md *ctx) { return bpf_xsk_redirect(ctx); } The builtin program loaded via XDP_ATTACH behaves, from an install-to-netdev/uninstall-from-netdev point of view, differently from regular XDP programs. The easiest way to look at it is as a 2-level hierarchy, where regular XDP programs has precedence over the builtin one. If no regular XDP program is installed to the netdev, the builtin will be install. If the builtin program is installed, and a regular is installed, regular XDP program will have precedence over the builtin one. Further, if a regular program is installed, and later removed, the builtin one will automatically be installed. The sxdp_flags field of struct sockaddr_xdp gets two new options XDP_BUILTIN_SKB_MODE and XDP_BUILTIN_DRV_MODE, which maps to the corresponding XDP netlink install flags. The builtin XDP program functionally adds even more complexity to the already hard to read dev_change_xdp_fd. Maybe it would be simpler to store the program in the struct net_device together with the install flags instead of calling the ndo_bpf multiple times? The outline of the series is as following: patch 1-2: Introduce the first part of XDP_ATTACH, simply adding the socket to the netdev structure. patch 3: Add a new BPF function, bpf_xsk_redirect, that redirects a frame to an attached socket. patch 4-5: Preparatory commits for built in BPF programs patch 6: Make XDP_ATTACH load a builtin XDP program patch 7: Extend the samples application with XDP_ATTACH support Patch 1 through 3 gives the performance boost and make it possible to use AF_XDP sockets without an XSKMAP, but still requires an explicit XDP program to be loaded. Patch 4 through 6 make it possible to use XDP socket without explictly loading an XDP program. The performance numbers for rxdrop (Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz): XDP_SKB: XSKMAP: 2.8 Mpps XDP_ATTACH: 2.9 Mpps XDP_DRV - copy: XSKMAP: 8.5 Mpps XDP_ATTACH: 9.3 Mpps XDP_DRV - zero-copy: XSKMAP: 15.1 Mpps XDP_ATTACH: 17.3 Mpps Thanks! Björn Björn Töpel (7): xsk: simplify AF_XDP socket teardown xsk: add XDP_ATTACH bind() flag bpf: add bpf_xsk_redirect function bpf: prepare for builtin bpf program bpf: add function to load builtin BPF program xsk: load a builtin XDP program on XDP_ATTACH samples: bpf: add support for XDP_ATTACH to xdpsock include/linux/bpf.h | 2 + include/linux/filter.h | 4 + include/linux/netdevice.h | 11 +++ include/net/xdp_sock.h | 2 + include/trace/events/xdp.h | 61 +++++++++++++++ include/uapi/linux/bpf.h | 14 +++- include/uapi/linux/if_xdp.h | 9 ++- kernel/bpf/syscall.c | 91 ++++++++++++++-------- net/core/dev.c | 84 +++++++++++++++++++-- net/core/filter.c | 100 ++++++++++++++++++++++++ net/xdp/xsk.c | 146 +++++++++++++++++++++++++++++------- samples/bpf/xdpsock_user.c | 108 ++++++++++++++++---------- 12 files changed, 524 insertions(+), 108 deletions(-)