[{"id":2544601,"web_url":"http://patchwork.ozlabs.org/comment/2544601/","msgid":"<5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>","list_archive_url":null,"date":"2020-10-02T15:25:49","subject":"RE: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":20028,"url":"http://patchwork.ozlabs.org/api/people/20028/","name":"John Fastabend","email":"john.fastabend@gmail.com"},"content":"Lorenzo Bianconi wrote:\n> This series introduce XDP multi-buffer support. The mvneta driver is\n> the first to support these new \"non-linear\" xdp_{buff,frame}. Reviewers\n> please focus on how these new types of xdp_{buff,frame} packets\n> traverse the different layers and the layout design. It is on purpose\n> that BPF-helpers are kept simple, as we don't want to expose the\n> internal layout to allow later changes.\n> \n> For now, to keep the design simple and to maintain performance, the XDP\n> BPF-prog (still) only have access to the first-buffer. It is left for\n> later (another patchset) to add payload access across multiple buffers.\n> This patchset should still allow for these future extensions. The goal\n> is to lift the XDP MTU restriction that comes with XDP, but maintain\n> same performance as before.\n> \n> The main idea for the new multi-buffer layout is to reuse the same\n> layout used for non-linear SKB. This rely on the \"skb_shared_info\"\n> struct at the end of the first buffer to link together subsequent\n> buffers. Keeping the layout compatible with SKBs is also done to ease\n> and speedup creating an SKB from an xdp_{buff,frame}. Converting\n> xdp_frame to SKB and deliver it to the network stack is shown in cpumap\n> code (patch 13/13).\n\nUsing the end of the buffer for the skb_shared_info struct is going to\nbecome driver API so unwinding it if it proves to be a performance issue\nis going to be ugly. So same question as before, for the use case where\nwe receive packet and do XDP_TX with it how do we avoid cache miss\noverhead? This is not just a hypothetical use case, the Facebook\nload balancer is doing this as well as Cilium and allowing this with\nmulti-buffer packets >1500B would be useful.\n\nCan we write the skb_shared_info lazily? It should only be needed once\nwe know the packet is going up the stack to some place that needs the\ninfo. Which we could learn from the return code of the XDP program.\n\n> \n> A multi-buffer bit (mb) has been introduced in xdp_{buff,frame} structure\n> to notify the bpf/network layer if this is a xdp multi-buffer frame (mb = 1)\n> or not (mb = 0).\n> The mb bit will be set by a xdp multi-buffer capable driver only for\n> non-linear frames maintaining the capability to receive linear frames\n> without any extra cost since the skb_shared_info structure at the end\n> of the first buffer will be initialized only if mb is set.\n\nThanks above is clearer.\n\n> \n> In order to provide to userspace some metdata about the non-linear\n> xdp_{buff,frame}, we introduced 2 bpf helpers:\n> - bpf_xdp_get_frags_count:\n>   get the number of fragments for a given xdp multi-buffer.\n> - bpf_xdp_get_frags_total_size:\n>   get the total size of fragments for a given xdp multi-buffer.\n\nWhats the use case for these? Do you have an example where knowing\nthe frags count is going to be something a BPF program will use?\nHaving total size seems interesting but perhaps we should push that\ninto the metadata so its pulled into the cache if users are going to\nbe reading it on every packet or something.\n\n> \n> Typical use cases for this series are:\n> - Jumbo-frames\n> - Packet header split (please see Google���s use-case @ NetDevConf 0x14, [0])\n> - TSO\n> \n> More info about the main idea behind this approach can be found here [1][2].\n> \n> We carried out some throughput tests in a standard linear frame scenario in order\n> to verify we did not introduced any performance regression adding xdp multi-buff\n> support to mvneta:\n> \n> offered load is ~ 1000Kpps, packet size is 64B, mvneta descriptor size is one PAGE\n> \n> commit: 879456bedbe5 (\"net: mvneta: avoid possible cache misses in mvneta_rx_swbm\")\n> - xdp-pass:      ~162Kpps\n> - xdp-drop:      ~701Kpps\n> - xdp-tx:        ~185Kpps\n> - xdp-redirect:  ~202Kpps\n> \n> mvneta xdp multi-buff:\n> - xdp-pass:      ~163Kpps\n> - xdp-drop:      ~739Kpps\n> - xdp-tx:        ~182Kpps\n> - xdp-redirect:  ~202Kpps\n> \n> Changes since v3:\n> - rebase ontop of bpf-next\n> - add patch 10/13 to copy back paged data from a xdp multi-buff frame to\n>   userspace buffer for xdp multi-buff selftests\n> \n> Changes since v2:\n> - add throughput measurements\n> - drop bpf_xdp_adjust_mb_header bpf helper\n> - introduce selftest for xdp multibuffer\n> - addressed comments on bpf_xdp_get_frags_count\n> - introduce xdp multi-buff support to cpumaps\n> \n> Changes since v1:\n> - Fix use-after-free in xdp_return_{buff/frame}\n> - Introduce bpf helpers\n> - Introduce xdp_mb sample program\n> - access skb_shared_info->nr_frags only on the last fragment\n> \n> Changes since RFC:\n> - squash multi-buffer bit initialization in a single patch\n> - add mvneta non-linear XDP buff support for tx side\n> \n> [0] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy\n> [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org\n> [2] https://netdevconf.info/0x14/session.html?tutorial-add-XDP-support-to-a-NIC-driver (XDPmulti-buffers section)\n> \n> Lorenzo Bianconi (11):\n>   xdp: introduce mb in xdp_buff/xdp_frame\n>   xdp: initialize xdp_buff mb bit to 0 in all XDP drivers\n>   net: mvneta: update mb bit before passing the xdp buffer to eBPF layer\n>   xdp: add multi-buff support to xdp_return_{buff/frame}\n>   net: mvneta: add multi buffer support to XDP_TX\n>   bpf: move user_size out of bpf_test_init\n>   bpf: introduce multibuff support to bpf_prog_test_run_xdp()\n>   bpf: test_run: add skb_shared_info pointer in bpf_test_finish\n>     signature\n>   bpf: add xdp multi-buffer selftest\n>   net: mvneta: enable jumbo frames for XDP\n>   bpf: cpumap: introduce xdp multi-buff support\n> \n> Sameeh Jubran (2):\n>   bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers\n>   samples/bpf: add bpf program that uses xdp mb helpers\n> \n>  drivers/net/ethernet/amazon/ena/ena_netdev.c  |   1 +\n>  drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |   1 +\n>  .../net/ethernet/cavium/thunder/nicvf_main.c  |   1 +\n>  .../net/ethernet/freescale/dpaa2/dpaa2-eth.c  |   1 +\n>  drivers/net/ethernet/intel/i40e/i40e_txrx.c   |   1 +\n>  drivers/net/ethernet/intel/ice/ice_txrx.c     |   1 +\n>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   1 +\n>  .../net/ethernet/intel/ixgbevf/ixgbevf_main.c |   1 +\n>  drivers/net/ethernet/marvell/mvneta.c         | 131 +++++++------\n>  .../net/ethernet/marvell/mvpp2/mvpp2_main.c   |   1 +\n>  drivers/net/ethernet/mellanox/mlx4/en_rx.c    |   1 +\n>  .../net/ethernet/mellanox/mlx5/core/en_rx.c   |   1 +\n>  .../ethernet/netronome/nfp/nfp_net_common.c   |   1 +\n>  drivers/net/ethernet/qlogic/qede/qede_fp.c    |   1 +\n>  drivers/net/ethernet/sfc/rx.c                 |   1 +\n>  drivers/net/ethernet/socionext/netsec.c       |   1 +\n>  drivers/net/ethernet/ti/cpsw.c                |   1 +\n>  drivers/net/ethernet/ti/cpsw_new.c            |   1 +\n>  drivers/net/hyperv/netvsc_bpf.c               |   1 +\n>  drivers/net/tun.c                             |   2 +\n>  drivers/net/veth.c                            |   1 +\n>  drivers/net/virtio_net.c                      |   2 +\n>  drivers/net/xen-netfront.c                    |   1 +\n>  include/net/xdp.h                             |  31 ++-\n>  include/uapi/linux/bpf.h                      |  14 ++\n>  kernel/bpf/cpumap.c                           |  45 +----\n>  net/bpf/test_run.c                            | 118 ++++++++++--\n>  net/core/dev.c                                |   1 +\n>  net/core/filter.c                             |  42 ++++\n>  net/core/xdp.c                                | 104 ++++++++++\n>  samples/bpf/Makefile                          |   3 +\n>  samples/bpf/xdp_mb_kern.c                     |  68 +++++++\n>  samples/bpf/xdp_mb_user.c                     | 182 ++++++++++++++++++\n>  tools/include/uapi/linux/bpf.h                |  14 ++\n>  .../testing/selftests/bpf/prog_tests/xdp_mb.c |  79 ++++++++\n>  .../selftests/bpf/progs/test_xdp_multi_buff.c |  24 +++\n>  36 files changed, 757 insertions(+), 123 deletions(-)\n>  create mode 100644 samples/bpf/xdp_mb_kern.c\n>  create mode 100644 samples/bpf/xdp_mb_user.c\n>  create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_mb.c\n>  create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_multi_buff.c\n> \n> -- \n> 2.26.2\n>","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com","ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20161025 header.b=i5Rio7kF;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C2v2w1DJRz9sSG\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Sat,  3 Oct 2020 01:26:00 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S2388314AbgJBPZ6 (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Fri, 2 Oct 2020 11:25:58 -0400","from lindbergh.monkeyblade.net ([23.128.96.19]:35002 \"EHLO\n        lindbergh.monkeyblade.net\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n        with ESMTP id S1726017AbgJBPZ6 (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Fri, 2 Oct 2020 11:25:58 -0400","from mail-io1-xd42.google.com (mail-io1-xd42.google.com\n [IPv6:2607:f8b0:4864:20::d42])\n        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E60DC0613D0;\n        Fri,  2 Oct 2020 08:25:58 -0700 (PDT)","by mail-io1-xd42.google.com with SMTP id y74so1921716iof.12;\n        Fri, 02 Oct 2020 08:25:58 -0700 (PDT)","from localhost ([184.63.162.180])\n        by smtp.gmail.com with ESMTPSA id\n a86sm925816ill.11.2020.10.02.08.25.54\n        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n        Fri, 02 Oct 2020 08:25:56 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=gmail.com; s=20161025;\n        h=date:from:to:cc:message-id:in-reply-to:references:subject\n         :mime-version:content-transfer-encoding;\n        bh=jPQqkXAfGZC41RKCtEmG0SNXADy8ZPQvJecEQs0H2Cc=;\n        b=i5Rio7kFRSA1lxsmlTCguMMQyAeQiikc0Swb8BJTKvsb4pCNl3TgJPoxNCUVlGnwXz\n         HhTO95QdZVrzAglauWfaaMQp/X9lhTx3bfyZrYZMAPjapBn6notPNX9xGynAGg0UXfB3\n         GfTmlm7xH0xiEi+QL9ho8bmYczH8g1chYfWRuv7gnGBeksoWP0WjaTaEvPg/gyp6xeLi\n         TNWBEffyS4zioqn5wR9ftrm9zAnm8FpA6ObtW61SgcJ1zf4P1hUurP5W9IagcKyD8IY6\n         ekEb/IbQ2OPsBWTfSdT6tR+ktt88gIUbSIPODpzKo/EttxAjHVtItMVE78c6PDJNQ49g\n         ghBQ==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=1e100.net; s=20161025;\n        h=x-gm-message-state:date:from:to:cc:message-id:in-reply-to\n         :references:subject:mime-version:content-transfer-encoding;\n        bh=jPQqkXAfGZC41RKCtEmG0SNXADy8ZPQvJecEQs0H2Cc=;\n        b=HRqicEpiv2MsbnnCEo15unq4TkbtNTqehErWFdSFN1I2WN02yiFvl4IyQxNoa9rz52\n         bpaxB24Uc9uBB7CrY/zZws+7AcDExhM28R9v8HFMKcmUalAGtBF03Xwagq2XvSAVQtXk\n         GSPwUqzhynGOMdbzxMEl5n19pTERNrk2rUBWuk4hYiui7l6+AE3FwG2Q1RSoDOzZkpvY\n         gZkvGfLRLURZB+A24qvVGSHAatyUq5sd0YHUpdSmipNPbrbQg9DNllVwtaqdC7Vp21es\n         +UaUf/Zzd2RAH95AU2kS3NvL3FcVZfdoUNr7NFOIQYVSyZsX8tK7nebRrjE5TxKiTLDE\n         FZGQ==","X-Gm-Message-State":"AOAM530iZHomZQ4qqF7Z7Fki3DyfcAlyWAJAyTOm7B2/1xrTzlx2DV7r\n        LF3MF2qHPlK7eDe+SrdfBuj15iHhAn5wUg==","X-Google-Smtp-Source":"\n ABdhPJwYlvjTnYUmkraS7ITOGv3yUV6yNeT5sQEKhEDd5/65aE/TC6241CnZUNO3yVcMLVlPFwzUnQ==","X-Received":"by 2002:a05:6638:25d0:: with SMTP id\n u16mr2869153jat.0.1601652357761;\n        Fri, 02 Oct 2020 08:25:57 -0700 (PDT)","Date":"Fri, 02 Oct 2020 08:25:49 -0700","From":"John Fastabend <john.fastabend@gmail.com>","To":"Lorenzo Bianconi <lorenzo@kernel.org>, bpf@vger.kernel.org,\n        netdev@vger.kernel.org","Cc":"davem@davemloft.net, kuba@kernel.org, ast@kernel.org,\n        daniel@iogearbox.net, shayagr@amazon.com, sameehj@amazon.com,\n        john.fastabend@gmail.com, dsahern@kernel.org, brouer@redhat.com,\n        lorenzo.bianconi@redhat.com, echaudro@redhat.com","Message-ID":"<5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>","In-Reply-To":"<cover.1601648734.git.lorenzo@kernel.org>","References":"<cover.1601648734.git.lorenzo@kernel.org>","Subject":"RE: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Mime-Version":"1.0","Content-Type":"text/plain;\n charset=utf-8","Content-Transfer-Encoding":"quoted-printable","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2544658,"web_url":"http://patchwork.ozlabs.org/comment/2544658/","msgid":"<20201002160623.GA40027@lore-desk>","list_archive_url":null,"date":"2020-10-02T16:06:23","subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":73083,"url":"http://patchwork.ozlabs.org/api/people/73083/","name":"Lorenzo Bianconi","email":"lorenzo.bianconi@redhat.com"},"content":"> Lorenzo Bianconi wrote:\n> > This series introduce XDP multi-buffer support. The mvneta driver is\n> > the first to support these new \"non-linear\" xdp_{buff,frame}. Reviewers\n> > please focus on how these new types of xdp_{buff,frame} packets\n> > traverse the different layers and the layout design. It is on purpose\n> > that BPF-helpers are kept simple, as we don't want to expose the\n> > internal layout to allow later changes.\n> > \n> > For now, to keep the design simple and to maintain performance, the XDP\n> > BPF-prog (still) only have access to the first-buffer. It is left for\n> > later (another patchset) to add payload access across multiple buffers.\n> > This patchset should still allow for these future extensions. The goal\n> > is to lift the XDP MTU restriction that comes with XDP, but maintain\n> > same performance as before.\n> > \n> > The main idea for the new multi-buffer layout is to reuse the same\n> > layout used for non-linear SKB. This rely on the \"skb_shared_info\"\n> > struct at the end of the first buffer to link together subsequent\n> > buffers. Keeping the layout compatible with SKBs is also done to ease\n> > and speedup creating an SKB from an xdp_{buff,frame}. Converting\n> > xdp_frame to SKB and deliver it to the network stack is shown in cpumap\n> > code (patch 13/13).\n> \n> Using the end of the buffer for the skb_shared_info struct is going to\n> become driver API so unwinding it if it proves to be a performance issue\n> is going to be ugly. So same question as before, for the use case where\n> we receive packet and do XDP_TX with it how do we avoid cache miss\n> overhead? This is not just a hypothetical use case, the Facebook\n> load balancer is doing this as well as Cilium and allowing this with\n> multi-buffer packets >1500B would be useful.\n> \n> Can we write the skb_shared_info lazily? It should only be needed once\n> we know the packet is going up the stack to some place that needs the\n> info. Which we could learn from the return code of the XDP program.\n\nHi John,\n\nI agree, I think for XDP_TX use-case it is not strictly necessary to fill the\nskb_hared_info. The driver can just keep this info on the stack and use it\ninserting the packet back to the DMA ring.\nFor mvneta I implemented it in this way to keep the code aligned with ndo_xdp_xmit\npath since it is a low-end device. I guess we are not introducing any API constraint\nfor XDP_TX. A high-end device can implement multi-buff for XDP_TX in a different way\nin order to avoid the cache miss.\n\nWe need to fill the skb_shared info only when we want to pass the frame to the\nnetwork stack (build_skb() can directly reuse skb_shared_info->frags[]) or for\nXDP_REDIRECT use-case.\n\n> \n> > \n> > A multi-buffer bit (mb) has been introduced in xdp_{buff,frame} structure\n> > to notify the bpf/network layer if this is a xdp multi-buffer frame (mb = 1)\n> > or not (mb = 0).\n> > The mb bit will be set by a xdp multi-buffer capable driver only for\n> > non-linear frames maintaining the capability to receive linear frames\n> > without any extra cost since the skb_shared_info structure at the end\n> > of the first buffer will be initialized only if mb is set.\n> \n> Thanks above is clearer.\n> \n> > \n> > In order to provide to userspace some metdata about the non-linear\n> > xdp_{buff,frame}, we introduced 2 bpf helpers:\n> > - bpf_xdp_get_frags_count:\n> >   get the number of fragments for a given xdp multi-buffer.\n> > - bpf_xdp_get_frags_total_size:\n> >   get the total size of fragments for a given xdp multi-buffer.\n> \n> Whats the use case for these? Do you have an example where knowing\n> the frags count is going to be something a BPF program will use?\n> Having total size seems interesting but perhaps we should push that\n> into the metadata so its pulled into the cache if users are going to\n> be reading it on every packet or something.\n\nAt the moment we do not have any use-case for these helpers (not considering\nthe sample in the series :)). We introduced them to provide some basic metadata\nabout the non-linear xdp_frame.\nIIRC we decided to introduce some helpers instead of adding this info in xdp_frame\nin order to save space on it (for xdp it is essential xdp_frame to fit in a single\ncache-line).\n\nRegards,\nLorenzo\n\n> \n> > \n> > Typical use cases for this series are:\n> > - Jumbo-frames\n> > - Packet header split (please see Google���s use-case @ NetDevConf 0x14, [0])\n> > - TSO\n> > \n> > More info about the main idea behind this approach can be found here [1][2].\n> > \n> > We carried out some throughput tests in a standard linear frame scenario in order\n> > to verify we did not introduced any performance regression adding xdp multi-buff\n> > support to mvneta:\n> > \n> > offered load is ~ 1000Kpps, packet size is 64B, mvneta descriptor size is one PAGE\n> > \n> > commit: 879456bedbe5 (\"net: mvneta: avoid possible cache misses in mvneta_rx_swbm\")\n> > - xdp-pass:      ~162Kpps\n> > - xdp-drop:      ~701Kpps\n> > - xdp-tx:        ~185Kpps\n> > - xdp-redirect:  ~202Kpps\n> > \n> > mvneta xdp multi-buff:\n> > - xdp-pass:      ~163Kpps\n> > - xdp-drop:      ~739Kpps\n> > - xdp-tx:        ~182Kpps\n> > - xdp-redirect:  ~202Kpps\n> > \n> > Changes since v3:\n> > - rebase ontop of bpf-next\n> > - add patch 10/13 to copy back paged data from a xdp multi-buff frame to\n> >   userspace buffer for xdp multi-buff selftests\n> > \n> > Changes since v2:\n> > - add throughput measurements\n> > - drop bpf_xdp_adjust_mb_header bpf helper\n> > - introduce selftest for xdp multibuffer\n> > - addressed comments on bpf_xdp_get_frags_count\n> > - introduce xdp multi-buff support to cpumaps\n> > \n> > Changes since v1:\n> > - Fix use-after-free in xdp_return_{buff/frame}\n> > - Introduce bpf helpers\n> > - Introduce xdp_mb sample program\n> > - access skb_shared_info->nr_frags only on the last fragment\n> > \n> > Changes since RFC:\n> > - squash multi-buffer bit initialization in a single patch\n> > - add mvneta non-linear XDP buff support for tx side\n> > \n> > [0] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy\n> > [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org\n> > [2] https://netdevconf.info/0x14/session.html?tutorial-add-XDP-support-to-a-NIC-driver (XDPmulti-buffers section)\n> > \n> > Lorenzo Bianconi (11):\n> >   xdp: introduce mb in xdp_buff/xdp_frame\n> >   xdp: initialize xdp_buff mb bit to 0 in all XDP drivers\n> >   net: mvneta: update mb bit before passing the xdp buffer to eBPF layer\n> >   xdp: add multi-buff support to xdp_return_{buff/frame}\n> >   net: mvneta: add multi buffer support to XDP_TX\n> >   bpf: move user_size out of bpf_test_init\n> >   bpf: introduce multibuff support to bpf_prog_test_run_xdp()\n> >   bpf: test_run: add skb_shared_info pointer in bpf_test_finish\n> >     signature\n> >   bpf: add xdp multi-buffer selftest\n> >   net: mvneta: enable jumbo frames for XDP\n> >   bpf: cpumap: introduce xdp multi-buff support\n> > \n> > Sameeh Jubran (2):\n> >   bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers\n> >   samples/bpf: add bpf program that uses xdp mb helpers\n> > \n> >  drivers/net/ethernet/amazon/ena/ena_netdev.c  |   1 +\n> >  drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |   1 +\n> >  .../net/ethernet/cavium/thunder/nicvf_main.c  |   1 +\n> >  .../net/ethernet/freescale/dpaa2/dpaa2-eth.c  |   1 +\n> >  drivers/net/ethernet/intel/i40e/i40e_txrx.c   |   1 +\n> >  drivers/net/ethernet/intel/ice/ice_txrx.c     |   1 +\n> >  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   1 +\n> >  .../net/ethernet/intel/ixgbevf/ixgbevf_main.c |   1 +\n> >  drivers/net/ethernet/marvell/mvneta.c         | 131 +++++++------\n> >  .../net/ethernet/marvell/mvpp2/mvpp2_main.c   |   1 +\n> >  drivers/net/ethernet/mellanox/mlx4/en_rx.c    |   1 +\n> >  .../net/ethernet/mellanox/mlx5/core/en_rx.c   |   1 +\n> >  .../ethernet/netronome/nfp/nfp_net_common.c   |   1 +\n> >  drivers/net/ethernet/qlogic/qede/qede_fp.c    |   1 +\n> >  drivers/net/ethernet/sfc/rx.c                 |   1 +\n> >  drivers/net/ethernet/socionext/netsec.c       |   1 +\n> >  drivers/net/ethernet/ti/cpsw.c                |   1 +\n> >  drivers/net/ethernet/ti/cpsw_new.c            |   1 +\n> >  drivers/net/hyperv/netvsc_bpf.c               |   1 +\n> >  drivers/net/tun.c                             |   2 +\n> >  drivers/net/veth.c                            |   1 +\n> >  drivers/net/virtio_net.c                      |   2 +\n> >  drivers/net/xen-netfront.c                    |   1 +\n> >  include/net/xdp.h                             |  31 ++-\n> >  include/uapi/linux/bpf.h                      |  14 ++\n> >  kernel/bpf/cpumap.c                           |  45 +----\n> >  net/bpf/test_run.c                            | 118 ++++++++++--\n> >  net/core/dev.c                                |   1 +\n> >  net/core/filter.c                             |  42 ++++\n> >  net/core/xdp.c                                | 104 ++++++++++\n> >  samples/bpf/Makefile                          |   3 +\n> >  samples/bpf/xdp_mb_kern.c                     |  68 +++++++\n> >  samples/bpf/xdp_mb_user.c                     | 182 ++++++++++++++++++\n> >  tools/include/uapi/linux/bpf.h                |  14 ++\n> >  .../testing/selftests/bpf/prog_tests/xdp_mb.c |  79 ++++++++\n> >  .../selftests/bpf/progs/test_xdp_multi_buff.c |  24 +++\n> >  36 files changed, 757 insertions(+), 123 deletions(-)\n> >  create mode 100644 samples/bpf/xdp_mb_kern.c\n> >  create mode 100644 samples/bpf/xdp_mb_user.c\n> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_mb.c\n> >  create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_multi_buff.c\n> > \n> > -- \n> > 2.26.2\n> > \n> \n>","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=redhat.com","ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256\n header.s=mimecast20190719 header.b=IaLWWYMF;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C2vxq3Mm9z9sSn\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Sat,  3 Oct 2020 02:06:39 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S1726386AbgJBQGh (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Fri, 2 Oct 2020 12:06:37 -0400","from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:41048 \"EHLO\n        us-smtp-delivery-124.mimecast.com\" rhost-flags-OK-OK-OK-OK)\n        by vger.kernel.org with ESMTP id S2387688AbgJBQGh (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Fri, 2 Oct 2020 12:06:37 -0400","from mail-wm1-f71.google.com (mail-wm1-f71.google.com\n [209.85.128.71]) (Using TLS) by relay.mimecast.com with ESMTP id\n us-mta-291-gCc5fZndMRWgnBUBktzGQQ-1; Fri, 02 Oct 2020 12:06:30 -0400","by mail-wm1-f71.google.com with SMTP id l15so678518wmh.9\n        for <netdev@vger.kernel.org>; Fri, 02 Oct 2020 09:06:29 -0700 (PDT)","from localhost ([176.207.245.61])\n        by smtp.gmail.com with ESMTPSA id m3sm2658963wme.3.2020.10.02.09.06.26\n        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n        Fri, 02 Oct 2020 09:06:26 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;\n        s=mimecast20190719; t=1601654794;\n        h=from:from:reply-to:subject:subject:date:date:message-id:message-id:\n         to:to:cc:cc:mime-version:mime-version:content-type:content-type:\n         in-reply-to:in-reply-to:references:references;\n        bh=f37y1NeOmt1jMzW727aJwskGl0zLABG5qEvMjStTgzM=;\n        b=IaLWWYMFZz0AUO/57nbq/+O9+lTfz9vzIIulHsz1NiuwQuZ94FCRZtrSzAX05b2f9w5YVu\n        kxOxdrUUdNvYRSLYSUoFclNE+BDADLig/KzY6k5V05sv3CZIVRXPDdzhjGXdF/Jo7cRdu3\n        MIiHsmNyEwuAAZEQ3F54nsyOGkTiIpM=","X-MC-Unique":"gCc5fZndMRWgnBUBktzGQQ-1","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=1e100.net; s=20161025;\n        h=x-gm-message-state:date:from:to:cc:subject:message-id:references\n         :mime-version:content-disposition:in-reply-to;\n        bh=f37y1NeOmt1jMzW727aJwskGl0zLABG5qEvMjStTgzM=;\n        b=XRBcrFh3/HOFqFPnTvmxaA1UcN5esLVHIaPUgxE7zQDoBavdvnxtQcQ3JFc54EnETs\n         puTQB8JiRHxv5DUBgb9ABMV69DwgncEQq/lg4TP95u7czLirRxaxZPAXQV2CZVa8O7Qw\n         ayUD2o7or1tVLJGfT7wTfuCs4cq9WvRVgoqg9hYkMxRqyHVy4PzPhOkwPYlB4XJlZO6M\n         Bs+byggUsD4cxmv+XdW3I89EzPXoUCy1/qdKcg0eCPhEAbBHzWB1JSN61PNfVxCZgdxm\n         cifCWDz8S5qJepqgdoNkw9vJsMQHBbLICqzeR6Ei9sjujVQPiqCnoWDE5Y03YHtFm3jz\n         CLNA==","X-Gm-Message-State":"AOAM530SemkuXd7OPN4Qe6RvdcXtLU0DyrxemRdfqRZ9Kd0owwe100RL\n        2d4Bk6pWnkj/5DggTWPxvmTzbu6+5/qElGv3pRHQEFsB+t9OvL2SF50mZWk1l9NpV+vUzgwOqy/\n        DmlLzU4PsJoOPMhha","X-Received":["by 2002:a1c:2903:: with SMTP id p3mr3789775wmp.170.1601654788846;\n        Fri, 02 Oct 2020 09:06:28 -0700 (PDT)","by 2002:a1c:2903:: with SMTP id p3mr3789740wmp.170.1601654788519;\n        Fri, 02 Oct 2020 09:06:28 -0700 (PDT)"],"X-Google-Smtp-Source":"\n ABdhPJwxzUC/G45T3wmsBxcEZe6q2wPURQSbmB2F/nRB0nbAOUKdHdQUP3bCRU7bSNd7zD7y3073qQ==","Date":"Fri, 2 Oct 2020 18:06:23 +0200","From":"Lorenzo Bianconi <lorenzo.bianconi@redhat.com>","To":"John Fastabend <john.fastabend@gmail.com>","Cc":"Lorenzo Bianconi <lorenzo@kernel.org>, bpf@vger.kernel.org,\n        netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org,\n        ast@kernel.org, daniel@iogearbox.net, shayagr@amazon.com,\n        sameehj@amazon.com, dsahern@kernel.org, brouer@redhat.com,\n        echaudro@redhat.com","Subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Message-ID":"<20201002160623.GA40027@lore-desk>","References":"<cover.1601648734.git.lorenzo@kernel.org>\n <5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>","MIME-Version":"1.0","Content-Type":"multipart/signed; micalg=pgp-sha256;\n        protocol=\"application/pgp-signature\"; boundary=\"envbJBWh7q8WU6mo\"","Content-Disposition":"inline","In-Reply-To":"<5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2544772,"web_url":"http://patchwork.ozlabs.org/comment/2544772/","msgid":"<5f776c14d69b3_a6402087e@john-XPS-13-9370.notmuch>","list_archive_url":null,"date":"2020-10-02T18:06:12","subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":20028,"url":"http://patchwork.ozlabs.org/api/people/20028/","name":"John Fastabend","email":"john.fastabend@gmail.com"},"content":"Lorenzo Bianconi wrote:\n> > Lorenzo Bianconi wrote:\n> > > This series introduce XDP multi-buffer support. The mvneta driver is\n> > > the first to support these new \"non-linear\" xdp_{buff,frame}. Reviewers\n> > > please focus on how these new types of xdp_{buff,frame} packets\n> > > traverse the different layers and the layout design. It is on purpose\n> > > that BPF-helpers are kept simple, as we don't want to expose the\n> > > internal layout to allow later changes.\n> > > \n> > > For now, to keep the design simple and to maintain performance, the XDP\n> > > BPF-prog (still) only have access to the first-buffer. It is left for\n> > > later (another patchset) to add payload access across multiple buffers.\n> > > This patchset should still allow for these future extensions. The goal\n> > > is to lift the XDP MTU restriction that comes with XDP, but maintain\n> > > same performance as before.\n> > > \n> > > The main idea for the new multi-buffer layout is to reuse the same\n> > > layout used for non-linear SKB. This rely on the \"skb_shared_info\"\n> > > struct at the end of the first buffer to link together subsequent\n> > > buffers. Keeping the layout compatible with SKBs is also done to ease\n> > > and speedup creating an SKB from an xdp_{buff,frame}. Converting\n> > > xdp_frame to SKB and deliver it to the network stack is shown in cpumap\n> > > code (patch 13/13).\n> > \n> > Using the end of the buffer for the skb_shared_info struct is going to\n> > become driver API so unwinding it if it proves to be a performance issue\n> > is going to be ugly. So same question as before, for the use case where\n> > we receive packet and do XDP_TX with it how do we avoid cache miss\n> > overhead? This is not just a hypothetical use case, the Facebook\n> > load balancer is doing this as well as Cilium and allowing this with\n> > multi-buffer packets >1500B would be useful.\n> > \n> > Can we write the skb_shared_info lazily? It should only be needed once\n> > we know the packet is going up the stack to some place that needs the\n> > info. Which we could learn from the return code of the XDP program.\n> \n> Hi John,\n\nHi, I'll try to join the two threads this one and the one on helpers here\nso we don't get too fragmented.\n\n> \n> I agree, I think for XDP_TX use-case it is not strictly necessary to fill the\n> skb_hared_info. The driver can just keep this info on the stack and use it\n> inserting the packet back to the DMA ring.\n> For mvneta I implemented it in this way to keep the code aligned with ndo_xdp_xmit\n> path since it is a low-end device. I guess we are not introducing any API constraint\n> for XDP_TX. A high-end device can implement multi-buff for XDP_TX in a different way\n> in order to avoid the cache miss.\n\nAgree it would be an implementation detail for XDP_TX except the two helpers added\nin this series currently require it to be there.\n\n> \n> We need to fill the skb_shared info only when we want to pass the frame to the\n> network stack (build_skb() can directly reuse skb_shared_info->frags[]) or for\n> XDP_REDIRECT use-case.\n\nIt might be good to think about the XDP_REDIRECT case as well then. If the\nfrags list fit in the metadata/xdp_frame would we expect better\nperformance?\n\nLooking at skb_shared_info{} that is a rather large structure with many\nfields that look unnecessary for XDP_REDIRECT case and only needed when\npassing to the stack. Fundamentally, a frag just needs\n\n struct bio_vec {\n     struct page *bv_page;     // 8B\n     unsigned int bv_len;      // 4B\n     unsigned int bv_offset;   // 4B\n } // 16B\n\nWith header split + data we only need a single frag so we could use just\n16B. And worse case jumbo frame + header split seems 3 entries would be\nenough giving 48B (header plus 3 4k pages). Could we just stick this in\nthe metadata and make it read only? Then programs that care can read it\nand get all the info they need without helpers. I would expect performance\nto be better in the XDP_TX and XDP_REDIRECT cases. And copying an extra\nworse case 48B in passing to the stack I guess is not measurable given\nall the work needed in that path.\n\n> \n> > \n> > > \n> > > A multi-buffer bit (mb) has been introduced in xdp_{buff,frame} structure\n> > > to notify the bpf/network layer if this is a xdp multi-buffer frame (mb = 1)\n> > > or not (mb = 0).\n> > > The mb bit will be set by a xdp multi-buffer capable driver only for\n> > > non-linear frames maintaining the capability to receive linear frames\n> > > without any extra cost since the skb_shared_info structure at the end\n> > > of the first buffer will be initialized only if mb is set.\n> > \n> > Thanks above is clearer.\n> > \n> > > \n> > > In order to provide to userspace some metdata about the non-linear\n> > > xdp_{buff,frame}, we introduced 2 bpf helpers:\n> > > - bpf_xdp_get_frags_count:\n> > >   get the number of fragments for a given xdp multi-buffer.\n> > > - bpf_xdp_get_frags_total_size:\n> > >   get the total size of fragments for a given xdp multi-buffer.\n> > \n> > Whats the use case for these? Do you have an example where knowing\n> > the frags count is going to be something a BPF program will use?\n> > Having total size seems interesting but perhaps we should push that\n> > into the metadata so its pulled into the cache if users are going to\n> > be reading it on every packet or something.\n> \n> At the moment we do not have any use-case for these helpers (not considering\n> the sample in the series :)). We introduced them to provide some basic metadata\n> about the non-linear xdp_frame.\n> IIRC we decided to introduce some helpers instead of adding this info in xdp_frame\n> in order to save space on it (for xdp it is essential xdp_frame to fit in a single\n> cache-line).\n\nSure, how about in the metadata then? (From other thread I was suggesting putting\nthe total length in metadata) We could even allow programs to overwrite it if\nthey wanted if its not used by the stack for anything other than packet length\nvisibility. Of course users would then need to be a bit careful not to overwrite\nit and then read it again expecting the length to be correct. I think from a\nusers perspective though that would be expected.\n\n> \n> Regards,\n> Lorenzo\n>","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com","ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20161025 header.b=uEtRAiin;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C2ybz6xgcz9sSC\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Sat,  3 Oct 2020 04:06:23 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S2388398AbgJBSGW (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Fri, 2 Oct 2020 14:06:22 -0400","from lindbergh.monkeyblade.net ([23.128.96.19]:60170 \"EHLO\n        lindbergh.monkeyblade.net\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n        with ESMTP id S1726224AbgJBSGW (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Fri, 2 Oct 2020 14:06:22 -0400","from mail-il1-x143.google.com (mail-il1-x143.google.com\n [IPv6:2607:f8b0:4864:20::143])\n        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13C7DC0613D0;\n        Fri,  2 Oct 2020 11:06:22 -0700 (PDT)","by mail-il1-x143.google.com with SMTP id e5so2045045ils.10;\n        Fri, 02 Oct 2020 11:06:22 -0700 (PDT)","from localhost ([184.63.162.180])\n        by smtp.gmail.com with ESMTPSA id\n f21sm1004316ioh.1.2020.10.02.11.06.19\n        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n        Fri, 02 Oct 2020 11:06:20 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=gmail.com; s=20161025;\n        h=date:from:to:cc:message-id:in-reply-to:references:subject\n         :mime-version:content-transfer-encoding;\n        bh=xQHdD9FYvaSVDKe3FGt16IcuIG+7ix3939R7cUq5vHw=;\n        b=uEtRAiinQabjTTjsVo4InS/Uby38Yc8HB2T+DH5ohJORdEX4PQKgyEYltUbsxCs8Eu\n         eR8WW2sEruObE8FyAHV97E2X03luqJIygIecBwV3ySZaGz+MkJOO6jYhMwjObrWX20ic\n         HES/25+/0A0KCf5kRfLxEhYY9r2UmB913GFS5WmNhaeuhmVZdJpS0l/fozfwW6nSzJKL\n         AIKez1ZJp/1cEtXljrnnIsX2Pm5t8wJ7iQny7ACxKW5vT/OkJ/YgGET6Tcd4Xb/KhdHg\n         TKjH2B2eRotK5G/bKxzM5TNLDg1GMzHYIb0fdSAjRnD5W5k3rQNLICuqbridtqbxH4N0\n         A5HA==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=1e100.net; s=20161025;\n        h=x-gm-message-state:date:from:to:cc:message-id:in-reply-to\n         :references:subject:mime-version:content-transfer-encoding;\n        bh=xQHdD9FYvaSVDKe3FGt16IcuIG+7ix3939R7cUq5vHw=;\n        b=pRreAZDDLPGPr7+O73tp9iqW7ZMH8TZxRCaqMWnNnDSsXvYmppfIIdAqBhYhFtSGZN\n         sooT+y6baBoDzZQC022EyyAcmvrh+C70FvEArYTvtjjUlskt1acErGDY0WiLtdulFWz8\n         iNnftrNNNqDzx9BHmCpRbKgnsDGT8Vf9yeW0GKhvjx20spjMCiD8slKuw3/GeE6BbLBa\n         3dAmOMqlE/CY5OKZwKm7/9lhfxJxZIZgV4dLPVYIcsYEBX0YuG3LxyblyT8Qml559SVh\n         uyMu5SpGu7DLwesckvFVCKxX/6iZbsHFUwi8b7WAHU40GrfoXPrEu+3wwwKRMy6dIdrX\n         pD2A==","X-Gm-Message-State":"AOAM532H2yeYn+ALjrIjVQPjKRnuEf1OKRRujNge7yjzNkylmtFVTXgM\n        5Q/YeMZj2xQnBkGxL2MZFcA=","X-Google-Smtp-Source":"\n ABdhPJweUhBuEB99WaFi6EQ0t/qRJdVFTH23M2hMah+XMKg+rI8BBG8p64T2eIcyhOGT+jbSn9BJKQ==","X-Received":"by 2002:a92:ccc5:: with SMTP id u5mr2845241ilq.178.1601661981275;\n        Fri, 02 Oct 2020 11:06:21 -0700 (PDT)","Date":"Fri, 02 Oct 2020 11:06:12 -0700","From":"John Fastabend <john.fastabend@gmail.com>","To":"Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,\n        John Fastabend <john.fastabend@gmail.com>","Cc":"Lorenzo Bianconi <lorenzo@kernel.org>, bpf@vger.kernel.org,\n        netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org,\n        ast@kernel.org, daniel@iogearbox.net, shayagr@amazon.com,\n        sameehj@amazon.com, dsahern@kernel.org, brouer@redhat.com,\n        echaudro@redhat.com","Message-ID":"<5f776c14d69b3_a6402087e@john-XPS-13-9370.notmuch>","In-Reply-To":"<20201002160623.GA40027@lore-desk>","References":"<cover.1601648734.git.lorenzo@kernel.org>\n <5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>\n <20201002160623.GA40027@lore-desk>","Subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Mime-Version":"1.0","Content-Type":"text/plain;\n charset=utf-8","Content-Transfer-Encoding":"7bit","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2544840,"web_url":"http://patchwork.ozlabs.org/comment/2544840/","msgid":"<5c22ee38-e2c3-0724-5033-603d19c4169f@iogearbox.net>","list_archive_url":null,"date":"2020-10-02T19:53:27","subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":65705,"url":"http://patchwork.ozlabs.org/api/people/65705/","name":"Daniel Borkmann","email":"daniel@iogearbox.net"},"content":"On 10/2/20 5:25 PM, John Fastabend wrote:\n> Lorenzo Bianconi wrote:\n>> This series introduce XDP multi-buffer support. The mvneta driver is\n>> the first to support these new \"non-linear\" xdp_{buff,frame}. Reviewers\n>> please focus on how these new types of xdp_{buff,frame} packets\n>> traverse the different layers and the layout design. It is on purpose\n>> that BPF-helpers are kept simple, as we don't want to expose the\n>> internal layout to allow later changes.\n>>\n>> For now, to keep the design simple and to maintain performance, the XDP\n>> BPF-prog (still) only have access to the first-buffer. It is left for\n>> later (another patchset) to add payload access across multiple buffers.\n>> This patchset should still allow for these future extensions. The goal\n>> is to lift the XDP MTU restriction that comes with XDP, but maintain\n>> same performance as before.\n>>\n>> The main idea for the new multi-buffer layout is to reuse the same\n>> layout used for non-linear SKB. This rely on the \"skb_shared_info\"\n>> struct at the end of the first buffer to link together subsequent\n>> buffers. Keeping the layout compatible with SKBs is also done to ease\n>> and speedup creating an SKB from an xdp_{buff,frame}. Converting\n>> xdp_frame to SKB and deliver it to the network stack is shown in cpumap\n>> code (patch 13/13).\n> \n> Using the end of the buffer for the skb_shared_info struct is going to\n> become driver API so unwinding it if it proves to be a performance issue\n> is going to be ugly. So same question as before, for the use case where\n> we receive packet and do XDP_TX with it how do we avoid cache miss\n> overhead? This is not just a hypothetical use case, the Facebook\n> load balancer is doing this as well as Cilium and allowing this with\n> multi-buffer packets >1500B would be useful.\n[...]\n\nFully agree. My other question would be if someone else right now is in the process\nof implementing this scheme for a 40G+ NIC? My concern is the numbers below are rather\non the lower end of the spectrum, so I would like to see a comparison of XDP as-is\ntoday vs XDP multi-buff on a higher end NIC so that we have a picture how well the\ncurrent designed scheme works there and into which performance issue we'll run e.g.\nunder typical XDP L4 load balancer scenario with XDP_TX. I think this would be crucial\nbefore the driver API becomes 'sort of' set in stone where others start to adapting\nit and changing design becomes painful. Do ena folks have an implementation ready as\nwell? And what about virtio_net, for example, anyone committing there too? Typically\nfor such features to land is to require at least 2 drivers implementing it.\n\n>> Typical use cases for this series are:\n>> - Jumbo-frames\n>> - Packet header split (please see Google���s use-case @ NetDevConf 0x14, [0])\n>> - TSO\n>>\n>> More info about the main idea behind this approach can be found here [1][2].\n>>\n>> We carried out some throughput tests in a standard linear frame scenario in order\n>> to verify we did not introduced any performance regression adding xdp multi-buff\n>> support to mvneta:\n>>\n>> offered load is ~ 1000Kpps, packet size is 64B, mvneta descriptor size is one PAGE\n>>\n>> commit: 879456bedbe5 (\"net: mvneta: avoid possible cache misses in mvneta_rx_swbm\")\n>> - xdp-pass:      ~162Kpps\n>> - xdp-drop:      ~701Kpps\n>> - xdp-tx:        ~185Kpps\n>> - xdp-redirect:  ~202Kpps\n>>\n>> mvneta xdp multi-buff:\n>> - xdp-pass:      ~163Kpps\n>> - xdp-drop:      ~739Kpps\n>> - xdp-tx:        ~182Kpps\n>> - xdp-redirect:  ~202Kpps\n[...]","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=none (p=none dis=none) header.from=iogearbox.net"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C30zf0Bb3z9sSf\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Sat,  3 Oct 2020 05:53:34 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S1725797AbgJBTxd (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Fri, 2 Oct 2020 15:53:33 -0400","from www62.your-server.de ([213.133.104.62]:47456 \"EHLO\n        www62.your-server.de\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n        with ESMTP id S1725283AbgJBTxc (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Fri, 2 Oct 2020 15:53:32 -0400","from sslproxy05.your-server.de ([78.46.172.2])\n        by www62.your-server.de with esmtpsa\n (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256)\n        (Exim 4.89_1)\n        (envelope-from <daniel@iogearbox.net>)\n        id 1kOR7Q-0005HG-Io; Fri, 02 Oct 2020 21:53:28 +0200","from [178.196.57.75] (helo=pc-9.home)\n        by sslproxy05.your-server.de with esmtpsa\n (TLSv1.3:TLS_AES_256_GCM_SHA384:256)\n        (Exim 4.92)\n        (envelope-from <daniel@iogearbox.net>)\n        id 1kOR7Q-0007i3-A8; Fri, 02 Oct 2020 21:53:28 +0200"],"Subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","To":"John Fastabend <john.fastabend@gmail.com>,\n        Lorenzo Bianconi <lorenzo@kernel.org>, bpf@vger.kernel.org,\n        netdev@vger.kernel.org","Cc":"davem@davemloft.net, kuba@kernel.org, ast@kernel.org,\n        shayagr@amazon.com, sameehj@amazon.com, dsahern@kernel.org,\n        brouer@redhat.com, lorenzo.bianconi@redhat.com, echaudro@redhat.com","References":"<cover.1601648734.git.lorenzo@kernel.org>\n <5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>","From":"Daniel Borkmann <daniel@iogearbox.net>","Message-ID":"<5c22ee38-e2c3-0724-5033-603d19c4169f@iogearbox.net>","Date":"Fri, 2 Oct 2020 21:53:27 +0200","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101\n Thunderbird/60.7.2","MIME-Version":"1.0","In-Reply-To":"<5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Language":"en-US","Content-Transfer-Encoding":"8bit","X-Authenticated-Sender":"daniel@iogearbox.net","X-Virus-Scanned":"Clear (ClamAV 0.102.4/25945/Fri Oct  2 15:54:22 2020)","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2545817,"web_url":"http://patchwork.ozlabs.org/comment/2545817/","msgid":"<20201005115247.72429157@carbon>","list_archive_url":null,"date":"2020-10-05T09:52:47","subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":13625,"url":"http://patchwork.ozlabs.org/api/people/13625/","name":"Jesper Dangaard Brouer","email":"brouer@redhat.com"},"content":"On Fri, 02 Oct 2020 11:06:12 -0700\nJohn Fastabend <john.fastabend@gmail.com> wrote:\n\n> Lorenzo Bianconi wrote:\n> > > Lorenzo Bianconi wrote:  \n> > > > This series introduce XDP multi-buffer support. The mvneta driver is\n> > > > the first to support these new \"non-linear\" xdp_{buff,frame}. Reviewers\n> > > > please focus on how these new types of xdp_{buff,frame} packets\n> > > > traverse the different layers and the layout design. It is on purpose\n> > > > that BPF-helpers are kept simple, as we don't want to expose the\n> > > > internal layout to allow later changes.\n> > > > \n> > > > For now, to keep the design simple and to maintain performance, the XDP\n> > > > BPF-prog (still) only have access to the first-buffer. It is left for\n> > > > later (another patchset) to add payload access across multiple buffers.\n> > > > This patchset should still allow for these future extensions. The goal\n> > > > is to lift the XDP MTU restriction that comes with XDP, but maintain\n> > > > same performance as before.\n> > > > \n> > > > The main idea for the new multi-buffer layout is to reuse the same\n> > > > layout used for non-linear SKB. This rely on the \"skb_shared_info\"\n> > > > struct at the end of the first buffer to link together subsequent\n> > > > buffers. Keeping the layout compatible with SKBs is also done to ease\n> > > > and speedup creating an SKB from an xdp_{buff,frame}. Converting\n> > > > xdp_frame to SKB and deliver it to the network stack is shown in cpumap\n> > > > code (patch 13/13).  \n> > > \n> > > Using the end of the buffer for the skb_shared_info struct is going to\n> > > become driver API so unwinding it if it proves to be a performance issue\n> > > is going to be ugly. So same question as before, for the use case where\n> > > we receive packet and do XDP_TX with it how do we avoid cache miss\n> > > overhead? This is not just a hypothetical use case, the Facebook\n> > > load balancer is doing this as well as Cilium and allowing this with\n> > > multi-buffer packets >1500B would be useful.\n> > > \n> > > Can we write the skb_shared_info lazily? It should only be needed once\n> > > we know the packet is going up the stack to some place that needs the\n> > > info. Which we could learn from the return code of the XDP program.  \n> > \n> > Hi John,  \n> \n> Hi, I'll try to join the two threads this one and the one on helpers here\n> so we don't get too fragmented.\n> \n> > \n> > I agree, I think for XDP_TX use-case it is not strictly necessary to fill the\n> > skb_hared_info. The driver can just keep this info on the stack and use it\n> > inserting the packet back to the DMA ring.\n> > For mvneta I implemented it in this way to keep the code aligned with ndo_xdp_xmit\n> > path since it is a low-end device. I guess we are not introducing any API constraint\n> > for XDP_TX. A high-end device can implement multi-buff for XDP_TX in a different way\n> > in order to avoid the cache miss.  \n> \n> Agree it would be an implementation detail for XDP_TX except the two\n> helpers added in this series currently require it to be there.\n\nThat is a good point.  If you look at the details, the helpers use\nxdp_buff->mb bit to guard against accessing the \"shared_info\"\ncacheline. Thus, for the normal single frame case XDP_TX should not see\na slowdown.  Do we really need to optimize XDP_TX multi-frame case(?)\n\n\n> > \n> > We need to fill the skb_shared info only when we want to pass the frame to the\n> > network stack (build_skb() can directly reuse skb_shared_info->frags[]) or for\n> > XDP_REDIRECT use-case.  \n> \n> It might be good to think about the XDP_REDIRECT case as well then. If the\n> frags list fit in the metadata/xdp_frame would we expect better\n> performance?\n\nI don't like to use space in xdp_frame for this. (1) We (Ahern and I)\nare planning to use the space in xdp_frame for RX-csum + RX-hash +vlan,\nwhich will be more common (e.g. all packets will have HW RX+csum).  (2)\nI consider XDP multi-buffer the exception case, that will not be used\nin most cases, so why reserve space for that in this cache-line.\n\nIMHO we CANNOT allow any slowdown for existing XDP use-cases, but IMHO\nXDP multi-buffer use-cases are allowed to run \"slower\".\n\n\n> Looking at skb_shared_info{} that is a rather large structure with many\n\nA cache-line detail about skb_shared_info: The first frags[0] member is\nin the first cache-line.  Meaning that it is still fast to have xdp\nframes with 1 extra buffer.\n\n> fields that look unnecessary for XDP_REDIRECT case and only needed when\n> passing to the stack. \n\nYes, I think we can use first cache-line of skb_shared_info more\noptimally (via defining a xdp_shared_info struct). But I still want us\nto use this specific cache-line.  Let me explain why below. (Avoiding\ncache-line misses is all about the details, so I hope you can follow).\n\nHopefully most driver developers understand/knows this.  In the RX-loop\nthe current RX-descriptor have a status that indicate there are more\nframe, usually expressed as non-EOP (End-Of-Packet).  Thus, a driver\ncan start a prefetchw of this shared_info cache-line, prior to\nprocessing the RX-desc that describe the multi-buffer.\n (Remember this shared_info is constructed prior to calling XDP and any\nXDP_TX action, thus the XDP prog should not see a cache-line miss when\nusing the BPF-helper to read shared_info area).\n\n\n> Fundamentally, a frag just needs\n> \n>  struct bio_vec {\n>      struct page *bv_page;     // 8B\n>      unsigned int bv_len;      // 4B\n>      unsigned int bv_offset;   // 4B\n>  } // 16B\n> \n> With header split + data we only need a single frag so we could use just\n> 16B. And worse case jumbo frame + header split seems 3 entries would be\n> enough giving 48B (header plus 3 4k pages). \n\nFor jumbo-frame 9000 MTU 2 entries might be enough, as we also have\nroom in the first buffer (((9000-(4096-256-320))/4096 = 1.33789).\n\nThe problem is that we need to support TSO (TCP Segmentation Offload)\nuse-case, which can have more frames. Thus, 3 entries will not be\nenough.\n\n> Could we just stick this in the metadata and make it read only? Then\n> programs that care can read it and get all the info they need without\n> helpers.\n\nI don't see how that is possible. (1) the metadata area is only 32\nbytes, (2) when freeing an xdp_frame the kernel need to know the layout\nas these points will be free'ed.\n\n> I would expect performance to be better in the XDP_TX and\n> XDP_REDIRECT cases. And copying an extra worse case 48B in passing to\n> the stack I guess is not measurable given all the work needed in that\n> path.\n\nI do agree, that when passing to netstack we can do a transformation\nfrom xdp_shared_info to skb_shared_info with a fairly small cost.  (The\nTSO case would require more copying).\n\nNotice that allocating an SKB, will always clear the first 32 bytes of\nskb_shared_info.  If the XDP driver-code path have done the prefetch\nas described above, then we should see a speedup for netstack delivery.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=redhat.com","ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256\n header.s=mimecast20190719 header.b=hXfqOfA4;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C4bWQ6Fp8z9sSs\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Mon,  5 Oct 2020 20:53:06 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S1725946AbgJEJxG (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Mon, 5 Oct 2020 05:53:06 -0400","from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:25162 \"EHLO\n        us-smtp-delivery-124.mimecast.com\" rhost-flags-OK-OK-OK-OK)\n        by vger.kernel.org with ESMTP id S1725887AbgJEJxD (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Mon, 5 Oct 2020 05:53:03 -0400","from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com\n [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id\n us-mta-358-E00guPkiPwO8P1ETtkhh3Q-1; Mon, 05 Oct 2020 05:52:59 -0400","from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com\n [10.5.11.11])\n        (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n        (No client certificate requested)\n        by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 80172100A246;\n        Mon,  5 Oct 2020 09:52:57 +0000 (UTC)","from carbon (unknown [10.36.110.30])\n        by smtp.corp.redhat.com (Postfix) with ESMTP id 7F96278814;\n        Mon,  5 Oct 2020 09:52:48 +0000 (UTC)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;\n        s=mimecast20190719; t=1601891581;\n        h=from:from:reply-to:subject:subject:date:date:message-id:message-id:\n         to:to:cc:cc:mime-version:mime-version:content-type:content-type:\n         content-transfer-encoding:content-transfer-encoding:\n         in-reply-to:in-reply-to:references:references;\n        bh=V9Nec+YykbrmtrrSug4z328cCcm80LO+iMyeCo1L8VA=;\n        b=hXfqOfA42hcawQhOsu1cuZ6XzspTRiSuHqrXNHS6/KOC6Tso/kSVrL7y6Sw8bJ8DuabYd/\n        NcNiMYHIQrMMVsoaGr7tjTRRCUh5jplUrsMyeG8gpwx+mboDJUA8bBjOjTfFE8FTzfVj96\n        qFk9stp8uq2evruzM6QvCbThhwy+YDU=","X-MC-Unique":"E00guPkiPwO8P1ETtkhh3Q-1","Date":"Mon, 5 Oct 2020 11:52:47 +0200","From":"Jesper Dangaard Brouer <brouer@redhat.com>","To":"John Fastabend <john.fastabend@gmail.com>","Cc":"Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,\n        Lorenzo Bianconi <lorenzo@kernel.org>, bpf@vger.kernel.org,\n        netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org,\n        ast@kernel.org, daniel@iogearbox.net, shayagr@amazon.com,\n        sameehj@amazon.com, dsahern@kernel.org, echaudro@redhat.com,\n        brouer@redhat.com","Subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Message-ID":"<20201005115247.72429157@carbon>","In-Reply-To":"<5f776c14d69b3_a6402087e@john-XPS-13-9370.notmuch>","References":"<cover.1601648734.git.lorenzo@kernel.org>\n        <5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>\n        <20201002160623.GA40027@lore-desk>\n        <5f776c14d69b3_a6402087e@john-XPS-13-9370.notmuch>","MIME-Version":"1.0","Content-Type":"text/plain; charset=US-ASCII","Content-Transfer-Encoding":"7bit","X-Scanned-By":"MIMEDefang 2.79 on 10.5.11.11","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2546282,"web_url":"http://patchwork.ozlabs.org/comment/2546282/","msgid":"<20201005155016.9195-1-tirthendu.sarkar@intel.com>","list_archive_url":null,"date":"2020-10-05T15:50:16","subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":80277,"url":"http://patchwork.ozlabs.org/api/people/80277/","name":"Tirthendu Sarkar","email":"tirtha@gmail.com"},"content":"On 10/2/20 5:25 PM, John Fastabend wrote:\n>>[..] Typically for such features to land is to require at least 2 drivers\n>>implementing it.\n\nI am working on making changes to Intel NIC drivers for XDP multi buffer based\non these patches. Respective patches Will be posted once ready.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com","ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20161025 header.b=HqUI1eAy;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C4twb0W1Rz9sXF\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Tue,  6 Oct 2020 08:27:27 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S1728820AbgJEPyz (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Mon, 5 Oct 2020 11:54:55 -0400","from lindbergh.monkeyblade.net ([23.128.96.19]:52822 \"EHLO\n        lindbergh.monkeyblade.net\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n        with ESMTP id S1726567AbgJEPyt (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Mon, 5 Oct 2020 11:54:49 -0400","from mail-pg1-x544.google.com (mail-pg1-x544.google.com\n [IPv6:2607:f8b0:4864:20::544])\n        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C636BC0613CE;\n        Mon,  5 Oct 2020 08:54:49 -0700 (PDT)","by mail-pg1-x544.google.com with SMTP id i2so2544750pgh.7;\n        Mon, 05 Oct 2020 08:54:49 -0700 (PDT)","from localhost.localdomain ([192.55.54.42])\n        by smtp.gmail.com with ESMTPSA id\n j4sm288262pfj.143.2020.10.05.08.54.46\n        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n        Mon, 05 Oct 2020 08:54:48 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=gmail.com; s=20161025;\n        h=from:to:cc:subject:date:message-id:in-reply-to:references;\n        bh=BqkdVKCtHsNSqF/Og7FBsBE4qMp9jp0ZGyBA+fMz5ks=;\n        b=HqUI1eAyKm5jQioMsTtQ8Lg0EF5dS6LJV6naajoWqlKLWk6M6GUO/h1DsotRw3ZWQ7\n         hCvm3GUBYmmyCw1A0JY2HEZhp3adMkN9yVUYd2zKHSjl1xrj/lS7xrX9ICkgl0iLsjrH\n         4yh7Bi/hhvMFv0JzrNI6mC90LAUQcDVa5N+0K8ddy1Qj6YaE8Wx0ZTJe/MML4f27+y6n\n         cjG7TgIjgY8itxq/OPEt91Y8AjYW4FdsvbxFggaszz+taWdTLrG4bWMXduAF0+6x26jY\n         HqMAN5vhmPEFhO0NnTKeAXuW1uNRFFeJnuOW297oJlz8TBrVzvE8fnsblsBn0sgUXRPh\n         5t4A==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=1e100.net; s=20161025;\n        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to\n         :references;\n        bh=BqkdVKCtHsNSqF/Og7FBsBE4qMp9jp0ZGyBA+fMz5ks=;\n        b=efYb1uT002Pau2ol0rzpFc1SgHmMaZ2VWBBPeXecYolyFy8uqukRlJNWUztZkGPoRn\n         aOoNP0EvGLcVxwCT9ztATsZqybUlyq4PqsVU5srxKhzOEIWfFhDFey0KZKKxrHxVTdc/\n         DpFRKUmzT1vwITwzTD1982syfiQ+iW57WBn/t7XRow9FEAtv6fuk/cgXy/ameDpjYERC\n         NRPq9mb9y+pulLb5MMeXdH57PWxNCnO7pZ4B5UMU5igR3SpThPVBsSWZNGpnYGSLq9o0\n         snwCSTqrGzQXP6PNfM6P03FHXX/LX+Yq3yeS5HPZ5PbRXZx0VPaKP16JasTh/UKt08rj\n         AClQ==","X-Gm-Message-State":"AOAM532X9s3IL9LKJF76Q1SzY9Mvot/jUPZovgQjCvJlMhAtcUKl8k2G\n        LZx5hVpDm4+lRyTpT98vn0g=","X-Google-Smtp-Source":"\n ABdhPJwajglKbXLiJskVArVqR9JeqRDLc5peOqyIW1kMnAWLq9FSq9p18X+qh0gpWvnyWJJPlIK+lw==","X-Received":"by 2002:a63:ee46:: with SMTP id n6mr156788pgk.120.1601913289112;\n        Mon, 05 Oct 2020 08:54:49 -0700 (PDT)","From":"Tirthendu Sarkar <tirtha@gmail.com>","X-Google-Original-From":"Tirthendu Sarkar <tirthendu.sarkar@intel.com>","To":"daniel@iogearbox.net","Cc":"ast@kernel.org, bpf@vger.kernel.org, brouer@redhat.com,\n        davem@davemloft.net, dsahern@kernel.org, echaudro@redhat.com,\n        john.fastabend@gmail.com, kuba@kernel.org,\n        lorenzo.bianconi@redhat.com, lorenzo@kernel.org,\n        netdev@vger.kernel.org, sameehj@amazon.com, shayagr@amazon.com,\n        Tirthendu Sarkar <tirthendu.sarkar@intel.com>","Subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Date":"Mon,  5 Oct 2020 21:20:16 +0530","Message-Id":"<20201005155016.9195-1-tirthendu.sarkar@intel.com>","X-Mailer":"git-send-email 2.17.1","In-Reply-To":"<5c22ee38-e2c3-0724-5033-603d19c4169f@iogearbox.net>","References":"<5c22ee38-e2c3-0724-5033-603d19c4169f@iogearbox.net>","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2546344,"web_url":"http://patchwork.ozlabs.org/comment/2546344/","msgid":"<5f7b8e7a5ebfc_4f19a208ba@john-XPS-13-9370.notmuch>","list_archive_url":null,"date":"2020-10-05T21:22:02","subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":20028,"url":"http://patchwork.ozlabs.org/api/people/20028/","name":"John Fastabend","email":"john.fastabend@gmail.com"},"content":"Jesper Dangaard Brouer wrote:\n> On Fri, 02 Oct 2020 11:06:12 -0700\n> John Fastabend <john.fastabend@gmail.com> wrote:\n> \n> > Lorenzo Bianconi wrote:\n> > > > Lorenzo Bianconi wrote:  \n> > > > > This series introduce XDP multi-buffer support. The mvneta driver is\n> > > > > the first to support these new \"non-linear\" xdp_{buff,frame}. Reviewers\n> > > > > please focus on how these new types of xdp_{buff,frame} packets\n> > > > > traverse the different layers and the layout design. It is on purpose\n> > > > > that BPF-helpers are kept simple, as we don't want to expose the\n> > > > > internal layout to allow later changes.\n> > > > > \n> > > > > For now, to keep the design simple and to maintain performance, the XDP\n> > > > > BPF-prog (still) only have access to the first-buffer. It is left for\n> > > > > later (another patchset) to add payload access across multiple buffers.\n> > > > > This patchset should still allow for these future extensions. The goal\n> > > > > is to lift the XDP MTU restriction that comes with XDP, but maintain\n> > > > > same performance as before.\n> > > > > \n> > > > > The main idea for the new multi-buffer layout is to reuse the same\n> > > > > layout used for non-linear SKB. This rely on the \"skb_shared_info\"\n> > > > > struct at the end of the first buffer to link together subsequent\n> > > > > buffers. Keeping the layout compatible with SKBs is also done to ease\n> > > > > and speedup creating an SKB from an xdp_{buff,frame}. Converting\n> > > > > xdp_frame to SKB and deliver it to the network stack is shown in cpumap\n> > > > > code (patch 13/13).  \n> > > > \n> > > > Using the end of the buffer for the skb_shared_info struct is going to\n> > > > become driver API so unwinding it if it proves to be a performance issue\n> > > > is going to be ugly. So same question as before, for the use case where\n> > > > we receive packet and do XDP_TX with it how do we avoid cache miss\n> > > > overhead? This is not just a hypothetical use case, the Facebook\n> > > > load balancer is doing this as well as Cilium and allowing this with\n> > > > multi-buffer packets >1500B would be useful.\n> > > > \n> > > > Can we write the skb_shared_info lazily? It should only be needed once\n> > > > we know the packet is going up the stack to some place that needs the\n> > > > info. Which we could learn from the return code of the XDP program.  \n> > > \n> > > Hi John,  \n> > \n> > Hi, I'll try to join the two threads this one and the one on helpers here\n> > so we don't get too fragmented.\n> > \n> > > \n> > > I agree, I think for XDP_TX use-case it is not strictly necessary to fill the\n> > > skb_hared_info. The driver can just keep this info on the stack and use it\n> > > inserting the packet back to the DMA ring.\n> > > For mvneta I implemented it in this way to keep the code aligned with ndo_xdp_xmit\n> > > path since it is a low-end device. I guess we are not introducing any API constraint\n> > > for XDP_TX. A high-end device can implement multi-buff for XDP_TX in a different way\n> > > in order to avoid the cache miss.  \n> > \n> > Agree it would be an implementation detail for XDP_TX except the two\n> > helpers added in this series currently require it to be there.\n> \n> That is a good point.  If you look at the details, the helpers use\n> xdp_buff->mb bit to guard against accessing the \"shared_info\"\n> cacheline. Thus, for the normal single frame case XDP_TX should not see\n> a slowdown.  Do we really need to optimize XDP_TX multi-frame case(?)\n\nAgree it is guarded by xdp_buff->mb which is why I asked for that detail\nto be posted in the cover letter so it was easy to understand that bit\nof info.\n\nDo we really need to optimize XDP_TX multi-frame case? Yes I think so.\nThe use case is jumbo frames (or 4kB) LB. XDP_TX is the common case any\nmany configurations. For our use case these including cloud providers\nand bare-metal data centers.\n\nKeeping the implementation out of the helpers allows drivers to optimize\nfor this case. Also it doesn't seem like the helpers in this series\nhave a strong use case. Happy to hear what it is, but I can't see how\nto use them myself.\n\n> \n> \n> > > \n> > > We need to fill the skb_shared info only when we want to pass the frame to the\n> > > network stack (build_skb() can directly reuse skb_shared_info->frags[]) or for\n> > > XDP_REDIRECT use-case.  \n> > \n> > It might be good to think about the XDP_REDIRECT case as well then. If the\n> > frags list fit in the metadata/xdp_frame would we expect better\n> > performance?\n> \n> I don't like to use space in xdp_frame for this. (1) We (Ahern and I)\n> are planning to use the space in xdp_frame for RX-csum + RX-hash +vlan,\n> which will be more common (e.g. all packets will have HW RX+csum).  (2)\n> I consider XDP multi-buffer the exception case, that will not be used\n> in most cases, so why reserve space for that in this cache-line.\n\nSure.\n\n> \n> IMHO we CANNOT allow any slowdown for existing XDP use-cases, but IMHO\n> XDP multi-buffer use-cases are allowed to run \"slower\".\n\nI agree we cannot slowdown existing use cases. But, disagree that multi\nbuffer use cases can be slower. If folks enable jumbo-frames and things\nslow down thats a problem.\n\n> \n> \n> > Looking at skb_shared_info{} that is a rather large structure with many\n> \n> A cache-line detail about skb_shared_info: The first frags[0] member is\n> in the first cache-line.  Meaning that it is still fast to have xdp\n> frames with 1 extra buffer.\n\nThats nice in-theory.\n\n> \n> > fields that look unnecessary for XDP_REDIRECT case and only needed when\n> > passing to the stack. \n> \n> Yes, I think we can use first cache-line of skb_shared_info more\n> optimally (via defining a xdp_shared_info struct). But I still want us\n> to use this specific cache-line.  Let me explain why below. (Avoiding\n> cache-line misses is all about the details, so I hope you can follow).\n> \n> Hopefully most driver developers understand/knows this.  In the RX-loop\n> the current RX-descriptor have a status that indicate there are more\n> frame, usually expressed as non-EOP (End-Of-Packet).  Thus, a driver\n> can start a prefetchw of this shared_info cache-line, prior to\n> processing the RX-desc that describe the multi-buffer.\n>  (Remember this shared_info is constructed prior to calling XDP and any\n> XDP_TX action, thus the XDP prog should not see a cache-line miss when\n> using the BPF-helper to read shared_info area).\n\nIn general I see no reason to populate these fields before the XDP\nprogram runs. Someone needs to convince me why having frags info before\nprogram runs is useful. In general headers should be preserved and first\nfrag already included in the data pointers. If users start parsing further\nthey might need it, but this series doesn't provide a way to do that\nso IMO without those helpers its a bit difficult to debate.\n\nSpecifically for XDP_TX case we can just flip the descriptors from RX\nring to TX ring and keep moving along. This is going to be ideal on\n40/100Gbps nics.\n\nI'm not arguing that its likely possible to put some prefetch logic\nin there and keep the pipe full, but I would need to see that on\na 100gbps nic to be convinced the details here are going to work. Or\nat minimum a 40gbps nic.\n\n> \n> \n> > Fundamentally, a frag just needs\n> > \n> >  struct bio_vec {\n> >      struct page *bv_page;     // 8B\n> >      unsigned int bv_len;      // 4B\n> >      unsigned int bv_offset;   // 4B\n> >  } // 16B\n> > \n> > With header split + data we only need a single frag so we could use just\n> > 16B. And worse case jumbo frame + header split seems 3 entries would be\n> > enough giving 48B (header plus 3 4k pages). \n> \n> For jumbo-frame 9000 MTU 2 entries might be enough, as we also have\n> room in the first buffer (((9000-(4096-256-320))/4096 = 1.33789).\n\nSure. I was just counting the fist buffer a frag understanding it\nwouldn't actually be in the frag list.\n\n> \n> The problem is that we need to support TSO (TCP Segmentation Offload)\n> use-case, which can have more frames. Thus, 3 entries will not be\n> enough.\n\nSorry not following, TSO? Explain how TSO is going to work for XDP_TX\nand XDP_REDIRECT? I guess in theory you can header split and coalesce,\nbut we are a ways off from that and this series certainly doesn't\ntalk about TSO unless I missed something.\n\n> \n> > Could we just stick this in the metadata and make it read only? Then\n> > programs that care can read it and get all the info they need without\n> > helpers.\n> \n> I don't see how that is possible. (1) the metadata area is only 32\n> bytes, (2) when freeing an xdp_frame the kernel need to know the layout\n> as these points will be free'ed.\n\nAgree its tight, probably too tight to be useful.\n\n> \n> > I would expect performance to be better in the XDP_TX and\n> > XDP_REDIRECT cases. And copying an extra worse case 48B in passing to\n> > the stack I guess is not measurable given all the work needed in that\n> > path.\n> \n> I do agree, that when passing to netstack we can do a transformation\n> from xdp_shared_info to skb_shared_info with a fairly small cost.  (The\n> TSO case would require more copying).\n\nI'm lost on the TSO case. Explain how TSO is related here? \n\n> \n> Notice that allocating an SKB, will always clear the first 32 bytes of\n> skb_shared_info.  If the XDP driver-code path have done the prefetch\n> as described above, then we should see a speedup for netstack delivery.\n\nNot against it, but these things are a bit tricky. Couple things I still\nwant to see/understand\n\n - Lets see a 40gbps use a prefetch and verify it works in practice\n - Explain why we can't just do this after XDP program runs\n - How will we read data in the frag list if we need to parse headers\n   inside the frags[].\n\nThe above would be best to answer now rather than later IMO.\n\nThanks,\nJohn","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com","ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20161025 header.b=pfz+mDXp;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C4v3w6wnkzB44v\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Tue,  6 Oct 2020 08:33:48 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S1726012AbgJEVWO (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Mon, 5 Oct 2020 17:22:14 -0400","from lindbergh.monkeyblade.net ([23.128.96.19]:46896 \"EHLO\n        lindbergh.monkeyblade.net\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n        with ESMTP id S1725785AbgJEVWN (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Mon, 5 Oct 2020 17:22:13 -0400","from mail-il1-x144.google.com (mail-il1-x144.google.com\n [IPv6:2607:f8b0:4864:20::144])\n        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C0DAC0613CE;\n        Mon,  5 Oct 2020 14:22:12 -0700 (PDT)","by mail-il1-x144.google.com with SMTP id q1so9160727ilt.6;\n        Mon, 05 Oct 2020 14:22:12 -0700 (PDT)","from localhost ([184.63.162.180])\n        by smtp.gmail.com with ESMTPSA id\n z200sm523091iof.47.2020.10.05.14.22.07\n        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n        Mon, 05 Oct 2020 14:22:10 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=gmail.com; s=20161025;\n        h=date:from:to:cc:message-id:in-reply-to:references:subject\n         :mime-version:content-transfer-encoding;\n        bh=0f1CLo503MnuWdm5KqmcasaFJsBdWmxKylqIcnejrWU=;\n        b=pfz+mDXp/TJnzAStKn9HxWsjHMXbkD56z/pHWTuuR4Be5dq/OLF/YKs3cl66PODaq/\n         Le/u8aIeL4HGdZXt4wVnEvm5vfMr6gVHzffhLX2yfpkeGZKdwPfjMc4qUzlbcdwGCLFc\n         8ZxRW3IBDexj5V7kBdw8gftHeeOE3mZqjffG1STiEhKt1sLXffeD3Ulu1iSmNNI2Icyc\n         G6/ixKyhxBvMxNQIR7xK1njuAjxpMGuoUGBqcU1HtLN2LZRHg66sE7gPYZL4Sd8q6YE4\n         BomPxAwpmp2/fSzXogfeqTfykpfcZ5C2xz5Cvwwxqaf+g16KGqgPGvLxdWbT2TDN/UQk\n         ZM/w==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=1e100.net; s=20161025;\n        h=x-gm-message-state:date:from:to:cc:message-id:in-reply-to\n         :references:subject:mime-version:content-transfer-encoding;\n        bh=0f1CLo503MnuWdm5KqmcasaFJsBdWmxKylqIcnejrWU=;\n        b=WTH04gkOqCZbiL7I+7iZCZyopk9q222HZt3CquXmRHWGPxShZ9QgWb9hlRXUVsNmhc\n         eaZcYn+wQPTJV3I1lRwvdoZnlEDnJzWDvYgGzpOceI9zBAVk6P6O89M8j7+JoXvWVk7e\n         yEtTe2bCDhsAYB761JXGdHHLLTC4SCzKYorMd2oI+bFBEMjBf81V/M2Ks2hTddCZVXqB\n         XkZkSRML2pOOokz1NC+rlNhGgAVh2TJ6/am/kqiEX9gUMDeDplZRwjy6NSVB4X0XJzHo\n         MpeBoevkso6jKR8f4YsEHFtVuG4lFcpgToma5g1fC45GQDYeqAes+Fh1NFNcaVCruLN0\n         gd1A==","X-Gm-Message-State":"AOAM530A6MBD/VasY8AHVPNWSbQ+1nX4zDCgVXbE8o2z6lavIIA4H3sj\n        3y41zwfCGYtX+NalXsFSnFI=","X-Google-Smtp-Source":"\n ABdhPJyOc03yXIaYurZOFS8bes5EwWCIfvjza5cSaUFftVQNbpQIlstCxs314tr5qKvg92tvwTsVzA==","X-Received":"by 2002:a92:24cf:: with SMTP id k198mr1054036ilk.3.1601932931335;\n        Mon, 05 Oct 2020 14:22:11 -0700 (PDT)","Date":"Mon, 05 Oct 2020 14:22:02 -0700","From":"John Fastabend <john.fastabend@gmail.com>","To":"Jesper Dangaard Brouer <brouer@redhat.com>,\n        John Fastabend <john.fastabend@gmail.com>","Cc":"Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,\n        Lorenzo Bianconi <lorenzo@kernel.org>, bpf@vger.kernel.org,\n        netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org,\n        ast@kernel.org, daniel@iogearbox.net, shayagr@amazon.com,\n        sameehj@amazon.com, dsahern@kernel.org, echaudro@redhat.com,\n        brouer@redhat.com","Message-ID":"<5f7b8e7a5ebfc_4f19a208ba@john-XPS-13-9370.notmuch>","In-Reply-To":"<20201005115247.72429157@carbon>","References":"<cover.1601648734.git.lorenzo@kernel.org>\n <5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>\n <20201002160623.GA40027@lore-desk>\n <5f776c14d69b3_a6402087e@john-XPS-13-9370.notmuch>\n <20201005115247.72429157@carbon>","Subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Mime-Version":"1.0","Content-Type":"text/plain;\n charset=utf-8","Content-Transfer-Encoding":"7bit","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2546385,"web_url":"http://patchwork.ozlabs.org/comment/2546385/","msgid":"<20201005222454.GB3501@localhost.localdomain>","list_archive_url":null,"date":"2020-10-05T22:24:54","subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":76007,"url":"http://patchwork.ozlabs.org/api/people/76007/","name":"Lorenzo Bianconi","email":"lorenzo@kernel.org"},"content":"[...]\n\n> \n> In general I see no reason to populate these fields before the XDP\n> program runs. Someone needs to convince me why having frags info before\n> program runs is useful. In general headers should be preserved and first\n> frag already included in the data pointers. If users start parsing further\n> they might need it, but this series doesn't provide a way to do that\n> so IMO without those helpers its a bit difficult to debate.\n\nWe need to populate the skb_shared_info before running the xdp program in order to\nallow the ebpf sanbox to access this data. If we restrict the access to the first\nbuffer only I guess we can avoid to do that but I think there is a value allowing\nthe xdp program to access this data.\nA possible optimization can be access the shared_info only once before running\nthe ebpf program constructing the shared_info using a struct allocated on the\nstack.\nMoreover we can define a \"xdp_shared_info\" struct to alias the skb_shared_info\none in order to have most on frags elements in the first \"shared_info\" cache line.\n\n> \n> Specifically for XDP_TX case we can just flip the descriptors from RX\n> ring to TX ring and keep moving along. This is going to be ideal on\n> 40/100Gbps nics.\n> \n> I'm not arguing that its likely possible to put some prefetch logic\n> in there and keep the pipe full, but I would need to see that on\n> a 100gbps nic to be convinced the details here are going to work. Or\n> at minimum a 40gbps nic.\n> \n> > \n> > \n\n[...]\n\n> Not against it, but these things are a bit tricky. Couple things I still\n> want to see/understand\n> \n>  - Lets see a 40gbps use a prefetch and verify it works in practice\n>  - Explain why we can't just do this after XDP program runs\n\nhow can we allow the ebpf program to access paged data if we do not do that?\n\n>  - How will we read data in the frag list if we need to parse headers\n>    inside the frags[].\n> \n> The above would be best to answer now rather than later IMO.\n> \n> Thanks,\n> John\n\nRegards,\nLorenzo","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=kernel.org","ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256\n header.s=default header.b=ArooJQte;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C4wHX0SpDz9sSG\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Tue,  6 Oct 2020 09:28:56 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S1727060AbgJEWZw (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Mon, 5 Oct 2020 18:25:52 -0400","from mail.kernel.org ([198.145.29.99]:53208 \"EHLO mail.kernel.org\"\n        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n        id S1725861AbgJEWZw (ORCPT <rfc822;netdev@vger.kernel.org>);\n        Mon, 5 Oct 2020 18:25:52 -0400","from localhost (unknown [176.207.245.61])\n        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n        (No client certificate requested)\n        by mail.kernel.org (Postfix) with ESMTPSA id A569F2076E;\n        Mon,  5 Oct 2020 22:25:50 +0000 (UTC)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;\n        s=default; t=1601936751;\n        bh=xuF7KDkVMm7Tl30WZFYZAz39xWsAxyjtzrh1vSdyaQU=;\n        h=Date:From:To:Cc:Subject:References:In-Reply-To:From;\n        b=ArooJQte9Q5g4SHodvVdpePkjXQ4e6ugVRYDy4jHVlpgxL4kGwxj84VKO2Sta1XQp\n         oXy1TbxHpxgGSjeNlRMungUbDaNf7m4ac8wBLMv6UqP30xqMtIWiuhMGTgAAFMMtVJ\n         ynqalo8M8MPrMeZfmdY2ou4OzLB7hV3KMAP299cg=","Date":"Tue, 6 Oct 2020 00:24:54 +0200","From":"Lorenzo Bianconi <lorenzo@kernel.org>","To":"John Fastabend <john.fastabend@gmail.com>","Cc":"Jesper Dangaard Brouer <brouer@redhat.com>,\n        Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,\n        bpf@vger.kernel.org, netdev@vger.kernel.org, davem@davemloft.net,\n        kuba@kernel.org, ast@kernel.org, daniel@iogearbox.net,\n        shayagr@amazon.com, sameehj@amazon.com, dsahern@kernel.org,\n        echaudro@redhat.com","Subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Message-ID":"<20201005222454.GB3501@localhost.localdomain>","References":"<cover.1601648734.git.lorenzo@kernel.org>\n <5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>\n <20201002160623.GA40027@lore-desk>\n <5f776c14d69b3_a6402087e@john-XPS-13-9370.notmuch>\n <20201005115247.72429157@carbon>\n <5f7b8e7a5ebfc_4f19a208ba@john-XPS-13-9370.notmuch>","MIME-Version":"1.0","Content-Type":"multipart/signed; micalg=pgp-sha256;\n        protocol=\"application/pgp-signature\"; boundary=\"bCsyhTFzCvuiizWE\"","Content-Disposition":"inline","In-Reply-To":"<5f7b8e7a5ebfc_4f19a208ba@john-XPS-13-9370.notmuch>","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2546494,"web_url":"http://patchwork.ozlabs.org/comment/2546494/","msgid":"<5f7bf2b0bf899_4f19a2083f@john-XPS-13-9370.notmuch>","list_archive_url":null,"date":"2020-10-06T04:29:36","subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":20028,"url":"http://patchwork.ozlabs.org/api/people/20028/","name":"John Fastabend","email":"john.fastabend@gmail.com"},"content":"Lorenzo Bianconi wrote:\n> [...]\n> \n> > \n> > In general I see no reason to populate these fields before the XDP\n> > program runs. Someone needs to convince me why having frags info before\n> > program runs is useful. In general headers should be preserved and first\n> > frag already included in the data pointers. If users start parsing further\n> > they might need it, but this series doesn't provide a way to do that\n> > so IMO without those helpers its a bit difficult to debate.\n> \n> We need to populate the skb_shared_info before running the xdp program in order to\n> allow the ebpf sanbox to access this data. If we restrict the access to the first\n> buffer only I guess we can avoid to do that but I think there is a value allowing\n> the xdp program to access this data.\n\nI agree. We could also only populate the fields if the program accesses\nthe fields.\n\n> A possible optimization can be access the shared_info only once before running\n> the ebpf program constructing the shared_info using a struct allocated on the\n> stack.\n\nSeems interesting, might be a good idea.\n\n> Moreover we can define a \"xdp_shared_info\" struct to alias the skb_shared_info\n> one in order to have most on frags elements in the first \"shared_info\" cache line.\n> \n> > \n> > Specifically for XDP_TX case we can just flip the descriptors from RX\n> > ring to TX ring and keep moving along. This is going to be ideal on\n> > 40/100Gbps nics.\n> > \n> > I'm not arguing that its likely possible to put some prefetch logic\n> > in there and keep the pipe full, but I would need to see that on\n> > a 100gbps nic to be convinced the details here are going to work. Or\n> > at minimum a 40gbps nic.\n> > \n> > > \n> > > \n> \n> [...]\n> \n> > Not against it, but these things are a bit tricky. Couple things I still\n> > want to see/understand\n> > \n> >  - Lets see a 40gbps use a prefetch and verify it works in practice\n> >  - Explain why we can't just do this after XDP program runs\n> \n> how can we allow the ebpf program to access paged data if we do not do that?\n\nI don't see an easy way, but also this series doesn't have the data\naccess support.\n\nIts hard to tell until we get at least a 40gbps nic if my concern about\nperformance is real or not. Prefetching smartly could resolve some of the\nissue I guess.\n\nIf the Intel folks are working on it I think waiting would be great. Otherwise\nat minimum drop the helpers and be prepared to revert things if needed.\n\n> \n> >  - How will we read data in the frag list if we need to parse headers\n> >    inside the frags[].\n> > \n> > The above would be best to answer now rather than later IMO.\n> > \n> > Thanks,\n> > John\n> \n> Regards,\n> Lorenzo","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com","ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20161025 header.b=ouTfOSoe;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C54Ht39cPz9sS8\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Tue,  6 Oct 2020 15:29:46 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S1726878AbgJFE3p (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Tue, 6 Oct 2020 00:29:45 -0400","from lindbergh.monkeyblade.net ([23.128.96.19]:56044 \"EHLO\n        lindbergh.monkeyblade.net\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n        with ESMTP id S1725945AbgJFE3p (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Tue, 6 Oct 2020 00:29:45 -0400","from mail-io1-xd43.google.com (mail-io1-xd43.google.com\n [IPv6:2607:f8b0:4864:20::d43])\n        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2EBBBC0613A7;\n        Mon,  5 Oct 2020 21:29:45 -0700 (PDT)","by mail-io1-xd43.google.com with SMTP id u19so11662410ion.3;\n        Mon, 05 Oct 2020 21:29:45 -0700 (PDT)","from localhost ([184.63.162.180])\n        by smtp.gmail.com with ESMTPSA id\n m18sm1033864ili.85.2020.10.05.21.29.42\n        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n        Mon, 05 Oct 2020 21:29:43 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=gmail.com; s=20161025;\n        h=date:from:to:cc:message-id:in-reply-to:references:subject\n         :mime-version:content-transfer-encoding;\n        bh=e3La3o/PaCpx3r0ht4G/d6LvHRWphH0B2pGDFYrXkpI=;\n        b=ouTfOSoegCuoJ63G5XuAAIot2C7Nt6oHxcRv05bweNVqa4OIcVdxI12KSsiQDsAAz2\n         XvpdvdvyER9SCP4iTFJiBhXrH7qQDTKdUaIgNLlYDZmRlpw33FWCa5mnC2GYWPnvE1Qo\n         Lw+Icf7N8rjbiSzywDlfvJyu+b/8AtLADSP0knadxRfPb+hjPuMAW5jkvrpr2liAebjw\n         LXtpDE7prUZBYH6lfgpW8sB9kxylnUj7fOvih6OYrdS45Nw18RtMEw5NdZ7iZI7gdggL\n         WkkCg97AzA2j9azGFSKaXaDRaczy09ECWjWTF3cqRhPWKprHcJVYP5Y87CgumhiNuYoL\n         C2Yg==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=1e100.net; s=20161025;\n        h=x-gm-message-state:date:from:to:cc:message-id:in-reply-to\n         :references:subject:mime-version:content-transfer-encoding;\n        bh=e3La3o/PaCpx3r0ht4G/d6LvHRWphH0B2pGDFYrXkpI=;\n        b=LTtTX4E7dWqbR/IxeNe9mQd1rJu/8zDAA65LbqrOP1uDymQHdGbxTlIgyEbpOaG/DM\n         ZTWQHmqydnD0LS555svzhx60kLyaL40nFFwA4k8b6O3qP7UoQDemui7MMDRy0M7KBkw9\n         iFecxeqqjV4BjroyxWqMtak3NRE5VhmNF3tRO5k/lNbPoj0ZEbXHu+LG6eguP1w3kTIu\n         SkOsrQfWYUxs7KecGMRUTZdIsj2CVNO3l1kwb4U/TGVoj6Qu6ApWKftDJuvcDymzGQU/\n         VMQo6D1deMmOOZGozKYk0Br93nSJXeOQxEOtBTvuefSfi6/gI0heeBDwUldMyHVhd7ni\n         nZWw==","X-Gm-Message-State":"AOAM530ySssVC56LvpzVPaD+KwCJcYFvHnCPI7Y+XH0zFwyUu2/d5Myv\n        /pfMNM6D8+vtZ/tjLeTrUXU=","X-Google-Smtp-Source":"\n ABdhPJzPdPSN2LUhlBVng95AzZLUoQU+Epnodjx6nFq3/o5jti/GCGVqWjVwjbwqJ3XSnb0VDhK2Iw==","X-Received":"by 2002:a02:c785:: with SMTP id n5mr2975946jao.128.1601958584489;\n        Mon, 05 Oct 2020 21:29:44 -0700 (PDT)","Date":"Mon, 05 Oct 2020 21:29:36 -0700","From":"John Fastabend <john.fastabend@gmail.com>","To":"Lorenzo Bianconi <lorenzo@kernel.org>,\n        John Fastabend <john.fastabend@gmail.com>","Cc":"Jesper Dangaard Brouer <brouer@redhat.com>,\n        Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,\n        bpf@vger.kernel.org, netdev@vger.kernel.org, davem@davemloft.net,\n        kuba@kernel.org, ast@kernel.org, daniel@iogearbox.net,\n        shayagr@amazon.com, sameehj@amazon.com, dsahern@kernel.org,\n        echaudro@redhat.com","Message-ID":"<5f7bf2b0bf899_4f19a2083f@john-XPS-13-9370.notmuch>","In-Reply-To":"<20201005222454.GB3501@localhost.localdomain>","References":"<cover.1601648734.git.lorenzo@kernel.org>\n <5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>\n <20201002160623.GA40027@lore-desk>\n <5f776c14d69b3_a6402087e@john-XPS-13-9370.notmuch>\n <20201005115247.72429157@carbon>\n <5f7b8e7a5ebfc_4f19a208ba@john-XPS-13-9370.notmuch>\n <20201005222454.GB3501@localhost.localdomain>","Subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Mime-Version":"1.0","Content-Type":"text/plain;\n charset=utf-8","Content-Transfer-Encoding":"7bit","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2546645,"web_url":"http://patchwork.ozlabs.org/comment/2546645/","msgid":"<20201006093011.36375745@carbon>","list_archive_url":null,"date":"2020-10-06T07:30:11","subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":13625,"url":"http://patchwork.ozlabs.org/api/people/13625/","name":"Jesper Dangaard Brouer","email":"brouer@redhat.com"},"content":"On Mon, 05 Oct 2020 21:29:36 -0700\nJohn Fastabend <john.fastabend@gmail.com> wrote:\n\n> Lorenzo Bianconi wrote:\n> > [...]\n> >   \n> > > \n> > > In general I see no reason to populate these fields before the XDP\n> > > program runs. Someone needs to convince me why having frags info before\n> > > program runs is useful. In general headers should be preserved and first\n> > > frag already included in the data pointers. If users start parsing further\n> > > they might need it, but this series doesn't provide a way to do that\n> > > so IMO without those helpers its a bit difficult to debate.  \n> > \n> > We need to populate the skb_shared_info before running the xdp program in order to\n> > allow the ebpf sanbox to access this data. If we restrict the access to the first\n> > buffer only I guess we can avoid to do that but I think there is a value allowing\n> > the xdp program to access this data.  \n> \n> I agree. We could also only populate the fields if the program accesses\n> the fields.\n\nNotice, a driver will not initialize/use the shared_info area unless\nthere are more segments.  And (we have already established) the xdp->mb\nbit is guarding BPF-prog from accessing shared_info area. \n\n> > A possible optimization can be access the shared_info only once before running\n> > the ebpf program constructing the shared_info using a struct allocated on the\n> > stack.  \n> \n> Seems interesting, might be a good idea.\n\nIt *might* be a good idea (\"alloc\" shared_info on stack), but we should\nbenchmark this.  The prefetch trick might be fast enough.  But also\nkeep in mind the performance target, as with large size frames the\npacket-per-sec we need to handle dramatically drop.\n\n\nThe TSO statement, I meant LRO (Large Receive Offload), but I want the\nability to XDP-redirect this frame out another netdev as TSO.  This\ndoes means that we need more than 3 pages (2 frags slots) to store LRO\nframes.  Thus, if we store this shared_info on the stack it might need\nto be larger than we like.\n\n\n\n> > Moreover we can define a \"xdp_shared_info\" struct to alias the skb_shared_info\n> > one in order to have most on frags elements in the first \"shared_info\" cache line.\n> >   \n> > > \n> > > Specifically for XDP_TX case we can just flip the descriptors from RX\n> > > ring to TX ring and keep moving along. This is going to be ideal on\n> > > 40/100Gbps nics.\n\nI think both approaches will still allow to do these page-flips.\n\n> > > I'm not arguing that its likely possible to put some prefetch logic\n> > > in there and keep the pipe full, but I would need to see that on\n> > > a 100gbps nic to be convinced the details here are going to work. Or\n> > > at minimum a 40gbps nic.\n\nI'm looking forward to see how this performs on faster NICs.  Once we\nhave a high-speed NIC driver with this I can also start doing testing\nin my testlab.\n\n\n> > [...]\n> >   \n> > > Not against it, but these things are a bit tricky. Couple things I still\n> > > want to see/understand\n> > > \n> > >  - Lets see a 40gbps use a prefetch and verify it works in practice\n> > >  - Explain why we can't just do this after XDP program runs  \n> > \n> > how can we allow the ebpf program to access paged data if we do not do that?  \n> \n> I don't see an easy way, but also this series doesn't have the data\n> access support.\n\nEelco (Cc'ed) are working on patches that allow access to data in these\nfragments, so far internal patches, which (sorry to mention) got\nshutdown in internal review.\n\n\n> Its hard to tell until we get at least a 40gbps nic if my concern about\n> performance is real or not. Prefetching smartly could resolve some of the\n> issue I guess.\n> \n> If the Intel folks are working on it I think waiting would be great. Otherwise\n> at minimum drop the helpers and be prepared to revert things if needed.\n\nI do think it makes sense to drop the helpers for now, and focus on how\nthis new multi-buffer frame type is handled in the existing code, and do\nsome benchmarking on higher speed NIC, before the BPF-helper start to\nlockdown/restrict what we can change/revert as they define UAPI.\n\nE.g. existing code that need to handle this is existing helper\nbpf_xdp_adjust_tail, which is something I have broad up before and even\ndescribed in[1].  Lets make sure existing code works with proposed\ndesign, before introducing new helpers (and this makes it easier to\nrevert).\n\n[1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org#xdp-tail-adjust","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=redhat.com","ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256\n header.s=mimecast20190719 header.b=Y54jWPMT;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C58JX2qQrz9sS8\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Tue,  6 Oct 2020 18:30:36 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S1726875AbgJFHaf (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Tue, 6 Oct 2020 03:30:35 -0400","from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:59716 \"EHLO\n        us-smtp-delivery-124.mimecast.com\" rhost-flags-OK-OK-OK-OK)\n        by vger.kernel.org with ESMTP id S1726769AbgJFHaf (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Tue, 6 Oct 2020 03:30:35 -0400","from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com\n [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id\n us-mta-100-6_or9DhLMlu5tzwliT_lAQ-1; Tue, 06 Oct 2020 03:30:30 -0400","from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com\n [10.5.11.23])\n        (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n        (No client certificate requested)\n        by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D1C461015CA1;\n        Tue,  6 Oct 2020 07:30:28 +0000 (UTC)","from carbon (unknown [10.36.110.30])\n        by smtp.corp.redhat.com (Postfix) with ESMTP id 490CE19C4F;\n        Tue,  6 Oct 2020 07:30:13 +0000 (UTC)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;\n        s=mimecast20190719; t=1601969433;\n        h=from:from:reply-to:subject:subject:date:date:message-id:message-id:\n         to:to:cc:cc:mime-version:mime-version:content-type:content-type:\n         content-transfer-encoding:content-transfer-encoding:\n         in-reply-to:in-reply-to:references:references;\n        bh=kFgbWUxLO3HjEne5ObZQxZp4HFI6IlgnTKpMl2lgCeQ=;\n        b=Y54jWPMTX+HQ0oshlMaXW2ippj9MGzRrWB2KDRkez6EYM5xfHLEoh3otr7BUQgvjjIsiFz\n        Qqd3+/XnxxNDUvISpQOxb323k3bIc7TcQd1Jlrw+V17V/lBTcCqXJXv3I1KSybjNISVfsH\n        qQsSaf6yNlBBXwktZOzlV4zUfNmvl6g=","X-MC-Unique":"6_or9DhLMlu5tzwliT_lAQ-1","Date":"Tue, 6 Oct 2020 09:30:11 +0200","From":"Jesper Dangaard Brouer <brouer@redhat.com>","To":"John Fastabend <john.fastabend@gmail.com>","Cc":"Lorenzo Bianconi <lorenzo@kernel.org>,\n Lorenzo Bianconi <lorenzo.bianconi@redhat.com>, bpf@vger.kernel.org,\n netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org, ast@kernel.org,\n daniel@iogearbox.net, shayagr@amazon.com, sameehj@amazon.com,\n dsahern@kernel.org, Eelco Chaudron <echaudro@redhat.com>, brouer@redhat.com,\n Tirthendu Sarkar <tirtha@gmail.com>, Toke =?utf-8?q?H=C3=B8iland-J=C3=B8rge?=\n\t=?utf-8?q?nsen?= <toke@redhat.com>","Subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Message-ID":"<20201006093011.36375745@carbon>","In-Reply-To":"<5f7bf2b0bf899_4f19a2083f@john-XPS-13-9370.notmuch>","References":"<cover.1601648734.git.lorenzo@kernel.org>\n        <5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>\n        <20201002160623.GA40027@lore-desk>\n        <5f776c14d69b3_a6402087e@john-XPS-13-9370.notmuch>\n        <20201005115247.72429157@carbon>\n        <5f7b8e7a5ebfc_4f19a208ba@john-XPS-13-9370.notmuch>\n        <20201005222454.GB3501@localhost.localdomain>\n        <5f7bf2b0bf899_4f19a2083f@john-XPS-13-9370.notmuch>","MIME-Version":"1.0","Content-Type":"text/plain; charset=US-ASCII","Content-Transfer-Encoding":"7bit","X-Scanned-By":"MIMEDefang 2.84 on 10.5.11.23","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2546924,"web_url":"http://patchwork.ozlabs.org/comment/2546924/","msgid":"<ba4b2f1ef9ea434292e14f03da6bf908@EX13D11EUB003.ant.amazon.com>","list_archive_url":null,"date":"2020-10-06T12:39:36","subject":"RE: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":76545,"url":"http://patchwork.ozlabs.org/api/people/76545/","name":"Jubran, Samih","email":"sameehj@amazon.com"},"content":"> -----Original Message-----\n> From: Daniel Borkmann <daniel@iogearbox.net>\n> Sent: Friday, October 2, 2020 10:53 PM\n> To: John Fastabend <john.fastabend@gmail.com>; Lorenzo Bianconi\n> <lorenzo@kernel.org>; bpf@vger.kernel.org; netdev@vger.kernel.org\n> Cc: davem@davemloft.net; kuba@kernel.org; ast@kernel.org; Agroskin,\n> Shay <shayagr@amazon.com>; Jubran, Samih <sameehj@amazon.com>;\n> dsahern@kernel.org; brouer@redhat.com; lorenzo.bianconi@redhat.com;\n> echaudro@redhat.com\n> Subject: RE: [EXTERNAL] [PATCH v4 bpf-next 00/13] mvneta: introduce XDP\n> multi-buffer support\n> \n> CAUTION: This email originated from outside of the organization. Do not click\n> links or open attachments unless you can confirm the sender and know the\n> content is safe.\n> \n> \n> \n> On 10/2/20 5:25 PM, John Fastabend wrote:\n> > Lorenzo Bianconi wrote:\n> >> This series introduce XDP multi-buffer support. The mvneta driver is\n> >> the first to support these new \"non-linear\" xdp_{buff,frame}.\n> >> Reviewers please focus on how these new types of xdp_{buff,frame}\n> >> packets traverse the different layers and the layout design. It is on\n> >> purpose that BPF-helpers are kept simple, as we don't want to expose\n> >> the internal layout to allow later changes.\n> >>\n> >> For now, to keep the design simple and to maintain performance, the\n> >> XDP BPF-prog (still) only have access to the first-buffer. It is left\n> >> for later (another patchset) to add payload access across multiple buffers.\n> >> This patchset should still allow for these future extensions. The\n> >> goal is to lift the XDP MTU restriction that comes with XDP, but\n> >> maintain same performance as before.\n> >>\n> >> The main idea for the new multi-buffer layout is to reuse the same\n> >> layout used for non-linear SKB. This rely on the \"skb_shared_info\"\n> >> struct at the end of the first buffer to link together subsequent\n> >> buffers. Keeping the layout compatible with SKBs is also done to ease\n> >> and speedup creating an SKB from an xdp_{buff,frame}. Converting\n> >> xdp_frame to SKB and deliver it to the network stack is shown in\n> >> cpumap code (patch 13/13).\n> >\n> > Using the end of the buffer for the skb_shared_info struct is going to\n> > become driver API so unwinding it if it proves to be a performance\n> > issue is going to be ugly. So same question as before, for the use\n> > case where we receive packet and do XDP_TX with it how do we avoid\n> > cache miss overhead? This is not just a hypothetical use case, the\n> > Facebook load balancer is doing this as well as Cilium and allowing\n> > this with multi-buffer packets >1500B would be useful.\n> [...]\n> \n> Fully agree. My other question would be if someone else right now is in the\n> process of implementing this scheme for a 40G+ NIC? My concern is the\n> numbers below are rather on the lower end of the spectrum, so I would like\n> to see a comparison of XDP as-is today vs XDP multi-buff on a higher end NIC\n> so that we have a picture how well the current designed scheme works there\n> and into which performance issue we'll run e.g.\n> under typical XDP L4 load balancer scenario with XDP_TX. I think this would\n> be crucial before the driver API becomes 'sort of' set in stone where others\n> start to adapting it and changing design becomes painful. Do ena folks have\n> an implementation ready as well? And what about virtio_net, for example,\n> anyone committing there too? Typically for such features to land is to require\n> at least 2 drivers implementing it.\n>\n\nWe (ENA) expect to have XDP MB implementation with performance results in around 4-6 weeks.\n\n> >> Typical use cases for this series are:\n> >> - Jumbo-frames\n> >> - Packet header split (please see Google   s use-case @ NetDevConf\n> >> 0x14, [0])\n> >> - TSO\n> >>\n> >> More info about the main idea behind this approach can be found here\n> [1][2].\n> >>\n> >> We carried out some throughput tests in a standard linear frame\n> >> scenario in order to verify we did not introduced any performance\n> >> regression adding xdp multi-buff support to mvneta:\n> >>\n> >> offered load is ~ 1000Kpps, packet size is 64B, mvneta descriptor\n> >> size is one PAGE\n> >>\n> >> commit: 879456bedbe5 (\"net: mvneta: avoid possible cache misses in\n> mvneta_rx_swbm\")\n> >> - xdp-pass:      ~162Kpps\n> >> - xdp-drop:      ~701Kpps\n> >> - xdp-tx:        ~185Kpps\n> >> - xdp-redirect:  ~202Kpps\n> >>\n> >> mvneta xdp multi-buff:\n> >> - xdp-pass:      ~163Kpps\n> >> - xdp-drop:      ~739Kpps\n> >> - xdp-tx:        ~182Kpps\n> >> - xdp-redirect:  ~202Kpps\n> [...]","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=quarantine dis=none) header.from=amazon.com","ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=amazon.com header.i=@amazon.com header.a=rsa-sha256\n header.s=amazon201209 header.b=WxG4j5Ku;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C5H9Q6FqVz9sTm\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Tue,  6 Oct 2020 23:39:54 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S1726497AbgJFMjy (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Tue, 6 Oct 2020 08:39:54 -0400","from smtp-fw-9102.amazon.com ([207.171.184.29]:13857 \"EHLO\n        smtp-fw-9102.amazon.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n        with ESMTP id S1726362AbgJFMjx (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Tue, 6 Oct 2020 08:39:53 -0400","from sea32-co-svc-lb4-vlan3.sea.corp.amazon.com (HELO\n email-inbound-relay-2b-5bdc5131.us-west-2.amazon.com) ([10.47.23.38])\n  by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP;\n 06 Oct 2020 12:39:49 +0000","from EX13D28EUB002.ant.amazon.com\n (pdx4-ws-svc-p6-lb7-vlan3.pdx.amazon.com [10.170.41.166])\n        by email-inbound-relay-2b-5bdc5131.us-west-2.amazon.com (Postfix) with\n ESMTPS id D8C0DA1823;\n        Tue,  6 Oct 2020 12:39:47 +0000 (UTC)","from EX13D11EUB003.ant.amazon.com (10.43.166.58) by\n EX13D28EUB002.ant.amazon.com (10.43.166.97) with Microsoft SMTP Server (TLS)\n id 15.0.1497.2; Tue, 6 Oct 2020 12:39:46 +0000","from EX13D11EUB003.ant.amazon.com ([10.43.166.58]) by\n EX13D11EUB003.ant.amazon.com ([10.43.166.58]) with mapi id 15.00.1497.006;\n Tue, 6 Oct 2020 12:39:46 +0000"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n  d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209;\n  t=1601987993; x=1633523993;\n  h=from:to:cc:date:message-id:references:in-reply-to:\n   content-transfer-encoding:mime-version:subject;\n  bh=cWScEf2H0N07zMQ3DijjChI3ei1wf5ykE77iwjqAp9c=;\n  b=WxG4j5KuEaloLswczX2aXeQJ4rGJxrA/L0ExJg3Ik0zPvuuIe/7aZutE\n   XtYBfZQXKKtN0SQ1y2UaksGqoPyDCCjBL7oUOHga8wSeLeHJeEzCh0iVA\n   8qvEt5cqRe5I0l0DOaO9trl7fkinlk8Wjpt4O1Uty2ewJWuPZMlh6iaJW\n   Q=;","X-IronPort-AV":"E=Sophos;i=\"5.77,343,1596499200\";\n   d=\"scan'208\";a=\"81962596\"","Subject":"RE: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Thread-Topic":"[PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","From":"\"Jubran, Samih\" <sameehj@amazon.com>","To":"Daniel Borkmann <daniel@iogearbox.net>,\n        John Fastabend <john.fastabend@gmail.com>,\n        Lorenzo Bianconi <lorenzo@kernel.org>,\n        \"bpf@vger.kernel.org\" <bpf@vger.kernel.org>,\n        \"netdev@vger.kernel.org\" <netdev@vger.kernel.org>","CC":"\"davem@davemloft.net\" <davem@davemloft.net>,\n        \"kuba@kernel.org\" <kuba@kernel.org>,\n        \"ast@kernel.org\" <ast@kernel.org>,\n        \"Agroskin, Shay\" <shayagr@amazon.com>,\n        \"dsahern@kernel.org\" <dsahern@kernel.org>,\n        \"brouer@redhat.com\" <brouer@redhat.com>,\n        \"lorenzo.bianconi@redhat.com\" <lorenzo.bianconi@redhat.com>,\n        \"echaudro@redhat.com\" <echaudro@redhat.com>","Thread-Index":"AQHWmMpYHdcS8+W5uEiUVfwUzD8TrKmEbwiAgABKxoCABc+m8A==","Date":"Tue, 6 Oct 2020 12:39:36 +0000","Deferred-Delivery":"Tue, 6 Oct 2020 12:39:29 +0000","Message-ID":"<ba4b2f1ef9ea434292e14f03da6bf908@EX13D11EUB003.ant.amazon.com>","References":"<cover.1601648734.git.lorenzo@kernel.org>\n <5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>\n <5c22ee38-e2c3-0724-5033-603d19c4169f@iogearbox.net>","In-Reply-To":"<5c22ee38-e2c3-0724-5033-603d19c4169f@iogearbox.net>","Accept-Language":"en-US","Content-Language":"en-US","X-MS-Has-Attach":"","X-MS-TNEF-Correlator":"","x-ms-exchange-transport-fromentityheader":"Hosted","x-originating-ip":"[10.43.164.68]","Content-Type":"text/plain; charset=\"utf-8\"","Content-Transfer-Encoding":"base64","MIME-Version":"1.0","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2547124,"web_url":"http://patchwork.ozlabs.org/comment/2547124/","msgid":"<20201006152845.GC43823@lore-desk>","list_archive_url":null,"date":"2020-10-06T15:28:45","subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":73083,"url":"http://patchwork.ozlabs.org/api/people/73083/","name":"Lorenzo Bianconi","email":"lorenzo.bianconi@redhat.com"},"content":"> On Mon, 05 Oct 2020 21:29:36 -0700\n> John Fastabend <john.fastabend@gmail.com> wrote:\n> \n> > Lorenzo Bianconi wrote:\n> > > [...]\n> > >   \n> > > > \n> > > > In general I see no reason to populate these fields before the XDP\n> > > > program runs. Someone needs to convince me why having frags info before\n> > > > program runs is useful. In general headers should be preserved and first\n> > > > frag already included in the data pointers. If users start parsing further\n> > > > they might need it, but this series doesn't provide a way to do that\n> > > > so IMO without those helpers its a bit difficult to debate.  \n> > > \n> > > We need to populate the skb_shared_info before running the xdp program in order to\n> > > allow the ebpf sanbox to access this data. If we restrict the access to the first\n> > > buffer only I guess we can avoid to do that but I think there is a value allowing\n> > > the xdp program to access this data.  \n> > \n> > I agree. We could also only populate the fields if the program accesses\n> > the fields.\n> \n> Notice, a driver will not initialize/use the shared_info area unless\n> there are more segments.  And (we have already established) the xdp->mb\n> bit is guarding BPF-prog from accessing shared_info area. \n> \n> > > A possible optimization can be access the shared_info only once before running\n> > > the ebpf program constructing the shared_info using a struct allocated on the\n> > > stack.  \n> > \n> > Seems interesting, might be a good idea.\n> \n> It *might* be a good idea (\"alloc\" shared_info on stack), but we should\n> benchmark this.  The prefetch trick might be fast enough.  But also\n> keep in mind the performance target, as with large size frames the\n> packet-per-sec we need to handle dramatically drop.\n\nright. I guess we need to define a workload we want to run for the\nxdp multi-buff use-case (e.g. if MTU is 9K we will have ~3 frames\nfor each packets and # of pps will be much slower)\n\n> \n> \n\n[...]\n\n> \n> I do think it makes sense to drop the helpers for now, and focus on how\n> this new multi-buffer frame type is handled in the existing code, and do\n> some benchmarking on higher speed NIC, before the BPF-helper start to\n> lockdown/restrict what we can change/revert as they define UAPI.\n\nack, I will drop them in v5.\n\nRegards,\nLorenzo\n\n> \n> E.g. existing code that need to handle this is existing helper\n> bpf_xdp_adjust_tail, which is something I have broad up before and even\n> described in[1].  Lets make sure existing code works with proposed\n> design, before introducing new helpers (and this makes it easier to\n> revert).\n> \n> [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org#xdp-tail-adjust\n> -- \n> Best regards,\n>   Jesper Dangaard Brouer\n>   MSc.CS, Principal Kernel Engineer at Red Hat\n>   LinkedIn: http://www.linkedin.com/in/brouer\n>","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=redhat.com","ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256\n header.s=mimecast20190719 header.b=fIz7A15K;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C5LwS4rZcz9sTL\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Wed,  7 Oct 2020 02:28:56 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S1726447AbgJFP2z (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Tue, 6 Oct 2020 11:28:55 -0400","from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:29311 \"EHLO\n        us-smtp-delivery-124.mimecast.com\" rhost-flags-OK-OK-OK-OK)\n        by vger.kernel.org with ESMTP id S1725996AbgJFP2z (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Tue, 6 Oct 2020 11:28:55 -0400","from mail-wr1-f69.google.com (mail-wr1-f69.google.com\n [209.85.221.69]) (Using TLS) by relay.mimecast.com with ESMTP id\n us-mta-399-R_2YSyJ0ObKmWIylB6OVMg-1; Tue, 06 Oct 2020 11:28:50 -0400","by mail-wr1-f69.google.com with SMTP id b2so5450095wrs.7\n        for <netdev@vger.kernel.org>; Tue, 06 Oct 2020 08:28:50 -0700 (PDT)","from localhost ([176.207.245.61])\n        by smtp.gmail.com with ESMTPSA id\n i11sm5012094wre.32.2020.10.06.08.28.48\n        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n        Tue, 06 Oct 2020 08:28:48 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;\n        s=mimecast20190719; t=1601998133;\n        h=from:from:reply-to:subject:subject:date:date:message-id:message-id:\n         to:to:cc:cc:mime-version:mime-version:content-type:content-type:\n         in-reply-to:in-reply-to:references:references;\n        bh=/rdp1rxgsLSHWFsRmsO/gSgWUaSPdYRlDAynIxLveGo=;\n        b=fIz7A15KO0C070dG+sCsrIJMvGerQ4PXww4hPk7i3V2MnBnSOQFDtAbmoO+x9Y59kJCdla\n        /1LejrY0BvaOdpdnL7hOkyaOSZfbDkyd3qogVLjgXvg7zQuBe+iVuGJ6oHExKG9cfU3cqY\n        jf68xVWh2SFK0QIuH4gmXJCqv87qMGU=","X-MC-Unique":"R_2YSyJ0ObKmWIylB6OVMg-1","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=1e100.net; s=20161025;\n        h=x-gm-message-state:date:from:to:cc:subject:message-id:references\n         :mime-version:content-disposition:in-reply-to;\n        bh=/rdp1rxgsLSHWFsRmsO/gSgWUaSPdYRlDAynIxLveGo=;\n        b=lXlmZY+H3eD85fuyNMPDlq4Aa5FgewPsamD/IjZ0qZLOpz3S1+EA8En80Wu2kFw9T/\n         5VDqCw+8D6GhKGyqjIR1XnDwk9Q/YVbrTF5Vxqx67Kx+yE+c3JRcTmnQY9tWMZj5GtB8\n         ApUkLWgH3rD1m4hQTgg45IOrRI/UKBztm0iYBonqr8E2sfXCESwmhS9L7qJp7XNAte92\n         aMFusqyTDS2OS9jceJ/86mFMiLwCEOfmd5PQ3J8YZXAtxnCtlFCM9ndLmUATa6XlcpeS\n         PycdVEQp3d6YRMUwLgZNZzdwhLfUlm8UcmzaRNc/uwFq+4lKPCE5RC+qp8Ms8dmyj+Uq\n         Xpng==","X-Gm-Message-State":"AOAM5315aQRe9N2rGAZn762AtxblC7X9+SPIz9FPwdYPBaCN0iJ264ns\n        lOorRgTcySKJz8xZYaP8bccTxYQs0Hu/RJndWF60HgA/vBJQANc3PAiWHNop7CfHqIgxGR4Yt5h\n        TC+vhzIHmRalwlaFR","X-Received":["by 2002:adf:d841:: with SMTP id k1mr5563929wrl.227.1601998129365;\n        Tue, 06 Oct 2020 08:28:49 -0700 (PDT)","by 2002:adf:d841:: with SMTP id k1mr5563895wrl.227.1601998129069;\n        Tue, 06 Oct 2020 08:28:49 -0700 (PDT)"],"X-Google-Smtp-Source":"\n ABdhPJxec4UfB0CGEg5/hujQVs3YwrNsB08CAyEAQK+3mUtbHLILGP9rB9x74qM/5zMopOjtzDtnew==","Date":"Tue, 6 Oct 2020 17:28:45 +0200","From":"Lorenzo Bianconi <lorenzo.bianconi@redhat.com>","To":"Jesper Dangaard Brouer <brouer@redhat.com>","Cc":"John Fastabend <john.fastabend@gmail.com>,\n Lorenzo Bianconi <lorenzo@kernel.org>, bpf@vger.kernel.org,\n netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org, ast@kernel.org,\n daniel@iogearbox.net, shayagr@amazon.com, sameehj@amazon.com,\n dsahern@kernel.org, Eelco Chaudron <echaudro@redhat.com>,\n Tirthendu Sarkar <tirtha@gmail.com>, Toke =?iso-8859-1?q?H=F8iland-J=F8rgen?=\n\t=?iso-8859-1?q?sen?= <toke@redhat.com>","Subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Message-ID":"<20201006152845.GC43823@lore-desk>","References":"<cover.1601648734.git.lorenzo@kernel.org>\n <5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>\n <20201002160623.GA40027@lore-desk>\n <5f776c14d69b3_a6402087e@john-XPS-13-9370.notmuch>\n <20201005115247.72429157@carbon>\n <5f7b8e7a5ebfc_4f19a208ba@john-XPS-13-9370.notmuch>\n <20201005222454.GB3501@localhost.localdomain>\n <5f7bf2b0bf899_4f19a2083f@john-XPS-13-9370.notmuch>\n <20201006093011.36375745@carbon>","MIME-Version":"1.0","Content-Type":"multipart/signed; micalg=pgp-sha256;\n        protocol=\"application/pgp-signature\"; boundary=\"Clx92ZfkiYIKRjnr\"","Content-Disposition":"inline","In-Reply-To":"<20201006093011.36375745@carbon>","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":2549056,"web_url":"http://patchwork.ozlabs.org/comment/2549056/","msgid":"<5f7f247acf860_2007208c9@john-XPS-13-9370.notmuch>","list_archive_url":null,"date":"2020-10-08T14:38:50","subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","submitter":{"id":20028,"url":"http://patchwork.ozlabs.org/api/people/20028/","name":"John Fastabend","email":"john.fastabend@gmail.com"},"content":"Lorenzo Bianconi wrote:\n> > On Mon, 05 Oct 2020 21:29:36 -0700\n> > John Fastabend <john.fastabend@gmail.com> wrote:\n> > \n> > > Lorenzo Bianconi wrote:\n> > > > [...]\n> > > >   \n> > > > > \n> > > > > In general I see no reason to populate these fields before the XDP\n> > > > > program runs. Someone needs to convince me why having frags info before\n> > > > > program runs is useful. In general headers should be preserved and first\n> > > > > frag already included in the data pointers. If users start parsing further\n> > > > > they might need it, but this series doesn't provide a way to do that\n> > > > > so IMO without those helpers its a bit difficult to debate.  \n> > > > \n> > > > We need to populate the skb_shared_info before running the xdp program in order to\n> > > > allow the ebpf sanbox to access this data. If we restrict the access to the first\n> > > > buffer only I guess we can avoid to do that but I think there is a value allowing\n> > > > the xdp program to access this data.  \n> > > \n> > > I agree. We could also only populate the fields if the program accesses\n> > > the fields.\n> > \n> > Notice, a driver will not initialize/use the shared_info area unless\n> > there are more segments.  And (we have already established) the xdp->mb\n> > bit is guarding BPF-prog from accessing shared_info area. \n> > \n> > > > A possible optimization can be access the shared_info only once before running\n> > > > the ebpf program constructing the shared_info using a struct allocated on the\n> > > > stack.  \n> > > \n> > > Seems interesting, might be a good idea.\n> > \n> > It *might* be a good idea (\"alloc\" shared_info on stack), but we should\n> > benchmark this.  The prefetch trick might be fast enough.  But also\n> > keep in mind the performance target, as with large size frames the\n> > packet-per-sec we need to handle dramatically drop.\n> \n> right. I guess we need to define a workload we want to run for the\n> xdp multi-buff use-case (e.g. if MTU is 9K we will have ~3 frames\n> for each packets and # of pps will be much slower)\n\nRight. Or configuring header split which would give 2 buffers with a much\nsmaller packet size. This would give some indication of the overhead. Then\nwe would likely want to look at XDP_TX and XDP_REDIRECT cases. At least\nthose would be my use cases.\n\n> \n> > \n> > \n> \n> [...]\n> \n> > \n> > I do think it makes sense to drop the helpers for now, and focus on how\n> > this new multi-buffer frame type is handled in the existing code, and do\n> > some benchmarking on higher speed NIC, before the BPF-helper start to\n> > lockdown/restrict what we can change/revert as they define UAPI.\n> \n> ack, I will drop them in v5.\n> \n> Regards,\n> Lorenzo\n> \n> > \n> > E.g. existing code that need to handle this is existing helper\n> > bpf_xdp_adjust_tail, which is something I have broad up before and even\n> > described in[1].  Lets make sure existing code works with proposed\n> > design, before introducing new helpers (and this makes it easier to\n> > revert).\n> > \n> > [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org#xdp-tail-adjust\n> > -- \n> > Best regards,\n> >   Jesper Dangaard Brouer\n> >   MSc.CS, Principal Kernel Engineer at Red Hat\n> >   LinkedIn: http://www.linkedin.com/in/brouer\n> >","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming-netdev@ozlabs.org","Delivered-To":"patchwork-incoming-netdev@ozlabs.org","Authentication-Results":["ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=23.128.96.18; helo=vger.kernel.org;\n envelope-from=netdev-owner@vger.kernel.org; receiver=<UNKNOWN>)","ozlabs.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com","ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20161025 header.b=uH9ooBSw;\n\tdkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [23.128.96.18])\n\tby ozlabs.org (Postfix) with ESMTP id 4C6Yk03P45z9sTs\n\tfor <patchwork-incoming-netdev@ozlabs.org>;\n Fri,  9 Oct 2020 01:39:04 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n        id S1730704AbgJHOjD (ORCPT\n        <rfc822;patchwork-incoming-netdev@ozlabs.org>);\n        Thu, 8 Oct 2020 10:39:03 -0400","from lindbergh.monkeyblade.net ([23.128.96.19]:58094 \"EHLO\n        lindbergh.monkeyblade.net\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n        with ESMTP id S1730668AbgJHOjA (ORCPT\n        <rfc822;netdev@vger.kernel.org>); Thu, 8 Oct 2020 10:39:00 -0400","from mail-il1-x142.google.com (mail-il1-x142.google.com\n [IPv6:2607:f8b0:4864:20::142])\n        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2EBEC061755;\n        Thu,  8 Oct 2020 07:38:59 -0700 (PDT)","by mail-il1-x142.google.com with SMTP id q7so5878337ile.8;\n        Thu, 08 Oct 2020 07:38:59 -0700 (PDT)","from localhost ([184.63.162.180])\n        by smtp.gmail.com with ESMTPSA id\n v15sm2778539ile.37.2020.10.08.07.38.55\n        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n        Thu, 08 Oct 2020 07:38:58 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=gmail.com; s=20161025;\n        h=date:from:to:cc:message-id:in-reply-to:references:subject\n         :mime-version:content-transfer-encoding;\n        bh=d2Racxx6iK0pNc2CLbu137mr5AeTeYJhXKRP7tG7ObI=;\n        b=uH9ooBSwFLCzXJjZmu3FhYNfCKZaVqLSsAWCOqMpGdMOW0H8I0+gIHcoBnmZLqXO80\n         3Fj2HVMRd6tZJFQMRTVXxi33HqPtenuyBrnDbXRrx9tV84dAFQBpmQXLltyyi2m6Pgog\n         ptprKAFE9hY8G05t76nAdjzTdbtZHU/V3MjVsZE92Y9P6hAV+OJQDCQv7mapxl3S8bvf\n         uLigFGztwl35HjM0SlloNibTtAfL3gd16/NzUeWpQ/A7rgbF2g+6r79llEIVwPzze31b\n         QMn7xseRV2wecfvli84FaJBgSe/K6XqQ8vTTyOY+xmKYyHTdejYLD29Gpfab8d6YtojU\n         Ir/A==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n        d=1e100.net; s=20161025;\n        h=x-gm-message-state:date:from:to:cc:message-id:in-reply-to\n         :references:subject:mime-version:content-transfer-encoding;\n        bh=d2Racxx6iK0pNc2CLbu137mr5AeTeYJhXKRP7tG7ObI=;\n        b=pFa+O05Qsaz5U0N1r/Qml/btH903ibTOxCAn/emfgZxxdg7gEhzbdDDsFTGi0yTxRP\n         0/S9rF19KRRZKeyhAaSdAPUPs4H6Y1h8r6xDhA5h0Jwgy2DKAdgBx/dqq6S7Yha/ioS3\n         JN8bwIiLaG5M3LupyfY5i0hKwh1GibY+q8q73mfcD1tWQliqOHn3+q8I2OtfSGdI968E\n         7zmhL5yjCoB+7m32qf7jhy/TmSceInKJLqY9ZGP9VtJ7QC7IHzpvxDXycNhEFJKEg/Sw\n         I/A6XbPe8j0vFYkOxReSW2qrkqYY+gTlUxutk/wRSAN5GEGrNBCtBhCorUZBqDHOCJT4\n         5l/w==","X-Gm-Message-State":"AOAM530TI0px0EOzumPUqKUBjLvjkNem3DNTWd2NyGJHfZq2SizzylUu\n        2ugahmym7ip+Ny0GSuG8rig=","X-Google-Smtp-Source":"\n ABdhPJwJIX3vRYewkNikOIhqH0ZLyaCY/9UC2nV/gLUAr1kAjdu1C7LtGgbISgalrlhPJYey+09j1g==","X-Received":"by 2002:a92:a307:: with SMTP id a7mr6261969ili.97.1602167939157;\n        Thu, 08 Oct 2020 07:38:59 -0700 (PDT)","Date":"Thu, 08 Oct 2020 07:38:50 -0700","From":"John Fastabend <john.fastabend@gmail.com>","To":"Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,\n        Jesper Dangaard Brouer <brouer@redhat.com>","Cc":"John Fastabend <john.fastabend@gmail.com>,\n Lorenzo Bianconi <lorenzo@kernel.org>, bpf@vger.kernel.org,\n netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org, ast@kernel.org,\n daniel@iogearbox.net, shayagr@amazon.com, sameehj@amazon.com,\n dsahern@kernel.org, Eelco Chaudron <echaudro@redhat.com>,\n Tirthendu Sarkar <tirtha@gmail.com>, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rge?=\n\t=?utf-8?q?nsen?= <toke@redhat.com>","Message-ID":"<5f7f247acf860_2007208c9@john-XPS-13-9370.notmuch>","In-Reply-To":"<20201006152845.GC43823@lore-desk>","References":"<cover.1601648734.git.lorenzo@kernel.org>\n <5f77467dbc1_38b0208ef@john-XPS-13-9370.notmuch>\n <20201002160623.GA40027@lore-desk>\n <5f776c14d69b3_a6402087e@john-XPS-13-9370.notmuch>\n <20201005115247.72429157@carbon>\n <5f7b8e7a5ebfc_4f19a208ba@john-XPS-13-9370.notmuch>\n <20201005222454.GB3501@localhost.localdomain>\n <5f7bf2b0bf899_4f19a2083f@john-XPS-13-9370.notmuch>\n <20201006093011.36375745@carbon>\n <20201006152845.GC43823@lore-desk>","Subject":"Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer\n support","Mime-Version":"1.0","Content-Type":"text/plain;\n charset=utf-8","Content-Transfer-Encoding":"7bit","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}}]