[{"id":1777318,"web_url":"http://patchwork.ozlabs.org/comment/1777318/","msgid":"<59CD83DD.4060603@iogearbox.net>","list_archive_url":null,"date":"2017-09-28T23:21:01","subject":"Re: [net-next PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and\n\tallocation","submitter":{"id":65705,"url":"http://patchwork.ozlabs.org/api/people/65705/","name":"Daniel Borkmann","email":"daniel@iogearbox.net"},"content":"On 09/28/2017 02:57 PM, Jesper Dangaard Brouer wrote:\n[...]\n> +/* Convert xdp_buff to xdp_pkt */\n> +static struct xdp_pkt *convert_to_xdp_pkt(struct xdp_buff *xdp)\n> +{\n> +\tstruct xdp_pkt *xdp_pkt;\n> +\tint headroom;\n> +\n> +\t/* Assure headroom is available for storing info */\n> +\theadroom = xdp->data - xdp->data_hard_start;\n> +\tif (headroom < sizeof(*xdp_pkt))\n> +\t\treturn NULL;\n> +\n> +\t/* Store info in top of packet */\n> +\txdp_pkt = xdp->data_hard_start;\n\n(You'd also need to handle data_meta here if set, and for below\ncpu_map_build_skb(), e.g. headroom is data_meta-data_hard_start.)\n\n> +\txdp_pkt->data = xdp->data;\n> +\txdp_pkt->len  = xdp->data_end - xdp->data;\n> +\txdp_pkt->headroom = headroom - sizeof(*xdp_pkt);\n> +\n> +\treturn xdp_pkt;\n> +}\n> +\n> +static struct sk_buff *cpu_map_build_skb(struct bpf_cpu_map_entry *rcpu,\n> +\t\t\t\t\t struct xdp_pkt *xdp_pkt)\n> +{\n> +\tunsigned int frame_size;\n> +\tvoid *pkt_data_start;\n> +\tstruct sk_buff *skb;\n> +\n> +\t/* build_skb need to place skb_shared_info after SKB end, and\n> +\t * also want to know the memory \"truesize\".  Thus, need to\n[...]\n>   static int cpu_map_kthread_run(void *data)\n>   {\n> +\tconst unsigned long busy_poll_jiffies = usecs_to_jiffies(2000);\n> +\tunsigned long time_limit = jiffies + busy_poll_jiffies;\n>   \tstruct bpf_cpu_map_entry *rcpu = data;\n> +\tunsigned int empty_cnt = 0;\n>\n>   \tset_current_state(TASK_INTERRUPTIBLE);\n>   \twhile (!kthread_should_stop()) {\n> +\t\tunsigned int processed = 0, drops = 0;\n>   \t\tstruct xdp_pkt *xdp_pkt;\n>\n> -\t\tschedule();\n> -\t\t/* Do work */\n> -\t\twhile ((xdp_pkt = ptr_ring_consume(rcpu->queue))) {\n> -\t\t\t/* For now just \"refcnt-free\" */\n> -\t\t\tpage_frag_free(xdp_pkt);\n> +\t\t/* Release CPU reschedule checks */\n> +\t\tif ((time_after_eq(jiffies, time_limit) || empty_cnt > 25) &&\n> +\t\t    __ptr_ring_empty(rcpu->queue)) {\n> +\t\t\tempty_cnt++;\n> +\t\t\tschedule();\n> +\t\t\ttime_limit = jiffies + busy_poll_jiffies;\n> +\t\t\tWARN_ON(smp_processor_id() != rcpu->cpu);\n> +\t\t} else {\n> +\t\t\tcond_resched();\n>   \t\t}\n> +\n> +\t\t/* Process packets in rcpu->queue */\n> +\t\tlocal_bh_disable();\n> +\t\t/*\n> +\t\t * The bpf_cpu_map_entry is single consumer, with this\n> +\t\t * kthread CPU pinned. Lockless access to ptr_ring\n> +\t\t * consume side valid as no-resize allowed of queue.\n> +\t\t */\n> +\t\twhile ((xdp_pkt = __ptr_ring_consume(rcpu->queue))) {\n> +\t\t\tstruct sk_buff *skb;\n> +\t\t\tint ret;\n> +\n> +\t\t\t/* Allow busy polling again */\n> +\t\t\tempty_cnt = 0;\n> +\n> +\t\t\tskb = cpu_map_build_skb(rcpu, xdp_pkt);\n> +\t\t\tif (!skb) {\n> +\t\t\t\tpage_frag_free(xdp_pkt);\n> +\t\t\t\tcontinue;\n> +\t\t\t}\n> +\n> +\t\t\t/* Inject into network stack */\n> +\t\t\tret = netif_receive_skb(skb);\n\nHave you looked into whether it's feasible to reuse GRO\nengine here as well?\n\n> +\t\t\tif (ret == NET_RX_DROP)\n> +\t\t\t\tdrops++;\n> +\n> +\t\t\t/* Limit BH-disable period */\n> +\t\t\tif (++processed == 8)\n> +\t\t\t\tbreak;\n> +\t\t}\n> +\t\tlocal_bh_enable();\n> +\n>   \t\t__set_current_state(TASK_INTERRUPTIBLE);\n>   \t}\n>   \tput_cpu_map_entry(rcpu);\n[...]","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3y39fr1JBXz9t1G\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 29 Sep 2017 09:21:08 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751403AbdI1XVE (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 28 Sep 2017 19:21:04 -0400","from www62.your-server.de ([213.133.104.62]:56696 \"EHLO\n\twww62.your-server.de\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1750857AbdI1XVD (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 28 Sep 2017 19:21:03 -0400","from [85.7.161.218] (helo=localhost.localdomain)\n\tby www62.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-SHA:256)\n\t(Exim 4.85_2) (envelope-from <daniel@iogearbox.net>)\n\tid 1dxi74-0001TQ-B6; Fri, 29 Sep 2017 01:21:02 +0200"],"Message-ID":"<59CD83DD.4060603@iogearbox.net>","Date":"Fri, 29 Sep 2017 01:21:01 +0200","From":"Daniel Borkmann <daniel@iogearbox.net>","User-Agent":"Mozilla/5.0 (X11; Linux x86_64;\n\trv:31.0) Gecko/20100101 Thunderbird/31.7.0","MIME-Version":"1.0","To":"Jesper Dangaard Brouer <brouer@redhat.com>, netdev@vger.kernel.org","CC":"jakub.kicinski@netronome.com, \"Michael S. Tsirkin\" <mst@redhat.com>,\n\tJason Wang <jasowang@redhat.com>, mchan@broadcom.com,\n\tJohn Fastabend <john.fastabend@gmail.com>, peter.waskiewicz.jr@intel.com,\n\tDaniel Borkmann <borkmann@iogearbox.net>,\n\tAlexei Starovoitov <alexei.starovoitov@gmail.com>,\n\tAndy Gospodarek <andy@greyhouse.net>","Subject":"Re: [net-next PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and\n\tallocation","References":"<150660339205.2808.7084136789768233829.stgit@firesoul>\n\t<150660343811.2808.7680200486950101509.stgit@firesoul>","In-Reply-To":"<150660343811.2808.7680200486950101509.stgit@firesoul>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Transfer-Encoding":"7bit","X-Authenticated-Sender":"daniel@iogearbox.net","X-Virus-Scanned":"Clear (ClamAV 0.99.2/23884/Thu Sep 28 22:46:49 2017)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1777392,"web_url":"http://patchwork.ozlabs.org/comment/1777392/","msgid":"<20170929094643.4934f661@redhat.com>","list_archive_url":null,"date":"2017-09-29T07:46:43","subject":"Re: [net-next PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and\n\tallocation","submitter":{"id":13625,"url":"http://patchwork.ozlabs.org/api/people/13625/","name":"Jesper Dangaard Brouer","email":"brouer@redhat.com"},"content":"On Fri, 29 Sep 2017 01:21:01 +0200\nDaniel Borkmann <daniel@iogearbox.net> wrote:\n\n> On 09/28/2017 02:57 PM, Jesper Dangaard Brouer wrote:\n> [...]\n> > +/* Convert xdp_buff to xdp_pkt */\n> > +static struct xdp_pkt *convert_to_xdp_pkt(struct xdp_buff *xdp)\n> > +{\n> > +\tstruct xdp_pkt *xdp_pkt;\n> > +\tint headroom;\n> > +\n> > +\t/* Assure headroom is available for storing info */\n> > +\theadroom = xdp->data - xdp->data_hard_start;\n> > +\tif (headroom < sizeof(*xdp_pkt))\n> > +\t\treturn NULL;\n> > +\n> > +\t/* Store info in top of packet */\n> > +\txdp_pkt = xdp->data_hard_start;  \n> \n> (You'd also need to handle data_meta here if set, and for below\n> cpu_map_build_skb(), e.g. headroom is data_meta-data_hard_start.)\n\nI'll look into this.  The data_meta patchset was in-flight while I\nrebased this.\n\n> > +\txdp_pkt->data = xdp->data;\n> > +\txdp_pkt->len  = xdp->data_end - xdp->data;\n> > +\txdp_pkt->headroom = headroom - sizeof(*xdp_pkt);\n> > +\n> > +\treturn xdp_pkt;\n> > +}\n> > +\n> > +static struct sk_buff *cpu_map_build_skb(struct bpf_cpu_map_entry *rcpu,\n> > +\t\t\t\t\t struct xdp_pkt *xdp_pkt)\n> > +{\n> > +\tunsigned int frame_size;\n> > +\tvoid *pkt_data_start;\n> > +\tstruct sk_buff *skb;\n> > +\n> > +\t/* build_skb need to place skb_shared_info after SKB end, and\n> > +\t * also want to know the memory \"truesize\".  Thus, need to  \n> [...]\n> >   static int cpu_map_kthread_run(void *data)\n> >   {\n> > +\tconst unsigned long busy_poll_jiffies = usecs_to_jiffies(2000);\n> > +\tunsigned long time_limit = jiffies + busy_poll_jiffies;\n> >   \tstruct bpf_cpu_map_entry *rcpu = data;\n> > +\tunsigned int empty_cnt = 0;\n> >\n> >   \tset_current_state(TASK_INTERRUPTIBLE);\n> >   \twhile (!kthread_should_stop()) {\n> > +\t\tunsigned int processed = 0, drops = 0;\n> >   \t\tstruct xdp_pkt *xdp_pkt;\n> >\n> > -\t\tschedule();\n> > -\t\t/* Do work */\n> > -\t\twhile ((xdp_pkt = ptr_ring_consume(rcpu->queue))) {\n> > -\t\t\t/* For now just \"refcnt-free\" */\n> > -\t\t\tpage_frag_free(xdp_pkt);\n> > +\t\t/* Release CPU reschedule checks */\n> > +\t\tif ((time_after_eq(jiffies, time_limit) || empty_cnt > 25) &&\n> > +\t\t    __ptr_ring_empty(rcpu->queue)) {\n> > +\t\t\tempty_cnt++;\n> > +\t\t\tschedule();\n> > +\t\t\ttime_limit = jiffies + busy_poll_jiffies;\n> > +\t\t\tWARN_ON(smp_processor_id() != rcpu->cpu);\n> > +\t\t} else {\n> > +\t\t\tcond_resched();\n> >   \t\t}\n> > +\n> > +\t\t/* Process packets in rcpu->queue */\n> > +\t\tlocal_bh_disable();\n> > +\t\t/*\n> > +\t\t * The bpf_cpu_map_entry is single consumer, with this\n> > +\t\t * kthread CPU pinned. Lockless access to ptr_ring\n> > +\t\t * consume side valid as no-resize allowed of queue.\n> > +\t\t */\n> > +\t\twhile ((xdp_pkt = __ptr_ring_consume(rcpu->queue))) {\n> > +\t\t\tstruct sk_buff *skb;\n> > +\t\t\tint ret;\n> > +\n> > +\t\t\t/* Allow busy polling again */\n> > +\t\t\tempty_cnt = 0;\n> > +\n> > +\t\t\tskb = cpu_map_build_skb(rcpu, xdp_pkt);\n> > +\t\t\tif (!skb) {\n> > +\t\t\t\tpage_frag_free(xdp_pkt);\n> > +\t\t\t\tcontinue;\n> > +\t\t\t}\n> > +\n> > +\t\t\t/* Inject into network stack */\n> > +\t\t\tret = netif_receive_skb(skb);  \n> \n> Have you looked into whether it's feasible to reuse GRO\n> engine here as well?\n\nThis is the first step. I'll work on adding the GRO-engine later. And\nit should be feasible.  There are plenty of optimizations in this area\nthat can do done later ;-)\n\n> \n> > +\t\t\tif (ret == NET_RX_DROP)\n> > +\t\t\t\tdrops++;\n> > +\n> > +\t\t\t/* Limit BH-disable period */\n> > +\t\t\tif (++processed == 8)\n> > +\t\t\t\tbreak;\n> > +\t\t}\n> > +\t\tlocal_bh_enable();\n> > +\n> >   \t\t__set_current_state(TASK_INTERRUPTIBLE);\n> >   \t}\n> >   \tput_cpu_map_entry(rcpu);  \n> [...]","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ext-mx06.extmail.prod.ext.phx2.redhat.com;\n\tdmarc=none (p=none dis=none) header.from=redhat.com","ext-mx06.extmail.prod.ext.phx2.redhat.com;\n\tspf=fail smtp.mailfrom=brouer@redhat.com"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3y3NtZ68z3z9t0F\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 29 Sep 2017 17:47:02 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751951AbdI2Hq6 (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tFri, 29 Sep 2017 03:46:58 -0400","from mx1.redhat.com ([209.132.183.28]:50800 \"EHLO mx1.redhat.com\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1751914AbdI2Hq5 (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tFri, 29 Sep 2017 03:46:57 -0400","from smtp.corp.redhat.com\n\t(int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby mx1.redhat.com (Postfix) with ESMTPS id 58DF6356C0;\n\tFri, 29 Sep 2017 07:46:57 +0000 (UTC)","from localhost (ovpn-200-30.brq.redhat.com [10.40.200.30])\n\tby smtp.corp.redhat.com (Postfix) with ESMTP id 687E218A49;\n\tFri, 29 Sep 2017 07:46:45 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.3.2 mx1.redhat.com 58DF6356C0","Date":"Fri, 29 Sep 2017 09:46:43 +0200","From":"Jesper Dangaard Brouer <brouer@redhat.com>","To":"Daniel Borkmann <daniel@iogearbox.net>","Cc":"netdev@vger.kernel.org, jakub.kicinski@netronome.com,\n\t\"Michael S. Tsirkin\" <mst@redhat.com>,\n\tJason Wang <jasowang@redhat.com>, mchan@broadcom.com,\n\tJohn Fastabend <john.fastabend@gmail.com>, peter.waskiewicz.jr@intel.com,\n\tDaniel Borkmann <borkmann@iogearbox.net>,\n\tAlexei Starovoitov <alexei.starovoitov@gmail.com>,\n\tAndy Gospodarek <andy@greyhouse.net>, brouer@redhat.com","Subject":"Re: [net-next PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and\n\tallocation","Message-ID":"<20170929094643.4934f661@redhat.com>","In-Reply-To":"<59CD83DD.4060603@iogearbox.net>","References":"<150660339205.2808.7084136789768233829.stgit@firesoul>\n\t<150660343811.2808.7680200486950101509.stgit@firesoul>\n\t<59CD83DD.4060603@iogearbox.net>","MIME-Version":"1.0","Content-Type":"text/plain; charset=US-ASCII","Content-Transfer-Encoding":"7bit","X-Scanned-By":"MIMEDefang 2.79 on 10.5.11.12","X-Greylist":"Sender IP whitelisted, not delayed by milter-greylist-4.5.16\n\t(mx1.redhat.com [10.5.110.30]);\n\tFri, 29 Sep 2017 07:46:57 +0000 (UTC)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1777450,"web_url":"http://patchwork.ozlabs.org/comment/1777450/","msgid":"<1c37f945-0e2f-1eec-fe88-a740815026d3@redhat.com>","list_archive_url":null,"date":"2017-09-29T09:49:23","subject":"Re: [net-next PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and\n\tallocation","submitter":{"id":5225,"url":"http://patchwork.ozlabs.org/api/people/5225/","name":"Jason Wang","email":"jasowang@redhat.com"},"content":"On 2017年09月28日 20:57, Jesper Dangaard Brouer wrote:\n> +};\n> +\n> +/* Convert xdp_buff to xdp_pkt */\n> +static struct xdp_pkt *convert_to_xdp_pkt(struct xdp_buff *xdp)\n> +{\n> +\tstruct xdp_pkt *xdp_pkt;\n> +\tint headroom;\n> +\n> +\t/* Assure headroom is available for storing info */\n> +\theadroom = xdp->data - xdp->data_hard_start;\n> +\tif (headroom < sizeof(*xdp_pkt))\n> +\t\treturn NULL;\n\nHi Jesper:\n\nDo you consider this as a trick or a long term solution? Is it better to \nstore XDP in a circular buffer? (I'm asking since I meet similar issue \nwhen doing xdp_xmit for tun).\n\n> +\n> +\t/* Store info in top of packet */\n> +\txdp_pkt = xdp->data_hard_start;\n> +\n> +\txdp_pkt->data = xdp->data;\n> +\txdp_pkt->len  = xdp->data_end - xdp->data;\n> +\txdp_pkt->headroom = headroom - sizeof(*xdp_pkt);\n> +\n\nIs wmb() needed here?\n\n> +\treturn xdp_pkt;\n> +}\n\nThanks","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ext-mx01.extmail.prod.ext.phx2.redhat.com;\n\tdmarc=none (p=none dis=none) header.from=redhat.com","ext-mx01.extmail.prod.ext.phx2.redhat.com;\n\tspf=fail smtp.mailfrom=jasowang@redhat.com"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3y3RcF052Dz9t0F\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 29 Sep 2017 19:49:49 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751901AbdI2Jtr (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tFri, 29 Sep 2017 05:49:47 -0400","from mx1.redhat.com ([209.132.183.28]:48194 \"EHLO mx1.redhat.com\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1750926AbdI2Jtp (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tFri, 29 Sep 2017 05:49:45 -0400","from smtp.corp.redhat.com\n\t(int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby mx1.redhat.com (Postfix) with ESMTPS id AD1AD85376;\n\tFri, 29 Sep 2017 09:49:45 +0000 (UTC)","from [10.72.12.103] (ovpn-12-103.pek2.redhat.com [10.72.12.103])\n\tby smtp.corp.redhat.com (Postfix) with ESMTPS id A8C175C88B;\n\tFri, 29 Sep 2017 09:49:32 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.3.2 mx1.redhat.com AD1AD85376","Subject":"Re: [net-next PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and\n\tallocation","To":"Jesper Dangaard Brouer <brouer@redhat.com>, netdev@vger.kernel.org","Cc":"jakub.kicinski@netronome.com,\n\t\"Michael S. Tsirkin\" <mst@redhat.com>, mchan@broadcom.com,\n\tJohn Fastabend <john.fastabend@gmail.com>, peter.waskiewicz.jr@intel.com,\n\tDaniel Borkmann <borkmann@iogearbox.net>,\n\tAlexei Starovoitov <alexei.starovoitov@gmail.com>,\n\tAndy Gospodarek <andy@greyhouse.net>","References":"<150660339205.2808.7084136789768233829.stgit@firesoul>\n\t<150660343811.2808.7680200486950101509.stgit@firesoul>","From":"Jason Wang <jasowang@redhat.com>","Message-ID":"<1c37f945-0e2f-1eec-fe88-a740815026d3@redhat.com>","Date":"Fri, 29 Sep 2017 17:49:23 +0800","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.3.0","MIME-Version":"1.0","In-Reply-To":"<150660343811.2808.7680200486950101509.stgit@firesoul>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Transfer-Encoding":"8bit","Content-Language":"en-US","X-Scanned-By":"MIMEDefang 2.79 on 10.5.11.16","X-Greylist":"Sender IP whitelisted, not delayed by milter-greylist-4.5.16\n\t(mx1.redhat.com [10.5.110.25]);\n\tFri, 29 Sep 2017 09:49:45 +0000 (UTC)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1777533,"web_url":"http://patchwork.ozlabs.org/comment/1777533/","msgid":"<20170929150536.643019a3@redhat.com>","list_archive_url":null,"date":"2017-09-29T13:05:36","subject":"Re: [net-next PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and\n\tallocation","submitter":{"id":13625,"url":"http://patchwork.ozlabs.org/api/people/13625/","name":"Jesper Dangaard Brouer","email":"brouer@redhat.com"},"content":"On Fri, 29 Sep 2017 17:49:23 +0800\nJason Wang <jasowang@redhat.com> wrote:\n\n> On 2017年09月28日 20:57, Jesper Dangaard Brouer wrote:\n> > +};\n> > +\n> > +/* Convert xdp_buff to xdp_pkt */\n> > +static struct xdp_pkt *convert_to_xdp_pkt(struct xdp_buff *xdp)\n> > +{\n> > +\tstruct xdp_pkt *xdp_pkt;\n> > +\tint headroom;\n> > +\n> > +\t/* Assure headroom is available for storing info */\n> > +\theadroom = xdp->data - xdp->data_hard_start;\n> > +\tif (headroom < sizeof(*xdp_pkt))\n> > +\t\treturn NULL;  \n> \n> Hi Jesper:\n> \n> Do you consider this as a trick or a long term solution? Is it better to \n> store XDP in a circular buffer? (I'm asking since I meet similar issue \n> when doing xdp_xmit for tun).\n\n(The way you ask the question is slightly ambiguous, but I hope I understand.)\n\nIMHO the best solution to allow queueing of XDP packets is to create a\nmeta-data structure, with the needed info.  For performance reasons, we\ndon't want to allocate a new memory area for this.  Thus, we simply use\nthe available headroom in the page that the packet is stored into.\nNotice that DPDK also use the first cache-line of the packet data, for\nits packet meta-data structure. (This is not a performance problem.\nI've done several PoC benchmarks, before choosing to do this)\n\nFor now, this \"trick\" is local to the cpumap, and thus not exposed as\nany API.  Thus we can evolve and change the contents easily.  But I\nwould in time, like to see this generalized. When/if more places need\nto queue XDP packets, this header meta-data format should be\nstandardized.\n\n\nPipe-dreaming: Taking this to the extreme... if I could get away with\nit, I would actually like to store the (232 bytes) SKB meta-data header\ninside headroom too.  That would eliminate any real SKB memory alloc.\n\n\n> > +\n> > +\t/* Store info in top of packet */\n> > +\txdp_pkt = xdp->data_hard_start;\n> > +\n> > +\txdp_pkt->data = xdp->data;\n> > +\txdp_pkt->len  = xdp->data_end - xdp->data;\n> > +\txdp_pkt->headroom = headroom - sizeof(*xdp_pkt);\n> > +  \n> \n> Is wmb() needed here?\n\nNo. This xdp_pkt is queued into a into a ptr_ring, which have a\nspin_lock on enqueue, and any atomic operation works as a full memory\nbarrirer mb().","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ext-mx10.extmail.prod.ext.phx2.redhat.com;\n\tdmarc=none (p=none dis=none) header.from=redhat.com","ext-mx10.extmail.prod.ext.phx2.redhat.com;\n\tspf=fail smtp.mailfrom=brouer@redhat.com"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3y3WyP400Mz9ryk\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 29 Sep 2017 23:05:49 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752065AbdI2NFr convert rfc822-to-8bit (ORCPT\n\t<rfc822;patchwork-incoming@ozlabs.org>);\n\tFri, 29 Sep 2017 09:05:47 -0400","from mx1.redhat.com ([209.132.183.28]:55972 \"EHLO mx1.redhat.com\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1751349AbdI2NFq (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tFri, 29 Sep 2017 09:05:46 -0400","from smtp.corp.redhat.com\n\t(int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby mx1.redhat.com (Postfix) with ESMTPS id 6B31925783;\n\tFri, 29 Sep 2017 13:05:46 +0000 (UTC)","from localhost (ovpn-200-30.brq.redhat.com [10.40.200.30])\n\tby smtp.corp.redhat.com (Postfix) with ESMTP id ED09260600;\n\tFri, 29 Sep 2017 13:05:38 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.3.2 mx1.redhat.com 6B31925783","Date":"Fri, 29 Sep 2017 15:05:36 +0200","From":"Jesper Dangaard Brouer <brouer@redhat.com>","To":"Jason Wang <jasowang@redhat.com>","Cc":"netdev@vger.kernel.org, jakub.kicinski@netronome.com,\n\t\"Michael S. Tsirkin\" <mst@redhat.com>, mchan@broadcom.com,\n\tJohn Fastabend <john.fastabend@gmail.com>, peter.waskiewicz.jr@intel.com,\n\tDaniel Borkmann <borkmann@iogearbox.net>,\n\tAlexei Starovoitov <alexei.starovoitov@gmail.com>,\n\tAndy Gospodarek <andy@greyhouse.net>, brouer@redhat.com","Subject":"Re: [net-next PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and\n\tallocation","Message-ID":"<20170929150536.643019a3@redhat.com>","In-Reply-To":"<1c37f945-0e2f-1eec-fe88-a740815026d3@redhat.com>","References":"<150660339205.2808.7084136789768233829.stgit@firesoul>\n\t<150660343811.2808.7680200486950101509.stgit@firesoul>\n\t<1c37f945-0e2f-1eec-fe88-a740815026d3@redhat.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=UTF-8","Content-Transfer-Encoding":"8BIT","X-Scanned-By":"MIMEDefang 2.79 on 10.5.11.13","X-Greylist":"Sender IP whitelisted, not delayed by milter-greylist-4.5.16\n\t(mx1.redhat.com [10.5.110.39]);\n\tFri, 29 Sep 2017 13:05:46 +0000 (UTC)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}}]