From patchwork Mon Nov 2 12:48:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: William Breathitt Gray X-Patchwork-Id: 1392246 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CPt7M62vTz9sRR; Mon, 2 Nov 2020 23:50:39 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1kZZIA-00077y-0p; Mon, 02 Nov 2020 12:50:34 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kZZHG-0006K9-Q8 for kernel-team@lists.ubuntu.com; Mon, 02 Nov 2020 12:49:38 +0000 Received: from mail-qv1-f69.google.com ([209.85.219.69]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kZZHD-00048G-Pt for kernel-team@lists.ubuntu.com; Mon, 02 Nov 2020 12:49:35 +0000 Received: by mail-qv1-f69.google.com with SMTP id j17so4927706qvi.21 for ; Mon, 02 Nov 2020 04:49:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WQUJKItgB9PJasJDVebnRPxKL5fTtJMmUyDrkRkn2hU=; b=Uljav/l9iYIp1AasB6pnK9Lp6eZiZToJ37RYJ+XK78LsAFjoPDyTqOMMnoAQRcbqfw 6jrMFQ+E6yGEZROYQdDbsPhRjOBI2gJRvu2NLZdDzKGKctBaiba3qkhHcX8LBD3eYEMX lDGZlI4Ts0QptRnVNEDMcfBqHglOiZbrPBxQp6RWymO3yxX4a3JRWCyTBy1OPbQxfzLd uJ6D+hLhbKk/sWKuVvHsxr7+xwidG8Ag22Znibd6wimJVQGdjotIuwbktl/qGgB6OBQu PdfBxPslVCc+sglxyYSx7SjuZwzHQmdyT9YVC7k6YiO3GWRK/KBGsIZ2ArfHC4g8KSkI tDMA== X-Gm-Message-State: AOAM531j+05c2o+Kztf3XmZCRhOwukFciSYOvJxV+GEbs+b9Bl63c15N AEAixb+GNEKA3uUa5YsLsjYgGSi1D+JiNy3sxqfMN0uQ+zQuIO3WJJcUHN6Q4CuRzeODbjT1g5Z pR/Jhw+gyoMpGayzrW2sndzS/mUxLvkmifMNsB7h5Tw== X-Received: by 2002:a0c:fdcb:: with SMTP id g11mr8823135qvs.58.1604321374641; Mon, 02 Nov 2020 04:49:34 -0800 (PST) X-Google-Smtp-Source: ABdhPJxa8yEuoqcTuW6c7OJijSYMhX+PhCwwtbLnIsA0xv5X4aw9dBOfy61srJ5B6+N7EZhRDIEdaQ== X-Received: by 2002:a0c:fdcb:: with SMTP id g11mr8823115qvs.58.1604321374424; Mon, 02 Nov 2020 04:49:34 -0800 (PST) Received: from localhost.localdomain (072-189-064-225.res.spectrum.com. [72.189.64.225]) by smtp.gmail.com with ESMTPSA id q7sm7666201qtd.49.2020.11.02.04.49.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 04:49:33 -0800 (PST) From: William Breathitt Gray To: kernel-team@lists.ubuntu.com Subject: [SRU][B:linux-azure-4.15][PATCH 33/40] bpf: devmap prepare xdp frames for bulking Date: Mon, 2 Nov 2020 07:48:49 -0500 Message-Id: <20201102124856.4659-34-william.gray@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201102124856.4659-1-william.gray@canonical.com> References: <20201102124856.4659-1-william.gray@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jesper Dangaard Brouer BugLink: https://bugs.launchpad.net/bugs/1877654 Like cpumap create queue for xdp frames that will be bulked. For now, this patch simply invoke ndo_xdp_xmit foreach frame. This happens, either when the map flush operation is envoked, or when the limit DEV_MAP_BULK_SIZE is reached. V5: Avoid memleak on error path in dev_map_update_elem() Signed-off-by: Jesper Dangaard Brouer Signed-off-by: Alexei Starovoitov (backported from commit 5d053f9da4311a86bc58be8588bb5660fb3f0724) [ vilhelmgray: context adjustment ] Signed-off-by: William Breathitt Gray --- kernel/bpf/devmap.c | 73 ++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 69 insertions(+), 4 deletions(-) diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 99240806ef71..7606c542b643 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -55,10 +55,17 @@ #define DEV_CREATE_FLAG_MASK \ (BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY) +#define DEV_MAP_BULK_SIZE 16 +struct xdp_bulk_queue { + struct xdp_frame *q[DEV_MAP_BULK_SIZE]; + unsigned int count; +}; + struct bpf_dtab_netdev { struct net_device *dev; /* must be first member, due to tracepoint */ struct bpf_dtab *dtab; unsigned int bit; + struct xdp_bulk_queue __percpu *bulkq; struct rcu_head rcu; }; @@ -217,6 +224,34 @@ void __dev_map_insert_ctx(struct bpf_map *map, u32 bit) __set_bit(bit, bitmap); } +static int bq_xmit_all(struct bpf_dtab_netdev *obj, + struct xdp_bulk_queue *bq) +{ + struct net_device *dev = obj->dev; + int i; + + if (unlikely(!bq->count)) + return 0; + + for (i = 0; i < bq->count; i++) { + struct xdp_frame *xdpf = bq->q[i]; + + prefetch(xdpf); + } + + for (i = 0; i < bq->count; i++) { + struct xdp_frame *xdpf = bq->q[i]; + int err; + + err = dev->netdev_ops->ndo_xdp_xmit(dev, xdpf); + if (err) + xdp_return_frame(xdpf); + } + bq->count = 0; + + return 0; +} + /* __dev_map_flush is called from xdp_do_flush_map() which _must_ be signaled * from the driver before returning from its napi->poll() routine. The poll() * routine is called either from busy_poll context or net_rx_action signaled @@ -232,6 +267,7 @@ void __dev_map_flush(struct bpf_map *map) for_each_set_bit(bit, bitmap, map->max_entries) { struct bpf_dtab_netdev *dev = READ_ONCE(dtab->netdev_map[bit]); + struct xdp_bulk_queue *bq; struct net_device *netdev; /* This is possible if the dev entry is removed by user space @@ -240,6 +276,8 @@ void __dev_map_flush(struct bpf_map *map) if (unlikely(!dev)) continue; + bq = this_cpu_ptr(dev->bulkq); + bq_xmit_all(dev, bq); netdev = dev->dev; if (likely(netdev->netdev_ops->ndo_xdp_flush)) netdev->netdev_ops->ndo_xdp_flush(netdev); @@ -264,6 +302,20 @@ struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key) return obj; } +/* Runs under RCU-read-side, plus in softirq under NAPI protection. + * Thus, safe percpu variable access. + */ +static int bq_enqueue(struct bpf_dtab_netdev *obj, struct xdp_frame *xdpf) +{ + struct xdp_bulk_queue *bq = this_cpu_ptr(obj->bulkq); + + if (unlikely(bq->count == DEV_MAP_BULK_SIZE)) + bq_xmit_all(obj, bq); + + bq->q[bq->count++] = xdpf; + return 0; +} + int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp) { struct net_device *dev = dst->dev; @@ -276,8 +328,7 @@ int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp) if (unlikely(!xdpf)) return -EOVERFLOW; - /* TODO: implement a bulking/enqueue step later */ - return dev->netdev_ops->ndo_xdp_xmit(dev, xdpf); + return bq_enqueue(dst, xdpf); } static void *dev_map_lookup_elem(struct bpf_map *map, void *key) @@ -292,13 +343,18 @@ static void dev_map_flush_old(struct bpf_dtab_netdev *dev) { if (dev->dev->netdev_ops->ndo_xdp_flush) { struct net_device *fl = dev->dev; + struct xdp_bulk_queue *bq; unsigned long *bitmap; + int cpu; for_each_online_cpu(cpu) { bitmap = per_cpu_ptr(dev->dtab->flush_needed, cpu); __clear_bit(dev->bit, bitmap); + bq = per_cpu_ptr(dev->bulkq, cpu); + bq_xmit_all(dev, bq); + fl->netdev_ops->ndo_xdp_flush(dev->dev); } } @@ -310,6 +366,7 @@ static void __dev_map_entry_free(struct rcu_head *rcu) dev = container_of(rcu, struct bpf_dtab_netdev, rcu); dev_map_flush_old(dev); + free_percpu(dev->bulkq); dev_put(dev->dev); kfree(dev); } @@ -342,6 +399,7 @@ static int dev_map_update_elem(struct bpf_map *map, void *key, void *value, { struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); struct net *net = current->nsproxy->net_ns; + gfp_t gfp = GFP_ATOMIC | __GFP_NOWARN; struct bpf_dtab_netdev *dev, *old_dev; u32 i = *(u32 *)key; u32 ifindex = *(u32 *)value; @@ -356,13 +414,20 @@ static int dev_map_update_elem(struct bpf_map *map, void *key, void *value, if (!ifindex) { dev = NULL; } else { - dev = kmalloc_node(sizeof(*dev), GFP_ATOMIC | __GFP_NOWARN, - map->numa_node); + dev = kmalloc_node(sizeof(*dev), gfp, map->numa_node); if (!dev) return -ENOMEM; + dev->bulkq = __alloc_percpu_gfp(sizeof(*dev->bulkq), + sizeof(void *), gfp); + if (!dev->bulkq) { + kfree(dev); + return -ENOMEM; + } + dev->dev = dev_get_by_index(net, ifindex); if (!dev->dev) { + free_percpu(dev->bulkq); kfree(dev); return -EINVAL; }