From patchwork Wed May 20 19:20:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294637 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=ZdA4LlDg; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2fq53GBz9sSF for ; Thu, 21 May 2020 05:21:23 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726857AbgETTVX (ORCPT ); Wed, 20 May 2020 15:21:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTVW (ORCPT ); Wed, 20 May 2020 15:21:22 -0400 Received: from mail-pl1-x644.google.com (mail-pl1-x644.google.com [IPv6:2607:f8b0:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1BD6C061A0E; Wed, 20 May 2020 12:21:22 -0700 (PDT) Received: by mail-pl1-x644.google.com with SMTP id a13so1744298pls.8; Wed, 20 May 2020 12:21:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=l+ZitJMelLTFgi5ias/6H8gqKi08+ZfSOF9+XZodPuU=; b=ZdA4LlDgUHVtwhZZgXQ5m7TygnrD35L/BEda6rqVXiOKOK2/8Le1bw3igGk6vIbu6o KN3zDi9cSAlpVfCdIp8DiIFBKJHefdioK8do+zqXzk/9fyYHYQnFCJp/GU/VozWYD3h3 6hF637gza/GTK1/K+fmIRbAYYDLBvSe07YtQ8PP7WR/ErKvWIcB0i0aUgp8WHtaeyRli yxibsy31UrA74fEaCdquEWZc0z5yAoMQFrCqo0vE6lrXmLI7B9N3a4mj+pfkBtrVuDfb ZNeWHfsU6yO6SzNltZ9BMmTEYjRyRJM3nCqQQ9GyjUBaKTeBp2CD4IjL8iejzLmoKSUv 9Icg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l+ZitJMelLTFgi5ias/6H8gqKi08+ZfSOF9+XZodPuU=; b=i4+ThEtoSw71ULyIJUiuxxa/pDkEDjElXvX27lT1nXLNbWEnSyIcYSlKgjJxY26FeR JZW47T7NG7ptJntvDUjKzC7Gc1+sVpy5lmEtITH6sfA+KUF4+EnePOz+mk1q9qq2HE9T h8HIgiOuEW9jmGHlj9quzlXqsUnVaAVlqWpPNzFTjgM8td5ymNFErSBjhemOZH+DIQub YpQ3NBRjDIRSffv9QW92KHt3oJoX2Hw22rKMjNRpIm6G8lFuKl/dgVA6dJY+CDrZXhDh Dp8+C+KA0ueTtRQyHc1O6mkuE5i11GXmT9kJ0yBr3vbU9CYdLogvNz2aMYNXeumzmrpp 7U9A== X-Gm-Message-State: AOAM533XA69ND/4Mvgr6RCFPBVUQ39+jMq8MIybpLeFhtCQMFAyFvb6B wRaONhXjLOGcid+B1DBqiR8= X-Google-Smtp-Source: ABdhPJx0bGMp8Tb9Ih5zXgtUX5nnU5iZwsFJ9lZx6rTZX5sDCDao9GcckvMe2YDtIfhACtbE+Ln/bw== X-Received: by 2002:a17:90a:68cd:: with SMTP id q13mr6981865pjj.177.1590002482259; Wed, 20 May 2020 12:21:22 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.21.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:21:21 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com Subject: [PATCH bpf-next v5 01/15] xsk: fix xsk_umem_xdp_frame_sz() Date: Wed, 20 May 2020 21:20:49 +0200 Message-Id: <20200520192103.355233-2-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Björn Töpel Calculating the "data_hard_end" for an XDP buffer coming from AF_XDP zero-copy mode, the return value of xsk_umem_xdp_frame_sz() is added to "data_hard_start". Currently, the chunk size of the UMEM is returned by xsk_umem_xdp_frame_sz(). This is not correct, if the fixed UMEM headroom is non-zero. Fix this by returning the chunk_size without the UMEM headroom. Fixes: 2a637c5b1aaf ("xdp: For Intel AF_XDP drivers add XDP frame_sz") Signed-off-by: Björn Töpel --- include/net/xdp_sock.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index abd72de25fa4..6b1137ce1692 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -239,7 +239,7 @@ static inline u64 xsk_umem_adjust_offset(struct xdp_umem *umem, u64 address, static inline u32 xsk_umem_xdp_frame_sz(struct xdp_umem *umem) { - return umem->chunk_size_nohr + umem->headroom; + return umem->chunk_size_nohr; } #else From patchwork Wed May 20 19:20:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294639 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=M0Xw/x99; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2fw5ZQhz9sSF for ; Thu, 21 May 2020 05:21:28 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726754AbgETTV2 (ORCPT ); Wed, 20 May 2020 15:21:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45636 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTV2 (ORCPT ); Wed, 20 May 2020 15:21:28 -0400 Received: from mail-pg1-x541.google.com (mail-pg1-x541.google.com [IPv6:2607:f8b0:4864:20::541]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04AE5C061A0E; Wed, 20 May 2020 12:21:28 -0700 (PDT) Received: by mail-pg1-x541.google.com with SMTP id n11so1886665pgl.9; Wed, 20 May 2020 12:21:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=G1YteGz4CGkw/1QFx3GPFqvKGrYaD6ii2xE0mrFwzAc=; b=M0Xw/x99wHSWz5/ZRlF2wcBHZTI1Y6fYdxaspCzRJ1F29ic2FNlivy6KlsZuFnMKjS 5G3pzdeCRWxlGp43C4aJvgLmrAOjJ6/UfMfdtCjrKZ/hfCKxSOqilX1YqkPcQGp1nSW9 dYIItEk1yOlieDHeO8TMNGG4TAAt8bz8w/fP9PGGgyDiJIKH8nnPvHhXGJwkxLxglL5t Do6gtbdKzjYgLtFEKpZR1cwHGqPYiEgJjQfwTDhxQRwPzOH/2+yMd7vD4Hn5BlTLUB7x HiRaplqCG44LJanIZVSxZngaASsIyoxiBul58hDstKPiHdaO4lMjOcuFj++9ORMQPR11 GK+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=G1YteGz4CGkw/1QFx3GPFqvKGrYaD6ii2xE0mrFwzAc=; b=oqO6bok2EYgOvcIgqI2xp9veXc7c1wYmTRCXZui60ELHgprY3RTC0max1AP7liAr2O syQnRGhE2xEh5xbQWtIfGuTOZ29QZBAkCLigXH3yT2fFQ4JyhOlhw6gYT7gkr8Xog6A+ IIhYzgNZ5GP/KCoD8krYggvHnVFRxwHglRumZxYtvjH3Pj//PgNm1tX1QXjMccyiTsHn qNRberZzrj12jUR7RWeg516qQk8oNai7ZIaPOvWVVsaTzW51t25VUDDJhjkKkwtdA5X3 afazd4MkLCsD41YhXQceCK6NE5QwJssN+oIjGoIIdYHdpqvmkg0Bb0F28P0PkVf/CiTB AwJg== X-Gm-Message-State: AOAM533Nyuj42LxKyouoUiKBVpzhmtNL9B3Lsjdik5mNFfhslAN6BPlf k8IHrGr1lY09Rik8ATY1KWU= X-Google-Smtp-Source: ABdhPJxeFHbjInJ526lKrcs8beM4IzRse9pMOn0JXEod82sdEzU597y1u0SMvi8GrBb8nfhSK8+FyQ== X-Received: by 2002:a62:c145:: with SMTP id i66mr5536242pfg.23.1590002487528; Wed, 20 May 2020 12:21:27 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.21.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:21:26 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com Subject: [PATCH bpf-next v5 02/15] xsk: move xskmap.c to net/xdp/ Date: Wed, 20 May 2020 21:20:50 +0200 Message-Id: <20200520192103.355233-3-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Björn Töpel The XSKMAP is partly implemented by net/xdp/xsk.c. Move xskmap.c from kernel/bpf/ to net/xdp/, which is the logical place for AF_XDP related code. Also, move AF_XDP struct definitions, and function declarations only used by AF_XDP internals into net/xdp/xsk.h. Signed-off-by: Björn Töpel --- include/net/xdp_sock.h | 20 -------------------- kernel/bpf/Makefile | 3 --- net/xdp/Makefile | 2 +- net/xdp/xsk.h | 16 ++++++++++++++++ {kernel/bpf => net/xdp}/xskmap.c | 2 ++ 5 files changed, 19 insertions(+), 24 deletions(-) rename {kernel/bpf => net/xdp}/xskmap.c (99%) diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index 6b1137ce1692..8f3f6f5b0dfe 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -65,22 +65,12 @@ struct xdp_umem { struct list_head xsk_tx_list; }; -/* Nodes are linked in the struct xdp_sock map_list field, and used to - * track which maps a certain socket reside in. - */ - struct xsk_map { struct bpf_map map; spinlock_t lock; /* Synchronize map updates */ struct xdp_sock *xsk_map[]; }; -struct xsk_map_node { - struct list_head node; - struct xsk_map *map; - struct xdp_sock **map_entry; -}; - struct xdp_sock { /* struct sock must be the first member of struct xdp_sock */ struct sock sk; @@ -114,7 +104,6 @@ struct xdp_sock { struct xdp_buff; #ifdef CONFIG_XDP_SOCKETS int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp); -bool xsk_is_setup_for_bpf_map(struct xdp_sock *xs); /* Used from netdev driver */ bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt); bool xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr); @@ -133,10 +122,6 @@ void xsk_clear_rx_need_wakeup(struct xdp_umem *umem); void xsk_clear_tx_need_wakeup(struct xdp_umem *umem); bool xsk_umem_uses_need_wakeup(struct xdp_umem *umem); -void xsk_map_try_sock_delete(struct xsk_map *map, struct xdp_sock *xs, - struct xdp_sock **map_entry); -int xsk_map_inc(struct xsk_map *map); -void xsk_map_put(struct xsk_map *map); int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp); void __xsk_map_flush(void); @@ -248,11 +233,6 @@ static inline int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) return -ENOTSUPP; } -static inline bool xsk_is_setup_for_bpf_map(struct xdp_sock *xs) -{ - return false; -} - static inline bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt) { return false; diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 37b2d8620153..375b933010dd 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -12,9 +12,6 @@ obj-$(CONFIG_BPF_JIT) += dispatcher.o ifeq ($(CONFIG_NET),y) obj-$(CONFIG_BPF_SYSCALL) += devmap.o obj-$(CONFIG_BPF_SYSCALL) += cpumap.o -ifeq ($(CONFIG_XDP_SOCKETS),y) -obj-$(CONFIG_BPF_SYSCALL) += xskmap.o -endif obj-$(CONFIG_BPF_SYSCALL) += offload.o endif ifeq ($(CONFIG_PERF_EVENTS),y) diff --git a/net/xdp/Makefile b/net/xdp/Makefile index 71e2bdafb2ce..90b5460d6166 100644 --- a/net/xdp/Makefile +++ b/net/xdp/Makefile @@ -1,3 +1,3 @@ # SPDX-License-Identifier: GPL-2.0-only -obj-$(CONFIG_XDP_SOCKETS) += xsk.o xdp_umem.o xsk_queue.o +obj-$(CONFIG_XDP_SOCKETS) += xsk.o xdp_umem.o xsk_queue.o xskmap.o obj-$(CONFIG_XDP_SOCKETS_DIAG) += xsk_diag.o diff --git a/net/xdp/xsk.h b/net/xdp/xsk.h index 4cfd106bdb53..d6a0979050e6 100644 --- a/net/xdp/xsk.h +++ b/net/xdp/xsk.h @@ -17,9 +17,25 @@ struct xdp_mmap_offsets_v1 { struct xdp_ring_offset_v1 cr; }; +/* Nodes are linked in the struct xdp_sock map_list field, and used to + * track which maps a certain socket reside in. + */ + +struct xsk_map_node { + struct list_head node; + struct xsk_map *map; + struct xdp_sock **map_entry; +}; + static inline struct xdp_sock *xdp_sk(struct sock *sk) { return (struct xdp_sock *)sk; } +bool xsk_is_setup_for_bpf_map(struct xdp_sock *xs); +void xsk_map_try_sock_delete(struct xsk_map *map, struct xdp_sock *xs, + struct xdp_sock **map_entry); +int xsk_map_inc(struct xsk_map *map); +void xsk_map_put(struct xsk_map *map); + #endif /* XSK_H_ */ diff --git a/kernel/bpf/xskmap.c b/net/xdp/xskmap.c similarity index 99% rename from kernel/bpf/xskmap.c rename to net/xdp/xskmap.c index 2cc5c8f4c800..1dc7208c71ba 100644 --- a/kernel/bpf/xskmap.c +++ b/net/xdp/xskmap.c @@ -9,6 +9,8 @@ #include #include +#include "xsk.h" + int xsk_map_inc(struct xsk_map *map) { bpf_map_inc(&map->map); From patchwork Wed May 20 19:20:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294642 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=gVyVzmun; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2g31NQkz9sSF for ; Thu, 21 May 2020 05:21:35 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726893AbgETTVe (ORCPT ); Wed, 20 May 2020 15:21:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTVe (ORCPT ); Wed, 20 May 2020 15:21:34 -0400 Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 536C8C061A0E; Wed, 20 May 2020 12:21:34 -0700 (PDT) Received: by mail-pl1-x642.google.com with SMTP id b12so1735257plz.13; Wed, 20 May 2020 12:21:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6B7YlWjJiU7epWUF8ZWr1X9bC9pqU/zyzV/VLX0N8d4=; b=gVyVzmunoIsqL9Ja8pA441+xGU+AkngWIy6Tmk8/nOLANZy/5fIYYfm1NMR5Q9NC4P pge4NZ5Ewyex1x/LDscyrwE10mUrqyRx0OiKl67E5ANR7VDivbmttzmjADgGHfhIyzLN pqlm4HE8XDyLCvy7s5/sFGQ8VpMW49Eap8P0vxXnQR9dJxm2KZTp/Itu0bo9QWHs2Zxk V7FNaaGdgqr89EinDmcZyh50Mq9yKw9oLLYXc6SsU2IN8IaQwI2I+7Ioe8dqIeF7NiAQ YqzXc3l7oYJTir/Tqp+ybaVQ9uUlh33k9uguaMLVyHiwyie8vuC1wvMz+qAna8h7LR7G sDMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6B7YlWjJiU7epWUF8ZWr1X9bC9pqU/zyzV/VLX0N8d4=; b=avI46kuHsKfGifbkZfq8srlvNyYid5xG8p/pBQx5Zf0nzDuWo2Czd1zvq9PXVdjcnr wKc3n/Fzn5qf+suMYsYaTtNHcIfI2xDRz/nyhyVqpZXhwiqihd6NN8tyf0mgD8+SG5vt WIvVO5LVOKa0HkuWHpbMx1/TLo7NtgCvtDg7Pl5uXGLfOGkmqib5uDX7tHcWDEULoKc5 EO094IrtoMXFQcrXq5UQxVD2bCUp4FVlEdXyT/1g/iuOGSNEefRYu/GPr5sEDR0Zv316 +nApKZebtFQ1O9JKz3eSEJHMm37KoTU+GDqQNhZ3No9GYoYMXSaLhW13WrzqWKZNCqBi i14A== X-Gm-Message-State: AOAM532S5PCbzMcvIR+TwK4KjUfNrnoGTHtJjCTBCQGKStuDu5fV7erq hiM6yxHKX0cw6AMRSqzoZR7FRIi4AXmanNHq X-Google-Smtp-Source: ABdhPJxaQZzdtHwScOxPAHJqetvO8zQvjOzc9/rDEZrUBmPmNfrxlxMxfIiACykS4RX5AGaoOcOP5Q== X-Received: by 2002:a17:90a:1aa3:: with SMTP id p32mr7414453pjp.4.1590002493597; Wed, 20 May 2020 12:21:33 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.21.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:21:32 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: maximmi@mellanox.com, maciej.fijalkowski@intel.com, bjorn.topel@intel.com Subject: [PATCH bpf-next v5 03/15] xsk: move driver interface to xdp_sock_drv.h Date: Wed, 20 May 2020 21:20:51 +0200 Message-Id: <20200520192103.355233-4-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Magnus Karlsson Move the AF_XDP zero-copy driver interface to its own include file called xdp_sock_drv.h. This, hopefully, will make it more clear for NIC driver implementors to know what functions to use for zero-copy support. v4->v5: Fix -Wmissing-prototypes by include header file. (Jakub) Signed-off-by: Magnus Karlsson --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- drivers/net/ethernet/intel/i40e/i40e_xsk.c | 2 +- drivers/net/ethernet/intel/ice/ice_xsk.c | 2 +- drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 2 +- .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 2 +- .../ethernet/mellanox/mlx5/core/en/xsk/rx.h | 2 +- .../ethernet/mellanox/mlx5/core/en/xsk/tx.h | 2 +- .../ethernet/mellanox/mlx5/core/en/xsk/umem.c | 2 +- include/net/xdp_sock.h | 214 +---------------- include/net/xdp_sock_drv.h | 217 ++++++++++++++++++ net/ethtool/channels.c | 2 +- net/ethtool/ioctl.c | 2 +- net/xdp/xdp_umem.h | 2 +- net/xdp/xsk.c | 2 +- net/xdp/xsk_queue.c | 1 + 15 files changed, 238 insertions(+), 218 deletions(-) create mode 100644 include/net/xdp_sock_drv.h diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 2a037ec244b9..d6b2db4f2c65 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -11,7 +11,7 @@ #include "i40e_diag.h" #include "i40e_xsk.h" #include -#include +#include /* All i40e tracepoints are defined by the include below, which * must be included exactly once across the whole kernel with * CREATE_TRACE_POINTS defined diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index 2b9184aead5f..d8b0be29099a 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -2,7 +2,7 @@ /* Copyright(c) 2018 Intel Corporation. */ #include -#include +#include #include #include "i40e.h" diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 23e5515d4527..70e204307a93 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -2,7 +2,7 @@ /* Copyright (c) 2019, Intel Corporation. */ #include -#include +#include #include #include "ice.h" #include "ice_base.h" diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c index a656ee9a1fae..82e4effae704 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c @@ -2,7 +2,7 @@ /* Copyright(c) 2018 Intel Corporation. */ #include -#include +#include #include #include "ixgbe.h" diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index 761c8979bd41..3507d23f0eb8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -31,7 +31,7 @@ */ #include -#include +#include #include "en/xdp.h" #include "en/params.h" diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h index cab0e93497ae..a8e11adbf426 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h @@ -5,7 +5,7 @@ #define __MLX5_EN_XSK_RX_H__ #include "en.h" -#include +#include /* RX data path */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.h index 79b487d89757..39fa0a705856 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.h @@ -5,7 +5,7 @@ #define __MLX5_EN_XSK_TX_H__ #include "en.h" -#include +#include /* TX data path */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/umem.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/umem.c index 4baaa5788320..5e49fdb564b3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/umem.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/umem.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2019 Mellanox Technologies. */ -#include +#include #include "umem.h" #include "setup.h" #include "en/params.h" diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index 8f3f6f5b0dfe..6a986dcbc336 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -15,6 +15,7 @@ struct net_device; struct xsk_queue; +struct xdp_buff; /* Masks for xdp_umem_page flags. * The low 12-bits of the addr will be 0 since this is the page address, so we @@ -101,27 +102,9 @@ struct xdp_sock { spinlock_t map_list_lock; }; -struct xdp_buff; #ifdef CONFIG_XDP_SOCKETS -int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp); -/* Used from netdev driver */ -bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt); -bool xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr); -void xsk_umem_release_addr(struct xdp_umem *umem); -void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries); -bool xsk_umem_consume_tx(struct xdp_umem *umem, struct xdp_desc *desc); -void xsk_umem_consume_tx_done(struct xdp_umem *umem); -struct xdp_umem_fq_reuse *xsk_reuseq_prepare(u32 nentries); -struct xdp_umem_fq_reuse *xsk_reuseq_swap(struct xdp_umem *umem, - struct xdp_umem_fq_reuse *newq); -void xsk_reuseq_free(struct xdp_umem_fq_reuse *rq); -struct xdp_umem *xdp_get_umem_from_qid(struct net_device *dev, u16 queue_id); -void xsk_set_rx_need_wakeup(struct xdp_umem *umem); -void xsk_set_tx_need_wakeup(struct xdp_umem *umem); -void xsk_clear_rx_need_wakeup(struct xdp_umem *umem); -void xsk_clear_tx_need_wakeup(struct xdp_umem *umem); -bool xsk_umem_uses_need_wakeup(struct xdp_umem *umem); +int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp); int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp); void __xsk_map_flush(void); @@ -153,131 +136,24 @@ static inline u64 xsk_umem_add_offset_to_addr(u64 addr) return xsk_umem_extract_addr(addr) + xsk_umem_extract_offset(addr); } -static inline char *xdp_umem_get_data(struct xdp_umem *umem, u64 addr) -{ - unsigned long page_addr; - - addr = xsk_umem_add_offset_to_addr(addr); - page_addr = (unsigned long)umem->pages[addr >> PAGE_SHIFT].addr; - - return (char *)(page_addr & PAGE_MASK) + (addr & ~PAGE_MASK); -} - -static inline dma_addr_t xdp_umem_get_dma(struct xdp_umem *umem, u64 addr) -{ - addr = xsk_umem_add_offset_to_addr(addr); - - return umem->pages[addr >> PAGE_SHIFT].dma + (addr & ~PAGE_MASK); -} - -/* Reuse-queue aware version of FILL queue helpers */ -static inline bool xsk_umem_has_addrs_rq(struct xdp_umem *umem, u32 cnt) -{ - struct xdp_umem_fq_reuse *rq = umem->fq_reuse; - - if (rq->length >= cnt) - return true; - - return xsk_umem_has_addrs(umem, cnt - rq->length); -} - -static inline bool xsk_umem_peek_addr_rq(struct xdp_umem *umem, u64 *addr) -{ - struct xdp_umem_fq_reuse *rq = umem->fq_reuse; - - if (!rq->length) - return xsk_umem_peek_addr(umem, addr); - - *addr = rq->handles[rq->length - 1]; - return addr; -} - -static inline void xsk_umem_release_addr_rq(struct xdp_umem *umem) -{ - struct xdp_umem_fq_reuse *rq = umem->fq_reuse; - - if (!rq->length) - xsk_umem_release_addr(umem); - else - rq->length--; -} - -static inline void xsk_umem_fq_reuse(struct xdp_umem *umem, u64 addr) -{ - struct xdp_umem_fq_reuse *rq = umem->fq_reuse; - - rq->handles[rq->length++] = addr; -} - -/* Handle the offset appropriately depending on aligned or unaligned mode. - * For unaligned mode, we store the offset in the upper 16-bits of the address. - * For aligned mode, we simply add the offset to the address. - */ -static inline u64 xsk_umem_adjust_offset(struct xdp_umem *umem, u64 address, - u64 offset) -{ - if (umem->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG) - return address + (offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT); - else - return address + offset; -} - -static inline u32 xsk_umem_xdp_frame_sz(struct xdp_umem *umem) -{ - return umem->chunk_size_nohr; -} - #else + static inline int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) { return -ENOTSUPP; } -static inline bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt) -{ - return false; -} - -static inline u64 *xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr) -{ - return NULL; -} - -static inline void xsk_umem_release_addr(struct xdp_umem *umem) -{ -} - -static inline void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries) -{ -} - -static inline bool xsk_umem_consume_tx(struct xdp_umem *umem, - struct xdp_desc *desc) -{ - return false; -} - -static inline void xsk_umem_consume_tx_done(struct xdp_umem *umem) -{ -} - -static inline struct xdp_umem_fq_reuse *xsk_reuseq_prepare(u32 nentries) +static inline int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp) { - return NULL; + return -EOPNOTSUPP; } -static inline struct xdp_umem_fq_reuse *xsk_reuseq_swap( - struct xdp_umem *umem, - struct xdp_umem_fq_reuse *newq) -{ - return NULL; -} -static inline void xsk_reuseq_free(struct xdp_umem_fq_reuse *rq) +static inline void __xsk_map_flush(void) { } -static inline struct xdp_umem *xdp_get_umem_from_qid(struct net_device *dev, - u16 queue_id) +static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, + u32 key) { return NULL; } @@ -297,80 +173,6 @@ static inline u64 xsk_umem_add_offset_to_addr(u64 addr) return 0; } -static inline char *xdp_umem_get_data(struct xdp_umem *umem, u64 addr) -{ - return NULL; -} - -static inline dma_addr_t xdp_umem_get_dma(struct xdp_umem *umem, u64 addr) -{ - return 0; -} - -static inline bool xsk_umem_has_addrs_rq(struct xdp_umem *umem, u32 cnt) -{ - return false; -} - -static inline u64 *xsk_umem_peek_addr_rq(struct xdp_umem *umem, u64 *addr) -{ - return NULL; -} - -static inline void xsk_umem_release_addr_rq(struct xdp_umem *umem) -{ -} - -static inline void xsk_umem_fq_reuse(struct xdp_umem *umem, u64 addr) -{ -} - -static inline void xsk_set_rx_need_wakeup(struct xdp_umem *umem) -{ -} - -static inline void xsk_set_tx_need_wakeup(struct xdp_umem *umem) -{ -} - -static inline void xsk_clear_rx_need_wakeup(struct xdp_umem *umem) -{ -} - -static inline void xsk_clear_tx_need_wakeup(struct xdp_umem *umem) -{ -} - -static inline bool xsk_umem_uses_need_wakeup(struct xdp_umem *umem) -{ - return false; -} - -static inline u64 xsk_umem_adjust_offset(struct xdp_umem *umem, u64 handle, - u64 offset) -{ - return 0; -} - -static inline u32 xsk_umem_xdp_frame_sz(struct xdp_umem *umem) -{ - return 0; -} - -static inline int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp) -{ - return -EOPNOTSUPP; -} - -static inline void __xsk_map_flush(void) -{ -} - -static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, - u32 key) -{ - return NULL; -} #endif /* CONFIG_XDP_SOCKETS */ #endif /* _LINUX_XDP_SOCK_H */ diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h new file mode 100644 index 000000000000..d67f2361937a --- /dev/null +++ b/include/net/xdp_sock_drv.h @@ -0,0 +1,217 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Interface for implementing AF_XDP zero-copy support in drivers. + * Copyright(c) 2020 Intel Corporation. + */ + +#ifndef _LINUX_XDP_SOCK_DRV_H +#define _LINUX_XDP_SOCK_DRV_H + +#include + +#ifdef CONFIG_XDP_SOCKETS + +bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt); +bool xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr); +void xsk_umem_release_addr(struct xdp_umem *umem); +void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries); +bool xsk_umem_consume_tx(struct xdp_umem *umem, struct xdp_desc *desc); +void xsk_umem_consume_tx_done(struct xdp_umem *umem); +struct xdp_umem_fq_reuse *xsk_reuseq_prepare(u32 nentries); +struct xdp_umem_fq_reuse *xsk_reuseq_swap(struct xdp_umem *umem, + struct xdp_umem_fq_reuse *newq); +void xsk_reuseq_free(struct xdp_umem_fq_reuse *rq); +struct xdp_umem *xdp_get_umem_from_qid(struct net_device *dev, u16 queue_id); +void xsk_set_rx_need_wakeup(struct xdp_umem *umem); +void xsk_set_tx_need_wakeup(struct xdp_umem *umem); +void xsk_clear_rx_need_wakeup(struct xdp_umem *umem); +void xsk_clear_tx_need_wakeup(struct xdp_umem *umem); +bool xsk_umem_uses_need_wakeup(struct xdp_umem *umem); + +static inline char *xdp_umem_get_data(struct xdp_umem *umem, u64 addr) +{ + unsigned long page_addr; + + addr = xsk_umem_add_offset_to_addr(addr); + page_addr = (unsigned long)umem->pages[addr >> PAGE_SHIFT].addr; + + return (char *)(page_addr & PAGE_MASK) + (addr & ~PAGE_MASK); +} + +static inline dma_addr_t xdp_umem_get_dma(struct xdp_umem *umem, u64 addr) +{ + addr = xsk_umem_add_offset_to_addr(addr); + + return umem->pages[addr >> PAGE_SHIFT].dma + (addr & ~PAGE_MASK); +} + +/* Reuse-queue aware version of FILL queue helpers */ +static inline bool xsk_umem_has_addrs_rq(struct xdp_umem *umem, u32 cnt) +{ + struct xdp_umem_fq_reuse *rq = umem->fq_reuse; + + if (rq->length >= cnt) + return true; + + return xsk_umem_has_addrs(umem, cnt - rq->length); +} + +static inline bool xsk_umem_peek_addr_rq(struct xdp_umem *umem, u64 *addr) +{ + struct xdp_umem_fq_reuse *rq = umem->fq_reuse; + + if (!rq->length) + return xsk_umem_peek_addr(umem, addr); + + *addr = rq->handles[rq->length - 1]; + return addr; +} + +static inline void xsk_umem_release_addr_rq(struct xdp_umem *umem) +{ + struct xdp_umem_fq_reuse *rq = umem->fq_reuse; + + if (!rq->length) + xsk_umem_release_addr(umem); + else + rq->length--; +} + +static inline void xsk_umem_fq_reuse(struct xdp_umem *umem, u64 addr) +{ + struct xdp_umem_fq_reuse *rq = umem->fq_reuse; + + rq->handles[rq->length++] = addr; +} + +/* Handle the offset appropriately depending on aligned or unaligned mode. + * For unaligned mode, we store the offset in the upper 16-bits of the address. + * For aligned mode, we simply add the offset to the address. + */ +static inline u64 xsk_umem_adjust_offset(struct xdp_umem *umem, u64 address, + u64 offset) +{ + if (umem->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG) + return address + (offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT); + else + return address + offset; +} + +static inline u32 xsk_umem_xdp_frame_sz(struct xdp_umem *umem) +{ + return umem->chunk_size_nohr; +} + +#else + +static inline bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt) +{ + return false; +} + +static inline u64 *xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr) +{ + return NULL; +} + +static inline void xsk_umem_release_addr(struct xdp_umem *umem) +{ +} + +static inline void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries) +{ +} + +static inline bool xsk_umem_consume_tx(struct xdp_umem *umem, + struct xdp_desc *desc) +{ + return false; +} + +static inline void xsk_umem_consume_tx_done(struct xdp_umem *umem) +{ +} + +static inline struct xdp_umem_fq_reuse *xsk_reuseq_prepare(u32 nentries) +{ + return NULL; +} + +static inline struct xdp_umem_fq_reuse *xsk_reuseq_swap( + struct xdp_umem *umem, struct xdp_umem_fq_reuse *newq) +{ + return NULL; +} + +static inline void xsk_reuseq_free(struct xdp_umem_fq_reuse *rq) +{ +} + +static inline struct xdp_umem *xdp_get_umem_from_qid(struct net_device *dev, + u16 queue_id) +{ + return NULL; +} + +static inline char *xdp_umem_get_data(struct xdp_umem *umem, u64 addr) +{ + return NULL; +} + +static inline dma_addr_t xdp_umem_get_dma(struct xdp_umem *umem, u64 addr) +{ + return 0; +} + +static inline bool xsk_umem_has_addrs_rq(struct xdp_umem *umem, u32 cnt) +{ + return false; +} + +static inline u64 *xsk_umem_peek_addr_rq(struct xdp_umem *umem, u64 *addr) +{ + return NULL; +} + +static inline void xsk_umem_release_addr_rq(struct xdp_umem *umem) +{ +} + +static inline void xsk_umem_fq_reuse(struct xdp_umem *umem, u64 addr) +{ +} + +static inline void xsk_set_rx_need_wakeup(struct xdp_umem *umem) +{ +} + +static inline void xsk_set_tx_need_wakeup(struct xdp_umem *umem) +{ +} + +static inline void xsk_clear_rx_need_wakeup(struct xdp_umem *umem) +{ +} + +static inline void xsk_clear_tx_need_wakeup(struct xdp_umem *umem) +{ +} + +static inline bool xsk_umem_uses_need_wakeup(struct xdp_umem *umem) +{ + return false; +} + +static inline u64 xsk_umem_adjust_offset(struct xdp_umem *umem, u64 handle, + u64 offset) +{ + return 0; +} + +static inline u32 xsk_umem_xdp_frame_sz(struct xdp_umem *umem) +{ + return 0; +} + +#endif /* CONFIG_XDP_SOCKETS */ + +#endif /* _LINUX_XDP_SOCK_DRV_H */ diff --git a/net/ethtool/channels.c b/net/ethtool/channels.c index 389924b65d05..658a8580b464 100644 --- a/net/ethtool/channels.c +++ b/net/ethtool/channels.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only -#include +#include #include "netlink.h" #include "common.h" diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 52102ab1709b..74892623bacd 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -24,7 +24,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/net/xdp/xdp_umem.h b/net/xdp/xdp_umem.h index a63a9fb251f5..32067fe98f65 100644 --- a/net/xdp/xdp_umem.h +++ b/net/xdp/xdp_umem.h @@ -6,7 +6,7 @@ #ifndef XDP_UMEM_H_ #define XDP_UMEM_H_ -#include +#include int xdp_umem_assign_dev(struct xdp_umem *umem, struct net_device *dev, u16 queue_id, u16 flags); diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 45ffd67b367d..8bda654e82ec 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -22,7 +22,7 @@ #include #include #include -#include +#include #include #include "xsk_queue.h" diff --git a/net/xdp/xsk_queue.c b/net/xdp/xsk_queue.c index 57fb81bd593c..554b1ebb4d02 100644 --- a/net/xdp/xsk_queue.c +++ b/net/xdp/xsk_queue.c @@ -6,6 +6,7 @@ #include #include #include +#include #include "xsk_queue.h" From patchwork Wed May 20 19:20:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294645 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=tEHMSmiq; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2g94lx7z9sTC for ; Thu, 21 May 2020 05:21:41 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726905AbgETTVl (ORCPT ); Wed, 20 May 2020 15:21:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTVk (ORCPT ); Wed, 20 May 2020 15:21:40 -0400 Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 75FEFC061A0E; Wed, 20 May 2020 12:21:39 -0700 (PDT) Received: by mail-pl1-x642.google.com with SMTP id w19so1736366ply.11; Wed, 20 May 2020 12:21:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=VonCmuB+xYklsPqBGzAOZb7n4020/T4yIH57DB5Cs9w=; b=tEHMSmiqK4NVAI8piWGijTRVmY9iSVVIS7VVG4ZRBa8nRLDyjAmwvWbmMmIz/mOq4m uUE52maNEWPsKgQsy3DRR3GELareAU/njqF/yd5pUk0g1yOK1b8VIs8KM4ch3WeR2J1y iZZgf8dhNB4xGi9nNKGDM5jsZfakf6/yUi0UxUE59GhQ2NmsjBzGsctejdxX7N0+lvRr ukko9+b/mSbGgkbbI/0Cpi5D65bzZtXHRNg5TLDEI/URDo88TJL5kC/t/Lo8U80uC5Ul j+g1X8rE2529sjVgBnV7TeXDdaCHaCophWohQIZcDZGvLPY4aQjqp+X2+M2UHTKdKq0y lNHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=VonCmuB+xYklsPqBGzAOZb7n4020/T4yIH57DB5Cs9w=; b=hu10Nk+G0hZ6WWICRoeIaVmCEoTbUPqoH8y/4ZtzZJm7l8TAZ+UMZK4s3WLbL435Lv 4rAeiGJnu/Xd7jAFNvDBwuOfBd16ElGRNg/J0+1B7worS0yWod9TetfwCk3jgNxvsy/A Vpxb+AaVBu332oX8zbXbCkuInEnjXBxecL0YKg90NO9Kw5PGoxrJFx5dsghlLVhHLz1B vFHFl0aCx9IMbFzQp0JAI92aOtr65gISMnOa1SH4ZWq0bW9lRT1BJVxm91GGK30EAWxv HMNdqpiCPtiMhUC/IPD83UVjhnNPE3Hx3ptT3TdM4hfM1Dm+iYq0ZFcDTGtEHjPpUgSz CjCQ== X-Gm-Message-State: AOAM5315LWMuZj0bXTUkN5usD3x7ZHt9xwOYKeX6VQJ4sUAKXNs3udRb ra+5wrOW0MVvrnEBY+81Y9A= X-Google-Smtp-Source: ABdhPJzG1aqPy5irzrkNKu6YcK4H7eFqXH78aiQnQ2f6AJgqHmg6B/NsvpVMlazrXtT623oN8fN2sg== X-Received: by 2002:a17:902:7787:: with SMTP id o7mr1219689pll.52.1590002498966; Wed, 20 May 2020 12:21:38 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.21.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:21:38 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com Subject: [PATCH bpf-next v5 04/15] xsk: move defines only used by AF_XDP internals to xsk.h Date: Wed, 20 May 2020 21:20:52 +0200 Message-Id: <20200520192103.355233-5-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Björn Töpel Move the XSK_NEXT_PG_CONTIG_{MASK,SHIFT}, and XDP_UMEM_USES_NEED_WAKEUP defines from xdp_sock.h to the AF_XDP internal xsk.h file. Also, start using the BIT{,_ULL} macro instead of explicit shifts. Signed-off-by: Björn Töpel --- include/net/xdp_sock.h | 14 -------------- net/xdp/xsk.h | 14 ++++++++++++++ net/xdp/xsk_queue.h | 2 ++ 3 files changed, 16 insertions(+), 14 deletions(-) diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index 6a986dcbc336..fb7fe3060175 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -17,13 +17,6 @@ struct net_device; struct xsk_queue; struct xdp_buff; -/* Masks for xdp_umem_page flags. - * The low 12-bits of the addr will be 0 since this is the page address, so we - * can use them for flags. - */ -#define XSK_NEXT_PG_CONTIG_SHIFT 0 -#define XSK_NEXT_PG_CONTIG_MASK (1ULL << XSK_NEXT_PG_CONTIG_SHIFT) - struct xdp_umem_page { void *addr; dma_addr_t dma; @@ -35,13 +28,6 @@ struct xdp_umem_fq_reuse { u64 handles[]; }; -/* Flags for the umem flags field. - * - * The NEED_WAKEUP flag is 1 due to the reuse of the flags field for public - * flags. See inlude/uapi/include/linux/if_xdp.h. - */ -#define XDP_UMEM_USES_NEED_WAKEUP (1 << 1) - struct xdp_umem { struct xsk_queue *fq; struct xsk_queue *cq; diff --git a/net/xdp/xsk.h b/net/xdp/xsk.h index d6a0979050e6..455ddd480f3d 100644 --- a/net/xdp/xsk.h +++ b/net/xdp/xsk.h @@ -4,6 +4,20 @@ #ifndef XSK_H_ #define XSK_H_ +/* Masks for xdp_umem_page flags. + * The low 12-bits of the addr will be 0 since this is the page address, so we + * can use them for flags. + */ +#define XSK_NEXT_PG_CONTIG_SHIFT 0 +#define XSK_NEXT_PG_CONTIG_MASK BIT_ULL(XSK_NEXT_PG_CONTIG_SHIFT) + +/* Flags for the umem flags field. + * + * The NEED_WAKEUP flag is 1 due to the reuse of the flags field for public + * flags. See inlude/uapi/include/linux/if_xdp.h. + */ +#define XDP_UMEM_USES_NEED_WAKEUP BIT(1) + struct xdp_ring_offset_v1 { __u64 producer; __u64 consumer; diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index 648733ec24ac..a322a7dac58c 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -10,6 +10,8 @@ #include #include +#include "xsk.h" + struct xdp_ring { u32 producer ____cacheline_aligned_in_smp; u32 consumer ____cacheline_aligned_in_smp; From patchwork Wed May 20 19:20:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294647 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=G8UYsMsu; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2gH2LKrz9sSF for ; Thu, 21 May 2020 05:21:47 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726939AbgETTVq (ORCPT ); Wed, 20 May 2020 15:21:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTVp (ORCPT ); Wed, 20 May 2020 15:21:45 -0400 Received: from mail-pj1-x1041.google.com (mail-pj1-x1041.google.com [IPv6:2607:f8b0:4864:20::1041]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C35DAC061A0E; Wed, 20 May 2020 12:21:45 -0700 (PDT) Received: by mail-pj1-x1041.google.com with SMTP id s69so1769316pjb.4; Wed, 20 May 2020 12:21:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=131VnH5t6YRJv1BNbxog77k4I8ddoOTfiMRNbY7oPMQ=; b=G8UYsMsu2Nxj9wNcrGxkTWx67JB2Pwr44QDN1oUGi4dvEcbyGh+2MtbAVmB33v0LJs GLnuNKnzIfJcVgXad2pYDfbN2ACTCc3bWTzmZZb+PHGfl2ObD8m3qH4HbyYpOJ+faRRn JaZZG/iIXr8ToPgLPuhAxZ3sP/y/wWXF6C04p86ykSvl9zupqmXL8pULcmX9zywXVjDf OLN3b8+PiGZiRo4lOAmjC3vRuW2/Fj2OKf9bIPW/eHElkp3cM3tZqRHk5wzOf2U2HSob /dDmNxiHW47vYv2nLm6ljG3fU8mTrhnfJnWQb/eiDeYwg6wZwbL42psNXc4mYDJseyuZ Qgsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=131VnH5t6YRJv1BNbxog77k4I8ddoOTfiMRNbY7oPMQ=; b=Xu2mNZTBoTWfmnjA+IYg1/+SNL4aG7NyEZjbq/NBfNc2ByGMLUgeVdljSxzFoeSy2Q YCH8LcWch61enUE66DeVzk6my04Pt09uLu5PyDHhLwZpD+AjFCeW/y12YtOdBL8S1i2i n/dFyZq3U+W/oY8JtPyAaOJboBjaWyhKC/8Xar5DGxjUx3Wg9+SeOaF4ZblqeP4QQ8fX x+x08X/9CToXvt3JF4r5CHovxsdwTYLim7+7lfgaiO3THUan5wNvS1wrip/vwAakIpIu nQv/pvU1UDCDJsr+DptbEVdLcRef3aOBcd0bJ/wnWEpXW+rRFKKxgy9y+YTv0a4Q+6aa lUag== X-Gm-Message-State: AOAM533cx8DgZC5yLbFN2RGHPbxWWgVetXChmscOzaSC5Daj8uXno16P qcbMhjRLkl3nKw3zkqAHhAc= X-Google-Smtp-Source: ABdhPJzfLIYZM7P6voKw/Mc4mBTWm5yH+HBN0TWm4RSJVDsVcqB3fwB9csA1C37waP4F0aHPraqCkA== X-Received: by 2002:a17:902:bd42:: with SMTP id b2mr5737987plx.219.1590002504985; Wed, 20 May 2020 12:21:44 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.21.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:21:44 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com Subject: [PATCH bpf-next v5 05/15] xsk: introduce AF_XDP buffer allocation API Date: Wed, 20 May 2020 21:20:53 +0200 Message-Id: <20200520192103.355233-6-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel In order to simplify AF_XDP zero-copy enablement for NIC driver developers, a new AF_XDP buffer allocation API is added. The implementation is based on a single core (single producer/consumer) buffer pool for the AF_XDP UMEM. A buffer is allocated using the xsk_buff_alloc() function, and returned using xsk_buff_free(). If a buffer is disassociated with the pool, e.g. when a buffer is passed to an AF_XDP socket, a buffer is said to be released. Currently, the release function is only used by the AF_XDP internals and not visible to the driver. Drivers using this API should register the XDP memory model with the new MEM_TYPE_XSK_BUFF_POOL type. The API is defined in net/xdp_sock_drv.h. The buffer type is struct xdp_buff, and follows the lifetime of regular xdp_buffs, i.e. the lifetime of an xdp_buff is restricted to a NAPI context. In other words, the API is not replacing xdp_frames. In addition to introducing the API and implementations, the AF_XDP core is migrated to use the new APIs. rfc->v1: Fixed build errors/warnings for m68k and riscv. (kbuild test robot) Added headroom/chunk size getter. (Maxim/Björn) v1->v2: Swapped SoBs. (Maxim) v2->v3: Initialize struct xdp_buff member frame_sz. (Björn) Add API to query the DMA address of a frame. (Maxim) Do DMA sync for CPU till the end of the frame to handle possible growth (frame_sz). (Maxim) Signed-off-by: Björn Töpel Signed-off-by: Maxim Mikityanskiy --- include/net/xdp.h | 4 +- include/net/xdp_sock.h | 2 + include/net/xdp_sock_drv.h | 164 +++++++++++++ include/net/xsk_buff_pool.h | 56 +++++ include/trace/events/xdp.h | 3 +- net/core/xdp.c | 14 +- net/xdp/Makefile | 1 + net/xdp/xdp_umem.c | 19 +- net/xdp/xsk.c | 147 +++++------- net/xdp/xsk_buff_pool.c | 467 ++++++++++++++++++++++++++++++++++++ net/xdp/xsk_diag.c | 2 +- net/xdp/xsk_queue.h | 59 +++-- 12 files changed, 819 insertions(+), 119 deletions(-) create mode 100644 include/net/xsk_buff_pool.h create mode 100644 net/xdp/xsk_buff_pool.c diff --git a/include/net/xdp.h b/include/net/xdp.h index 3094fccf5a88..f432134c7c00 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -40,6 +40,7 @@ enum xdp_mem_type { MEM_TYPE_PAGE_ORDER0, /* Orig XDP full page model */ MEM_TYPE_PAGE_POOL, MEM_TYPE_ZERO_COPY, + MEM_TYPE_XSK_BUFF_POOL, MEM_TYPE_MAX, }; @@ -119,7 +120,8 @@ struct xdp_frame *convert_to_xdp_frame(struct xdp_buff *xdp) int metasize; int headroom; - if (xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY) + if (xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY || + xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL) return xdp_convert_zc_to_xdp_frame(xdp); /* Assure headroom is available for storing info */ diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index fb7fe3060175..6e7265f63c04 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -31,11 +31,13 @@ struct xdp_umem_fq_reuse { struct xdp_umem { struct xsk_queue *fq; struct xsk_queue *cq; + struct xsk_buff_pool *pool; struct xdp_umem_page *pages; u64 chunk_mask; u64 size; u32 headroom; u32 chunk_size_nohr; + u32 chunk_size; struct user_struct *user; refcount_t users; struct work_struct work; diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index d67f2361937a..7752c8663d1b 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -7,6 +7,7 @@ #define _LINUX_XDP_SOCK_DRV_H #include +#include #ifdef CONFIG_XDP_SOCKETS @@ -101,6 +102,94 @@ static inline u32 xsk_umem_xdp_frame_sz(struct xdp_umem *umem) return umem->chunk_size_nohr; } +static inline u32 xsk_umem_get_headroom(struct xdp_umem *umem) +{ + return XDP_PACKET_HEADROOM + umem->headroom; +} + +static inline u32 xsk_umem_get_chunk_size(struct xdp_umem *umem) +{ + return umem->chunk_size; +} + +static inline u32 xsk_umem_get_rx_frame_size(struct xdp_umem *umem) +{ + return xsk_umem_get_chunk_size(umem) - xsk_umem_get_headroom(umem); +} + +static inline void xsk_buff_set_rxq_info(struct xdp_umem *umem, + struct xdp_rxq_info *rxq) +{ + xp_set_rxq_info(umem->pool, rxq); +} + +static inline void xsk_buff_dma_unmap(struct xdp_umem *umem, + unsigned long attrs) +{ + xp_dma_unmap(umem->pool, attrs); +} + +static inline int xsk_buff_dma_map(struct xdp_umem *umem, struct device *dev, + unsigned long attrs) +{ + return xp_dma_map(umem->pool, dev, attrs, umem->pgs, umem->npgs); +} + +static inline dma_addr_t xsk_buff_xdp_get_dma(struct xdp_buff *xdp) +{ + struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp); + + return xp_get_dma(xskb); +} + +static inline dma_addr_t xsk_buff_xdp_get_frame_dma(struct xdp_buff *xdp) +{ + struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp); + + return xp_get_frame_dma(xskb); +} + +static inline struct xdp_buff *xsk_buff_alloc(struct xdp_umem *umem) +{ + return xp_alloc(umem->pool); +} + +static inline bool xsk_buff_can_alloc(struct xdp_umem *umem, u32 count) +{ + return xp_can_alloc(umem->pool, count); +} + +static inline void xsk_buff_free(struct xdp_buff *xdp) +{ + struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp); + + xp_free(xskb); +} + +static inline dma_addr_t xsk_buff_raw_get_dma(struct xdp_umem *umem, u64 addr) +{ + return xp_raw_get_dma(umem->pool, addr); +} + +static inline void *xsk_buff_raw_get_data(struct xdp_umem *umem, u64 addr) +{ + return xp_raw_get_data(umem->pool, addr); +} + +static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp) +{ + struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp); + + xp_dma_sync_for_cpu(xskb); +} + +static inline void xsk_buff_raw_dma_sync_for_device(struct xdp_umem *umem, + dma_addr_t dma, + size_t size) +{ + xp_dma_sync_for_device(umem->pool, dma, size); +} + #else static inline bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt) @@ -212,6 +301,81 @@ static inline u32 xsk_umem_xdp_frame_sz(struct xdp_umem *umem) return 0; } +static inline u32 xsk_umem_get_headroom(struct xdp_umem *umem) +{ + return 0; +} + +static inline u32 xsk_umem_get_chunk_size(struct xdp_umem *umem) +{ + return 0; +} + +static inline u32 xsk_umem_get_rx_frame_size(struct xdp_umem *umem) +{ + return 0; +} + +static inline void xsk_buff_set_rxq_info(struct xdp_umem *umem, + struct xdp_rxq_info *rxq) +{ +} + +static inline void xsk_buff_dma_unmap(struct xdp_umem *umem, + unsigned long attrs) +{ +} + +static inline int xsk_buff_dma_map(struct xdp_umem *umem, struct device *dev, + unsigned long attrs) +{ + return 0; +} + +static inline dma_addr_t xsk_buff_xdp_get_dma(struct xdp_buff *xdp) +{ + return 0; +} + +static inline dma_addr_t xsk_buff_xdp_get_frame_dma(struct xdp_buff *xdp) +{ + return 0; +} + +static inline struct xdp_buff *xsk_buff_alloc(struct xdp_umem *umem) +{ + return NULL; +} + +static inline bool xsk_buff_can_alloc(struct xdp_umem *umem, u32 count) +{ + return false; +} + +static inline void xsk_buff_free(struct xdp_buff *xdp) +{ +} + +static inline dma_addr_t xsk_buff_raw_get_dma(struct xdp_umem *umem, u64 addr) +{ + return 0; +} + +static inline void *xsk_buff_raw_get_data(struct xdp_umem *umem, u64 addr) +{ + return NULL; +} + +static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp) +{ +} + +static inline void xsk_buff_raw_dma_sync_for_device(struct xdp_umem *umem, + dma_addr_t dma, + size_t size) +{ +} + #endif /* CONFIG_XDP_SOCKETS */ #endif /* _LINUX_XDP_SOCK_DRV_H */ diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h new file mode 100644 index 000000000000..9f221b36e405 --- /dev/null +++ b/include/net/xsk_buff_pool.h @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright(c) 2020 Intel Corporation. */ + +#ifndef XSK_BUFF_POOL_H_ +#define XSK_BUFF_POOL_H_ + +#include +#include +#include + +struct xsk_buff_pool; +struct xdp_rxq_info; +struct xsk_queue; +struct xdp_desc; +struct device; +struct page; + +struct xdp_buff_xsk { + struct xdp_buff xdp; + dma_addr_t dma; + dma_addr_t frame_dma; + struct xsk_buff_pool *pool; + bool unaligned; + u64 orig_addr; + struct list_head free_list_node; +}; + +/* AF_XDP core. */ +struct xsk_buff_pool *xp_create(struct page **pages, u32 nr_pages, u32 chunks, + u32 chunk_size, u32 headroom, u64 size, + bool unaligned); +void xp_set_fq(struct xsk_buff_pool *pool, struct xsk_queue *fq); +void xp_destroy(struct xsk_buff_pool *pool); +void xp_release(struct xdp_buff_xsk *xskb); +u64 xp_get_handle(struct xdp_buff_xsk *xskb); +bool xp_validate_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc); + +/* AF_XDP, and XDP core. */ +void xp_free(struct xdp_buff_xsk *xskb); + +/* AF_XDP ZC drivers, via xdp_sock_buff.h */ +void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq); +int xp_dma_map(struct xsk_buff_pool *pool, struct device *dev, + unsigned long attrs, struct page **pages, u32 nr_pages); +void xp_dma_unmap(struct xsk_buff_pool *pool, unsigned long attrs); +struct xdp_buff *xp_alloc(struct xsk_buff_pool *pool); +bool xp_can_alloc(struct xsk_buff_pool *pool, u32 count); +void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr); +dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr); +dma_addr_t xp_get_dma(struct xdp_buff_xsk *xskb); +dma_addr_t xp_get_frame_dma(struct xdp_buff_xsk *xskb); +void xp_dma_sync_for_cpu(struct xdp_buff_xsk *xskb); +void xp_dma_sync_for_device(struct xsk_buff_pool *pool, dma_addr_t dma, + size_t size); + +#endif /* XSK_BUFF_POOL_H_ */ diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h index b95d65e8c628..48547a12fa27 100644 --- a/include/trace/events/xdp.h +++ b/include/trace/events/xdp.h @@ -287,7 +287,8 @@ TRACE_EVENT(xdp_devmap_xmit, FN(PAGE_SHARED) \ FN(PAGE_ORDER0) \ FN(PAGE_POOL) \ - FN(ZERO_COPY) + FN(ZERO_COPY) \ + FN(XSK_BUFF_POOL) #define __MEM_TYPE_TP_FN(x) \ TRACE_DEFINE_ENUM(MEM_TYPE_##x); diff --git a/net/core/xdp.c b/net/core/xdp.c index 490b8f5fa8ee..f0ce8b195193 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -17,6 +17,7 @@ #include #include /* struct xdp_mem_allocator */ #include +#include #define REG_STATE_NEW 0x0 #define REG_STATE_REGISTERED 0x1 @@ -361,7 +362,7 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_reg_mem_model); * of xdp_frames/pages in those cases. */ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, - unsigned long handle) + unsigned long handle, struct xdp_buff *xdp) { struct xdp_mem_allocator *xa; struct page *page; @@ -390,6 +391,11 @@ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, xa = rhashtable_lookup(mem_id_ht, &mem->id, mem_id_rht_params); xa->zc_alloc->free(xa->zc_alloc, handle); rcu_read_unlock(); + break; + case MEM_TYPE_XSK_BUFF_POOL: + /* NB! Only valid from an xdp_buff! */ + xsk_buff_free(xdp); + break; default: /* Not possible, checked in xdp_rxq_info_reg_mem_model() */ break; @@ -398,19 +404,19 @@ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, void xdp_return_frame(struct xdp_frame *xdpf) { - __xdp_return(xdpf->data, &xdpf->mem, false, 0); + __xdp_return(xdpf->data, &xdpf->mem, false, 0, NULL); } EXPORT_SYMBOL_GPL(xdp_return_frame); void xdp_return_frame_rx_napi(struct xdp_frame *xdpf) { - __xdp_return(xdpf->data, &xdpf->mem, true, 0); + __xdp_return(xdpf->data, &xdpf->mem, true, 0, NULL); } EXPORT_SYMBOL_GPL(xdp_return_frame_rx_napi); void xdp_return_buff(struct xdp_buff *xdp) { - __xdp_return(xdp->data, &xdp->rxq->mem, true, xdp->handle); + __xdp_return(xdp->data, &xdp->rxq->mem, true, xdp->handle, xdp); } EXPORT_SYMBOL_GPL(xdp_return_buff); diff --git a/net/xdp/Makefile b/net/xdp/Makefile index 90b5460d6166..30cdc4315f42 100644 --- a/net/xdp/Makefile +++ b/net/xdp/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only obj-$(CONFIG_XDP_SOCKETS) += xsk.o xdp_umem.o xsk_queue.o xskmap.o +obj-$(CONFIG_XDP_SOCKETS) += xsk_buff_pool.o obj-$(CONFIG_XDP_SOCKETS_DIAG) += xsk_diag.o diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c index 37ace3bc0d48..7f04688045d5 100644 --- a/net/xdp/xdp_umem.c +++ b/net/xdp/xdp_umem.c @@ -245,7 +245,7 @@ static void xdp_umem_release(struct xdp_umem *umem) } xsk_reuseq_destroy(umem); - + xp_destroy(umem->pool); xdp_umem_unmap_pages(umem); xdp_umem_unpin_pages(umem); @@ -390,6 +390,7 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) umem->size = size; umem->headroom = headroom; umem->chunk_size_nohr = chunk_size - headroom; + umem->chunk_size = chunk_size; umem->npgs = size / PAGE_SIZE; umem->pgs = NULL; umem->user = NULL; @@ -415,11 +416,21 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) } err = xdp_umem_map_pages(umem); - if (!err) - return 0; + if (err) + goto out_pages; - kvfree(umem->pages); + umem->pool = xp_create(umem->pgs, umem->npgs, chunks, chunk_size, + headroom, size, unaligned_chunks); + if (!umem->pool) { + err = -ENOMEM; + goto out_unmap; + } + return 0; +out_unmap: + xdp_umem_unmap_pages(umem); +out_pages: + kvfree(umem->pages); out_pin: xdp_umem_unpin_pages(umem); out_account: diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 8bda654e82ec..6933f0d494ba 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -117,76 +117,67 @@ bool xsk_umem_uses_need_wakeup(struct xdp_umem *umem) } EXPORT_SYMBOL(xsk_umem_uses_need_wakeup); -/* If a buffer crosses a page boundary, we need to do 2 memcpy's, one for - * each page. This is only required in copy mode. - */ -static void __xsk_rcv_memcpy(struct xdp_umem *umem, u64 addr, void *from_buf, - u32 len, u32 metalen) +static int __xsk_rcv_zc(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len) { - void *to_buf = xdp_umem_get_data(umem, addr); - - addr = xsk_umem_add_offset_to_addr(addr); - if (xskq_cons_crosses_non_contig_pg(umem, addr, len + metalen)) { - void *next_pg_addr = umem->pages[(addr >> PAGE_SHIFT) + 1].addr; - u64 page_start = addr & ~(PAGE_SIZE - 1); - u64 first_len = PAGE_SIZE - (addr - page_start); - - memcpy(to_buf, from_buf, first_len); - memcpy(next_pg_addr, from_buf + first_len, - len + metalen - first_len); + struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp); + u64 addr; + int err; - return; + addr = xp_get_handle(xskb); + err = xskq_prod_reserve_desc(xs->rx, addr, len); + if (err) { + xs->rx_dropped++; + return err; } - memcpy(to_buf, from_buf, len + metalen); + xp_release(xskb); + return 0; } -static int __xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len) +static void xsk_copy_xdp(struct xdp_buff *to, struct xdp_buff *from, u32 len) { - u64 offset = xs->umem->headroom; - u64 addr, memcpy_addr; - void *from_buf; + void *from_buf, *to_buf; u32 metalen; - int err; - - if (!xskq_cons_peek_addr(xs->umem->fq, &addr, xs->umem) || - len > xs->umem->chunk_size_nohr - XDP_PACKET_HEADROOM) { - xs->rx_dropped++; - return -ENOSPC; - } - if (unlikely(xdp_data_meta_unsupported(xdp))) { - from_buf = xdp->data; + if (unlikely(xdp_data_meta_unsupported(from))) { + from_buf = from->data; + to_buf = to->data; metalen = 0; } else { - from_buf = xdp->data_meta; - metalen = xdp->data - xdp->data_meta; + from_buf = from->data_meta; + metalen = from->data - from->data_meta; + to_buf = to->data - metalen; } - memcpy_addr = xsk_umem_adjust_offset(xs->umem, addr, offset); - __xsk_rcv_memcpy(xs->umem, memcpy_addr, from_buf, len, metalen); - - offset += metalen; - addr = xsk_umem_adjust_offset(xs->umem, addr, offset); - err = xskq_prod_reserve_desc(xs->rx, addr, len); - if (!err) { - xskq_cons_release(xs->umem->fq); - xdp_return_buff(xdp); - return 0; - } - - xs->rx_dropped++; - return err; + memcpy(to_buf, from_buf, len + metalen); } -static int __xsk_rcv_zc(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len) +static int __xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len, + bool explicit_free) { - int err = xskq_prod_reserve_desc(xs->rx, xdp->handle, len); + struct xdp_buff *xsk_xdp; + int err; - if (err) + if (len > xsk_umem_get_rx_frame_size(xs->umem)) { + xs->rx_dropped++; + return -ENOSPC; + } + + xsk_xdp = xsk_buff_alloc(xs->umem); + if (!xsk_xdp) { xs->rx_dropped++; + return -ENOSPC; + } - return err; + xsk_copy_xdp(xsk_xdp, xdp, len); + err = __xsk_rcv_zc(xs, xsk_xdp, len); + if (err) { + xsk_buff_free(xsk_xdp); + return err; + } + if (explicit_free) + xdp_return_buff(xdp); + return 0; } static bool xsk_is_bound(struct xdp_sock *xs) @@ -199,7 +190,8 @@ static bool xsk_is_bound(struct xdp_sock *xs) return false; } -static int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) +static int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, + bool explicit_free) { u32 len; @@ -211,8 +203,10 @@ static int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) len = xdp->data_end - xdp->data; - return (xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY) ? - __xsk_rcv_zc(xs, xdp, len) : __xsk_rcv(xs, xdp, len); + return xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY || + xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL ? + __xsk_rcv_zc(xs, xdp, len) : + __xsk_rcv(xs, xdp, len, explicit_free); } static void xsk_flush(struct xdp_sock *xs) @@ -224,46 +218,11 @@ static void xsk_flush(struct xdp_sock *xs) int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) { - u32 metalen = xdp->data - xdp->data_meta; - u32 len = xdp->data_end - xdp->data; - u64 offset = xs->umem->headroom; - void *buffer; - u64 addr; int err; spin_lock_bh(&xs->rx_lock); - - if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index) { - err = -EINVAL; - goto out_unlock; - } - - if (!xskq_cons_peek_addr(xs->umem->fq, &addr, xs->umem) || - len > xs->umem->chunk_size_nohr - XDP_PACKET_HEADROOM) { - err = -ENOSPC; - goto out_drop; - } - - addr = xsk_umem_adjust_offset(xs->umem, addr, offset); - buffer = xdp_umem_get_data(xs->umem, addr); - memcpy(buffer, xdp->data_meta, len + metalen); - - addr = xsk_umem_adjust_offset(xs->umem, addr, metalen); - err = xskq_prod_reserve_desc(xs->rx, addr, len); - if (err) - goto out_drop; - - xskq_cons_release(xs->umem->fq); - xskq_prod_submit(xs->rx); - - spin_unlock_bh(&xs->rx_lock); - - xs->sk.sk_data_ready(&xs->sk); - return 0; - -out_drop: - xs->rx_dropped++; -out_unlock: + err = xsk_rcv(xs, xdp, false); + xsk_flush(xs); spin_unlock_bh(&xs->rx_lock); return err; } @@ -273,7 +232,7 @@ int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp) struct list_head *flush_list = this_cpu_ptr(&xskmap_flush_list); int err; - err = xsk_rcv(xs, xdp); + err = xsk_rcv(xs, xdp, true); if (err) return err; @@ -404,7 +363,7 @@ static int xsk_generic_xmit(struct sock *sk) skb_put(skb, len); addr = desc.addr; - buffer = xdp_umem_get_data(xs->umem, addr); + buffer = xsk_buff_raw_get_data(xs->umem, addr); err = skb_store_bits(skb, 0, buffer, len); /* This is the backpressure mechanism for the Tx path. * Reserve space in the completion queue and only proceed @@ -860,6 +819,8 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname, q = (optname == XDP_UMEM_FILL_RING) ? &xs->umem->fq : &xs->umem->cq; err = xsk_init_queue(entries, q, true); + if (optname == XDP_UMEM_FILL_RING) + xp_set_fq(xs->umem->pool, *q); mutex_unlock(&xs->mutex); return err; } diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c new file mode 100644 index 000000000000..e214a5795a62 --- /dev/null +++ b/net/xdp/xsk_buff_pool.c @@ -0,0 +1,467 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include + +#include "xsk_queue.h" + +struct xsk_buff_pool { + struct xsk_queue *fq; + struct list_head free_list; + dma_addr_t *dma_pages; + struct xdp_buff_xsk *heads; + u64 chunk_mask; + u64 addrs_cnt; + u32 free_list_cnt; + u32 dma_pages_cnt; + u32 heads_cnt; + u32 free_heads_cnt; + u32 headroom; + u32 chunk_size; + u32 frame_len; + bool cheap_dma; + bool unaligned; + void *addrs; + struct device *dev; + struct xdp_buff_xsk *free_heads[]; +}; + +static void xp_addr_unmap(struct xsk_buff_pool *pool) +{ + vunmap(pool->addrs); +} + +static int xp_addr_map(struct xsk_buff_pool *pool, + struct page **pages, u32 nr_pages) +{ + pool->addrs = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); + if (!pool->addrs) + return -ENOMEM; + return 0; +} + +void xp_destroy(struct xsk_buff_pool *pool) +{ + if (!pool) + return; + + xp_addr_unmap(pool); + kvfree(pool->heads); + kvfree(pool); +} + +struct xsk_buff_pool *xp_create(struct page **pages, u32 nr_pages, u32 chunks, + u32 chunk_size, u32 headroom, u64 size, + bool unaligned) +{ + struct xsk_buff_pool *pool; + struct xdp_buff_xsk *xskb; + int err; + u32 i; + + pool = kvzalloc(struct_size(pool, free_heads, chunks), GFP_KERNEL); + if (!pool) + goto out; + + pool->heads = kvcalloc(chunks, sizeof(*pool->heads), GFP_KERNEL); + if (!pool->heads) + goto out; + + pool->chunk_mask = ~((u64)chunk_size - 1); + pool->addrs_cnt = size; + pool->heads_cnt = chunks; + pool->free_heads_cnt = chunks; + pool->headroom = headroom; + pool->chunk_size = chunk_size; + pool->cheap_dma = true; + pool->unaligned = unaligned; + pool->frame_len = chunk_size - headroom - XDP_PACKET_HEADROOM; + INIT_LIST_HEAD(&pool->free_list); + + for (i = 0; i < pool->free_heads_cnt; i++) { + xskb = &pool->heads[i]; + xskb->pool = pool; + xskb->xdp.frame_sz = chunk_size - headroom; + pool->free_heads[i] = xskb; + } + + err = xp_addr_map(pool, pages, nr_pages); + if (!err) + return pool; + +out: + xp_destroy(pool); + return NULL; +} + +void xp_set_fq(struct xsk_buff_pool *pool, struct xsk_queue *fq) +{ + pool->fq = fq; +} + +void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq) +{ + u32 i; + + for (i = 0; i < pool->heads_cnt; i++) + pool->heads[i].xdp.rxq = rxq; +} +EXPORT_SYMBOL(xp_set_rxq_info); + +void xp_dma_unmap(struct xsk_buff_pool *pool, unsigned long attrs) +{ + dma_addr_t *dma; + u32 i; + + if (pool->dma_pages_cnt == 0) + return; + + for (i = 0; i < pool->dma_pages_cnt; i++) { + dma = &pool->dma_pages[i]; + if (*dma) { + dma_unmap_page_attrs(pool->dev, *dma, PAGE_SIZE, + DMA_BIDIRECTIONAL, attrs); + *dma = 0; + } + } + + kvfree(pool->dma_pages); + pool->dma_pages_cnt = 0; + pool->dev = NULL; +} +EXPORT_SYMBOL(xp_dma_unmap); + +static void xp_check_dma_contiguity(struct xsk_buff_pool *pool) +{ + u32 i; + + for (i = 0; i < pool->dma_pages_cnt - 1; i++) { + if (pool->dma_pages[i] + PAGE_SIZE == pool->dma_pages[i + 1]) + pool->dma_pages[i] |= XSK_NEXT_PG_CONTIG_MASK; + else + pool->dma_pages[i] &= ~XSK_NEXT_PG_CONTIG_MASK; + } +} + +static bool __maybe_unused xp_check_swiotlb_dma(struct xsk_buff_pool *pool) +{ +#if defined(CONFIG_SWIOTLB) + phys_addr_t paddr; + u32 i; + + for (i = 0; i < pool->dma_pages_cnt; i++) { + paddr = dma_to_phys(pool->dev, pool->dma_pages[i]); + if (is_swiotlb_buffer(paddr)) + return false; + } +#endif + return true; +} + +static bool xp_check_cheap_dma(struct xsk_buff_pool *pool) +{ +#if defined(CONFIG_HAS_DMA) + const struct dma_map_ops *ops = get_dma_ops(pool->dev); + + if (ops) { + return !ops->sync_single_for_cpu && + !ops->sync_single_for_device; + } + + if (!dma_is_direct(ops)) + return false; + + if (!xp_check_swiotlb_dma(pool)) + return false; + + if (!dev_is_dma_coherent(pool->dev)) { +#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \ + defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) || \ + defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) + return false; +#endif + } +#endif + return true; +} + +int xp_dma_map(struct xsk_buff_pool *pool, struct device *dev, + unsigned long attrs, struct page **pages, u32 nr_pages) +{ + dma_addr_t dma; + u32 i; + + pool->dma_pages = kvcalloc(nr_pages, sizeof(*pool->dma_pages), + GFP_KERNEL); + if (!pool->dma_pages) + return -ENOMEM; + + pool->dev = dev; + pool->dma_pages_cnt = nr_pages; + + for (i = 0; i < pool->dma_pages_cnt; i++) { + dma = dma_map_page_attrs(dev, pages[i], 0, PAGE_SIZE, + DMA_BIDIRECTIONAL, attrs); + if (dma_mapping_error(dev, dma)) { + xp_dma_unmap(pool, attrs); + return -ENOMEM; + } + pool->dma_pages[i] = dma; + } + + if (pool->unaligned) + xp_check_dma_contiguity(pool); + + pool->dev = dev; + pool->cheap_dma = xp_check_cheap_dma(pool); + return 0; +} +EXPORT_SYMBOL(xp_dma_map); + +static bool xp_desc_crosses_non_contig_pg(struct xsk_buff_pool *pool, + u64 addr, u32 len) +{ + bool cross_pg = (addr & (PAGE_SIZE - 1)) + len > PAGE_SIZE; + + if (pool->dma_pages_cnt && cross_pg) { + return !(pool->dma_pages[addr >> PAGE_SHIFT] & + XSK_NEXT_PG_CONTIG_MASK); + } + return false; +} + +static bool xp_addr_crosses_non_contig_pg(struct xsk_buff_pool *pool, + u64 addr) +{ + return xp_desc_crosses_non_contig_pg(pool, addr, pool->chunk_size); +} + +void xp_release(struct xdp_buff_xsk *xskb) +{ + xskb->pool->free_heads[xskb->pool->free_heads_cnt++] = xskb; +} + +static u64 xp_aligned_extract_addr(struct xsk_buff_pool *pool, u64 addr) +{ + return addr & pool->chunk_mask; +} + +static u64 xp_unaligned_extract_addr(u64 addr) +{ + return addr & XSK_UNALIGNED_BUF_ADDR_MASK; +} + +static u64 xp_unaligned_extract_offset(u64 addr) +{ + return addr >> XSK_UNALIGNED_BUF_OFFSET_SHIFT; +} + +static u64 xp_unaligned_add_offset_to_addr(u64 addr) +{ + return xp_unaligned_extract_addr(addr) + + xp_unaligned_extract_offset(addr); +} + +static bool xp_check_unaligned(struct xsk_buff_pool *pool, u64 *addr) +{ + *addr = xp_unaligned_extract_addr(*addr); + if (*addr >= pool->addrs_cnt || + *addr + pool->chunk_size > pool->addrs_cnt || + xp_addr_crosses_non_contig_pg(pool, *addr)) + return false; + return true; +} + +static bool xp_check_aligned(struct xsk_buff_pool *pool, u64 *addr) +{ + *addr = xp_aligned_extract_addr(pool, *addr); + return *addr < pool->addrs_cnt; +} + +static struct xdp_buff_xsk *__xp_alloc(struct xsk_buff_pool *pool) +{ + struct xdp_buff_xsk *xskb; + u64 addr; + bool ok; + + if (pool->free_heads_cnt == 0) + return NULL; + + xskb = pool->free_heads[--pool->free_heads_cnt]; + + for (;;) { + if (!xskq_cons_peek_addr_unchecked(pool->fq, &addr)) { + xp_release(xskb); + return NULL; + } + + ok = pool->unaligned ? xp_check_unaligned(pool, &addr) : + xp_check_aligned(pool, &addr); + if (!ok) { + pool->fq->invalid_descs++; + xskq_cons_release(pool->fq); + continue; + } + break; + } + xskq_cons_release(pool->fq); + + xskb->orig_addr = addr; + xskb->xdp.data_hard_start = pool->addrs + addr + pool->headroom; + if (pool->dma_pages_cnt) { + xskb->frame_dma = (pool->dma_pages[addr >> PAGE_SHIFT] & + ~XSK_NEXT_PG_CONTIG_MASK) + + (addr & ~PAGE_MASK); + xskb->dma = xskb->frame_dma + pool->headroom + + XDP_PACKET_HEADROOM; + } + return xskb; +} + +struct xdp_buff *xp_alloc(struct xsk_buff_pool *pool) +{ + struct xdp_buff_xsk *xskb; + + if (!pool->free_list_cnt) { + xskb = __xp_alloc(pool); + if (!xskb) + return NULL; + } else { + pool->free_list_cnt--; + xskb = list_first_entry(&pool->free_list, struct xdp_buff_xsk, + free_list_node); + list_del(&xskb->free_list_node); + } + + xskb->xdp.data = xskb->xdp.data_hard_start + XDP_PACKET_HEADROOM; + xskb->xdp.data_meta = xskb->xdp.data; + + if (!pool->cheap_dma) { + dma_sync_single_range_for_device(pool->dev, xskb->dma, 0, + pool->frame_len, + DMA_BIDIRECTIONAL); + } + return &xskb->xdp; +} +EXPORT_SYMBOL(xp_alloc); + +bool xp_can_alloc(struct xsk_buff_pool *pool, u32 count) +{ + if (pool->free_list_cnt >= count) + return true; + return xskq_cons_has_entries(pool->fq, count - pool->free_list_cnt); +} +EXPORT_SYMBOL(xp_can_alloc); + +void xp_free(struct xdp_buff_xsk *xskb) +{ + xskb->pool->free_list_cnt++; + list_add(&xskb->free_list_node, &xskb->pool->free_list); +} +EXPORT_SYMBOL(xp_free); + +static bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, + struct xdp_desc *desc) +{ + u64 chunk, chunk_end; + + chunk = xp_aligned_extract_addr(pool, desc->addr); + chunk_end = xp_aligned_extract_addr(pool, desc->addr + desc->len); + if (chunk != chunk_end) + return false; + + if (chunk >= pool->addrs_cnt) + return false; + + if (desc->options) + return false; + return true; +} + +static bool xp_unaligned_validate_desc(struct xsk_buff_pool *pool, + struct xdp_desc *desc) +{ + u64 addr, base_addr; + + base_addr = xp_unaligned_extract_addr(desc->addr); + addr = xp_unaligned_add_offset_to_addr(desc->addr); + + if (desc->len > pool->chunk_size) + return false; + + if (base_addr >= pool->addrs_cnt || addr >= pool->addrs_cnt || + xp_desc_crosses_non_contig_pg(pool, addr, desc->len)) + return false; + + if (desc->options) + return false; + return true; +} + +bool xp_validate_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc) +{ + return pool->unaligned ? xp_unaligned_validate_desc(pool, desc) : + xp_aligned_validate_desc(pool, desc); +} + +u64 xp_get_handle(struct xdp_buff_xsk *xskb) +{ + u64 offset = xskb->xdp.data - xskb->xdp.data_hard_start; + + offset += xskb->pool->headroom; + if (!xskb->pool->unaligned) + return xskb->orig_addr + offset; + return xskb->orig_addr + (offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT); +} + +void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr) +{ + addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr; + return pool->addrs + addr; +} +EXPORT_SYMBOL(xp_raw_get_data); + +dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr) +{ + addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr; + return (pool->dma_pages[addr >> PAGE_SHIFT] & + ~XSK_NEXT_PG_CONTIG_MASK) + + (addr & ~PAGE_MASK); +} +EXPORT_SYMBOL(xp_raw_get_dma); + +dma_addr_t xp_get_dma(struct xdp_buff_xsk *xskb) +{ + return xskb->dma; +} +EXPORT_SYMBOL(xp_get_dma); + +dma_addr_t xp_get_frame_dma(struct xdp_buff_xsk *xskb) +{ + return xskb->frame_dma; +} +EXPORT_SYMBOL(xp_get_frame_dma); + +void xp_dma_sync_for_cpu(struct xdp_buff_xsk *xskb) +{ + if (xskb->pool->cheap_dma) + return; + + dma_sync_single_range_for_cpu(xskb->pool->dev, xskb->dma, 0, + xskb->pool->frame_len, DMA_BIDIRECTIONAL); +} +EXPORT_SYMBOL(xp_dma_sync_for_cpu); + +void xp_dma_sync_for_device(struct xsk_buff_pool *pool, dma_addr_t dma, + size_t size) +{ + if (pool->cheap_dma) + return; + + dma_sync_single_range_for_device(pool->dev, dma, 0, + size, DMA_BIDIRECTIONAL); +} +EXPORT_SYMBOL(xp_dma_sync_for_device); diff --git a/net/xdp/xsk_diag.c b/net/xdp/xsk_diag.c index f59791ba43a0..0163b26aaf63 100644 --- a/net/xdp/xsk_diag.c +++ b/net/xdp/xsk_diag.c @@ -56,7 +56,7 @@ static int xsk_diag_put_umem(const struct xdp_sock *xs, struct sk_buff *nlskb) du.id = umem->id; du.size = umem->size; du.num_pages = umem->npgs; - du.chunk_size = umem->chunk_size_nohr + umem->headroom; + du.chunk_size = umem->chunk_size; du.headroom = umem->headroom; du.ifindex = umem->dev ? umem->dev->ifindex : 0; du.queue_id = umem->queue_id; diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index a322a7dac58c..9151aef7dbca 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -9,6 +9,7 @@ #include #include #include +#include #include "xsk.h" @@ -172,31 +173,45 @@ static inline bool xskq_cons_read_addr(struct xsk_queue *q, u64 *addr, return false; } -static inline bool xskq_cons_is_valid_desc(struct xsk_queue *q, - struct xdp_desc *d, - struct xdp_umem *umem) +static inline bool xskq_cons_read_addr_aligned(struct xsk_queue *q, u64 *addr) { - if (umem->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG) { - if (!xskq_cons_is_valid_unaligned(q, d->addr, d->len, umem)) - return false; + struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring; - if (d->len > umem->chunk_size_nohr || d->options) { - q->invalid_descs++; - return false; - } + while (q->cached_cons != q->cached_prod) { + u32 idx = q->cached_cons & q->ring_mask; + + *addr = ring->desc[idx]; + if (xskq_cons_is_valid_addr(q, *addr)) + return true; + q->cached_cons++; + } + + return false; +} + +static inline bool xskq_cons_read_addr_unchecked(struct xsk_queue *q, u64 *addr) +{ + struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring; + + if (q->cached_cons != q->cached_prod) { + u32 idx = q->cached_cons & q->ring_mask; + + *addr = ring->desc[idx]; return true; } - if (!xskq_cons_is_valid_addr(q, d->addr)) - return false; + return false; +} - if (((d->addr + d->len) & q->chunk_mask) != (d->addr & q->chunk_mask) || - d->options) { +static inline bool xskq_cons_is_valid_desc(struct xsk_queue *q, + struct xdp_desc *d, + struct xdp_umem *umem) +{ + if (!xp_validate_desc(umem->pool, d)) { q->invalid_descs++; return false; } - return true; } @@ -260,6 +275,20 @@ static inline bool xskq_cons_peek_addr(struct xsk_queue *q, u64 *addr, return xskq_cons_read_addr(q, addr, umem); } +static inline bool xskq_cons_peek_addr_aligned(struct xsk_queue *q, u64 *addr) +{ + if (q->cached_prod == q->cached_cons) + xskq_cons_get_entries(q); + return xskq_cons_read_addr_aligned(q, addr); +} + +static inline bool xskq_cons_peek_addr_unchecked(struct xsk_queue *q, u64 *addr) +{ + if (q->cached_prod == q->cached_cons) + xskq_cons_get_entries(q); + return xskq_cons_read_addr_unchecked(q, addr); +} + static inline bool xskq_cons_peek_desc(struct xsk_queue *q, struct xdp_desc *desc, struct xdp_umem *umem) From patchwork Wed May 20 19:20:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294650 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=SPFMqWrC; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2gN5gTBz9sPK for ; Thu, 21 May 2020 05:21:52 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726940AbgETTVw (ORCPT ); Wed, 20 May 2020 15:21:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTVw (ORCPT ); Wed, 20 May 2020 15:21:52 -0400 Received: from mail-pg1-x542.google.com (mail-pg1-x542.google.com [IPv6:2607:f8b0:4864:20::542]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1066BC061A0E; Wed, 20 May 2020 12:21:52 -0700 (PDT) Received: by mail-pg1-x542.google.com with SMTP id p30so1884878pgl.11; Wed, 20 May 2020 12:21:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4krTzV2u35Z3NG+JAyiWOTySv4f3topMKqbx3Je2xYo=; b=SPFMqWrCwxgyO4UkpK3BImeOXAcNFRidCtwasUw9CebKXnzQDdV0tVSkvrU+4DlRIn /SnE2KCIU5bMjg28RvJKK23mcWlXi6i2cWrvWURKQKDYAMF/cb1ubaD3jF1II9OdAneO t51VGDVpPFvGiEnlwFdYbrFWsZBvHuZiFY7td/W8zutV3MogzoA6HgDAbsyIiQXLhA6Q OWGayh0Xx1A2MwfbHMoUhuyTnUfP+LdOWT6pCwjfGVbgI6/7mpyNp+xMjQRrsGf1FOXJ GlARxm9De58vm47Ae80CH4YoE9VyFMDpmIoZFNnmUwjrGIA7/5qW37PD6YhIjQ4qXENr G2eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4krTzV2u35Z3NG+JAyiWOTySv4f3topMKqbx3Je2xYo=; b=b/mcIsO90UmklpfYpQPDYvYYIhVAm2ZIiC3kr996G3QNWg5IlyARfYw2cUSWohmJoN Xvm9dSKSvN0D72fazNjWwAQgoVGjdIlBYpzfTvlHjLcmnzRm36WxBAmVA/CYnYPJ+1vQ /hyytR/Himgs9r7SeLjoF45kf20yl1DNwjfp/DRX7svwvyOBqIfuCCfJBp0HDXfBA1fy 2eQYOqL4fVZyv+mvn8Tz1MQkLfZPcXYHepPKugO8BdZkybVXQM9kWk5xUorrsRUAC9Zv CWeAUwZ+czzigNhRQvj0y0/7D4BsszYbF/ZyK6WTmG80D5dE9IgsNtRVx1HKM1fVK3hY Yfgg== X-Gm-Message-State: AOAM532dNkE71Kkbi8IiKsPSPvbU9IAUahgGnoZQIUAqNT45HJ3t1bUY GM2OFEODVE8jf5YkL2lLv7E= X-Google-Smtp-Source: ABdhPJxCGZff0/tSIS9qt2CJBzbWNNKhz7XosIfn6Gw8Y0zw78hYYZ2ICzKzSmGZX5lNujE9LnEQaA== X-Received: by 2002:aa7:94a9:: with SMTP id a9mr5744101pfl.312.1590002510748; Wed, 20 May 2020 12:21:50 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.21.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:21:50 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com, intel-wired-lan@lists.osuosl.org Subject: [PATCH bpf-next v5 06/15] i40e: refactor rx_bi accesses Date: Wed, 20 May 2020 21:20:54 +0200 Message-Id: <20200520192103.355233-7-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Björn Töpel As a first step to migrate i40e to the new MEM_TYPE_XSK_BUFF_POOL APIs, code that accesses the rx_bi (SW/shadow ring) is refactored to use an accessor function. Cc: intel-wired-lan@lists.osuosl.org Signed-off-by: Björn Töpel --- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 17 +++++++++++------ drivers/net/ethernet/intel/i40e/i40e_xsk.c | 18 ++++++++++++------ 2 files changed, 23 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index a3772beffe02..9b9ef951f9ce 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -1195,6 +1195,11 @@ static void i40e_update_itr(struct i40e_q_vector *q_vector, rc->total_packets = 0; } +static struct i40e_rx_buffer *i40e_rx_bi(struct i40e_ring *rx_ring, u32 idx) +{ + return &rx_ring->rx_bi[idx]; +} + /** * i40e_reuse_rx_page - page flip buffer and store it back on the ring * @rx_ring: rx descriptor ring to store buffers on @@ -1208,7 +1213,7 @@ static void i40e_reuse_rx_page(struct i40e_ring *rx_ring, struct i40e_rx_buffer *new_buff; u16 nta = rx_ring->next_to_alloc; - new_buff = &rx_ring->rx_bi[nta]; + new_buff = i40e_rx_bi(rx_ring, nta); /* update, and store next to alloc */ nta++; @@ -1272,7 +1277,7 @@ struct i40e_rx_buffer *i40e_clean_programming_status( ntc = rx_ring->next_to_clean; /* fetch, update, and store next to clean */ - rx_buffer = &rx_ring->rx_bi[ntc++]; + rx_buffer = i40e_rx_bi(rx_ring, ntc++); ntc = (ntc < rx_ring->count) ? ntc : 0; rx_ring->next_to_clean = ntc; @@ -1361,7 +1366,7 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring) /* Free all the Rx ring sk_buffs */ for (i = 0; i < rx_ring->count; i++) { - struct i40e_rx_buffer *rx_bi = &rx_ring->rx_bi[i]; + struct i40e_rx_buffer *rx_bi = i40e_rx_bi(rx_ring, i); if (!rx_bi->page) continue; @@ -1592,7 +1597,7 @@ bool i40e_alloc_rx_buffers(struct i40e_ring *rx_ring, u16 cleaned_count) return false; rx_desc = I40E_RX_DESC(rx_ring, ntu); - bi = &rx_ring->rx_bi[ntu]; + bi = i40e_rx_bi(rx_ring, ntu); do { if (!i40e_alloc_mapped_page(rx_ring, bi)) @@ -1614,7 +1619,7 @@ bool i40e_alloc_rx_buffers(struct i40e_ring *rx_ring, u16 cleaned_count) ntu++; if (unlikely(ntu == rx_ring->count)) { rx_desc = I40E_RX_DESC(rx_ring, 0); - bi = rx_ring->rx_bi; + bi = i40e_rx_bi(rx_ring, 0); ntu = 0; } @@ -1981,7 +1986,7 @@ static struct i40e_rx_buffer *i40e_get_rx_buffer(struct i40e_ring *rx_ring, { struct i40e_rx_buffer *rx_buffer; - rx_buffer = &rx_ring->rx_bi[rx_ring->next_to_clean]; + rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); prefetchw(rx_buffer->page); /* we are reusing so sync this buffer for CPU use */ diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index d8b0be29099a..d84ec92f8538 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -9,6 +9,11 @@ #include "i40e_txrx_common.h" #include "i40e_xsk.h" +static struct i40e_rx_buffer *i40e_rx_bi(struct i40e_ring *rx_ring, u32 idx) +{ + return &rx_ring->rx_bi[idx]; +} + /** * i40e_xsk_umem_dma_map - DMA maps all UMEM memory for the netdev * @vsi: Current VSI @@ -321,7 +326,7 @@ __i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 count, bool ok = true; rx_desc = I40E_RX_DESC(rx_ring, ntu); - bi = &rx_ring->rx_bi[ntu]; + bi = i40e_rx_bi(rx_ring, ntu); do { if (!alloc(rx_ring, bi)) { ok = false; @@ -340,7 +345,7 @@ __i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 count, if (unlikely(ntu == rx_ring->count)) { rx_desc = I40E_RX_DESC(rx_ring, 0); - bi = rx_ring->rx_bi; + bi = i40e_rx_bi(rx_ring, 0); ntu = 0; } @@ -402,7 +407,7 @@ static struct i40e_rx_buffer *i40e_get_rx_buffer_zc(struct i40e_ring *rx_ring, { struct i40e_rx_buffer *bi; - bi = &rx_ring->rx_bi[rx_ring->next_to_clean]; + bi = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); /* we are reusing so sync this buffer for CPU use */ dma_sync_single_range_for_cpu(rx_ring->dev, @@ -424,7 +429,8 @@ static struct i40e_rx_buffer *i40e_get_rx_buffer_zc(struct i40e_ring *rx_ring, static void i40e_reuse_rx_buffer_zc(struct i40e_ring *rx_ring, struct i40e_rx_buffer *old_bi) { - struct i40e_rx_buffer *new_bi = &rx_ring->rx_bi[rx_ring->next_to_alloc]; + struct i40e_rx_buffer *new_bi = i40e_rx_bi(rx_ring, + rx_ring->next_to_alloc); u16 nta = rx_ring->next_to_alloc; /* update, and store next to alloc */ @@ -456,7 +462,7 @@ void i40e_zca_free(struct zero_copy_allocator *alloc, unsigned long handle) mask = rx_ring->xsk_umem->chunk_mask; nta = rx_ring->next_to_alloc; - bi = &rx_ring->rx_bi[nta]; + bi = i40e_rx_bi(rx_ring, nta); nta++; rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0; @@ -826,7 +832,7 @@ void i40e_xsk_clean_rx_ring(struct i40e_ring *rx_ring) u16 i; for (i = 0; i < rx_ring->count; i++) { - struct i40e_rx_buffer *rx_bi = &rx_ring->rx_bi[i]; + struct i40e_rx_buffer *rx_bi = i40e_rx_bi(rx_ring, i); if (!rx_bi->addr) continue; From patchwork Wed May 20 19:20:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294652 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=buoUp+5p; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2gW4JwJz9sTC for ; Thu, 21 May 2020 05:21:59 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727000AbgETTV6 (ORCPT ); Wed, 20 May 2020 15:21:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTV5 (ORCPT ); Wed, 20 May 2020 15:21:57 -0400 Received: from mail-pl1-x641.google.com (mail-pl1-x641.google.com [IPv6:2607:f8b0:4864:20::641]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54534C061A0E; Wed, 20 May 2020 12:21:57 -0700 (PDT) Received: by mail-pl1-x641.google.com with SMTP id a13so1745004pls.8; Wed, 20 May 2020 12:21:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=aopS+Xt534w3ri8BrZUDdiHB6f7YGZmwlsFupZRetAc=; b=buoUp+5pdTX31xDltlE5Qlrm3mS3TX+D59PW7mao3UMsUDt6TkL56lJ92uC79T1lcq RXEO3cF2HdiDcO/Sh8p0GJhMVEMc0BHisF8EFs8hBl4/q33PNm0TFRpw1XgDrkFdOm9R LtlSPgH/1CD8tKOEb5Qj6zTv6QCbWQ97Qjy0rQxNPO2jimKWkdKorzBbjzKNck/tY2C6 o0rF/diBn2KYsh+7MvmzyyMO4RpX5czU7SSkk95DnYw03tADyFWwxWJmSZFXSCeIr/dZ /M07Mpf4s5khZogHczId4JgT4uHlUxtYrm0rRYeEzVjtwmJdEjO50XPt1gRX3N+Q+O3o cQhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aopS+Xt534w3ri8BrZUDdiHB6f7YGZmwlsFupZRetAc=; b=YG/LiP2tpF/dfh1jYo8VJkEDje6n9H1uI8pjaYe/HYris4rNJMvsqH0jrghf3WbiSV /1u7X0lcJXiDfVRPjEQq5kxnHILII2AZrBYu8UKayArSCWSMq3vlZkv9vfwuiObrHY4j TT8xq7GQvcNy0FFysAX9IrdeSkeOXCAWOdwgIoEslUHMDMjfOKIZsm6IqsP5Eh33xxvl Ybi9pqBFNb8B/SHz0GcEEdRnwBFu2xNSrWg8+dVdweQjgc1GAC3ZidPglrcEVtI/QqqD 2boP/gDdl09Zr7KYk7uKIyTTu88FqZipBIcW2YWeHyk5ftjmk5af4cAoF3pPe0j7m+CS y+4A== X-Gm-Message-State: AOAM5336udk1U+3Cs5kdvWFQKHtih0r60PSpMOcn/yZznV9IWTnHY0Gw Qc9DKHimFRsq7aXIVJitK1c= X-Google-Smtp-Source: ABdhPJx1pG96yTNUJKQyrs1Cs8O7EPld5dDdD8LoebkbfBP8RPh82wWDjxGKxUwlQJQ2gOSy89V3Xw== X-Received: by 2002:a17:902:465:: with SMTP id 92mr5835044ple.227.1590002516697; Wed, 20 May 2020 12:21:56 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.21.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:21:55 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com, intel-wired-lan@lists.osuosl.org Subject: [PATCH bpf-next v5 07/15] i40e: separate kernel allocated rx_bi rings from AF_XDP rings Date: Wed, 20 May 2020 21:20:55 +0200 Message-Id: <20200520192103.355233-8-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Continuing the path to support MEM_TYPE_XSK_BUFF_POOL, the AF_XDP zero-copy/sk_buff rx_bi rings are now separate. Functions to properly allocate the different rings are added as well. v3->v4: Made i40e_fd_handle_status() static. (kbuild test robot) v4->v5: Fix kdoc for i40e_clean_programming_status(). (Jakub) Cc: intel-wired-lan@lists.osuosl.org Signed-off-by: Björn Töpel --- drivers/net/ethernet/intel/i40e/i40e_main.c | 7 ++ drivers/net/ethernet/intel/i40e/i40e_txrx.c | 119 +++++++----------- drivers/net/ethernet/intel/i40e/i40e_txrx.h | 22 ++-- .../ethernet/intel/i40e/i40e_txrx_common.h | 40 +++++- drivers/net/ethernet/intel/i40e/i40e_type.h | 5 +- drivers/net/ethernet/intel/i40e/i40e_xsk.c | 74 ++++++----- drivers/net/ethernet/intel/i40e/i40e_xsk.h | 2 + 7 files changed, 142 insertions(+), 127 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index d6b2db4f2c65..3e1695bb8262 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -3260,8 +3260,12 @@ static int i40e_configure_rx_ring(struct i40e_ring *ring) if (ring->vsi->type == I40E_VSI_MAIN) xdp_rxq_info_unreg_mem_model(&ring->xdp_rxq); + kfree(ring->rx_bi); ring->xsk_umem = i40e_xsk_umem(ring); if (ring->xsk_umem) { + ret = i40e_alloc_rx_bi_zc(ring); + if (ret) + return ret; ring->rx_buf_len = ring->xsk_umem->chunk_size_nohr - XDP_PACKET_HEADROOM; /* For AF_XDP ZC, we disallow packets to span on @@ -3280,6 +3284,9 @@ static int i40e_configure_rx_ring(struct i40e_ring *ring) ring->queue_index); } else { + ret = i40e_alloc_rx_bi(ring); + if (ret) + return ret; ring->rx_buf_len = vsi->rx_buf_len; if (ring->vsi->type == I40E_VSI_MAIN) { ret = xdp_rxq_info_reg_mem_model(&ring->xdp_rxq, diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index 9b9ef951f9ce..f613782f2f56 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -521,28 +521,29 @@ int i40e_add_del_fdir(struct i40e_vsi *vsi, /** * i40e_fd_handle_status - check the Programming Status for FD * @rx_ring: the Rx ring for this descriptor - * @rx_desc: the Rx descriptor for programming Status, not a packet descriptor. + * @qword0_raw: qword0 + * @qword1: qword1 after le_to_cpu * @prog_id: the id originally used for programming * * This is used to verify if the FD programming or invalidation * requested by SW to the HW is successful or not and take actions accordingly. **/ -void i40e_fd_handle_status(struct i40e_ring *rx_ring, - union i40e_rx_desc *rx_desc, u8 prog_id) +static void i40e_fd_handle_status(struct i40e_ring *rx_ring, u64 qword0_raw, + u64 qword1, u8 prog_id) { struct i40e_pf *pf = rx_ring->vsi->back; struct pci_dev *pdev = pf->pdev; + struct i40e_32b_rx_wb_qw0 *qw0; u32 fcnt_prog, fcnt_avail; u32 error; - u64 qw; - qw = le64_to_cpu(rx_desc->wb.qword1.status_error_len); - error = (qw & I40E_RX_PROG_STATUS_DESC_QW1_ERROR_MASK) >> + qw0 = (struct i40e_32b_rx_wb_qw0 *)&qword0_raw; + error = (qword1 & I40E_RX_PROG_STATUS_DESC_QW1_ERROR_MASK) >> I40E_RX_PROG_STATUS_DESC_QW1_ERROR_SHIFT; if (error == BIT(I40E_RX_PROG_STATUS_DESC_FD_TBL_FULL_SHIFT)) { - pf->fd_inv = le32_to_cpu(rx_desc->wb.qword0.hi_dword.fd_id); - if ((rx_desc->wb.qword0.hi_dword.fd_id != 0) || + pf->fd_inv = le32_to_cpu(qw0->hi_dword.fd_id); + if (qw0->hi_dword.fd_id != 0 || (I40E_DEBUG_FD & pf->hw.debug_mask)) dev_warn(&pdev->dev, "ntuple filter loc = %d, could not be added\n", pf->fd_inv); @@ -560,7 +561,7 @@ void i40e_fd_handle_status(struct i40e_ring *rx_ring, /* store the current atr filter count */ pf->fd_atr_cnt = i40e_get_current_atr_cnt(pf); - if ((rx_desc->wb.qword0.hi_dword.fd_id == 0) && + if (qw0->hi_dword.fd_id == 0 && test_bit(__I40E_FD_SB_AUTO_DISABLED, pf->state)) { /* These set_bit() calls aren't atomic with the * test_bit() here, but worse case we potentially @@ -589,7 +590,7 @@ void i40e_fd_handle_status(struct i40e_ring *rx_ring, } else if (error == BIT(I40E_RX_PROG_STATUS_DESC_NO_FD_ENTRY_SHIFT)) { if (I40E_DEBUG_FD & pf->hw.debug_mask) dev_info(&pdev->dev, "ntuple filter fd_id = %d, could not be removed\n", - rx_desc->wb.qword0.hi_dword.fd_id); + qw0->hi_dword.fd_id); } } @@ -1232,29 +1233,10 @@ static void i40e_reuse_rx_page(struct i40e_ring *rx_ring, } /** - * i40e_rx_is_programming_status - check for programming status descriptor - * @qw: qword representing status_error_len in CPU ordering - * - * The value of in the descriptor length field indicate if this - * is a programming status descriptor for flow director or FCoE - * by the value of I40E_RX_PROG_STATUS_DESC_LENGTH, otherwise - * it is a packet descriptor. - **/ -static inline bool i40e_rx_is_programming_status(u64 qw) -{ - /* The Rx filter programming status and SPH bit occupy the same - * spot in the descriptor. Since we don't support packet split we - * can just reuse the bit as an indication that this is a - * programming status descriptor. - */ - return qw & I40E_RXD_QW1_LENGTH_SPH_MASK; -} - -/** - * i40e_clean_programming_status - try clean the programming status descriptor + * i40e_clean_programming_status - clean the programming status descriptor * @rx_ring: the rx ring that has this descriptor - * @rx_desc: the rx descriptor written back by HW - * @qw: qword representing status_error_len in CPU ordering + * @qword0_raw: qword0 + * @qword1: qword1 representing status_error_len in CPU ordering * * Flow director should handle FD_FILTER_STATUS to check its filter programming * status being successful or not and take actions accordingly. FCoE should @@ -1262,34 +1244,16 @@ static inline bool i40e_rx_is_programming_status(u64 qw) * * Returns an i40e_rx_buffer to reuse if the cleanup occurred, otherwise NULL. **/ -struct i40e_rx_buffer *i40e_clean_programming_status( - struct i40e_ring *rx_ring, - union i40e_rx_desc *rx_desc, - u64 qw) +void i40e_clean_programming_status(struct i40e_ring *rx_ring, u64 qword0_raw, + u64 qword1) { - struct i40e_rx_buffer *rx_buffer; - u32 ntc; u8 id; - if (!i40e_rx_is_programming_status(qw)) - return NULL; - - ntc = rx_ring->next_to_clean; - - /* fetch, update, and store next to clean */ - rx_buffer = i40e_rx_bi(rx_ring, ntc++); - ntc = (ntc < rx_ring->count) ? ntc : 0; - rx_ring->next_to_clean = ntc; - - prefetch(I40E_RX_DESC(rx_ring, ntc)); - - id = (qw & I40E_RX_PROG_STATUS_DESC_QW1_PROGID_MASK) >> + id = (qword1 & I40E_RX_PROG_STATUS_DESC_QW1_PROGID_MASK) >> I40E_RX_PROG_STATUS_DESC_QW1_PROGID_SHIFT; if (id == I40E_RX_PROG_STATUS_DESC_FD_FILTER_STATUS) - i40e_fd_handle_status(rx_ring, rx_desc, id); - - return rx_buffer; + i40e_fd_handle_status(rx_ring, qword0_raw, qword1, id); } /** @@ -1341,13 +1305,25 @@ int i40e_setup_tx_descriptors(struct i40e_ring *tx_ring) return -ENOMEM; } +int i40e_alloc_rx_bi(struct i40e_ring *rx_ring) +{ + unsigned long sz = sizeof(*rx_ring->rx_bi) * rx_ring->count; + + rx_ring->rx_bi = kzalloc(sz, GFP_KERNEL); + return rx_ring->rx_bi ? 0 : -ENOMEM; +} + +static void i40e_clear_rx_bi(struct i40e_ring *rx_ring) +{ + memset(rx_ring->rx_bi, 0, sizeof(*rx_ring->rx_bi) * rx_ring->count); +} + /** * i40e_clean_rx_ring - Free Rx buffers * @rx_ring: ring to be cleaned **/ void i40e_clean_rx_ring(struct i40e_ring *rx_ring) { - unsigned long bi_size; u16 i; /* ring already cleared, nothing to do */ @@ -1393,8 +1369,10 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring) } skip_free: - bi_size = sizeof(struct i40e_rx_buffer) * rx_ring->count; - memset(rx_ring->rx_bi, 0, bi_size); + if (rx_ring->xsk_umem) + i40e_clear_rx_bi_zc(rx_ring); + else + i40e_clear_rx_bi(rx_ring); /* Zero out the descriptor ring */ memset(rx_ring->desc, 0, rx_ring->size); @@ -1435,15 +1413,7 @@ void i40e_free_rx_resources(struct i40e_ring *rx_ring) int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring) { struct device *dev = rx_ring->dev; - int err = -ENOMEM; - int bi_size; - - /* warn if we are about to overwrite the pointer */ - WARN_ON(rx_ring->rx_bi); - bi_size = sizeof(struct i40e_rx_buffer) * rx_ring->count; - rx_ring->rx_bi = kzalloc(bi_size, GFP_KERNEL); - if (!rx_ring->rx_bi) - goto err; + int err; u64_stats_init(&rx_ring->syncp); @@ -1456,7 +1426,7 @@ int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring) if (!rx_ring->desc) { dev_info(dev, "Unable to allocate memory for the Rx descriptor ring, size=%d\n", rx_ring->size); - goto err; + return -ENOMEM; } rx_ring->next_to_alloc = 0; @@ -1468,16 +1438,12 @@ int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring) err = xdp_rxq_info_reg(&rx_ring->xdp_rxq, rx_ring->netdev, rx_ring->queue_index); if (err < 0) - goto err; + return err; } rx_ring->xdp_prog = rx_ring->vsi->xdp_prog; return 0; -err: - kfree(rx_ring->rx_bi); - rx_ring->rx_bi = NULL; - return err; } /** @@ -2387,9 +2353,12 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget) */ dma_rmb(); - rx_buffer = i40e_clean_programming_status(rx_ring, rx_desc, - qword); - if (unlikely(rx_buffer)) { + if (i40e_rx_is_programming_status(qword)) { + i40e_clean_programming_status(rx_ring, + rx_desc->raw.qword[0], + qword); + rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); + i40e_inc_ntc(rx_ring); i40e_reuse_rx_page(rx_ring, rx_buffer); cleaned_count++; continue; diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h index 36d37f31a287..d343498e8de5 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h @@ -296,17 +296,15 @@ struct i40e_tx_buffer { struct i40e_rx_buffer { dma_addr_t dma; - union { - struct { - struct page *page; - __u32 page_offset; - __u16 pagecnt_bias; - }; - struct { - void *addr; - u64 handle; - }; - }; + struct page *page; + __u32 page_offset; + __u16 pagecnt_bias; +}; + +struct i40e_rx_buffer_zc { + dma_addr_t dma; + void *addr; + u64 handle; }; struct i40e_queue_stats { @@ -358,6 +356,7 @@ struct i40e_ring { union { struct i40e_tx_buffer *tx_bi; struct i40e_rx_buffer *rx_bi; + struct i40e_rx_buffer_zc *rx_bi_zc; }; DECLARE_BITMAP(state, __I40E_RING_STATE_NBITS); u16 queue_index; /* Queue number of ring */ @@ -495,6 +494,7 @@ int __i40e_maybe_stop_tx(struct i40e_ring *tx_ring, int size); bool __i40e_chk_linearize(struct sk_buff *skb); int i40e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, u32 flags); +int i40e_alloc_rx_bi(struct i40e_ring *rx_ring); /** * i40e_get_head - Retrieve head from head writeback diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx_common.h b/drivers/net/ethernet/intel/i40e/i40e_txrx_common.h index 8af0e99c6c0d..667c4dc4b39f 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx_common.h +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx_common.h @@ -4,13 +4,9 @@ #ifndef I40E_TXRX_COMMON_ #define I40E_TXRX_COMMON_ -void i40e_fd_handle_status(struct i40e_ring *rx_ring, - union i40e_rx_desc *rx_desc, u8 prog_id); int i40e_xmit_xdp_tx_ring(struct xdp_buff *xdp, struct i40e_ring *xdp_ring); -struct i40e_rx_buffer *i40e_clean_programming_status( - struct i40e_ring *rx_ring, - union i40e_rx_desc *rx_desc, - u64 qw); +void i40e_clean_programming_status(struct i40e_ring *rx_ring, u64 qword0_raw, + u64 qword1); void i40e_process_skb_fields(struct i40e_ring *rx_ring, union i40e_rx_desc *rx_desc, struct sk_buff *skb); void i40e_xdp_ring_update_tail(struct i40e_ring *xdp_ring); @@ -84,6 +80,38 @@ static inline void i40e_arm_wb(struct i40e_ring *tx_ring, } } +/** + * i40e_rx_is_programming_status - check for programming status descriptor + * @qword1: qword1 representing status_error_len in CPU ordering + * + * The value of in the descriptor length field indicate if this + * is a programming status descriptor for flow director or FCoE + * by the value of I40E_RX_PROG_STATUS_DESC_LENGTH, otherwise + * it is a packet descriptor. + **/ +static inline bool i40e_rx_is_programming_status(u64 qword1) +{ + /* The Rx filter programming status and SPH bit occupy the same + * spot in the descriptor. Since we don't support packet split we + * can just reuse the bit as an indication that this is a + * programming status descriptor. + */ + return qword1 & I40E_RXD_QW1_LENGTH_SPH_MASK; +} + +/** + * i40e_inc_ntc: Advance the next_to_clean index + * @rx_ring: Rx ring + **/ +static inline void i40e_inc_ntc(struct i40e_ring *rx_ring) +{ + u32 ntc = rx_ring->next_to_clean + 1; + + ntc = (ntc < rx_ring->count) ? ntc : 0; + rx_ring->next_to_clean = ntc; + prefetch(I40E_RX_DESC(rx_ring, ntc)); +} + void i40e_xsk_clean_rx_ring(struct i40e_ring *rx_ring); void i40e_xsk_clean_tx_ring(struct i40e_ring *tx_ring); bool i40e_xsk_any_rx_ring_enabled(struct i40e_vsi *vsi); diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h index 6ea2867ff60f..63e098f7cb63 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_type.h +++ b/drivers/net/ethernet/intel/i40e/i40e_type.h @@ -689,7 +689,7 @@ union i40e_32byte_rx_desc { __le64 rsvd2; } read; struct { - struct { + struct i40e_32b_rx_wb_qw0 { struct { union { __le16 mirroring_status; @@ -727,6 +727,9 @@ union i40e_32byte_rx_desc { } hi_dword; } qword3; } wb; /* writeback */ + struct { + u64 qword[4]; + } raw; }; enum i40e_rx_desc_status_bits { diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index d84ec92f8538..4fca52a30ea4 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -9,9 +9,23 @@ #include "i40e_txrx_common.h" #include "i40e_xsk.h" -static struct i40e_rx_buffer *i40e_rx_bi(struct i40e_ring *rx_ring, u32 idx) +int i40e_alloc_rx_bi_zc(struct i40e_ring *rx_ring) { - return &rx_ring->rx_bi[idx]; + unsigned long sz = sizeof(*rx_ring->rx_bi_zc) * rx_ring->count; + + rx_ring->rx_bi_zc = kzalloc(sz, GFP_KERNEL); + return rx_ring->rx_bi_zc ? 0 : -ENOMEM; +} + +void i40e_clear_rx_bi_zc(struct i40e_ring *rx_ring) +{ + memset(rx_ring->rx_bi_zc, 0, + sizeof(*rx_ring->rx_bi_zc) * rx_ring->count); +} + +static struct i40e_rx_buffer_zc *i40e_rx_bi(struct i40e_ring *rx_ring, u32 idx) +{ + return &rx_ring->rx_bi_zc[idx]; } /** @@ -238,7 +252,7 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp) } /** - * i40e_alloc_buffer_zc - Allocates an i40e_rx_buffer + * i40e_alloc_buffer_zc - Allocates an i40e_rx_buffer_zc * @rx_ring: Rx ring * @bi: Rx buffer to populate * @@ -248,7 +262,7 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp) * Returns true for a successful allocation, false otherwise **/ static bool i40e_alloc_buffer_zc(struct i40e_ring *rx_ring, - struct i40e_rx_buffer *bi) + struct i40e_rx_buffer_zc *bi) { struct xdp_umem *umem = rx_ring->xsk_umem; void *addr = bi->addr; @@ -279,7 +293,7 @@ static bool i40e_alloc_buffer_zc(struct i40e_ring *rx_ring, } /** - * i40e_alloc_buffer_slow_zc - Allocates an i40e_rx_buffer + * i40e_alloc_buffer_slow_zc - Allocates an i40e_rx_buffer_zc * @rx_ring: Rx ring * @bi: Rx buffer to populate * @@ -289,7 +303,7 @@ static bool i40e_alloc_buffer_zc(struct i40e_ring *rx_ring, * Returns true for a successful allocation, false otherwise **/ static bool i40e_alloc_buffer_slow_zc(struct i40e_ring *rx_ring, - struct i40e_rx_buffer *bi) + struct i40e_rx_buffer_zc *bi) { struct xdp_umem *umem = rx_ring->xsk_umem; u64 handle, hr; @@ -318,11 +332,11 @@ static bool i40e_alloc_buffer_slow_zc(struct i40e_ring *rx_ring, static __always_inline bool __i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 count, bool alloc(struct i40e_ring *rx_ring, - struct i40e_rx_buffer *bi)) + struct i40e_rx_buffer_zc *bi)) { u16 ntu = rx_ring->next_to_use; union i40e_rx_desc *rx_desc; - struct i40e_rx_buffer *bi; + struct i40e_rx_buffer_zc *bi; bool ok = true; rx_desc = I40E_RX_DESC(rx_ring, ntu); @@ -402,10 +416,11 @@ static bool i40e_alloc_rx_buffers_fast_zc(struct i40e_ring *rx_ring, u16 count) * * Returns the received Rx buffer **/ -static struct i40e_rx_buffer *i40e_get_rx_buffer_zc(struct i40e_ring *rx_ring, - const unsigned int size) +static struct i40e_rx_buffer_zc *i40e_get_rx_buffer_zc( + struct i40e_ring *rx_ring, + const unsigned int size) { - struct i40e_rx_buffer *bi; + struct i40e_rx_buffer_zc *bi; bi = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); @@ -427,10 +442,10 @@ static struct i40e_rx_buffer *i40e_get_rx_buffer_zc(struct i40e_ring *rx_ring, * recycle queue (next_to_alloc). **/ static void i40e_reuse_rx_buffer_zc(struct i40e_ring *rx_ring, - struct i40e_rx_buffer *old_bi) + struct i40e_rx_buffer_zc *old_bi) { - struct i40e_rx_buffer *new_bi = i40e_rx_bi(rx_ring, - rx_ring->next_to_alloc); + struct i40e_rx_buffer_zc *new_bi = i40e_rx_bi(rx_ring, + rx_ring->next_to_alloc); u16 nta = rx_ring->next_to_alloc; /* update, and store next to alloc */ @@ -452,7 +467,7 @@ static void i40e_reuse_rx_buffer_zc(struct i40e_ring *rx_ring, **/ void i40e_zca_free(struct zero_copy_allocator *alloc, unsigned long handle) { - struct i40e_rx_buffer *bi; + struct i40e_rx_buffer_zc *bi; struct i40e_ring *rx_ring; u64 hr, mask; u16 nta; @@ -490,7 +505,7 @@ void i40e_zca_free(struct zero_copy_allocator *alloc, unsigned long handle) * Returns the skb, or NULL on failure. **/ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring, - struct i40e_rx_buffer *bi, + struct i40e_rx_buffer_zc *bi, struct xdp_buff *xdp) { unsigned int metasize = xdp->data - xdp->data_meta; @@ -513,19 +528,6 @@ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring, return skb; } -/** - * i40e_inc_ntc: Advance the next_to_clean index - * @rx_ring: Rx ring - **/ -static void i40e_inc_ntc(struct i40e_ring *rx_ring) -{ - u32 ntc = rx_ring->next_to_clean + 1; - - ntc = (ntc < rx_ring->count) ? ntc : 0; - rx_ring->next_to_clean = ntc; - prefetch(I40E_RX_DESC(rx_ring, ntc)); -} - /** * i40e_clean_rx_irq_zc - Consumes Rx packets from the hardware ring * @rx_ring: Rx ring @@ -547,7 +549,7 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget) xdp.frame_sz = xsk_umem_xdp_frame_sz(umem); while (likely(total_rx_packets < (unsigned int)budget)) { - struct i40e_rx_buffer *bi; + struct i40e_rx_buffer_zc *bi; union i40e_rx_desc *rx_desc; unsigned int size; u64 qword; @@ -568,14 +570,18 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget) */ dma_rmb(); - bi = i40e_clean_programming_status(rx_ring, rx_desc, - qword); - if (unlikely(bi)) { + if (i40e_rx_is_programming_status(qword)) { + i40e_clean_programming_status(rx_ring, + rx_desc->raw.qword[0], + qword); + bi = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); + i40e_inc_ntc(rx_ring); i40e_reuse_rx_buffer_zc(rx_ring, bi); cleaned_count++; continue; } + bi = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); size = (qword & I40E_RXD_QW1_LENGTH_PBUF_MASK) >> I40E_RXD_QW1_LENGTH_PBUF_SHIFT; if (!size) @@ -832,7 +838,7 @@ void i40e_xsk_clean_rx_ring(struct i40e_ring *rx_ring) u16 i; for (i = 0; i < rx_ring->count; i++) { - struct i40e_rx_buffer *rx_bi = i40e_rx_bi(rx_ring, i); + struct i40e_rx_buffer_zc *rx_bi = i40e_rx_bi(rx_ring, i); if (!rx_bi->addr) continue; diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h index 9ed59c14eb55..f5e292c218ee 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h @@ -19,5 +19,7 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget); bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring, int napi_budget); int i40e_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags); +int i40e_alloc_rx_bi_zc(struct i40e_ring *rx_ring); +void i40e_clear_rx_bi_zc(struct i40e_ring *rx_ring); #endif /* _I40E_XSK_H_ */ From patchwork Wed May 20 19:20:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294655 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=BMGaITPK; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2gc68s2z9sTC for ; Thu, 21 May 2020 05:22:04 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727011AbgETTWE (ORCPT ); Wed, 20 May 2020 15:22:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTWD (ORCPT ); Wed, 20 May 2020 15:22:03 -0400 Received: from mail-pl1-x644.google.com (mail-pl1-x644.google.com [IPv6:2607:f8b0:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0440AC061A0E; Wed, 20 May 2020 12:22:03 -0700 (PDT) Received: by mail-pl1-x644.google.com with SMTP id u22so1734406plq.12; Wed, 20 May 2020 12:22:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WWsnXbuy3qB4i4s/QctZchmNsMWt5QFLd5KYA2p1+K0=; b=BMGaITPK1WvJl+fkghOnq+xA10wyLMMzYajcxC60l7/EJi6vkOtpoX5hcmQAhjuy2m p9JW4YhMF7UmoQgdl8RTE0oUm6dxUfwS5tY3HJS0udGnDCO+M+LVag4jQjTULrrGojtA Gty+A4J67C2saTyVVPJAQ27+YKzJ159+y//xP7sJuql1tWYUiF5n5kQtZVI5jwDIZ2kQ 7eK4MFLcHzAZ5caE7w+dv/KASdr3lqkBeGY3RTc9L5dnOVWiE1+DaWKlyUptO+rcQJed sgW/fBjQkYOquTnQvHlkvEbIkTUYOfRMA7qoAeh9iDQ7Xs/NjrARwxBqTSkZI0/3H93z VWDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WWsnXbuy3qB4i4s/QctZchmNsMWt5QFLd5KYA2p1+K0=; b=ena3YgncLgCBdQR8pP6J3YFuC44tfRjXl1C9isuXVAEs0UYWAuTv7OUDqZ2V1hAsYs mVGAU9qIM/EMEsJYhxYCmhGiLwtJjgIsbxbcl0GwXu7uLqAaZsCnLs1+roxTvvzzSmtL iLPfpBbL42wXh+s2aKGZQ4OEvLlx+dLWAAA/5UhmJ5+yCdT0RNEqG7a1aqHt4akXTfSF YllSNAnUS0fuoGqcB4+lphO+i/mSiS4zIR9YF1UGnBdfDJxslIrv8DXcqLm8SbvAqSlV J0gkBjkrmZGi0ysZSpiaRw6vNsuWT1/zNr0SzoMNvHptNhKpUyyAOeZQ5RO4vbrjRIG1 AARw== X-Gm-Message-State: AOAM531YAMap5+NDK+OkyJTw7XzfXLleVnDgAl1O602KgHazOjgcFzWP pNpsqmUk9Fasm7+WWcMS6Gk= X-Google-Smtp-Source: ABdhPJy6uH7JFKNEMFNk763LA0Hv3+8GSiDbpMVAFHxstYITT55crEFrv1efQQOUx9BxG8xB3L3e4Q== X-Received: by 2002:a17:90a:68cd:: with SMTP id q13mr6984306pjj.177.1590002522255; Wed, 20 May 2020 12:22:02 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.21.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:22:01 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com, intel-wired-lan@lists.osuosl.org Subject: [PATCH bpf-next v5 08/15] i40e, xsk: migrate to new MEM_TYPE_XSK_BUFF_POOL Date: Wed, 20 May 2020 21:20:56 +0200 Message-Id: <20200520192103.355233-9-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL APIs. The AF_XDP zero-copy rx_bi ring is now simply a struct xdp_buff pointer. v4->v5: Fixed "warning: Excess function parameter 'bi' description in 'i40e_construct_skb_zc'". (Jakub) Cc: intel-wired-lan@lists.osuosl.org Signed-off-by: Björn Töpel --- drivers/net/ethernet/intel/i40e/i40e_main.c | 19 +- drivers/net/ethernet/intel/i40e/i40e_txrx.h | 9 +- drivers/net/ethernet/intel/i40e/i40e_xsk.c | 353 ++------------------ drivers/net/ethernet/intel/i40e/i40e_xsk.h | 1 - 4 files changed, 47 insertions(+), 335 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 3e1695bb8262..ea7395b391e5 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -3266,21 +3266,19 @@ static int i40e_configure_rx_ring(struct i40e_ring *ring) ret = i40e_alloc_rx_bi_zc(ring); if (ret) return ret; - ring->rx_buf_len = ring->xsk_umem->chunk_size_nohr - - XDP_PACKET_HEADROOM; + ring->rx_buf_len = xsk_umem_get_rx_frame_size(ring->xsk_umem); /* For AF_XDP ZC, we disallow packets to span on * multiple buffers, thus letting us skip that * handling in the fast-path. */ chain_len = 1; - ring->zca.free = i40e_zca_free; ret = xdp_rxq_info_reg_mem_model(&ring->xdp_rxq, - MEM_TYPE_ZERO_COPY, - &ring->zca); + MEM_TYPE_XSK_BUFF_POOL, + NULL); if (ret) return ret; dev_info(&vsi->back->pdev->dev, - "Registered XDP mem model MEM_TYPE_ZERO_COPY on Rx ring %d\n", + "Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring %d\n", ring->queue_index); } else { @@ -3351,9 +3349,12 @@ static int i40e_configure_rx_ring(struct i40e_ring *ring) ring->tail = hw->hw_addr + I40E_QRX_TAIL(pf_q); writel(0, ring->tail); - ok = ring->xsk_umem ? - i40e_alloc_rx_buffers_zc(ring, I40E_DESC_UNUSED(ring)) : - !i40e_alloc_rx_buffers(ring, I40E_DESC_UNUSED(ring)); + if (ring->xsk_umem) { + xsk_buff_set_rxq_info(ring->xsk_umem, &ring->xdp_rxq); + ok = i40e_alloc_rx_buffers_zc(ring, I40E_DESC_UNUSED(ring)); + } else { + ok = !i40e_alloc_rx_buffers(ring, I40E_DESC_UNUSED(ring)); + } if (!ok) { /* Log this in case the user has forgotten to give the kernel * any buffers, even later in the application. diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h index d343498e8de5..5c255977fd58 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h @@ -301,12 +301,6 @@ struct i40e_rx_buffer { __u16 pagecnt_bias; }; -struct i40e_rx_buffer_zc { - dma_addr_t dma; - void *addr; - u64 handle; -}; - struct i40e_queue_stats { u64 packets; u64 bytes; @@ -356,7 +350,7 @@ struct i40e_ring { union { struct i40e_tx_buffer *tx_bi; struct i40e_rx_buffer *rx_bi; - struct i40e_rx_buffer_zc *rx_bi_zc; + struct xdp_buff **rx_bi_zc; }; DECLARE_BITMAP(state, __I40E_RING_STATE_NBITS); u16 queue_index; /* Queue number of ring */ @@ -418,7 +412,6 @@ struct i40e_ring { struct i40e_channel *ch; struct xdp_rxq_info xdp_rxq; struct xdp_umem *xsk_umem; - struct zero_copy_allocator zca; /* ZC allocator anchor */ } ____cacheline_internodealigned_in_smp; static inline bool ring_uses_build_skb(struct i40e_ring *ring) diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index 4fca52a30ea4..f3953744c505 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -23,68 +23,11 @@ void i40e_clear_rx_bi_zc(struct i40e_ring *rx_ring) sizeof(*rx_ring->rx_bi_zc) * rx_ring->count); } -static struct i40e_rx_buffer_zc *i40e_rx_bi(struct i40e_ring *rx_ring, u32 idx) +static struct xdp_buff **i40e_rx_bi(struct i40e_ring *rx_ring, u32 idx) { return &rx_ring->rx_bi_zc[idx]; } -/** - * i40e_xsk_umem_dma_map - DMA maps all UMEM memory for the netdev - * @vsi: Current VSI - * @umem: UMEM to DMA map - * - * Returns 0 on success, <0 on failure - **/ -static int i40e_xsk_umem_dma_map(struct i40e_vsi *vsi, struct xdp_umem *umem) -{ - struct i40e_pf *pf = vsi->back; - struct device *dev; - unsigned int i, j; - dma_addr_t dma; - - dev = &pf->pdev->dev; - for (i = 0; i < umem->npgs; i++) { - dma = dma_map_page_attrs(dev, umem->pgs[i], 0, PAGE_SIZE, - DMA_BIDIRECTIONAL, I40E_RX_DMA_ATTR); - if (dma_mapping_error(dev, dma)) - goto out_unmap; - - umem->pages[i].dma = dma; - } - - return 0; - -out_unmap: - for (j = 0; j < i; j++) { - dma_unmap_page_attrs(dev, umem->pages[i].dma, PAGE_SIZE, - DMA_BIDIRECTIONAL, I40E_RX_DMA_ATTR); - umem->pages[i].dma = 0; - } - - return -1; -} - -/** - * i40e_xsk_umem_dma_unmap - DMA unmaps all UMEM memory for the netdev - * @vsi: Current VSI - * @umem: UMEM to DMA map - **/ -static void i40e_xsk_umem_dma_unmap(struct i40e_vsi *vsi, struct xdp_umem *umem) -{ - struct i40e_pf *pf = vsi->back; - struct device *dev; - unsigned int i; - - dev = &pf->pdev->dev; - - for (i = 0; i < umem->npgs; i++) { - dma_unmap_page_attrs(dev, umem->pages[i].dma, PAGE_SIZE, - DMA_BIDIRECTIONAL, I40E_RX_DMA_ATTR); - - umem->pages[i].dma = 0; - } -} - /** * i40e_xsk_umem_enable - Enable/associate a UMEM to a certain ring/qid * @vsi: Current VSI @@ -97,7 +40,6 @@ static int i40e_xsk_umem_enable(struct i40e_vsi *vsi, struct xdp_umem *umem, u16 qid) { struct net_device *netdev = vsi->netdev; - struct xdp_umem_fq_reuse *reuseq; bool if_running; int err; @@ -111,13 +53,7 @@ static int i40e_xsk_umem_enable(struct i40e_vsi *vsi, struct xdp_umem *umem, qid >= netdev->real_num_tx_queues) return -EINVAL; - reuseq = xsk_reuseq_prepare(vsi->rx_rings[0]->count); - if (!reuseq) - return -ENOMEM; - - xsk_reuseq_free(xsk_reuseq_swap(umem, reuseq)); - - err = i40e_xsk_umem_dma_map(vsi, umem); + err = xsk_buff_dma_map(umem, &vsi->back->pdev->dev, I40E_RX_DMA_ATTR); if (err) return err; @@ -170,7 +106,7 @@ static int i40e_xsk_umem_disable(struct i40e_vsi *vsi, u16 qid) } clear_bit(qid, vsi->af_xdp_zc_qps); - i40e_xsk_umem_dma_unmap(vsi, umem); + xsk_buff_dma_unmap(umem, I40E_RX_DMA_ATTR); if (if_running) { err = i40e_queue_pair_enable(vsi, qid); @@ -209,11 +145,9 @@ int i40e_xsk_umem_setup(struct i40e_vsi *vsi, struct xdp_umem *umem, **/ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp) { - struct xdp_umem *umem = rx_ring->xsk_umem; int err, result = I40E_XDP_PASS; struct i40e_ring *xdp_ring; struct bpf_prog *xdp_prog; - u64 offset; u32 act; rcu_read_lock(); @@ -222,9 +156,6 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp) */ xdp_prog = READ_ONCE(rx_ring->xdp_prog); act = bpf_prog_run_xdp(xdp_prog, xdp); - offset = xdp->data - xdp->data_hard_start; - - xdp->handle = xsk_umem_adjust_offset(umem, xdp->handle, offset); switch (act) { case XDP_PASS: @@ -251,107 +182,26 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp) return result; } -/** - * i40e_alloc_buffer_zc - Allocates an i40e_rx_buffer_zc - * @rx_ring: Rx ring - * @bi: Rx buffer to populate - * - * This function allocates an Rx buffer. The buffer can come from fill - * queue, or via the recycle queue (next_to_alloc). - * - * Returns true for a successful allocation, false otherwise - **/ -static bool i40e_alloc_buffer_zc(struct i40e_ring *rx_ring, - struct i40e_rx_buffer_zc *bi) -{ - struct xdp_umem *umem = rx_ring->xsk_umem; - void *addr = bi->addr; - u64 handle, hr; - - if (addr) { - rx_ring->rx_stats.page_reuse_count++; - return true; - } - - if (!xsk_umem_peek_addr(umem, &handle)) { - rx_ring->rx_stats.alloc_page_failed++; - return false; - } - - hr = umem->headroom + XDP_PACKET_HEADROOM; - - bi->dma = xdp_umem_get_dma(umem, handle); - bi->dma += hr; - - bi->addr = xdp_umem_get_data(umem, handle); - bi->addr += hr; - - bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom); - - xsk_umem_release_addr(umem); - return true; -} - -/** - * i40e_alloc_buffer_slow_zc - Allocates an i40e_rx_buffer_zc - * @rx_ring: Rx ring - * @bi: Rx buffer to populate - * - * This function allocates an Rx buffer. The buffer can come from fill - * queue, or via the reuse queue. - * - * Returns true for a successful allocation, false otherwise - **/ -static bool i40e_alloc_buffer_slow_zc(struct i40e_ring *rx_ring, - struct i40e_rx_buffer_zc *bi) -{ - struct xdp_umem *umem = rx_ring->xsk_umem; - u64 handle, hr; - - if (!xsk_umem_peek_addr_rq(umem, &handle)) { - rx_ring->rx_stats.alloc_page_failed++; - return false; - } - - handle &= rx_ring->xsk_umem->chunk_mask; - - hr = umem->headroom + XDP_PACKET_HEADROOM; - - bi->dma = xdp_umem_get_dma(umem, handle); - bi->dma += hr; - - bi->addr = xdp_umem_get_data(umem, handle); - bi->addr += hr; - - bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom); - - xsk_umem_release_addr_rq(umem); - return true; -} - -static __always_inline bool -__i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 count, - bool alloc(struct i40e_ring *rx_ring, - struct i40e_rx_buffer_zc *bi)) +bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 count) { u16 ntu = rx_ring->next_to_use; union i40e_rx_desc *rx_desc; - struct i40e_rx_buffer_zc *bi; + struct xdp_buff **bi, *xdp; + dma_addr_t dma; bool ok = true; rx_desc = I40E_RX_DESC(rx_ring, ntu); bi = i40e_rx_bi(rx_ring, ntu); do { - if (!alloc(rx_ring, bi)) { + xdp = xsk_buff_alloc(rx_ring->xsk_umem); + if (!xdp) { ok = false; goto no_buffers; } - - dma_sync_single_range_for_device(rx_ring->dev, bi->dma, 0, - rx_ring->rx_buf_len, - DMA_BIDIRECTIONAL); - - rx_desc->read.pkt_addr = cpu_to_le64(bi->dma); + *bi = xdp; + dma = xsk_buff_xdp_get_dma(xdp); + rx_desc->read.pkt_addr = cpu_to_le64(dma); + rx_desc->read.hdr_addr = 0; rx_desc++; bi++; @@ -363,7 +213,6 @@ __i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 count, ntu = 0; } - rx_desc->wb.qword1.status_error_len = 0; count--; } while (count); @@ -374,130 +223,9 @@ __i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 count, return ok; } -/** - * i40e_alloc_rx_buffers_zc - Allocates a number of Rx buffers - * @rx_ring: Rx ring - * @count: The number of buffers to allocate - * - * This function allocates a number of Rx buffers from the reuse queue - * or fill ring and places them on the Rx ring. - * - * Returns true for a successful allocation, false otherwise - **/ -bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 count) -{ - return __i40e_alloc_rx_buffers_zc(rx_ring, count, - i40e_alloc_buffer_slow_zc); -} - -/** - * i40e_alloc_rx_buffers_fast_zc - Allocates a number of Rx buffers - * @rx_ring: Rx ring - * @count: The number of buffers to allocate - * - * This function allocates a number of Rx buffers from the fill ring - * or the internal recycle mechanism and places them on the Rx ring. - * - * Returns true for a successful allocation, false otherwise - **/ -static bool i40e_alloc_rx_buffers_fast_zc(struct i40e_ring *rx_ring, u16 count) -{ - return __i40e_alloc_rx_buffers_zc(rx_ring, count, - i40e_alloc_buffer_zc); -} - -/** - * i40e_get_rx_buffer_zc - Return the current Rx buffer - * @rx_ring: Rx ring - * @size: The size of the rx buffer (read from descriptor) - * - * This function returns the current, received Rx buffer, and also - * does DMA synchronization. the Rx ring. - * - * Returns the received Rx buffer - **/ -static struct i40e_rx_buffer_zc *i40e_get_rx_buffer_zc( - struct i40e_ring *rx_ring, - const unsigned int size) -{ - struct i40e_rx_buffer_zc *bi; - - bi = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); - - /* we are reusing so sync this buffer for CPU use */ - dma_sync_single_range_for_cpu(rx_ring->dev, - bi->dma, 0, - size, - DMA_BIDIRECTIONAL); - - return bi; -} - -/** - * i40e_reuse_rx_buffer_zc - Recycle an Rx buffer - * @rx_ring: Rx ring - * @old_bi: The Rx buffer to recycle - * - * This function recycles a finished Rx buffer, and places it on the - * recycle queue (next_to_alloc). - **/ -static void i40e_reuse_rx_buffer_zc(struct i40e_ring *rx_ring, - struct i40e_rx_buffer_zc *old_bi) -{ - struct i40e_rx_buffer_zc *new_bi = i40e_rx_bi(rx_ring, - rx_ring->next_to_alloc); - u16 nta = rx_ring->next_to_alloc; - - /* update, and store next to alloc */ - nta++; - rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0; - - /* transfer page from old buffer to new buffer */ - new_bi->dma = old_bi->dma; - new_bi->addr = old_bi->addr; - new_bi->handle = old_bi->handle; - - old_bi->addr = NULL; -} - -/** - * i40e_zca_free - Free callback for MEM_TYPE_ZERO_COPY allocations - * @alloc: Zero-copy allocator - * @handle: Buffer handle - **/ -void i40e_zca_free(struct zero_copy_allocator *alloc, unsigned long handle) -{ - struct i40e_rx_buffer_zc *bi; - struct i40e_ring *rx_ring; - u64 hr, mask; - u16 nta; - - rx_ring = container_of(alloc, struct i40e_ring, zca); - hr = rx_ring->xsk_umem->headroom + XDP_PACKET_HEADROOM; - mask = rx_ring->xsk_umem->chunk_mask; - - nta = rx_ring->next_to_alloc; - bi = i40e_rx_bi(rx_ring, nta); - - nta++; - rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0; - - handle &= mask; - - bi->dma = xdp_umem_get_dma(rx_ring->xsk_umem, handle); - bi->dma += hr; - - bi->addr = xdp_umem_get_data(rx_ring->xsk_umem, handle); - bi->addr += hr; - - bi->handle = xsk_umem_adjust_offset(rx_ring->xsk_umem, (u64)handle, - rx_ring->xsk_umem->headroom); -} - /** * i40e_construct_skb_zc - Create skbufff from zero-copy Rx buffer * @rx_ring: Rx ring - * @bi: Rx buffer * @xdp: xdp_buff * * This functions allocates a new skb from a zero-copy Rx buffer. @@ -505,7 +233,6 @@ void i40e_zca_free(struct zero_copy_allocator *alloc, unsigned long handle) * Returns the skb, or NULL on failure. **/ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring, - struct i40e_rx_buffer_zc *bi, struct xdp_buff *xdp) { unsigned int metasize = xdp->data - xdp->data_meta; @@ -524,7 +251,7 @@ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring, if (metasize) skb_metadata_set(skb, metasize); - i40e_reuse_rx_buffer_zc(rx_ring, bi); + xsk_buff_free(xdp); return skb; } @@ -539,25 +266,20 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget) { unsigned int total_rx_bytes = 0, total_rx_packets = 0; u16 cleaned_count = I40E_DESC_UNUSED(rx_ring); - struct xdp_umem *umem = rx_ring->xsk_umem; unsigned int xdp_res, xdp_xmit = 0; bool failure = false; struct sk_buff *skb; - struct xdp_buff xdp; - - xdp.rxq = &rx_ring->xdp_rxq; - xdp.frame_sz = xsk_umem_xdp_frame_sz(umem); while (likely(total_rx_packets < (unsigned int)budget)) { - struct i40e_rx_buffer_zc *bi; union i40e_rx_desc *rx_desc; + struct xdp_buff **bi; unsigned int size; u64 qword; if (cleaned_count >= I40E_RX_BUFFER_WRITE) { failure = failure || - !i40e_alloc_rx_buffers_fast_zc(rx_ring, - cleaned_count); + !i40e_alloc_rx_buffers_zc(rx_ring, + cleaned_count); cleaned_count = 0; } @@ -575,9 +297,10 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget) rx_desc->raw.qword[0], qword); bi = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); - i40e_inc_ntc(rx_ring); - i40e_reuse_rx_buffer_zc(rx_ring, bi); + xsk_buff_free(*bi); + *bi = NULL; cleaned_count++; + i40e_inc_ntc(rx_ring); continue; } @@ -587,22 +310,18 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget) if (!size) break; - bi = i40e_get_rx_buffer_zc(rx_ring, size); - xdp.data = bi->addr; - xdp.data_meta = xdp.data; - xdp.data_hard_start = xdp.data - XDP_PACKET_HEADROOM; - xdp.data_end = xdp.data + size; - xdp.handle = bi->handle; + bi = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); + (*bi)->data_end = (*bi)->data + size; + xsk_buff_dma_sync_for_cpu(*bi); - xdp_res = i40e_run_xdp_zc(rx_ring, &xdp); + xdp_res = i40e_run_xdp_zc(rx_ring, *bi); if (xdp_res) { - if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) { + if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) xdp_xmit |= xdp_res; - bi->addr = NULL; - } else { - i40e_reuse_rx_buffer_zc(rx_ring, bi); - } + else + xsk_buff_free(*bi); + *bi = NULL; total_rx_bytes += size; total_rx_packets++; @@ -618,7 +337,8 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget) * BIT(I40E_RXD_QW1_ERROR_SHIFT). This is due to that * SBP is *not* set in PRT_SBPVSI (default not set). */ - skb = i40e_construct_skb_zc(rx_ring, bi, &xdp); + skb = i40e_construct_skb_zc(rx_ring, *bi); + *bi = NULL; if (!skb) { rx_ring->rx_stats.alloc_buff_failed++; break; @@ -676,10 +396,9 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) if (!xsk_umem_consume_tx(xdp_ring->xsk_umem, &desc)) break; - dma = xdp_umem_get_dma(xdp_ring->xsk_umem, desc.addr); - - dma_sync_single_for_device(xdp_ring->dev, dma, desc.len, - DMA_BIDIRECTIONAL); + dma = xsk_buff_raw_get_dma(xdp_ring->xsk_umem, desc.addr); + xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_umem, dma, + desc.len); tx_bi = &xdp_ring->tx_bi[xdp_ring->next_to_use]; tx_bi->bytecount = desc.len; @@ -838,13 +557,13 @@ void i40e_xsk_clean_rx_ring(struct i40e_ring *rx_ring) u16 i; for (i = 0; i < rx_ring->count; i++) { - struct i40e_rx_buffer_zc *rx_bi = i40e_rx_bi(rx_ring, i); + struct xdp_buff *rx_bi = *i40e_rx_bi(rx_ring, i); - if (!rx_bi->addr) + if (!rx_bi) continue; - xsk_umem_fq_reuse(rx_ring->xsk_umem, rx_bi->handle); - rx_bi->addr = NULL; + xsk_buff_free(rx_bi); + rx_bi = NULL; } } diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h index f5e292c218ee..ea919a7d60ec 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h @@ -12,7 +12,6 @@ int i40e_queue_pair_disable(struct i40e_vsi *vsi, int queue_pair); int i40e_queue_pair_enable(struct i40e_vsi *vsi, int queue_pair); int i40e_xsk_umem_setup(struct i40e_vsi *vsi, struct xdp_umem *umem, u16 qid); -void i40e_zca_free(struct zero_copy_allocator *alloc, unsigned long handle); bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 cleaned_count); int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget); From patchwork Wed May 20 19:20:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294657 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=kPutYhKX; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2gj6bXJz9sSF for ; Thu, 21 May 2020 05:22:09 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726748AbgETTWJ (ORCPT ); Wed, 20 May 2020 15:22:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTWI (ORCPT ); Wed, 20 May 2020 15:22:08 -0400 Received: from mail-pf1-x444.google.com (mail-pf1-x444.google.com [IPv6:2607:f8b0:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0384C061A0E; Wed, 20 May 2020 12:22:08 -0700 (PDT) Received: by mail-pf1-x444.google.com with SMTP id y198so2053415pfb.4; Wed, 20 May 2020 12:22:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=nk0bdDqtdV1QcTA60jsKu+zUHamRghokn4yIOEdpje0=; b=kPutYhKXHsuoexpWbHiPlNbOvYqDzk1xhXmsdbS8uzldHuipM+6uhBwSVjKLzQcFqu qH+gg5bZzqeDFtucgVCiMAYrq385+cwOoAZga4+i8AqTsbS/x1eBz236msOSNPgB5zuo j3yBP26CbxrBbv+O/D5VEFirOMqhAVDkJ3C8WIB98jJZx84LZxl9YFT074Sy0JhDs9YN O9TAucxxT++vZGk10PV/KUC1LqkHrmc0YSgbe2ppP2QzYAUsRoOnyM2PX8tV0Jvi0WJl /XLyrFjx4n+1XGDwSesetkjWYdeEFiwCQFz2bDkn5Bh+/4kRhL8fSERx1r36IkuUl78Q rTfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nk0bdDqtdV1QcTA60jsKu+zUHamRghokn4yIOEdpje0=; b=fGzvO6RilQF/ouDUmOIdNM0nXTA0KibNzjCLuhvJKrte3VTpUrfyKj+/w1a+rWUqqc WAhOM9eoJBDpaLPWfhyCgxREz1r10WLFODigCqyyP4AUXbnLk066+KyUunVd6CCrcg6q NADLPmeM2S3N/OlZRI8mBEhleFZtCJnAUC5fBf6jxRR1jlcEsK8IFxcXLHNt9KFIbVqK lLl2oWAgh8YDOJTx28qkDmg2d9agaM7EvklQhtlujMJPZcDsMKt2027KtfzQY0LNX1a8 DD8fckUJNv0BlIJAz96c4+UoHgssoZjqizcHh3ao3QBIEm+S7s/zDcqpyWP8Gb66k/vv TLRQ== X-Gm-Message-State: AOAM530WTxZQQDZzpXnHGKZMyJy+Z0qWvjvBfwYl4wt6mVNav0oQp8AK Q7KjGtYgyEt40DL2gOlM2WwnI3+Pg1Fpfygf X-Google-Smtp-Source: ABdhPJyIVFP67k7ozgM1RbfgcmWgrLoXt8WSVG+QbuvzXmRpiArVYTnUZN7At8rpevgTtTyDaQhB5w== X-Received: by 2002:a63:2216:: with SMTP id i22mr5228978pgi.359.1590002528088; Wed, 20 May 2020 12:22:08 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.22.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:22:07 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com, intel-wired-lan@lists.osuosl.org Subject: [PATCH bpf-next v5 09/15] ice, xsk: migrate to new MEM_TYPE_XSK_BUFF_POOL Date: Wed, 20 May 2020 21:20:57 +0200 Message-Id: <20200520192103.355233-10-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Björn Töpel Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL APIs. v4->v5: Fixed "warning: Excess function parameter 'alloc' description in 'ice_alloc_rx_bufs_zc'" and "warning: Excess function parameter 'xdp' description in 'ice_construct_skb_zc'". (Jakub) Cc: intel-wired-lan@lists.osuosl.org Signed-off-by: Maciej Fijalkowski Signed-off-by: Björn Töpel --- drivers/net/ethernet/intel/ice/ice_base.c | 16 +- drivers/net/ethernet/intel/ice/ice_txrx.h | 8 +- drivers/net/ethernet/intel/ice/ice_xsk.c | 376 +++------------------- drivers/net/ethernet/intel/ice/ice_xsk.h | 13 +- 4 files changed, 54 insertions(+), 359 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c index a19cd6f5436b..433eb72b1c85 100644 --- a/drivers/net/ethernet/intel/ice/ice_base.c +++ b/drivers/net/ethernet/intel/ice/ice_base.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (c) 2019, Intel Corporation. */ +#include #include "ice_base.h" #include "ice_dcb_lib.h" @@ -308,24 +309,23 @@ int ice_setup_rx_ctx(struct ice_ring *ring) if (ring->xsk_umem) { xdp_rxq_info_unreg_mem_model(&ring->xdp_rxq); - ring->rx_buf_len = ring->xsk_umem->chunk_size_nohr - - XDP_PACKET_HEADROOM; + ring->rx_buf_len = + xsk_umem_get_rx_frame_size(ring->xsk_umem); /* For AF_XDP ZC, we disallow packets to span on * multiple buffers, thus letting us skip that * handling in the fast-path. */ chain_len = 1; - ring->zca.free = ice_zca_free; err = xdp_rxq_info_reg_mem_model(&ring->xdp_rxq, - MEM_TYPE_ZERO_COPY, - &ring->zca); + MEM_TYPE_XSK_BUFF_POOL, + NULL); if (err) return err; + xsk_buff_set_rxq_info(ring->xsk_umem, &ring->xdp_rxq); - dev_info(ice_pf_to_dev(vsi->back), "Registered XDP mem model MEM_TYPE_ZERO_COPY on Rx ring %d\n", + dev_info(ice_pf_to_dev(vsi->back), "Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring %d\n", ring->q_index); } else { - ring->zca.free = NULL; if (!xdp_rxq_info_is_reg(&ring->xdp_rxq)) /* coverity[check_return] */ xdp_rxq_info_reg(&ring->xdp_rxq, @@ -426,7 +426,7 @@ int ice_setup_rx_ctx(struct ice_ring *ring) writel(0, ring->tail); err = ring->xsk_umem ? - ice_alloc_rx_bufs_slow_zc(ring, ICE_DESC_UNUSED(ring)) : + ice_alloc_rx_bufs_zc(ring, ICE_DESC_UNUSED(ring)) : ice_alloc_rx_bufs(ring, ICE_DESC_UNUSED(ring)); if (err) dev_info(ice_pf_to_dev(vsi->back), "Failed allocate some buffers on %sRx ring %d (pf_q %d)\n", diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index 7ee00a128663..d0fd2173854f 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -155,17 +155,16 @@ struct ice_tx_offload_params { }; struct ice_rx_buf { - struct sk_buff *skb; - dma_addr_t dma; union { struct { + struct sk_buff *skb; + dma_addr_t dma; struct page *page; unsigned int page_offset; u16 pagecnt_bias; }; struct { - void *addr; - u64 handle; + struct xdp_buff *xdp; }; }; }; @@ -289,7 +288,6 @@ struct ice_ring { struct rcu_head rcu; /* to avoid race on free */ struct bpf_prog *xdp_prog; struct xdp_umem *xsk_umem; - struct zero_copy_allocator zca; /* CL3 - 3rd cacheline starts here */ struct xdp_rxq_info xdp_rxq; /* CLX - the below items are only accessed infrequently and should be diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 70e204307a93..a73f6c3c70a4 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -279,28 +279,6 @@ static int ice_xsk_alloc_umems(struct ice_vsi *vsi) return 0; } -/** - * ice_xsk_add_umem - add a UMEM region for XDP sockets - * @vsi: VSI to which the UMEM will be added - * @umem: pointer to a requested UMEM region - * @qid: queue ID - * - * Returns 0 on success, negative on error - */ -static int ice_xsk_add_umem(struct ice_vsi *vsi, struct xdp_umem *umem, u16 qid) -{ - int err; - - err = ice_xsk_alloc_umems(vsi); - if (err) - return err; - - vsi->xsk_umems[qid] = umem; - vsi->num_xsk_umems_used++; - - return 0; -} - /** * ice_xsk_remove_umem - Remove an UMEM for a certain ring/qid * @vsi: VSI from which the VSI will be removed @@ -318,65 +296,6 @@ static void ice_xsk_remove_umem(struct ice_vsi *vsi, u16 qid) } } -/** - * ice_xsk_umem_dma_map - DMA map UMEM region for XDP sockets - * @vsi: VSI to map the UMEM region - * @umem: UMEM to map - * - * Returns 0 on success, negative on error - */ -static int ice_xsk_umem_dma_map(struct ice_vsi *vsi, struct xdp_umem *umem) -{ - struct ice_pf *pf = vsi->back; - struct device *dev; - unsigned int i; - - dev = ice_pf_to_dev(pf); - for (i = 0; i < umem->npgs; i++) { - dma_addr_t dma = dma_map_page_attrs(dev, umem->pgs[i], 0, - PAGE_SIZE, - DMA_BIDIRECTIONAL, - ICE_RX_DMA_ATTR); - if (dma_mapping_error(dev, dma)) { - dev_dbg(dev, "XSK UMEM DMA mapping error on page num %d\n", - i); - goto out_unmap; - } - - umem->pages[i].dma = dma; - } - - return 0; - -out_unmap: - for (; i > 0; i--) { - dma_unmap_page_attrs(dev, umem->pages[i].dma, PAGE_SIZE, - DMA_BIDIRECTIONAL, ICE_RX_DMA_ATTR); - umem->pages[i].dma = 0; - } - - return -EFAULT; -} - -/** - * ice_xsk_umem_dma_unmap - DMA unmap UMEM region for XDP sockets - * @vsi: VSI from which the UMEM will be unmapped - * @umem: UMEM to unmap - */ -static void ice_xsk_umem_dma_unmap(struct ice_vsi *vsi, struct xdp_umem *umem) -{ - struct ice_pf *pf = vsi->back; - struct device *dev; - unsigned int i; - - dev = ice_pf_to_dev(pf); - for (i = 0; i < umem->npgs; i++) { - dma_unmap_page_attrs(dev, umem->pages[i].dma, PAGE_SIZE, - DMA_BIDIRECTIONAL, ICE_RX_DMA_ATTR); - - umem->pages[i].dma = 0; - } -} /** * ice_xsk_umem_disable - disable a UMEM region @@ -391,7 +310,7 @@ static int ice_xsk_umem_disable(struct ice_vsi *vsi, u16 qid) !vsi->xsk_umems[qid]) return -EINVAL; - ice_xsk_umem_dma_unmap(vsi, vsi->xsk_umems[qid]); + xsk_buff_dma_unmap(vsi->xsk_umems[qid], ICE_RX_DMA_ATTR); ice_xsk_remove_umem(vsi, qid); return 0; @@ -408,7 +327,6 @@ static int ice_xsk_umem_disable(struct ice_vsi *vsi, u16 qid) static int ice_xsk_umem_enable(struct ice_vsi *vsi, struct xdp_umem *umem, u16 qid) { - struct xdp_umem_fq_reuse *reuseq; int err; if (vsi->type != ICE_VSI_PF) @@ -419,20 +337,18 @@ ice_xsk_umem_enable(struct ice_vsi *vsi, struct xdp_umem *umem, u16 qid) if (qid >= vsi->num_xsk_umems) return -EINVAL; + err = ice_xsk_alloc_umems(vsi); + if (err) + return err; + if (vsi->xsk_umems && vsi->xsk_umems[qid]) return -EBUSY; - reuseq = xsk_reuseq_prepare(vsi->rx_rings[0]->count); - if (!reuseq) - return -ENOMEM; - - xsk_reuseq_free(xsk_reuseq_swap(umem, reuseq)); - - err = ice_xsk_umem_dma_map(vsi, umem); - if (err) - return err; + vsi->xsk_umems[qid] = umem; + vsi->num_xsk_umems_used++; - err = ice_xsk_add_umem(vsi, umem, qid); + err = xsk_buff_dma_map(vsi->xsk_umems[qid], ice_pf_to_dev(vsi->back), + ICE_RX_DMA_ATTR); if (err) return err; @@ -483,138 +399,23 @@ int ice_xsk_umem_setup(struct ice_vsi *vsi, struct xdp_umem *umem, u16 qid) return ret; } -/** - * ice_zca_free - Callback for MEM_TYPE_ZERO_COPY allocations - * @zca: zero-cpoy allocator - * @handle: Buffer handle - */ -void ice_zca_free(struct zero_copy_allocator *zca, unsigned long handle) -{ - struct ice_rx_buf *rx_buf; - struct ice_ring *rx_ring; - struct xdp_umem *umem; - u64 hr, mask; - u16 nta; - - rx_ring = container_of(zca, struct ice_ring, zca); - umem = rx_ring->xsk_umem; - hr = umem->headroom + XDP_PACKET_HEADROOM; - - mask = umem->chunk_mask; - - nta = rx_ring->next_to_alloc; - rx_buf = &rx_ring->rx_buf[nta]; - - nta++; - rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0; - - handle &= mask; - - rx_buf->dma = xdp_umem_get_dma(umem, handle); - rx_buf->dma += hr; - - rx_buf->addr = xdp_umem_get_data(umem, handle); - rx_buf->addr += hr; - - rx_buf->handle = (u64)handle + umem->headroom; -} - -/** - * ice_alloc_buf_fast_zc - Retrieve buffer address from XDP umem - * @rx_ring: ring with an xdp_umem bound to it - * @rx_buf: buffer to which xsk page address will be assigned - * - * This function allocates an Rx buffer in the hot path. - * The buffer can come from fill queue or recycle queue. - * - * Returns true if an assignment was successful, false if not. - */ -static __always_inline bool -ice_alloc_buf_fast_zc(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf) -{ - struct xdp_umem *umem = rx_ring->xsk_umem; - void *addr = rx_buf->addr; - u64 handle, hr; - - if (addr) { - rx_ring->rx_stats.page_reuse_count++; - return true; - } - - if (!xsk_umem_peek_addr(umem, &handle)) { - rx_ring->rx_stats.alloc_page_failed++; - return false; - } - - hr = umem->headroom + XDP_PACKET_HEADROOM; - - rx_buf->dma = xdp_umem_get_dma(umem, handle); - rx_buf->dma += hr; - - rx_buf->addr = xdp_umem_get_data(umem, handle); - rx_buf->addr += hr; - - rx_buf->handle = handle + umem->headroom; - - xsk_umem_release_addr(umem); - return true; -} - -/** - * ice_alloc_buf_slow_zc - Retrieve buffer address from XDP umem - * @rx_ring: ring with an xdp_umem bound to it - * @rx_buf: buffer to which xsk page address will be assigned - * - * This function allocates an Rx buffer in the slow path. - * The buffer can come from fill queue or recycle queue. - * - * Returns true if an assignment was successful, false if not. - */ -static __always_inline bool -ice_alloc_buf_slow_zc(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf) -{ - struct xdp_umem *umem = rx_ring->xsk_umem; - u64 handle, headroom; - - if (!xsk_umem_peek_addr_rq(umem, &handle)) { - rx_ring->rx_stats.alloc_page_failed++; - return false; - } - - handle &= umem->chunk_mask; - headroom = umem->headroom + XDP_PACKET_HEADROOM; - - rx_buf->dma = xdp_umem_get_dma(umem, handle); - rx_buf->dma += headroom; - - rx_buf->addr = xdp_umem_get_data(umem, handle); - rx_buf->addr += headroom; - - rx_buf->handle = handle + umem->headroom; - - xsk_umem_release_addr_rq(umem); - return true; -} - /** * ice_alloc_rx_bufs_zc - allocate a number of Rx buffers * @rx_ring: Rx ring * @count: The number of buffers to allocate - * @alloc: the function pointer to call for allocation * * This function allocates a number of Rx buffers from the fill ring * or the internal recycle mechanism and places them on the Rx ring. * * Returns false if all allocations were successful, true if any fail. */ -static bool -ice_alloc_rx_bufs_zc(struct ice_ring *rx_ring, int count, - bool (*alloc)(struct ice_ring *, struct ice_rx_buf *)) +bool ice_alloc_rx_bufs_zc(struct ice_ring *rx_ring, u16 count) { union ice_32b_rx_flex_desc *rx_desc; u16 ntu = rx_ring->next_to_use; struct ice_rx_buf *rx_buf; bool ret = false; + dma_addr_t dma; if (!count) return false; @@ -623,16 +424,14 @@ ice_alloc_rx_bufs_zc(struct ice_ring *rx_ring, int count, rx_buf = &rx_ring->rx_buf[ntu]; do { - if (!alloc(rx_ring, rx_buf)) { + rx_buf->xdp = xsk_buff_alloc(rx_ring->xsk_umem); + if (!rx_buf->xdp) { ret = true; break; } - dma_sync_single_range_for_device(rx_ring->dev, rx_buf->dma, 0, - rx_ring->rx_buf_len, - DMA_BIDIRECTIONAL); - - rx_desc->read.pkt_addr = cpu_to_le64(rx_buf->dma); + dma = xsk_buff_xdp_get_dma(rx_buf->xdp); + rx_desc->read.pkt_addr = cpu_to_le64(dma); rx_desc->wb.status_error0 = 0; rx_desc++; @@ -652,32 +451,6 @@ ice_alloc_rx_bufs_zc(struct ice_ring *rx_ring, int count, return ret; } -/** - * ice_alloc_rx_bufs_fast_zc - allocate zero copy bufs in the hot path - * @rx_ring: Rx ring - * @count: number of bufs to allocate - * - * Returns false on success, true on failure. - */ -static bool ice_alloc_rx_bufs_fast_zc(struct ice_ring *rx_ring, u16 count) -{ - return ice_alloc_rx_bufs_zc(rx_ring, count, - ice_alloc_buf_fast_zc); -} - -/** - * ice_alloc_rx_bufs_slow_zc - allocate zero copy bufs in the slow path - * @rx_ring: Rx ring - * @count: number of bufs to allocate - * - * Returns false on success, true on failure. - */ -bool ice_alloc_rx_bufs_slow_zc(struct ice_ring *rx_ring, u16 count) -{ - return ice_alloc_rx_bufs_zc(rx_ring, count, - ice_alloc_buf_slow_zc); -} - /** * ice_bump_ntc - Bump the next_to_clean counter of an Rx ring * @rx_ring: Rx ring @@ -691,77 +464,22 @@ static void ice_bump_ntc(struct ice_ring *rx_ring) prefetch(ICE_RX_DESC(rx_ring, ntc)); } -/** - * ice_get_rx_buf_zc - Fetch the current Rx buffer - * @rx_ring: Rx ring - * @size: size of a buffer - * - * This function returns the current, received Rx buffer and does - * DMA synchronization. - * - * Returns a pointer to the received Rx buffer. - */ -static struct ice_rx_buf *ice_get_rx_buf_zc(struct ice_ring *rx_ring, int size) -{ - struct ice_rx_buf *rx_buf; - - rx_buf = &rx_ring->rx_buf[rx_ring->next_to_clean]; - - dma_sync_single_range_for_cpu(rx_ring->dev, rx_buf->dma, 0, - size, DMA_BIDIRECTIONAL); - - return rx_buf; -} - -/** - * ice_reuse_rx_buf_zc - reuse an Rx buffer - * @rx_ring: Rx ring - * @old_buf: The buffer to recycle - * - * This function recycles a finished Rx buffer, and places it on the recycle - * queue (next_to_alloc). - */ -static void -ice_reuse_rx_buf_zc(struct ice_ring *rx_ring, struct ice_rx_buf *old_buf) -{ - unsigned long mask = (unsigned long)rx_ring->xsk_umem->chunk_mask; - u64 hr = rx_ring->xsk_umem->headroom + XDP_PACKET_HEADROOM; - u16 nta = rx_ring->next_to_alloc; - struct ice_rx_buf *new_buf; - - new_buf = &rx_ring->rx_buf[nta++]; - rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0; - - new_buf->dma = old_buf->dma & mask; - new_buf->dma += hr; - - new_buf->addr = (void *)((unsigned long)old_buf->addr & mask); - new_buf->addr += hr; - - new_buf->handle = old_buf->handle & mask; - new_buf->handle += rx_ring->xsk_umem->headroom; - - old_buf->addr = NULL; -} - /** * ice_construct_skb_zc - Create an sk_buff from zero-copy buffer * @rx_ring: Rx ring * @rx_buf: zero-copy Rx buffer - * @xdp: XDP buffer * * This function allocates a new skb from a zero-copy Rx buffer. * * Returns the skb on success, NULL on failure. */ static struct sk_buff * -ice_construct_skb_zc(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf, - struct xdp_buff *xdp) +ice_construct_skb_zc(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf) { - unsigned int metasize = xdp->data - xdp->data_meta; - unsigned int datasize = xdp->data_end - xdp->data; - unsigned int datasize_hard = xdp->data_end - - xdp->data_hard_start; + unsigned int metasize = rx_buf->xdp->data - rx_buf->xdp->data_meta; + unsigned int datasize = rx_buf->xdp->data_end - rx_buf->xdp->data; + unsigned int datasize_hard = rx_buf->xdp->data_end - + rx_buf->xdp->data_hard_start; struct sk_buff *skb; skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize_hard, @@ -769,13 +487,13 @@ ice_construct_skb_zc(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf, if (unlikely(!skb)) return NULL; - skb_reserve(skb, xdp->data - xdp->data_hard_start); - memcpy(__skb_put(skb, datasize), xdp->data, datasize); + skb_reserve(skb, rx_buf->xdp->data - rx_buf->xdp->data_hard_start); + memcpy(__skb_put(skb, datasize), rx_buf->xdp->data, datasize); if (metasize) skb_metadata_set(skb, metasize); - ice_reuse_rx_buf_zc(rx_ring, rx_buf); - + xsk_buff_free(rx_buf->xdp); + rx_buf->xdp = NULL; return skb; } @@ -802,7 +520,6 @@ ice_run_xdp_zc(struct ice_ring *rx_ring, struct xdp_buff *xdp) } act = bpf_prog_run_xdp(xdp_prog, xdp); - xdp->handle += xdp->data - xdp->data_hard_start; switch (act) { case XDP_PASS: break; @@ -840,13 +557,8 @@ int ice_clean_rx_irq_zc(struct ice_ring *rx_ring, int budget) { unsigned int total_rx_bytes = 0, total_rx_packets = 0; u16 cleaned_count = ICE_DESC_UNUSED(rx_ring); - struct xdp_umem *umem = rx_ring->xsk_umem; unsigned int xdp_xmit = 0; bool failure = false; - struct xdp_buff xdp; - - xdp.rxq = &rx_ring->xdp_rxq; - xdp.frame_sz = xsk_umem_xdp_frame_sz(umem); while (likely(total_rx_packets < (unsigned int)budget)) { union ice_32b_rx_flex_desc *rx_desc; @@ -858,8 +570,8 @@ int ice_clean_rx_irq_zc(struct ice_ring *rx_ring, int budget) u8 rx_ptype; if (cleaned_count >= ICE_RX_BUF_WRITE) { - failure |= ice_alloc_rx_bufs_fast_zc(rx_ring, - cleaned_count); + failure |= ice_alloc_rx_bufs_zc(rx_ring, + cleaned_count); cleaned_count = 0; } @@ -880,25 +592,19 @@ int ice_clean_rx_irq_zc(struct ice_ring *rx_ring, int budget) if (!size) break; - rx_buf = ice_get_rx_buf_zc(rx_ring, size); - if (!rx_buf->addr) - break; - xdp.data = rx_buf->addr; - xdp.data_meta = xdp.data; - xdp.data_hard_start = xdp.data - XDP_PACKET_HEADROOM; - xdp.data_end = xdp.data + size; - xdp.handle = rx_buf->handle; + rx_buf = &rx_ring->rx_buf[rx_ring->next_to_clean]; + rx_buf->xdp->data_end = rx_buf->xdp->data + size; + xsk_buff_dma_sync_for_cpu(rx_buf->xdp); - xdp_res = ice_run_xdp_zc(rx_ring, &xdp); + xdp_res = ice_run_xdp_zc(rx_ring, rx_buf->xdp); if (xdp_res) { - if (xdp_res & (ICE_XDP_TX | ICE_XDP_REDIR)) { + if (xdp_res & (ICE_XDP_TX | ICE_XDP_REDIR)) xdp_xmit |= xdp_res; - rx_buf->addr = NULL; - } else { - ice_reuse_rx_buf_zc(rx_ring, rx_buf); - } + else + xsk_buff_free(rx_buf->xdp); + rx_buf->xdp = NULL; total_rx_bytes += size; total_rx_packets++; cleaned_count++; @@ -908,7 +614,7 @@ int ice_clean_rx_irq_zc(struct ice_ring *rx_ring, int budget) } /* XDP_PASS path */ - skb = ice_construct_skb_zc(rx_ring, rx_buf, &xdp); + skb = ice_construct_skb_zc(rx_ring, rx_buf); if (!skb) { rx_ring->rx_stats.alloc_buf_failed++; break; @@ -979,10 +685,9 @@ static bool ice_xmit_zc(struct ice_ring *xdp_ring, int budget) if (!xsk_umem_consume_tx(xdp_ring->xsk_umem, &desc)) break; - dma = xdp_umem_get_dma(xdp_ring->xsk_umem, desc.addr); - - dma_sync_single_for_device(xdp_ring->dev, dma, desc.len, - DMA_BIDIRECTIONAL); + dma = xsk_buff_raw_get_dma(xdp_ring->xsk_umem, desc.addr); + xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_umem, dma, + desc.len); tx_buf->bytecount = desc.len; @@ -1165,11 +870,10 @@ void ice_xsk_clean_rx_ring(struct ice_ring *rx_ring) for (i = 0; i < rx_ring->count; i++) { struct ice_rx_buf *rx_buf = &rx_ring->rx_buf[i]; - if (!rx_buf->addr) + if (!rx_buf->xdp) continue; - xsk_umem_fq_reuse(rx_ring->xsk_umem, rx_buf->handle); - rx_buf->addr = NULL; + rx_buf->xdp = NULL; } } diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.h b/drivers/net/ethernet/intel/ice/ice_xsk.h index 8a4ba7c6d549..fc1a06b4df36 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.h +++ b/drivers/net/ethernet/intel/ice/ice_xsk.h @@ -10,11 +10,10 @@ struct ice_vsi; #ifdef CONFIG_XDP_SOCKETS int ice_xsk_umem_setup(struct ice_vsi *vsi, struct xdp_umem *umem, u16 qid); -void ice_zca_free(struct zero_copy_allocator *zca, unsigned long handle); int ice_clean_rx_irq_zc(struct ice_ring *rx_ring, int budget); bool ice_clean_tx_irq_zc(struct ice_ring *xdp_ring, int budget); int ice_xsk_wakeup(struct net_device *netdev, u32 queue_id, u32 flags); -bool ice_alloc_rx_bufs_slow_zc(struct ice_ring *rx_ring, u16 count); +bool ice_alloc_rx_bufs_zc(struct ice_ring *rx_ring, u16 count); bool ice_xsk_any_rx_ring_ena(struct ice_vsi *vsi); void ice_xsk_clean_rx_ring(struct ice_ring *rx_ring); void ice_xsk_clean_xdp_ring(struct ice_ring *xdp_ring); @@ -27,12 +26,6 @@ ice_xsk_umem_setup(struct ice_vsi __always_unused *vsi, return -EOPNOTSUPP; } -static inline void -ice_zca_free(struct zero_copy_allocator __always_unused *zca, - unsigned long __always_unused handle) -{ -} - static inline int ice_clean_rx_irq_zc(struct ice_ring __always_unused *rx_ring, int __always_unused budget) @@ -48,8 +41,8 @@ ice_clean_tx_irq_zc(struct ice_ring __always_unused *xdp_ring, } static inline bool -ice_alloc_rx_bufs_slow_zc(struct ice_ring __always_unused *rx_ring, - u16 __always_unused count) +ice_alloc_rx_bufs_zc(struct ice_ring __always_unused *rx_ring, + u16 __always_unused count) { return false; } From patchwork Wed May 20 19:20:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294660 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=HL+9kGm3; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2gr54j4z9sSF for ; Thu, 21 May 2020 05:22:16 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726932AbgETTWO (ORCPT ); Wed, 20 May 2020 15:22:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTWO (ORCPT ); Wed, 20 May 2020 15:22:14 -0400 Received: from mail-pj1-x1043.google.com (mail-pj1-x1043.google.com [IPv6:2607:f8b0:4864:20::1043]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2836DC061A0E; Wed, 20 May 2020 12:22:14 -0700 (PDT) Received: by mail-pj1-x1043.google.com with SMTP id ci21so1775601pjb.3; Wed, 20 May 2020 12:22:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=S1dqQiCUAKey5IKmhS8Rr5u0JayAKZpb//8QP4uIvcM=; b=HL+9kGm3Ro9k/CsaMSzZm3LR2VKebZht7UlouHdZ+U2T53N+VEgy1XynksS3QjVlr8 RHv0TR1A+NV7KcPYugu810kiAQoNojwPbkFqiV2ZedbmexKvL74y4iYaXNaFlrxKxtrP 3MnJDZ1OKIT/S6cUIXOiQaAjq/xvutNyyZmRR6l6UqbzaaERw7ZP7lU1h0+6x4s6NR6s VXDVuZ8fmlxO55hh2eYt69uRT5dW1M6J9N+AOnYnOvNjHSrGn7EBQyQlsA/PtqPxMI9I yhH8HoQz7UCFRjoCz9aey41DLA/TmrsJhSyboC+aX8nmHFFnOjDQ9NpE9VyqAWctJrdR 7qww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=S1dqQiCUAKey5IKmhS8Rr5u0JayAKZpb//8QP4uIvcM=; b=puvBkXGFIcYrSpodk7sApctT2wpNk4Lz1yFzYOdwRnB2fK8dpbBCO9qoNltN0XFEJV NwJwtoJ33DVqXcOXpPbrcXe7ywYACARV9rks+cpD/8Pa3AmM5VFAUZq0Be/XsgMYz7ad QqJv697KBRY/kOBXE4h9CZim3FEmbM1N5aesgVp9nRdcmvaqyjnOZPPCBvOfSA2u+Dae HDpYVG1ham/3udTaNOoB6besUzio5vfhJ7aCVJPD/2rg5lEn2yBtdJeyvCeE3OmK0dBh XDW4eAV9PjBhgkL0qTMD0ujy1Lp+wsy/cYVq0NrfJKq4fGU6AT1dTuhYZAWZOeOyshds G8FQ== X-Gm-Message-State: AOAM5320Lq8F873gEmEiVouS1TWtDq3/7GlgCBcyWPPbM/oj2roDVMmg b/GEe6RjDHZWjeWUuS32aP0= X-Google-Smtp-Source: ABdhPJwjExJwGugOqAbI3uevP3/UCEwewOCy3vNQk+j06tVuINE6VvsenMC10SUXOI5oYVzUez+oOg== X-Received: by 2002:a17:902:ea8a:: with SMTP id x10mr5962239plb.61.1590002533582; Wed, 20 May 2020 12:22:13 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.22.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:22:12 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com, intel-wired-lan@lists.osuosl.org Subject: [PATCH bpf-next v5 10/15] ixgbe, xsk: migrate to new MEM_TYPE_XSK_BUFF_POOL Date: Wed, 20 May 2020 21:20:58 +0200 Message-Id: <20200520192103.355233-11-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Björn Töpel Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL APIs. v1->v2: Fixed xdp_buff data_end update. (Björn) Cc: intel-wired-lan@lists.osuosl.org Signed-off-by: Björn Töpel --- drivers/net/ethernet/intel/ixgbe/ixgbe.h | 9 +- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 15 +- .../ethernet/intel/ixgbe/ixgbe_txrx_common.h | 2 +- drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 307 +++--------------- 4 files changed, 62 insertions(+), 271 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h index 2833e4f041ce..5ddfc83a1e46 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h @@ -224,17 +224,17 @@ struct ixgbe_tx_buffer { }; struct ixgbe_rx_buffer { - struct sk_buff *skb; - dma_addr_t dma; union { struct { + struct sk_buff *skb; + dma_addr_t dma; struct page *page; __u32 page_offset; __u16 pagecnt_bias; }; struct { - void *addr; - u64 handle; + bool discard; + struct xdp_buff *xdp; }; }; }; @@ -351,7 +351,6 @@ struct ixgbe_ring { }; struct xdp_rxq_info xdp_rxq; struct xdp_umem *xsk_umem; - struct zero_copy_allocator zca; /* ZC allocator anchor */ u16 ring_idx; /* {rx,tx,xdp}_ring back reference idx */ u16 rx_buf_len; } ____cacheline_internodealigned_in_smp; diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index eab5934b04f5..45fc7ce1a543 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -35,7 +35,7 @@ #include #include #include -#include +#include #include #include "ixgbe.h" @@ -3745,8 +3745,7 @@ static void ixgbe_configure_srrctl(struct ixgbe_adapter *adapter, /* configure the packet buffer length */ if (rx_ring->xsk_umem) { - u32 xsk_buf_len = rx_ring->xsk_umem->chunk_size_nohr - - XDP_PACKET_HEADROOM; + u32 xsk_buf_len = xsk_umem_get_rx_frame_size(rx_ring->xsk_umem); /* If the MAC support setting RXDCTL.RLPML, the * SRRCTL[n].BSIZEPKT is set to PAGE_SIZE and @@ -4093,11 +4092,10 @@ void ixgbe_configure_rx_ring(struct ixgbe_adapter *adapter, xdp_rxq_info_unreg_mem_model(&ring->xdp_rxq); ring->xsk_umem = ixgbe_xsk_umem(adapter, ring); if (ring->xsk_umem) { - ring->zca.free = ixgbe_zca_free; WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq, - MEM_TYPE_ZERO_COPY, - &ring->zca)); - + MEM_TYPE_XSK_BUFF_POOL, + NULL)); + xsk_buff_set_rxq_info(ring->xsk_umem, &ring->xdp_rxq); } else { WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq, MEM_TYPE_PAGE_SHARED, NULL)); @@ -4153,8 +4151,7 @@ void ixgbe_configure_rx_ring(struct ixgbe_adapter *adapter, } if (ring->xsk_umem && hw->mac.type != ixgbe_mac_82599EB) { - u32 xsk_buf_len = ring->xsk_umem->chunk_size_nohr - - XDP_PACKET_HEADROOM; + u32 xsk_buf_len = xsk_umem_get_rx_frame_size(ring->xsk_umem); rxdctl &= ~(IXGBE_RXDCTL_RLPMLMASK | IXGBE_RXDCTL_RLPML_EN); diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_txrx_common.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_txrx_common.h index 6d01700b46bc..7887ae4aaf4f 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_txrx_common.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_txrx_common.h @@ -35,7 +35,7 @@ int ixgbe_xsk_umem_setup(struct ixgbe_adapter *adapter, struct xdp_umem *umem, void ixgbe_zca_free(struct zero_copy_allocator *alloc, unsigned long handle); -void ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count); +bool ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count); int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector, struct ixgbe_ring *rx_ring, const int budget); diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c index 82e4effae704..86add9fbd36c 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c @@ -20,54 +20,11 @@ struct xdp_umem *ixgbe_xsk_umem(struct ixgbe_adapter *adapter, return xdp_get_umem_from_qid(adapter->netdev, qid); } -static int ixgbe_xsk_umem_dma_map(struct ixgbe_adapter *adapter, - struct xdp_umem *umem) -{ - struct device *dev = &adapter->pdev->dev; - unsigned int i, j; - dma_addr_t dma; - - for (i = 0; i < umem->npgs; i++) { - dma = dma_map_page_attrs(dev, umem->pgs[i], 0, PAGE_SIZE, - DMA_BIDIRECTIONAL, IXGBE_RX_DMA_ATTR); - if (dma_mapping_error(dev, dma)) - goto out_unmap; - - umem->pages[i].dma = dma; - } - - return 0; - -out_unmap: - for (j = 0; j < i; j++) { - dma_unmap_page_attrs(dev, umem->pages[i].dma, PAGE_SIZE, - DMA_BIDIRECTIONAL, IXGBE_RX_DMA_ATTR); - umem->pages[i].dma = 0; - } - - return -1; -} - -static void ixgbe_xsk_umem_dma_unmap(struct ixgbe_adapter *adapter, - struct xdp_umem *umem) -{ - struct device *dev = &adapter->pdev->dev; - unsigned int i; - - for (i = 0; i < umem->npgs; i++) { - dma_unmap_page_attrs(dev, umem->pages[i].dma, PAGE_SIZE, - DMA_BIDIRECTIONAL, IXGBE_RX_DMA_ATTR); - - umem->pages[i].dma = 0; - } -} - static int ixgbe_xsk_umem_enable(struct ixgbe_adapter *adapter, struct xdp_umem *umem, u16 qid) { struct net_device *netdev = adapter->netdev; - struct xdp_umem_fq_reuse *reuseq; bool if_running; int err; @@ -78,13 +35,7 @@ static int ixgbe_xsk_umem_enable(struct ixgbe_adapter *adapter, qid >= netdev->real_num_tx_queues) return -EINVAL; - reuseq = xsk_reuseq_prepare(adapter->rx_ring[0]->count); - if (!reuseq) - return -ENOMEM; - - xsk_reuseq_free(xsk_reuseq_swap(umem, reuseq)); - - err = ixgbe_xsk_umem_dma_map(adapter, umem); + err = xsk_buff_dma_map(umem, &adapter->pdev->dev, IXGBE_RX_DMA_ATTR); if (err) return err; @@ -124,7 +75,7 @@ static int ixgbe_xsk_umem_disable(struct ixgbe_adapter *adapter, u16 qid) ixgbe_txrx_ring_disable(adapter, qid); clear_bit(qid, adapter->af_xdp_zc_qps); - ixgbe_xsk_umem_dma_unmap(adapter, umem); + xsk_buff_dma_unmap(umem, IXGBE_RX_DMA_ATTR); if (if_running) ixgbe_txrx_ring_enable(adapter, qid); @@ -143,19 +94,14 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter, struct ixgbe_ring *rx_ring, struct xdp_buff *xdp) { - struct xdp_umem *umem = rx_ring->xsk_umem; int err, result = IXGBE_XDP_PASS; struct bpf_prog *xdp_prog; struct xdp_frame *xdpf; - u64 offset; u32 act; rcu_read_lock(); xdp_prog = READ_ONCE(rx_ring->xdp_prog); act = bpf_prog_run_xdp(xdp_prog, xdp); - offset = xdp->data - xdp->data_hard_start; - - xdp->handle = xsk_umem_adjust_offset(umem, xdp->handle, offset); switch (act) { case XDP_PASS: @@ -186,140 +132,16 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter, return result; } -static struct -ixgbe_rx_buffer *ixgbe_get_rx_buffer_zc(struct ixgbe_ring *rx_ring, - unsigned int size) -{ - struct ixgbe_rx_buffer *bi; - - bi = &rx_ring->rx_buffer_info[rx_ring->next_to_clean]; - - /* we are reusing so sync this buffer for CPU use */ - dma_sync_single_range_for_cpu(rx_ring->dev, - bi->dma, 0, - size, - DMA_BIDIRECTIONAL); - - return bi; -} - -static void ixgbe_reuse_rx_buffer_zc(struct ixgbe_ring *rx_ring, - struct ixgbe_rx_buffer *obi) -{ - u16 nta = rx_ring->next_to_alloc; - struct ixgbe_rx_buffer *nbi; - - nbi = &rx_ring->rx_buffer_info[rx_ring->next_to_alloc]; - /* update, and store next to alloc */ - nta++; - rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0; - - /* transfer page from old buffer to new buffer */ - nbi->dma = obi->dma; - nbi->addr = obi->addr; - nbi->handle = obi->handle; - - obi->addr = NULL; - obi->skb = NULL; -} - -void ixgbe_zca_free(struct zero_copy_allocator *alloc, unsigned long handle) -{ - struct ixgbe_rx_buffer *bi; - struct ixgbe_ring *rx_ring; - u64 hr, mask; - u16 nta; - - rx_ring = container_of(alloc, struct ixgbe_ring, zca); - hr = rx_ring->xsk_umem->headroom + XDP_PACKET_HEADROOM; - mask = rx_ring->xsk_umem->chunk_mask; - - nta = rx_ring->next_to_alloc; - bi = rx_ring->rx_buffer_info; - - nta++; - rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0; - - handle &= mask; - - bi->dma = xdp_umem_get_dma(rx_ring->xsk_umem, handle); - bi->dma += hr; - - bi->addr = xdp_umem_get_data(rx_ring->xsk_umem, handle); - bi->addr += hr; - - bi->handle = xsk_umem_adjust_offset(rx_ring->xsk_umem, (u64)handle, - rx_ring->xsk_umem->headroom); -} - -static bool ixgbe_alloc_buffer_zc(struct ixgbe_ring *rx_ring, - struct ixgbe_rx_buffer *bi) -{ - struct xdp_umem *umem = rx_ring->xsk_umem; - void *addr = bi->addr; - u64 handle, hr; - - if (addr) - return true; - - if (!xsk_umem_peek_addr(umem, &handle)) { - rx_ring->rx_stats.alloc_rx_page_failed++; - return false; - } - - hr = umem->headroom + XDP_PACKET_HEADROOM; - - bi->dma = xdp_umem_get_dma(umem, handle); - bi->dma += hr; - - bi->addr = xdp_umem_get_data(umem, handle); - bi->addr += hr; - - bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom); - - xsk_umem_release_addr(umem); - return true; -} - -static bool ixgbe_alloc_buffer_slow_zc(struct ixgbe_ring *rx_ring, - struct ixgbe_rx_buffer *bi) -{ - struct xdp_umem *umem = rx_ring->xsk_umem; - u64 handle, hr; - - if (!xsk_umem_peek_addr_rq(umem, &handle)) { - rx_ring->rx_stats.alloc_rx_page_failed++; - return false; - } - - handle &= rx_ring->xsk_umem->chunk_mask; - - hr = umem->headroom + XDP_PACKET_HEADROOM; - - bi->dma = xdp_umem_get_dma(umem, handle); - bi->dma += hr; - - bi->addr = xdp_umem_get_data(umem, handle); - bi->addr += hr; - - bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom); - - xsk_umem_release_addr_rq(umem); - return true; -} - -static __always_inline bool -__ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count, - bool alloc(struct ixgbe_ring *rx_ring, - struct ixgbe_rx_buffer *bi)) +bool ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 count) { union ixgbe_adv_rx_desc *rx_desc; struct ixgbe_rx_buffer *bi; u16 i = rx_ring->next_to_use; + dma_addr_t dma; bool ok = true; /* nothing to do */ - if (!cleaned_count) + if (!count) return true; rx_desc = IXGBE_RX_DESC(rx_ring, i); @@ -327,21 +149,18 @@ __ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count, i -= rx_ring->count; do { - if (!alloc(rx_ring, bi)) { + bi->xdp = xsk_buff_alloc(rx_ring->xsk_umem); + if (!bi->xdp) { ok = false; break; } - /* sync the buffer for use by the device */ - dma_sync_single_range_for_device(rx_ring->dev, bi->dma, - bi->page_offset, - rx_ring->rx_buf_len, - DMA_BIDIRECTIONAL); + dma = xsk_buff_xdp_get_dma(bi->xdp); /* Refresh the desc even if buffer_addrs didn't change * because each write-back erases this info. */ - rx_desc->read.pkt_addr = cpu_to_le64(bi->dma); + rx_desc->read.pkt_addr = cpu_to_le64(dma); rx_desc++; bi++; @@ -355,17 +174,14 @@ __ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count, /* clear the length for the next_to_use descriptor */ rx_desc->wb.upper.length = 0; - cleaned_count--; - } while (cleaned_count); + count--; + } while (count); i += rx_ring->count; if (rx_ring->next_to_use != i) { rx_ring->next_to_use = i; - /* update next to alloc since we have filled the ring */ - rx_ring->next_to_alloc = i; - /* Force memory writes to complete before letting h/w * know there are new descriptors to fetch. (Only * applicable for weak-ordered memory model archs, @@ -378,40 +194,27 @@ __ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 cleaned_count, return ok; } -void ixgbe_alloc_rx_buffers_zc(struct ixgbe_ring *rx_ring, u16 count) -{ - __ixgbe_alloc_rx_buffers_zc(rx_ring, count, - ixgbe_alloc_buffer_slow_zc); -} - -static bool ixgbe_alloc_rx_buffers_fast_zc(struct ixgbe_ring *rx_ring, - u16 count) -{ - return __ixgbe_alloc_rx_buffers_zc(rx_ring, count, - ixgbe_alloc_buffer_zc); -} - static struct sk_buff *ixgbe_construct_skb_zc(struct ixgbe_ring *rx_ring, - struct ixgbe_rx_buffer *bi, - struct xdp_buff *xdp) + struct ixgbe_rx_buffer *bi) { - unsigned int metasize = xdp->data - xdp->data_meta; - unsigned int datasize = xdp->data_end - xdp->data; + unsigned int metasize = bi->xdp->data - bi->xdp->data_meta; + unsigned int datasize = bi->xdp->data_end - bi->xdp->data; struct sk_buff *skb; /* allocate a skb to store the frags */ skb = __napi_alloc_skb(&rx_ring->q_vector->napi, - xdp->data_end - xdp->data_hard_start, + bi->xdp->data_end - bi->xdp->data_hard_start, GFP_ATOMIC | __GFP_NOWARN); if (unlikely(!skb)) return NULL; - skb_reserve(skb, xdp->data - xdp->data_hard_start); - memcpy(__skb_put(skb, datasize), xdp->data, datasize); + skb_reserve(skb, bi->xdp->data - bi->xdp->data_hard_start); + memcpy(__skb_put(skb, datasize), bi->xdp->data, datasize); if (metasize) skb_metadata_set(skb, metasize); - ixgbe_reuse_rx_buffer_zc(rx_ring, bi); + xsk_buff_free(bi->xdp); + bi->xdp = NULL; return skb; } @@ -431,14 +234,9 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector, unsigned int total_rx_bytes = 0, total_rx_packets = 0; struct ixgbe_adapter *adapter = q_vector->adapter; u16 cleaned_count = ixgbe_desc_unused(rx_ring); - struct xdp_umem *umem = rx_ring->xsk_umem; unsigned int xdp_res, xdp_xmit = 0; bool failure = false; struct sk_buff *skb; - struct xdp_buff xdp; - - xdp.rxq = &rx_ring->xdp_rxq; - xdp.frame_sz = xsk_umem_xdp_frame_sz(umem); while (likely(total_rx_packets < budget)) { union ixgbe_adv_rx_desc *rx_desc; @@ -448,8 +246,8 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector, /* return some buffers to hardware, one at a time is too slow */ if (cleaned_count >= IXGBE_RX_BUFFER_WRITE) { failure = failure || - !ixgbe_alloc_rx_buffers_fast_zc(rx_ring, - cleaned_count); + !ixgbe_alloc_rx_buffers_zc(rx_ring, + cleaned_count); cleaned_count = 0; } @@ -464,42 +262,40 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector, */ dma_rmb(); - bi = ixgbe_get_rx_buffer_zc(rx_ring, size); + bi = &rx_ring->rx_buffer_info[rx_ring->next_to_clean]; if (unlikely(!ixgbe_test_staterr(rx_desc, IXGBE_RXD_STAT_EOP))) { struct ixgbe_rx_buffer *next_bi; - ixgbe_reuse_rx_buffer_zc(rx_ring, bi); + xsk_buff_free(bi->xdp); + bi->xdp = NULL; ixgbe_inc_ntc(rx_ring); next_bi = &rx_ring->rx_buffer_info[rx_ring->next_to_clean]; - next_bi->skb = ERR_PTR(-EINVAL); + next_bi->discard = true; continue; } - if (unlikely(bi->skb)) { - ixgbe_reuse_rx_buffer_zc(rx_ring, bi); + if (unlikely(bi->discard)) { + xsk_buff_free(bi->xdp); + bi->xdp = NULL; + bi->discard = false; ixgbe_inc_ntc(rx_ring); continue; } - xdp.data = bi->addr; - xdp.data_meta = xdp.data; - xdp.data_hard_start = xdp.data - XDP_PACKET_HEADROOM; - xdp.data_end = xdp.data + size; - xdp.handle = bi->handle; - - xdp_res = ixgbe_run_xdp_zc(adapter, rx_ring, &xdp); + bi->xdp->data_end = bi->xdp->data + size; + xsk_buff_dma_sync_for_cpu(bi->xdp); + xdp_res = ixgbe_run_xdp_zc(adapter, rx_ring, bi->xdp); if (xdp_res) { - if (xdp_res & (IXGBE_XDP_TX | IXGBE_XDP_REDIR)) { + if (xdp_res & (IXGBE_XDP_TX | IXGBE_XDP_REDIR)) xdp_xmit |= xdp_res; - bi->addr = NULL; - bi->skb = NULL; - } else { - ixgbe_reuse_rx_buffer_zc(rx_ring, bi); - } + else + xsk_buff_free(bi->xdp); + + bi->xdp = NULL; total_rx_packets++; total_rx_bytes += size; @@ -509,7 +305,7 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector, } /* XDP_PASS path */ - skb = ixgbe_construct_skb_zc(rx_ring, bi, &xdp); + skb = ixgbe_construct_skb_zc(rx_ring, bi); if (!skb) { rx_ring->rx_stats.alloc_rx_buff_failed++; break; @@ -561,17 +357,17 @@ int ixgbe_clean_rx_irq_zc(struct ixgbe_q_vector *q_vector, void ixgbe_xsk_clean_rx_ring(struct ixgbe_ring *rx_ring) { - u16 i = rx_ring->next_to_clean; - struct ixgbe_rx_buffer *bi = &rx_ring->rx_buffer_info[i]; + struct ixgbe_rx_buffer *bi; + u16 i; - while (i != rx_ring->next_to_alloc) { - xsk_umem_fq_reuse(rx_ring->xsk_umem, bi->handle); - i++; - bi++; - if (i == rx_ring->count) { - i = 0; - bi = rx_ring->rx_buffer_info; - } + for (i = 0; i < rx_ring->count; i++) { + bi = &rx_ring->rx_buffer_info[i]; + + if (!bi->xdp) + continue; + + xsk_buff_free(bi->xdp); + bi->xdp = NULL; } } @@ -594,10 +390,9 @@ static bool ixgbe_xmit_zc(struct ixgbe_ring *xdp_ring, unsigned int budget) if (!xsk_umem_consume_tx(xdp_ring->xsk_umem, &desc)) break; - dma = xdp_umem_get_dma(xdp_ring->xsk_umem, desc.addr); - - dma_sync_single_for_device(xdp_ring->dev, dma, desc.len, - DMA_BIDIRECTIONAL); + dma = xsk_buff_raw_get_dma(xdp_ring->xsk_umem, desc.addr); + xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_umem, dma, + desc.len); tx_bi = &xdp_ring->tx_buffer_info[xdp_ring->next_to_use]; tx_bi->bytecount = desc.len; From patchwork Wed May 20 19:20:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294662 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=rwnVHvXJ; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2gy6S5hz9sSF for ; Thu, 21 May 2020 05:22:22 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727033AbgETTWW (ORCPT ); Wed, 20 May 2020 15:22:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTWU (ORCPT ); Wed, 20 May 2020 15:22:20 -0400 Received: from mail-pj1-x1043.google.com (mail-pj1-x1043.google.com [IPv6:2607:f8b0:4864:20::1043]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F3EAFC061A0E; Wed, 20 May 2020 12:22:19 -0700 (PDT) Received: by mail-pj1-x1043.google.com with SMTP id s69so1769958pjb.4; Wed, 20 May 2020 12:22:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+Voq+MaCZ7huOclIkUT4YsCVMq0fMlP7h2jK8PRiTeo=; b=rwnVHvXJ+jUatsyriao9JXlcxesRnj1lKu4U7cxPIjKZXUVa4hLGU/5aTqRH/Zn0RW +B9KAU21c2somqoScMWsLv0PPYcCCyoC9bGaFbSo4PCEnn8nh+FxO8cl2Q3wlsSBrVZ8 fEuPWG6qVRFn3JOOyr2wSETKOJ+zNhnS/efUq4SPU5yeIGS+i3Ym/IMSlOkLNrUroVFc D8r/10tsvb39Gi426cakElJWhGG6QKV772qCHMhETMR8zMBbResLDmHgKSxs91N0CDSr ZJA5f9GgyeLE7IB+lhshQfs4K1UM32AYApN3MYMmCElDT27bYjVaWu26K81EXh3yoiwH m5zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+Voq+MaCZ7huOclIkUT4YsCVMq0fMlP7h2jK8PRiTeo=; b=gG7dnHDVIGSyHq1hjBES1gTJsBFf7Uikggp5QpK3fSrXOnR7yTV7Pame8XAbuhdFhi nuuHkMDVKp1XuB+Y/2LqNIVVwvM+lzeHpznhNLWpioJ6+0MYiHDjLchu4INlAC6e8y34 XOCUOStwMuGspepzHZARGVpJNbY3jN+lBl7MCHU/LnvFMZ2cnXejdf5UObKGiINCAZ01 v7kIVMRAnR07quZ8sK4wIFzX0hSp4VwxarFKbGR447qLI5pkYhmTPnIM6US9cAtkdywv Fiuxa7Xp2VkXFkEwQ6z2wh0TGK3/aDCpNmgyKL60CjDuB081M/SBPFvlBkFj+KOGf14C LSGg== X-Gm-Message-State: AOAM5333apG5YKRgh38WhkqLH0rQVi+2UskUTrvsvTyZajtrPVgFFgYr sN5ueSi8mjzMve8Q5CvPr70= X-Google-Smtp-Source: ABdhPJxWWl6bSI5o/mzXviOMp1FWpyA/Vr2ahRlOCBAr/0VlYuqi6VR3mhgufLC18va1u7VKLgHrsA== X-Received: by 2002:a17:902:7787:: with SMTP id o7mr1222098pll.52.1590002539226; Wed, 20 May 2020 12:22:19 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.22.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:22:18 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com Subject: [PATCH bpf-next v5 11/15] mlx5, xsk: migrate to new MEM_TYPE_XSK_BUFF_POOL Date: Wed, 20 May 2020 21:20:59 +0200 Message-Id: <20200520192103.355233-12-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Use the new MEM_TYPE_XSK_BUFF_POOL API in lieu of MEM_TYPE_ZERO_COPY in mlx5e. It allows to drop a lot of code from the driver (which is now common in AF_XDP core and was related to XSK RX frame allocation, DMA mapping, etc.) and slightly improve performance (RX +0.8 Mpps, TX +0.4 Mpps). rfc->v1: Put back the sanity check for XSK params, use XSK API to get the total headroom size. (Maxim) v1->v2: Fix DMA address handling, set XDP metadata to invalid. (Maxim) v2->v3: Handle frame_sz, use xsk_buff_xdp_get_frame_dma, use xsk_buff API for DMA sync on TX, add performance numbers. (Maxim) v3->v4: Remove unused variable num_xsk_frames. (Jakub) Signed-off-by: Björn Töpel Signed-off-by: Maxim Mikityanskiy --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 7 +- .../ethernet/mellanox/mlx5/core/en/params.c | 13 +- .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 31 ++--- .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 2 +- .../ethernet/mellanox/mlx5/core/en/xsk/rx.c | 113 ++++-------------- .../ethernet/mellanox/mlx5/core/en/xsk/rx.h | 23 +++- .../ethernet/mellanox/mlx5/core/en/xsk/tx.c | 9 +- .../ethernet/mellanox/mlx5/core/en/xsk/umem.c | 49 +------- .../net/ethernet/mellanox/mlx5/core/en_main.c | 25 +--- .../net/ethernet/mellanox/mlx5/core/en_rx.c | 34 +++++- 10 files changed, 96 insertions(+), 210 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 26911b15f8fe..0a02b804b2fe 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -407,10 +407,7 @@ struct mlx5e_dma_info { dma_addr_t addr; union { struct page *page; - struct { - u64 handle; - void *data; - } xsk; + struct xdp_buff *xsk; }; }; @@ -623,7 +620,6 @@ struct mlx5e_rq { } mpwqe; }; struct { - u16 umem_headroom; u16 headroom; u32 frame0_sz; u8 map_dir; /* dma map direction */ @@ -656,7 +652,6 @@ struct mlx5e_rq { struct page_pool *page_pool; /* AF_XDP zero-copy */ - struct zero_copy_allocator zca; struct xdp_umem *umem; struct work_struct recover_work; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c index eb2e1f2138e4..38e4f19d69f8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c @@ -12,15 +12,16 @@ static inline bool mlx5e_rx_is_xdp(struct mlx5e_params *params, u16 mlx5e_get_linear_rq_headroom(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk) { - u16 headroom = NET_IP_ALIGN; + u16 headroom; - if (mlx5e_rx_is_xdp(params, xsk)) { + if (xsk) + return xsk->headroom; + + headroom = NET_IP_ALIGN; + if (mlx5e_rx_is_xdp(params, xsk)) headroom += XDP_PACKET_HEADROOM; - if (xsk) - headroom += xsk->headroom; - } else { + else headroom += MLX5_RX_HEADROOM; - } return headroom; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index 3507d23f0eb8..a2a194525b15 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -71,7 +71,7 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, xdptxd.data = xdpf->data; xdptxd.len = xdpf->len; - if (xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY) { + if (xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL) { /* The xdp_buff was in the UMEM and was copied into a newly * allocated page. The UMEM page was returned via the ZCA, and * this new page has to be mapped at this point and has to be @@ -119,50 +119,33 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, /* returns true if packet was consumed by xdp */ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di, - void *va, u16 *rx_headroom, u32 *len, bool xsk) + u32 *len, struct xdp_buff *xdp) { struct bpf_prog *prog = READ_ONCE(rq->xdp_prog); - struct xdp_umem *umem = rq->umem; - struct xdp_buff xdp; u32 act; int err; if (!prog) return false; - xdp.data = va + *rx_headroom; - xdp_set_data_meta_invalid(&xdp); - xdp.data_end = xdp.data + *len; - xdp.data_hard_start = va; - if (xsk) - xdp.handle = di->xsk.handle; - xdp.rxq = &rq->xdp_rxq; - xdp.frame_sz = rq->buff.frame0_sz; - - act = bpf_prog_run_xdp(prog, &xdp); - if (xsk) { - u64 off = xdp.data - xdp.data_hard_start; - - xdp.handle = xsk_umem_adjust_offset(umem, xdp.handle, off); - } + act = bpf_prog_run_xdp(prog, xdp); switch (act) { case XDP_PASS: - *rx_headroom = xdp.data - xdp.data_hard_start; - *len = xdp.data_end - xdp.data; + *len = xdp->data_end - xdp->data; return false; case XDP_TX: - if (unlikely(!mlx5e_xmit_xdp_buff(rq->xdpsq, rq, di, &xdp))) + if (unlikely(!mlx5e_xmit_xdp_buff(rq->xdpsq, rq, di, xdp))) goto xdp_abort; __set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags); /* non-atomic */ return true; case XDP_REDIRECT: /* When XDP enabled then page-refcnt==1 here */ - err = xdp_do_redirect(rq->netdev, &xdp, prog); + err = xdp_do_redirect(rq->netdev, xdp, prog); if (unlikely(err)) goto xdp_abort; __set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags); __set_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags); - if (!xsk) + if (xdp->rxq->mem.type != MEM_TYPE_XSK_BUFF_POOL) mlx5e_page_dma_unmap(rq, di); rq->stats->xdp_redirect++; return true; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h index e2e01f064c1e..2e4e117aeb49 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h @@ -63,7 +63,7 @@ struct mlx5e_xsk_param; int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk); bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di, - void *va, u16 *rx_headroom, u32 *len, bool xsk); + u32 *len, struct xdp_buff *xdp); void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq); bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq); void mlx5e_free_xdpsq_descs(struct mlx5e_xdpsq *sq); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c index 62fc8a128a8d..a33a1f762c70 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c @@ -3,71 +3,10 @@ #include "rx.h" #include "en/xdp.h" -#include +#include /* RX data path */ -bool mlx5e_xsk_pages_enough_umem(struct mlx5e_rq *rq, int count) -{ - /* Check in advance that we have enough frames, instead of allocating - * one-by-one, failing and moving frames to the Reuse Ring. - */ - return xsk_umem_has_addrs_rq(rq->umem, count); -} - -int mlx5e_xsk_page_alloc_umem(struct mlx5e_rq *rq, - struct mlx5e_dma_info *dma_info) -{ - struct xdp_umem *umem = rq->umem; - u64 handle; - - if (!xsk_umem_peek_addr_rq(umem, &handle)) - return -ENOMEM; - - dma_info->xsk.handle = xsk_umem_adjust_offset(umem, handle, - rq->buff.umem_headroom); - dma_info->xsk.data = xdp_umem_get_data(umem, dma_info->xsk.handle); - - /* No need to add headroom to the DMA address. In striding RQ case, we - * just provide pages for UMR, and headroom is counted at the setup - * stage when creating a WQE. In non-striding RQ case, headroom is - * accounted in mlx5e_alloc_rx_wqe. - */ - dma_info->addr = xdp_umem_get_dma(umem, handle); - - xsk_umem_release_addr_rq(umem); - - dma_sync_single_for_device(rq->pdev, dma_info->addr, PAGE_SIZE, - DMA_BIDIRECTIONAL); - - return 0; -} - -static inline void mlx5e_xsk_recycle_frame(struct mlx5e_rq *rq, u64 handle) -{ - xsk_umem_fq_reuse(rq->umem, handle & rq->umem->chunk_mask); -} - -/* XSKRQ uses pages from UMEM, they must not be released. They are returned to - * the userspace if possible, and if not, this function is called to reuse them - * in the driver. - */ -void mlx5e_xsk_page_release(struct mlx5e_rq *rq, - struct mlx5e_dma_info *dma_info) -{ - mlx5e_xsk_recycle_frame(rq, dma_info->xsk.handle); -} - -/* Return a frame back to the hardware to fill in again. It is used by XDP when - * the XDP program returns XDP_TX or XDP_REDIRECT not to an XSKMAP. - */ -void mlx5e_xsk_zca_free(struct zero_copy_allocator *zca, unsigned long handle) -{ - struct mlx5e_rq *rq = container_of(zca, struct mlx5e_rq, zca); - - mlx5e_xsk_recycle_frame(rq, handle); -} - static struct sk_buff *mlx5e_xsk_construct_skb(struct mlx5e_rq *rq, void *data, u32 cqe_bcnt) { @@ -90,11 +29,8 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, u32 head_offset, u32 page_idx) { - struct mlx5e_dma_info *di = &wi->umr.dma_info[page_idx]; - u16 rx_headroom = rq->buff.headroom - rq->buff.umem_headroom; + struct xdp_buff *xdp = wi->umr.dma_info[page_idx].xsk; u32 cqe_bcnt32 = cqe_bcnt; - void *va, *data; - u32 frag_size; bool consumed; /* Check packet size. Note LRO doesn't use linear SKB */ @@ -103,22 +39,20 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, return NULL; } - /* head_offset is not used in this function, because di->xsk.data and - * di->addr point directly to the necessary place. Furthermore, in the - * current implementation, UMR pages are mapped to XSK frames, so + /* head_offset is not used in this function, because xdp->data and the + * DMA address point directly to the necessary place. Furthermore, in + * the current implementation, UMR pages are mapped to XSK frames, so * head_offset should always be 0. */ WARN_ON_ONCE(head_offset); - va = di->xsk.data; - data = va + rx_headroom; - frag_size = rq->buff.headroom + cqe_bcnt32; - - dma_sync_single_for_cpu(rq->pdev, di->addr, frag_size, DMA_BIDIRECTIONAL); - prefetch(data); + xdp->data_end = xdp->data + cqe_bcnt32; + xdp_set_data_meta_invalid(xdp); + xsk_buff_dma_sync_for_cpu(xdp); + prefetch(xdp->data); rcu_read_lock(); - consumed = mlx5e_xdp_handle(rq, di, va, &rx_headroom, &cqe_bcnt32, true); + consumed = mlx5e_xdp_handle(rq, NULL, &cqe_bcnt32, xdp); rcu_read_unlock(); /* Possible flows: @@ -145,7 +79,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, /* XDP_PASS: copy the data from the UMEM to a new SKB and reuse the * frame. On SKB allocation failure, NULL is returned. */ - return mlx5e_xsk_construct_skb(rq, data, cqe_bcnt32); + return mlx5e_xsk_construct_skb(rq, xdp->data, cqe_bcnt32); } struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, @@ -153,25 +87,20 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, u32 cqe_bcnt) { - struct mlx5e_dma_info *di = wi->di; - u16 rx_headroom = rq->buff.headroom - rq->buff.umem_headroom; - void *va, *data; + struct xdp_buff *xdp = wi->di->xsk; bool consumed; - u32 frag_size; - /* wi->offset is not used in this function, because di->xsk.data and - * di->addr point directly to the necessary place. Furthermore, in the - * current implementation, one page = one packet = one frame, so + /* wi->offset is not used in this function, because xdp->data and the + * DMA address point directly to the necessary place. Furthermore, the + * XSK allocator allocates frames per packet, instead of pages, so * wi->offset should always be 0. */ WARN_ON_ONCE(wi->offset); - va = di->xsk.data; - data = va + rx_headroom; - frag_size = rq->buff.headroom + cqe_bcnt; - - dma_sync_single_for_cpu(rq->pdev, di->addr, frag_size, DMA_BIDIRECTIONAL); - prefetch(data); + xdp->data_end = xdp->data + cqe_bcnt; + xdp_set_data_meta_invalid(xdp); + xsk_buff_dma_sync_for_cpu(xdp); + prefetch(xdp->data); if (unlikely(get_cqe_opcode(cqe) != MLX5_CQE_RESP_SEND)) { rq->stats->wqe_err++; @@ -179,7 +108,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, } rcu_read_lock(); - consumed = mlx5e_xdp_handle(rq, di, va, &rx_headroom, &cqe_bcnt, true); + consumed = mlx5e_xdp_handle(rq, NULL, &cqe_bcnt, xdp); rcu_read_unlock(); if (likely(consumed)) @@ -189,5 +118,5 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, * will be handled by mlx5e_put_rx_frag. * On SKB allocation failure, NULL is returned. */ - return mlx5e_xsk_construct_skb(rq, data, cqe_bcnt); + return mlx5e_xsk_construct_skb(rq, xdp->data, cqe_bcnt); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h index a8e11adbf426..d147b2f13b54 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h @@ -9,12 +9,6 @@ /* RX data path */ -bool mlx5e_xsk_pages_enough_umem(struct mlx5e_rq *rq, int count); -int mlx5e_xsk_page_alloc_umem(struct mlx5e_rq *rq, - struct mlx5e_dma_info *dma_info); -void mlx5e_xsk_page_release(struct mlx5e_rq *rq, - struct mlx5e_dma_info *dma_info); -void mlx5e_xsk_zca_free(struct zero_copy_allocator *zca, unsigned long handle); struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, u16 cqe_bcnt, @@ -25,6 +19,23 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, u32 cqe_bcnt); +static inline int mlx5e_xsk_page_alloc_umem(struct mlx5e_rq *rq, + struct mlx5e_dma_info *dma_info) +{ + dma_info->xsk = xsk_buff_alloc(rq->umem); + if (!dma_info->xsk) + return -ENOMEM; + + /* Store the DMA address without headroom. In striding RQ case, we just + * provide pages for UMR, and headroom is counted at the setup stage + * when creating a WQE. In non-striding RQ case, headroom is accounted + * in mlx5e_alloc_rx_wqe. + */ + dma_info->addr = xsk_buff_xdp_get_frame_dma(dma_info->xsk); + + return 0; +} + static inline bool mlx5e_xsk_update_rx_wakeup(struct mlx5e_rq *rq, bool alloc_err) { if (!xsk_umem_uses_need_wakeup(rq->umem)) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c index 3bcdb5b2fc20..83dce9cdb8c2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c @@ -5,7 +5,7 @@ #include "umem.h" #include "en/xdp.h" #include "en/params.h" -#include +#include int mlx5e_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags) { @@ -92,12 +92,11 @@ bool mlx5e_xsk_tx(struct mlx5e_xdpsq *sq, unsigned int budget) break; } - xdptxd.dma_addr = xdp_umem_get_dma(umem, desc.addr); - xdptxd.data = xdp_umem_get_data(umem, desc.addr); + xdptxd.dma_addr = xsk_buff_raw_get_dma(umem, desc.addr); + xdptxd.data = xsk_buff_raw_get_data(umem, desc.addr); xdptxd.len = desc.len; - dma_sync_single_for_device(sq->pdev, xdptxd.dma_addr, - xdptxd.len, DMA_BIDIRECTIONAL); + xsk_buff_raw_dma_sync_for_device(umem, xdptxd.dma_addr, xdptxd.len); if (unlikely(!sq->xmit_xdp_frame(sq, &xdptxd, &xdpi, check_result))) { if (sq->mpwqe.wqe) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/umem.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/umem.c index 5e49fdb564b3..7b17fcd0a56d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/umem.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/umem.c @@ -10,40 +10,14 @@ static int mlx5e_xsk_map_umem(struct mlx5e_priv *priv, struct xdp_umem *umem) { struct device *dev = priv->mdev->device; - u32 i; - for (i = 0; i < umem->npgs; i++) { - dma_addr_t dma = dma_map_page(dev, umem->pgs[i], 0, PAGE_SIZE, - DMA_BIDIRECTIONAL); - - if (unlikely(dma_mapping_error(dev, dma))) - goto err_unmap; - umem->pages[i].dma = dma; - } - - return 0; - -err_unmap: - while (i--) { - dma_unmap_page(dev, umem->pages[i].dma, PAGE_SIZE, - DMA_BIDIRECTIONAL); - umem->pages[i].dma = 0; - } - - return -ENOMEM; + return xsk_buff_dma_map(umem, dev, 0); } static void mlx5e_xsk_unmap_umem(struct mlx5e_priv *priv, struct xdp_umem *umem) { - struct device *dev = priv->mdev->device; - u32 i; - - for (i = 0; i < umem->npgs; i++) { - dma_unmap_page(dev, umem->pages[i].dma, PAGE_SIZE, - DMA_BIDIRECTIONAL); - umem->pages[i].dma = 0; - } + return xsk_buff_dma_unmap(umem, 0); } static int mlx5e_xsk_get_umems(struct mlx5e_xsk *xsk) @@ -90,13 +64,14 @@ static void mlx5e_xsk_remove_umem(struct mlx5e_xsk *xsk, u16 ix) static bool mlx5e_xsk_is_umem_sane(struct xdp_umem *umem) { - return umem->headroom <= 0xffff && umem->chunk_size_nohr <= 0xffff; + return xsk_umem_get_headroom(umem) <= 0xffff && + xsk_umem_get_chunk_size(umem) <= 0xffff; } void mlx5e_build_xsk_param(struct xdp_umem *umem, struct mlx5e_xsk_param *xsk) { - xsk->headroom = umem->headroom; - xsk->chunk_size = umem->chunk_size_nohr + umem->headroom; + xsk->headroom = xsk_umem_get_headroom(umem); + xsk->chunk_size = xsk_umem_get_chunk_size(umem); } static int mlx5e_xsk_enable_locked(struct mlx5e_priv *priv, @@ -241,18 +216,6 @@ int mlx5e_xsk_setup_umem(struct net_device *dev, struct xdp_umem *umem, u16 qid) mlx5e_xsk_disable_umem(priv, ix); } -int mlx5e_xsk_resize_reuseq(struct xdp_umem *umem, u32 nentries) -{ - struct xdp_umem_fq_reuse *reuseq; - - reuseq = xsk_reuseq_prepare(nentries); - if (unlikely(!reuseq)) - return -ENOMEM; - xsk_reuseq_free(xsk_reuseq_swap(umem, reuseq)); - - return 0; -} - u16 mlx5e_xsk_first_unused_channel(struct mlx5e_params *params, struct mlx5e_xsk *xsk) { u16 res = xsk->refcnt ? params->num_channels : 0; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 0e4ca08ddca9..4041132723a3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -38,7 +38,7 @@ #include #include #include -#include +#include #include "eswitch.h" #include "en.h" #include "en/txrx.h" @@ -374,7 +374,6 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, struct mlx5_core_dev *mdev = c->mdev; void *rqc = rqp->rqc; void *rqc_wq = MLX5_ADDR_OF(rqc, rqc, wq); - u32 num_xsk_frames = 0; u32 rq_xdp_ix; u32 pool_size; int wq_sz; @@ -414,7 +413,6 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, rq->buff.map_dir = rq->xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE; rq->buff.headroom = mlx5e_get_rq_headroom(mdev, params, xsk); - rq->buff.umem_headroom = xsk ? xsk->headroom : 0; pool_size = 1 << params->log_rq_mtu_frames; switch (rq->wq_type) { @@ -428,10 +426,6 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, wq_sz = mlx5_wq_ll_get_size(&rq->mpwqe.wq); - if (xsk) - num_xsk_frames = wq_sz << - mlx5e_mpwqe_get_log_num_strides(mdev, params, xsk); - pool_size = MLX5_MPWRQ_PAGES_PER_WQE << mlx5e_mpwqe_get_log_rq_size(params, xsk); @@ -483,9 +477,6 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, wq_sz = mlx5_wq_cyc_get_size(&rq->wqe.wq); - if (xsk) - num_xsk_frames = wq_sz << rq->wqe.info.log_num_frags; - rq->wqe.info = rqp->frags_info; rq->buff.frame0_sz = rq->wqe.info.arr[0].frag_stride; @@ -526,19 +517,9 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, } if (xsk) { - rq->buff.frame0_sz = xsk_umem_xdp_frame_sz(umem); - - err = mlx5e_xsk_resize_reuseq(umem, num_xsk_frames); - if (unlikely(err)) { - mlx5_core_err(mdev, "Unable to allocate the Reuse Ring for %u frames\n", - num_xsk_frames); - goto err_free; - } - - rq->zca.free = mlx5e_xsk_zca_free; err = xdp_rxq_info_reg_mem_model(&rq->xdp_rxq, - MEM_TYPE_ZERO_COPY, - &rq->zca); + MEM_TYPE_XSK_BUFF_POOL, NULL); + xsk_buff_set_rxq_info(rq->umem, &rq->xdp_rxq); } else { /* Create a page_pool and register it with rxq */ pp_params.order = 0; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 821f94beda7a..d7b24e8905f1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -300,7 +300,7 @@ static inline void mlx5e_page_release(struct mlx5e_rq *rq, * put into the Reuse Ring, because there is no way to return * the page to the userspace when the interface goes down. */ - mlx5e_xsk_page_release(rq, dma_info); + xsk_buff_free(dma_info->xsk); else mlx5e_page_release_dynamic(rq, dma_info, recycle); } @@ -385,7 +385,11 @@ static int mlx5e_alloc_rx_wqes(struct mlx5e_rq *rq, u16 ix, u8 wqe_bulk) if (rq->umem) { int pages_desired = wqe_bulk << rq->wqe.info.log_num_frags; - if (unlikely(!mlx5e_xsk_pages_enough_umem(rq, pages_desired))) + /* Check in advance that we have enough frames, instead of + * allocating one-by-one, failing and moving frames to the + * Reuse Ring. + */ + if (unlikely(!xsk_buff_can_alloc(rq->umem, pages_desired))) return -ENOMEM; } @@ -480,8 +484,11 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) int err; int i; + /* Check in advance that we have enough frames, instead of allocating + * one-by-one, failing and moving frames to the Reuse Ring. + */ if (rq->umem && - unlikely(!mlx5e_xsk_pages_enough_umem(rq, MLX5_MPWRQ_PAGES_PER_WQE))) { + unlikely(!xsk_buff_can_alloc(rq->umem, MLX5_MPWRQ_PAGES_PER_WQE))) { err = -ENOMEM; goto err; } @@ -1044,12 +1051,24 @@ struct sk_buff *mlx5e_build_linear_skb(struct mlx5e_rq *rq, void *va, return skb; } +static void mlx5e_fill_xdp_buff(struct mlx5e_rq *rq, void *va, u16 headroom, + u32 len, struct xdp_buff *xdp) +{ + xdp->data_hard_start = va; + xdp_set_data_meta_invalid(xdp); + xdp->data = va + headroom; + xdp->data_end = xdp->data + len; + xdp->rxq = &rq->xdp_rxq; + xdp->frame_sz = rq->buff.frame0_sz; +} + struct sk_buff * mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe, struct mlx5e_wqe_frag_info *wi, u32 cqe_bcnt) { struct mlx5e_dma_info *di = wi->di; u16 rx_headroom = rq->buff.headroom; + struct xdp_buff xdp; struct sk_buff *skb; void *va, *data; bool consumed; @@ -1065,11 +1084,13 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe, prefetch(data); rcu_read_lock(); - consumed = mlx5e_xdp_handle(rq, di, va, &rx_headroom, &cqe_bcnt, false); + mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &xdp); + consumed = mlx5e_xdp_handle(rq, di, &cqe_bcnt, &xdp); rcu_read_unlock(); if (consumed) return NULL; /* page/packet was consumed by XDP */ + rx_headroom = xdp.data - xdp.data_hard_start; frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt); skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt); if (unlikely(!skb)) @@ -1343,6 +1364,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, struct mlx5e_dma_info *di = &wi->umr.dma_info[page_idx]; u16 rx_headroom = rq->buff.headroom; u32 cqe_bcnt32 = cqe_bcnt; + struct xdp_buff xdp; struct sk_buff *skb; void *va, *data; u32 frag_size; @@ -1364,7 +1386,8 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, prefetch(data); rcu_read_lock(); - consumed = mlx5e_xdp_handle(rq, di, va, &rx_headroom, &cqe_bcnt32, false); + mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt32, &xdp); + consumed = mlx5e_xdp_handle(rq, di, &cqe_bcnt32, &xdp); rcu_read_unlock(); if (consumed) { if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) @@ -1372,6 +1395,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, return NULL; /* page/packet was consumed by XDP */ } + rx_headroom = xdp.data - xdp.data_hard_start; frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt32); skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt32); if (unlikely(!skb)) From patchwork Wed May 20 19:21:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294664 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=Sq1A3t3e; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2h33m8Kz9sTC for ; Thu, 21 May 2020 05:22:27 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727040AbgETTW0 (ORCPT ); Wed, 20 May 2020 15:22:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTWZ (ORCPT ); Wed, 20 May 2020 15:22:25 -0400 Received: from mail-pl1-x643.google.com (mail-pl1-x643.google.com [IPv6:2607:f8b0:4864:20::643]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 88A36C061A0E; Wed, 20 May 2020 12:22:25 -0700 (PDT) Received: by mail-pl1-x643.google.com with SMTP id f15so1753549plr.3; Wed, 20 May 2020 12:22:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0eUlONg/KJsAH4dzXoWh0oOD0IA/3oWEtQayezuLRu0=; b=Sq1A3t3ezl6KRIJv9Ug4T0ZcqxKnrjK/D3u546HoC+PJBpSJ/y81TqZEa7KCi66nfg 8ib69kQgDoKdmdmBMeqkJryQXPpm1xdKpTjE8Hmg5fnEjklCkLtZF6w552dDB7hBEsPf cZHKYt3ayeDdqhCuASflqmOJFVRrG+lzZJIMlLvm0coFb8Vg3bJDxes1hZMEnocb8/PL xiYnuad4Vj24oxndKCUrZ6Py2/FlvptIiVAztZfJgkOgC4GZd7/F6yQvQKkkqc1u6eiH APPdn5Ee2yDIa83vyp1vQ///JBqXoPhwpr05uhbWWtgegh2tppIp+c2c+j539PH7FH96 808A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0eUlONg/KJsAH4dzXoWh0oOD0IA/3oWEtQayezuLRu0=; b=DkisrxQZZTDdYRZRWKXj1WpykiLx+NYTF4CG4/FRGad10yPJseqvPVg0APnb30MVqi Rllyi3jKuaQvd3WQeODtNgKbP/9dvYZOQYK90Gz/ay+UuygrSCCKy/LWP7CxT3KGR96s qa17SMvTERA68HNDGGJuV2sZyHqM0ynjb1QhmMRpL0sBJYeC1cWSE6ayCVQLvbhqTRZ3 cNKHTysgrMISQuhGkw1znFElgL1Pd5cJrNvs1+8MjPj0gvUGzQ9GjASrZEMgWryqzYzK F8f0up99q37YogKTI8w+noHCEuZCYcyY9cTtU/dsuIUK6pQHvtRabziquq8rR43YU5Kj uQbg== X-Gm-Message-State: AOAM5314H+luJKmUUj+Ny9N/kHx8OLH61QPD8mUOZ7T3IoU40P2aqtjV XmFYmGmSaus5c7voNG1DYDTJGov4eKZCOL8F X-Google-Smtp-Source: ABdhPJzyfb14c57QDhvnNlQ5ZsL5zZXGxoSRDQvQ7OcaQsQeh26Fku7CK77Z4QmlGQEetRC8wecwMg== X-Received: by 2002:a17:90a:332e:: with SMTP id m43mr6997909pjb.236.1590002544878; Wed, 20 May 2020 12:22:24 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.22.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:22:24 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com Subject: [PATCH bpf-next v5 12/15] xsk: remove MEM_TYPE_ZERO_COPY and corresponding code Date: Wed, 20 May 2020 21:21:00 +0200 Message-Id: <20200520192103.355233-13-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel There are no users of MEM_TYPE_ZERO_COPY. Remove all corresponding code, including the "handle" member of struct xdp_buff. rfc->v1: Fixed spelling in commit message. (Björn) Signed-off-by: Björn Töpel --- drivers/net/hyperv/netvsc_bpf.c | 1 - include/net/xdp.h | 9 +- include/net/xdp_sock.h | 45 ---------- include/net/xdp_sock_drv.h | 149 -------------------------------- include/trace/events/xdp.h | 1 - net/core/xdp.c | 42 ++------- net/xdp/xdp_umem.c | 56 +----------- net/xdp/xsk.c | 48 +--------- net/xdp/xsk_buff_pool.c | 7 ++ net/xdp/xsk_queue.c | 62 ------------- net/xdp/xsk_queue.h | 105 ---------------------- 11 files changed, 15 insertions(+), 510 deletions(-) diff --git a/drivers/net/hyperv/netvsc_bpf.c b/drivers/net/hyperv/netvsc_bpf.c index 1e0c024b0a93..8e4141552423 100644 --- a/drivers/net/hyperv/netvsc_bpf.c +++ b/drivers/net/hyperv/netvsc_bpf.c @@ -50,7 +50,6 @@ u32 netvsc_run_xdp(struct net_device *ndev, struct netvsc_channel *nvchan, xdp->data_end = xdp->data + len; xdp->rxq = &nvchan->xdp_rxq; xdp->frame_sz = PAGE_SIZE; - xdp->handle = 0; memcpy(xdp->data, data, len); diff --git a/include/net/xdp.h b/include/net/xdp.h index f432134c7c00..90f11760bd12 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -39,7 +39,6 @@ enum xdp_mem_type { MEM_TYPE_PAGE_SHARED = 0, /* Split-page refcnt based model */ MEM_TYPE_PAGE_ORDER0, /* Orig XDP full page model */ MEM_TYPE_PAGE_POOL, - MEM_TYPE_ZERO_COPY, MEM_TYPE_XSK_BUFF_POOL, MEM_TYPE_MAX, }; @@ -55,10 +54,6 @@ struct xdp_mem_info { struct page_pool; -struct zero_copy_allocator { - void (*free)(struct zero_copy_allocator *zca, unsigned long handle); -}; - struct xdp_rxq_info { struct net_device *dev; u32 queue_index; @@ -71,7 +66,6 @@ struct xdp_buff { void *data_end; void *data_meta; void *data_hard_start; - unsigned long handle; struct xdp_rxq_info *rxq; u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/ }; @@ -120,8 +114,7 @@ struct xdp_frame *convert_to_xdp_frame(struct xdp_buff *xdp) int metasize; int headroom; - if (xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY || - xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL) + if (xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL) return xdp_convert_zc_to_xdp_frame(xdp); /* Assure headroom is available for storing info */ diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index 6e7265f63c04..96bfc5f5f24e 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -17,26 +17,12 @@ struct net_device; struct xsk_queue; struct xdp_buff; -struct xdp_umem_page { - void *addr; - dma_addr_t dma; -}; - -struct xdp_umem_fq_reuse { - u32 nentries; - u32 length; - u64 handles[]; -}; - struct xdp_umem { struct xsk_queue *fq; struct xsk_queue *cq; struct xsk_buff_pool *pool; - struct xdp_umem_page *pages; - u64 chunk_mask; u64 size; u32 headroom; - u32 chunk_size_nohr; u32 chunk_size; struct user_struct *user; refcount_t users; @@ -48,7 +34,6 @@ struct xdp_umem { u8 flags; int id; struct net_device *dev; - struct xdp_umem_fq_reuse *fq_reuse; bool zc; spinlock_t xsk_tx_list_lock; struct list_head xsk_tx_list; @@ -109,21 +94,6 @@ static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, return xs; } -static inline u64 xsk_umem_extract_addr(u64 addr) -{ - return addr & XSK_UNALIGNED_BUF_ADDR_MASK; -} - -static inline u64 xsk_umem_extract_offset(u64 addr) -{ - return addr >> XSK_UNALIGNED_BUF_OFFSET_SHIFT; -} - -static inline u64 xsk_umem_add_offset_to_addr(u64 addr) -{ - return xsk_umem_extract_addr(addr) + xsk_umem_extract_offset(addr); -} - #else static inline int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) @@ -146,21 +116,6 @@ static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, return NULL; } -static inline u64 xsk_umem_extract_addr(u64 addr) -{ - return 0; -} - -static inline u64 xsk_umem_extract_offset(u64 addr) -{ - return 0; -} - -static inline u64 xsk_umem_add_offset_to_addr(u64 addr) -{ - return 0; -} - #endif /* CONFIG_XDP_SOCKETS */ #endif /* _LINUX_XDP_SOCK_H */ diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 7752c8663d1b..ccf848f7efa4 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -11,16 +11,9 @@ #ifdef CONFIG_XDP_SOCKETS -bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt); -bool xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr); -void xsk_umem_release_addr(struct xdp_umem *umem); void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries); bool xsk_umem_consume_tx(struct xdp_umem *umem, struct xdp_desc *desc); void xsk_umem_consume_tx_done(struct xdp_umem *umem); -struct xdp_umem_fq_reuse *xsk_reuseq_prepare(u32 nentries); -struct xdp_umem_fq_reuse *xsk_reuseq_swap(struct xdp_umem *umem, - struct xdp_umem_fq_reuse *newq); -void xsk_reuseq_free(struct xdp_umem_fq_reuse *rq); struct xdp_umem *xdp_get_umem_from_qid(struct net_device *dev, u16 queue_id); void xsk_set_rx_need_wakeup(struct xdp_umem *umem); void xsk_set_tx_need_wakeup(struct xdp_umem *umem); @@ -28,80 +21,6 @@ void xsk_clear_rx_need_wakeup(struct xdp_umem *umem); void xsk_clear_tx_need_wakeup(struct xdp_umem *umem); bool xsk_umem_uses_need_wakeup(struct xdp_umem *umem); -static inline char *xdp_umem_get_data(struct xdp_umem *umem, u64 addr) -{ - unsigned long page_addr; - - addr = xsk_umem_add_offset_to_addr(addr); - page_addr = (unsigned long)umem->pages[addr >> PAGE_SHIFT].addr; - - return (char *)(page_addr & PAGE_MASK) + (addr & ~PAGE_MASK); -} - -static inline dma_addr_t xdp_umem_get_dma(struct xdp_umem *umem, u64 addr) -{ - addr = xsk_umem_add_offset_to_addr(addr); - - return umem->pages[addr >> PAGE_SHIFT].dma + (addr & ~PAGE_MASK); -} - -/* Reuse-queue aware version of FILL queue helpers */ -static inline bool xsk_umem_has_addrs_rq(struct xdp_umem *umem, u32 cnt) -{ - struct xdp_umem_fq_reuse *rq = umem->fq_reuse; - - if (rq->length >= cnt) - return true; - - return xsk_umem_has_addrs(umem, cnt - rq->length); -} - -static inline bool xsk_umem_peek_addr_rq(struct xdp_umem *umem, u64 *addr) -{ - struct xdp_umem_fq_reuse *rq = umem->fq_reuse; - - if (!rq->length) - return xsk_umem_peek_addr(umem, addr); - - *addr = rq->handles[rq->length - 1]; - return addr; -} - -static inline void xsk_umem_release_addr_rq(struct xdp_umem *umem) -{ - struct xdp_umem_fq_reuse *rq = umem->fq_reuse; - - if (!rq->length) - xsk_umem_release_addr(umem); - else - rq->length--; -} - -static inline void xsk_umem_fq_reuse(struct xdp_umem *umem, u64 addr) -{ - struct xdp_umem_fq_reuse *rq = umem->fq_reuse; - - rq->handles[rq->length++] = addr; -} - -/* Handle the offset appropriately depending on aligned or unaligned mode. - * For unaligned mode, we store the offset in the upper 16-bits of the address. - * For aligned mode, we simply add the offset to the address. - */ -static inline u64 xsk_umem_adjust_offset(struct xdp_umem *umem, u64 address, - u64 offset) -{ - if (umem->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG) - return address + (offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT); - else - return address + offset; -} - -static inline u32 xsk_umem_xdp_frame_sz(struct xdp_umem *umem) -{ - return umem->chunk_size_nohr; -} - static inline u32 xsk_umem_get_headroom(struct xdp_umem *umem) { return XDP_PACKET_HEADROOM + umem->headroom; @@ -192,20 +111,6 @@ static inline void xsk_buff_raw_dma_sync_for_device(struct xdp_umem *umem, #else -static inline bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt) -{ - return false; -} - -static inline u64 *xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr) -{ - return NULL; -} - -static inline void xsk_umem_release_addr(struct xdp_umem *umem) -{ -} - static inline void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries) { } @@ -220,55 +125,12 @@ static inline void xsk_umem_consume_tx_done(struct xdp_umem *umem) { } -static inline struct xdp_umem_fq_reuse *xsk_reuseq_prepare(u32 nentries) -{ - return NULL; -} - -static inline struct xdp_umem_fq_reuse *xsk_reuseq_swap( - struct xdp_umem *umem, struct xdp_umem_fq_reuse *newq) -{ - return NULL; -} - -static inline void xsk_reuseq_free(struct xdp_umem_fq_reuse *rq) -{ -} - static inline struct xdp_umem *xdp_get_umem_from_qid(struct net_device *dev, u16 queue_id) { return NULL; } -static inline char *xdp_umem_get_data(struct xdp_umem *umem, u64 addr) -{ - return NULL; -} - -static inline dma_addr_t xdp_umem_get_dma(struct xdp_umem *umem, u64 addr) -{ - return 0; -} - -static inline bool xsk_umem_has_addrs_rq(struct xdp_umem *umem, u32 cnt) -{ - return false; -} - -static inline u64 *xsk_umem_peek_addr_rq(struct xdp_umem *umem, u64 *addr) -{ - return NULL; -} - -static inline void xsk_umem_release_addr_rq(struct xdp_umem *umem) -{ -} - -static inline void xsk_umem_fq_reuse(struct xdp_umem *umem, u64 addr) -{ -} - static inline void xsk_set_rx_need_wakeup(struct xdp_umem *umem) { } @@ -290,17 +152,6 @@ static inline bool xsk_umem_uses_need_wakeup(struct xdp_umem *umem) return false; } -static inline u64 xsk_umem_adjust_offset(struct xdp_umem *umem, u64 handle, - u64 offset) -{ - return 0; -} - -static inline u32 xsk_umem_xdp_frame_sz(struct xdp_umem *umem) -{ - return 0; -} - static inline u32 xsk_umem_get_headroom(struct xdp_umem *umem) { return 0; diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h index 48547a12fa27..b73d3e141323 100644 --- a/include/trace/events/xdp.h +++ b/include/trace/events/xdp.h @@ -287,7 +287,6 @@ TRACE_EVENT(xdp_devmap_xmit, FN(PAGE_SHARED) \ FN(PAGE_ORDER0) \ FN(PAGE_POOL) \ - FN(ZERO_COPY) \ FN(XSK_BUFF_POOL) #define __MEM_TYPE_TP_FN(x) \ diff --git a/net/core/xdp.c b/net/core/xdp.c index f0ce8b195193..a8c2f243367d 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -110,27 +110,6 @@ static void mem_allocator_disconnect(void *allocator) mutex_unlock(&mem_id_lock); } -static void mem_id_disconnect(int id) -{ - struct xdp_mem_allocator *xa; - - mutex_lock(&mem_id_lock); - - xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params); - if (!xa) { - mutex_unlock(&mem_id_lock); - WARN(1, "Request remove non-existing id(%d), driver bug?", id); - return; - } - - trace_mem_disconnect(xa); - - if (!rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params)) - call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free); - - mutex_unlock(&mem_id_lock); -} - void xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq) { struct xdp_mem_allocator *xa; @@ -144,9 +123,6 @@ void xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq) if (id == 0) return; - if (xdp_rxq->mem.type == MEM_TYPE_ZERO_COPY) - return mem_id_disconnect(id); - if (xdp_rxq->mem.type == MEM_TYPE_PAGE_POOL) { rcu_read_lock(); xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params); @@ -302,7 +278,7 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq, xdp_rxq->mem.type = type; if (!allocator) { - if (type == MEM_TYPE_PAGE_POOL || type == MEM_TYPE_ZERO_COPY) + if (type == MEM_TYPE_PAGE_POOL) return -EINVAL; /* Setup time check page_pool req */ return 0; } @@ -362,7 +338,7 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_reg_mem_model); * of xdp_frames/pages in those cases. */ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, - unsigned long handle, struct xdp_buff *xdp) + struct xdp_buff *xdp) { struct xdp_mem_allocator *xa; struct page *page; @@ -384,14 +360,6 @@ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, page = virt_to_page(data); /* Assumes order0 page*/ put_page(page); break; - case MEM_TYPE_ZERO_COPY: - /* NB! Only valid from an xdp_buff! */ - rcu_read_lock(); - /* mem->id is valid, checked in xdp_rxq_info_reg_mem_model() */ - xa = rhashtable_lookup(mem_id_ht, &mem->id, mem_id_rht_params); - xa->zc_alloc->free(xa->zc_alloc, handle); - rcu_read_unlock(); - break; case MEM_TYPE_XSK_BUFF_POOL: /* NB! Only valid from an xdp_buff! */ xsk_buff_free(xdp); @@ -404,19 +372,19 @@ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, void xdp_return_frame(struct xdp_frame *xdpf) { - __xdp_return(xdpf->data, &xdpf->mem, false, 0, NULL); + __xdp_return(xdpf->data, &xdpf->mem, false, NULL); } EXPORT_SYMBOL_GPL(xdp_return_frame); void xdp_return_frame_rx_napi(struct xdp_frame *xdpf) { - __xdp_return(xdpf->data, &xdpf->mem, true, 0, NULL); + __xdp_return(xdpf->data, &xdpf->mem, true, NULL); } EXPORT_SYMBOL_GPL(xdp_return_frame_rx_napi); void xdp_return_buff(struct xdp_buff *xdp) { - __xdp_return(xdp->data, &xdp->rxq->mem, true, xdp->handle, xdp); + __xdp_return(xdp->data, &xdp->rxq->mem, true, xdp); } EXPORT_SYMBOL_GPL(xdp_return_buff); diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c index 7f04688045d5..19e59d1a5e9f 100644 --- a/net/xdp/xdp_umem.c +++ b/net/xdp/xdp_umem.c @@ -179,37 +179,6 @@ void xdp_umem_clear_dev(struct xdp_umem *umem) umem->zc = false; } -static void xdp_umem_unmap_pages(struct xdp_umem *umem) -{ - unsigned int i; - - for (i = 0; i < umem->npgs; i++) - if (PageHighMem(umem->pgs[i])) - vunmap(umem->pages[i].addr); -} - -static int xdp_umem_map_pages(struct xdp_umem *umem) -{ - unsigned int i; - void *addr; - - for (i = 0; i < umem->npgs; i++) { - if (PageHighMem(umem->pgs[i])) - addr = vmap(&umem->pgs[i], 1, VM_MAP, PAGE_KERNEL); - else - addr = page_address(umem->pgs[i]); - - if (!addr) { - xdp_umem_unmap_pages(umem); - return -ENOMEM; - } - - umem->pages[i].addr = addr; - } - - return 0; -} - static void xdp_umem_unpin_pages(struct xdp_umem *umem) { unpin_user_pages_dirty_lock(umem->pgs, umem->npgs, true); @@ -244,14 +213,9 @@ static void xdp_umem_release(struct xdp_umem *umem) umem->cq = NULL; } - xsk_reuseq_destroy(umem); xp_destroy(umem->pool); - xdp_umem_unmap_pages(umem); xdp_umem_unpin_pages(umem); - kvfree(umem->pages); - umem->pages = NULL; - xdp_umem_unaccount_pages(umem); kfree(umem); } @@ -385,11 +349,8 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) if (headroom >= chunk_size - XDP_PACKET_HEADROOM) return -EINVAL; - umem->chunk_mask = unaligned_chunks ? XSK_UNALIGNED_BUF_ADDR_MASK - : ~((u64)chunk_size - 1); umem->size = size; umem->headroom = headroom; - umem->chunk_size_nohr = chunk_size - headroom; umem->chunk_size = chunk_size; umem->npgs = size / PAGE_SIZE; umem->pgs = NULL; @@ -408,29 +369,14 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) if (err) goto out_account; - umem->pages = kvcalloc(umem->npgs, sizeof(*umem->pages), - GFP_KERNEL_ACCOUNT); - if (!umem->pages) { - err = -ENOMEM; - goto out_pin; - } - - err = xdp_umem_map_pages(umem); - if (err) - goto out_pages; - umem->pool = xp_create(umem->pgs, umem->npgs, chunks, chunk_size, headroom, size, unaligned_chunks); if (!umem->pool) { err = -ENOMEM; - goto out_unmap; + goto out_pin; } return 0; -out_unmap: - xdp_umem_unmap_pages(umem); -out_pages: - kvfree(umem->pages); out_pin: xdp_umem_unpin_pages(umem); out_account: diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 6933f0d494ba..3f2ab732ab8b 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -39,24 +39,6 @@ bool xsk_is_setup_for_bpf_map(struct xdp_sock *xs) READ_ONCE(xs->umem->fq); } -bool xsk_umem_has_addrs(struct xdp_umem *umem, u32 cnt) -{ - return xskq_cons_has_entries(umem->fq, cnt); -} -EXPORT_SYMBOL(xsk_umem_has_addrs); - -bool xsk_umem_peek_addr(struct xdp_umem *umem, u64 *addr) -{ - return xskq_cons_peek_addr(umem->fq, addr, umem); -} -EXPORT_SYMBOL(xsk_umem_peek_addr); - -void xsk_umem_release_addr(struct xdp_umem *umem) -{ - xskq_cons_release(umem->fq); -} -EXPORT_SYMBOL(xsk_umem_release_addr); - void xsk_set_rx_need_wakeup(struct xdp_umem *umem) { if (umem->need_wakeup & XDP_WAKEUP_RX) @@ -203,8 +185,7 @@ static int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, len = xdp->data_end - xdp->data; - return xdp->rxq->mem.type == MEM_TYPE_ZERO_COPY || - xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL ? + return xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL ? __xsk_rcv_zc(xs, xdp, len) : __xsk_rcv(xs, xdp, len, explicit_free); } @@ -588,24 +569,6 @@ static struct socket *xsk_lookup_xsk_from_fd(int fd) return sock; } -/* Check if umem pages are contiguous. - * If zero-copy mode, use the DMA address to do the page contiguity check - * For all other modes we use addr (kernel virtual address) - * Store the result in the low bits of addr. - */ -static void xsk_check_page_contiguity(struct xdp_umem *umem, u32 flags) -{ - struct xdp_umem_page *pgs = umem->pages; - int i, is_contig; - - for (i = 0; i < umem->npgs - 1; i++) { - is_contig = (flags & XDP_ZEROCOPY) ? - (pgs[i].dma + PAGE_SIZE == pgs[i + 1].dma) : - (pgs[i].addr + PAGE_SIZE == pgs[i + 1].addr); - pgs[i].addr += is_contig << XSK_NEXT_PG_CONTIG_SHIFT; - } -} - static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) { struct sockaddr_xdp *sxdp = (struct sockaddr_xdp *)addr; @@ -688,23 +651,14 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) goto out_unlock; } else { /* This xsk has its own umem. */ - xskq_set_umem(xs->umem->fq, xs->umem->size, - xs->umem->chunk_mask); - xskq_set_umem(xs->umem->cq, xs->umem->size, - xs->umem->chunk_mask); - err = xdp_umem_assign_dev(xs->umem, dev, qid, flags); if (err) goto out_unlock; - - xsk_check_page_contiguity(xs->umem, flags); } xs->dev = dev; xs->zc = xs->umem->zc; xs->queue_id = qid; - xskq_set_umem(xs->rx, xs->umem->size, xs->umem->chunk_mask); - xskq_set_umem(xs->tx, xs->umem->size, xs->umem->chunk_mask); xdp_add_sk_umem(xs->umem, xs); out_unlock: diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index e214a5795a62..89dae78865e7 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -8,6 +8,13 @@ #include "xsk_queue.h" +/* Masks for xdp_umem_page flags. + * The low 12-bits of the addr will be 0 since this is the page address, so we + * can use them for flags. + */ +#define XSK_NEXT_PG_CONTIG_SHIFT 0 +#define XSK_NEXT_PG_CONTIG_MASK BIT_ULL(XSK_NEXT_PG_CONTIG_SHIFT) + struct xsk_buff_pool { struct xsk_queue *fq; struct list_head free_list; diff --git a/net/xdp/xsk_queue.c b/net/xdp/xsk_queue.c index 554b1ebb4d02..6cf9586e5027 100644 --- a/net/xdp/xsk_queue.c +++ b/net/xdp/xsk_queue.c @@ -10,15 +10,6 @@ #include "xsk_queue.h" -void xskq_set_umem(struct xsk_queue *q, u64 umem_size, u64 chunk_mask) -{ - if (!q) - return; - - q->umem_size = umem_size; - q->chunk_mask = chunk_mask; -} - static size_t xskq_get_ring_size(struct xsk_queue *q, bool umem_queue) { struct xdp_umem_ring *umem_ring; @@ -64,56 +55,3 @@ void xskq_destroy(struct xsk_queue *q) page_frag_free(q->ring); kfree(q); } - -struct xdp_umem_fq_reuse *xsk_reuseq_prepare(u32 nentries) -{ - struct xdp_umem_fq_reuse *newq; - - /* Check for overflow */ - if (nentries > (u32)roundup_pow_of_two(nentries)) - return NULL; - nentries = roundup_pow_of_two(nentries); - - newq = kvmalloc(struct_size(newq, handles, nentries), GFP_KERNEL); - if (!newq) - return NULL; - memset(newq, 0, offsetof(typeof(*newq), handles)); - - newq->nentries = nentries; - return newq; -} -EXPORT_SYMBOL_GPL(xsk_reuseq_prepare); - -struct xdp_umem_fq_reuse *xsk_reuseq_swap(struct xdp_umem *umem, - struct xdp_umem_fq_reuse *newq) -{ - struct xdp_umem_fq_reuse *oldq = umem->fq_reuse; - - if (!oldq) { - umem->fq_reuse = newq; - return NULL; - } - - if (newq->nentries < oldq->length) - return newq; - - memcpy(newq->handles, oldq->handles, - array_size(oldq->length, sizeof(u64))); - newq->length = oldq->length; - - umem->fq_reuse = newq; - return oldq; -} -EXPORT_SYMBOL_GPL(xsk_reuseq_swap); - -void xsk_reuseq_free(struct xdp_umem_fq_reuse *rq) -{ - kvfree(rq); -} -EXPORT_SYMBOL_GPL(xsk_reuseq_free); - -void xsk_reuseq_destroy(struct xdp_umem *umem) -{ - xsk_reuseq_free(umem->fq_reuse); - umem->fq_reuse = NULL; -} diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index 9151aef7dbca..16bf15864788 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -32,8 +32,6 @@ struct xdp_umem_ring { }; struct xsk_queue { - u64 chunk_mask; - u64 umem_size; u32 ring_mask; u32 nentries; u32 cached_prod; @@ -106,90 +104,6 @@ struct xsk_queue { /* Functions that read and validate content from consumer rings. */ -static inline bool xskq_cons_crosses_non_contig_pg(struct xdp_umem *umem, - u64 addr, - u64 length) -{ - bool cross_pg = (addr & (PAGE_SIZE - 1)) + length > PAGE_SIZE; - bool next_pg_contig = - (unsigned long)umem->pages[(addr >> PAGE_SHIFT)].addr & - XSK_NEXT_PG_CONTIG_MASK; - - return cross_pg && !next_pg_contig; -} - -static inline bool xskq_cons_is_valid_unaligned(struct xsk_queue *q, - u64 addr, - u64 length, - struct xdp_umem *umem) -{ - u64 base_addr = xsk_umem_extract_addr(addr); - - addr = xsk_umem_add_offset_to_addr(addr); - if (base_addr >= q->umem_size || addr >= q->umem_size || - xskq_cons_crosses_non_contig_pg(umem, addr, length)) { - q->invalid_descs++; - return false; - } - - return true; -} - -static inline bool xskq_cons_is_valid_addr(struct xsk_queue *q, u64 addr) -{ - if (addr >= q->umem_size) { - q->invalid_descs++; - return false; - } - - return true; -} - -static inline bool xskq_cons_read_addr(struct xsk_queue *q, u64 *addr, - struct xdp_umem *umem) -{ - struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring; - - while (q->cached_cons != q->cached_prod) { - u32 idx = q->cached_cons & q->ring_mask; - - *addr = ring->desc[idx] & q->chunk_mask; - - if (umem->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG) { - if (xskq_cons_is_valid_unaligned(q, *addr, - umem->chunk_size_nohr, - umem)) - return true; - goto out; - } - - if (xskq_cons_is_valid_addr(q, *addr)) - return true; - -out: - q->cached_cons++; - } - - return false; -} - -static inline bool xskq_cons_read_addr_aligned(struct xsk_queue *q, u64 *addr) -{ - struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring; - - while (q->cached_cons != q->cached_prod) { - u32 idx = q->cached_cons & q->ring_mask; - - *addr = ring->desc[idx]; - if (xskq_cons_is_valid_addr(q, *addr)) - return true; - - q->cached_cons++; - } - - return false; -} - static inline bool xskq_cons_read_addr_unchecked(struct xsk_queue *q, u64 *addr) { struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring; @@ -267,21 +181,6 @@ static inline bool xskq_cons_has_entries(struct xsk_queue *q, u32 cnt) return entries >= cnt; } -static inline bool xskq_cons_peek_addr(struct xsk_queue *q, u64 *addr, - struct xdp_umem *umem) -{ - if (q->cached_prod == q->cached_cons) - xskq_cons_get_entries(q); - return xskq_cons_read_addr(q, addr, umem); -} - -static inline bool xskq_cons_peek_addr_aligned(struct xsk_queue *q, u64 *addr) -{ - if (q->cached_prod == q->cached_cons) - xskq_cons_get_entries(q); - return xskq_cons_read_addr_aligned(q, addr); -} - static inline bool xskq_cons_peek_addr_unchecked(struct xsk_queue *q, u64 *addr) { if (q->cached_prod == q->cached_cons) @@ -410,11 +309,7 @@ static inline u64 xskq_nb_invalid_descs(struct xsk_queue *q) return q ? q->invalid_descs : 0; } -void xskq_set_umem(struct xsk_queue *q, u64 umem_size, u64 chunk_mask); struct xsk_queue *xskq_create(u32 nentries, bool umem_queue); void xskq_destroy(struct xsk_queue *q_ops); -/* Executed by the core when the entire UMEM gets freed */ -void xsk_reuseq_destroy(struct xdp_umem *umem); - #endif /* _LINUX_XSK_QUEUE_H */ From patchwork Wed May 20 19:21:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294666 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=OMobGvX6; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2h75Ds2z9sSF for ; Thu, 21 May 2020 05:22:31 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726940AbgETTWb (ORCPT ); Wed, 20 May 2020 15:22:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTWa (ORCPT ); Wed, 20 May 2020 15:22:30 -0400 Received: from mail-pj1-x1043.google.com (mail-pj1-x1043.google.com [IPv6:2607:f8b0:4864:20::1043]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB010C061A0E; Wed, 20 May 2020 12:22:30 -0700 (PDT) Received: by mail-pj1-x1043.google.com with SMTP id ci21so1775903pjb.3; Wed, 20 May 2020 12:22:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Di5GhhIVk5+nln0iwOVokfXWhv8Y+Qgt7aCSLcBIvXI=; b=OMobGvX6GuqmTRQPro3ahC3dGlOzclyXLO2yxPghXuQ9W+1URuCc29EFet3q1w6YO2 dFFHDwKXzmFT7LNb6l0qUA1iE4ORfSXr1sAjA5hrNSbUllrgrKIk4jUWXjMEbINtpx1E Gd+tdPYsAGOhcB5ix8AngfA424w0udgtWXa3bGb4/9SZS9VA82gw2rtql4Kt9W2WysF1 yQkNn0ST6nHEZ7TgcD/WPdUBYX5wtUzu/bfkRI/hSUgHsyoQ01nn3wze6OtaePRyTMCX +IrJo0T8AO49MyJP+FSWdh2B5QlMBfU144+YNyc6m415w1Hzacnh4MLnBKEA1o22MjNb TrmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Di5GhhIVk5+nln0iwOVokfXWhv8Y+Qgt7aCSLcBIvXI=; b=ZxEaCrczOLiizfkvts2XbDajUbhsja3xZrvVj6fTvtBgqSY7hq0xLEi0gDrnW5Ior/ keAbmNDKNm7es4fkiTH/7HWGHoA/U6XqHnRDZn0MNaJc0CyprDmpzLsCRdvmgOelmAxm nIWSM8FVcP8fN4up5KcXfzxINry6rdgdi8B0seDMFz0hKpc+G0ijiPJZO0oFTZnrd541 8ZC33swE0WnOApjv4n5rrKhqW/Z5NtuAlQ0gGCPuwlxiasDHyzvJsVQzkLmUE5mM4sKS OGOIIII1u2ePmZ7PfImlPOMW4ve1cVS5W5pFV0tJ3P1Q0evfajQ3EXaYbjp5DV8N/kEZ r/3w== X-Gm-Message-State: AOAM53092hCbeBveKEfCKLpzJv82yi6T8SleSU5fp8J6FbHY/4mUtLFE CGfI4NQbI+OhuoxITGRW8rc= X-Google-Smtp-Source: ABdhPJyMF5iHn2a51GFWDNoICicGKSj5Zj5BnzcCLs/U/64onI9lZLPfUp4pcA0rfnYeQyXLM+tCZQ== X-Received: by 2002:a17:902:8b84:: with SMTP id ay4mr6052074plb.167.1590002550296; Wed, 20 May 2020 12:22:30 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.22.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:22:29 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com Subject: [PATCH bpf-next v5 13/15] xdp: simplify xdp_return_{frame, frame_rx_napi, buff} Date: Wed, 20 May 2020 21:21:01 +0200 Message-Id: <20200520192103.355233-14-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Björn Töpel The xdp_return_{frame,frame_rx_napi,buff} function are never used, except in xdp_convert_zc_to_xdp_frame(), by the MEM_TYPE_XSK_BUFF_POOL memory type. To simplify and reduce code, change so that xdp_convert_zc_to_xdp_frame() calls xsk_buff_free() directly since the type is know, and remove MEM_TYPE_XSK_BUFF_POOL from the switch statement in __xdp_return() function. Suggested-by: Maxim Mikityanskiy Signed-off-by: Björn Töpel --- net/core/xdp.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/net/core/xdp.c b/net/core/xdp.c index a8c2f243367d..90f44f382115 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -335,10 +335,11 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_reg_mem_model); * scenarios (e.g. queue full), it is possible to return the xdp_frame * while still leveraging this protection. The @napi_direct boolean * is used for those calls sites. Thus, allowing for faster recycling - * of xdp_frames/pages in those cases. + * of xdp_frames/pages in those cases. This path is never used by the + * MEM_TYPE_XSK_BUFF_POOL memory type, so it's explicitly not part of + * the switch-statement. */ -static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, - struct xdp_buff *xdp) +static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct) { struct xdp_mem_allocator *xa; struct page *page; @@ -360,33 +361,29 @@ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, page = virt_to_page(data); /* Assumes order0 page*/ put_page(page); break; - case MEM_TYPE_XSK_BUFF_POOL: - /* NB! Only valid from an xdp_buff! */ - xsk_buff_free(xdp); - break; default: /* Not possible, checked in xdp_rxq_info_reg_mem_model() */ + WARN(1, "Incorrect XDP memory type (%d) usage", mem->type); break; } } void xdp_return_frame(struct xdp_frame *xdpf) { - __xdp_return(xdpf->data, &xdpf->mem, false, NULL); + __xdp_return(xdpf->data, &xdpf->mem, false); } EXPORT_SYMBOL_GPL(xdp_return_frame); void xdp_return_frame_rx_napi(struct xdp_frame *xdpf) { - __xdp_return(xdpf->data, &xdpf->mem, true, NULL); + __xdp_return(xdpf->data, &xdpf->mem, true); } EXPORT_SYMBOL_GPL(xdp_return_frame_rx_napi); void xdp_return_buff(struct xdp_buff *xdp) { - __xdp_return(xdp->data, &xdp->rxq->mem, true, xdp); + __xdp_return(xdp->data, &xdp->rxq->mem, true); } -EXPORT_SYMBOL_GPL(xdp_return_buff); /* Only called for MEM_TYPE_PAGE_POOL see xdp.h */ void __xdp_release_frame(void *data, struct xdp_mem_info *mem) @@ -467,7 +464,7 @@ struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp) xdpf->metasize = metasize; xdpf->mem.type = MEM_TYPE_PAGE_ORDER0; - xdp_return_buff(xdp); + xsk_buff_free(xdp); return xdpf; } EXPORT_SYMBOL_GPL(xdp_convert_zc_to_xdp_frame); From patchwork Wed May 20 19:21:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294668 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=t3tAWeyy; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2hF0XpCz9sSF for ; Thu, 21 May 2020 05:22:37 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727055AbgETTWg (ORCPT ); Wed, 20 May 2020 15:22:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTWg (ORCPT ); Wed, 20 May 2020 15:22:36 -0400 Received: from mail-pf1-x443.google.com (mail-pf1-x443.google.com [IPv6:2607:f8b0:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A29DC061A0E; Wed, 20 May 2020 12:22:36 -0700 (PDT) Received: by mail-pf1-x443.google.com with SMTP id 145so2031125pfw.13; Wed, 20 May 2020 12:22:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=U2553eIKItwn1wRDgfDsyXrZewnOzINl0PnHhFRyV4w=; b=t3tAWeyyZTo3oh6WmJb1HB2U4AEbUZCwVLjXC3d9UPvQqJEyhVfOe8g7DPZkSy+MWN p/IcogTnvZVNh8Ao8csDenSnOs3yaE4+qVZkMm23maBR/R4NAhX/gt+ul7VQLizi0XU/ r0+tAXOu3Ml6/RAtx1yUy7PtwIHEBw/VMPZPrtkxHQs/IOQq/06fav1Zk1vr8nCubple 7Jb3iriCPGYeKam4QFz/Z5cxvNlmumMEMH8FutafRYF2irzF84B/FJi4576yx9iRd0U7 BDDRvOOuis3CvZtLyrbGP3gP+SeeXnRi7xOQ+lcbagFYPWcaW+UtypUuSUbtHsVcbhmZ nsWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=U2553eIKItwn1wRDgfDsyXrZewnOzINl0PnHhFRyV4w=; b=FWCE37q7A7F3AIX2lsWXxVy+kgjF0I98zuFxjc9tZAgBvRh+1YV/g1dURmeNzXbxd5 oy51FB6AwYThmpctwmI8nqkfvNbn2SEt1wQsKXAo2jxKwheVBR0ZCmStKRUjFjBLf0dL 2/2V4jjjn8pmiPr5hA67X/QgmQ81bnWoS7sNSwNGjd/D3CKai+0uxHQalzPqVHMYSvqa MeDYw30AzNUlfj8pHKrpTOyTVI+zC5pdVSJXj/LIH5Qu9vHStY4xEzw6pbGZAvzYzrvC /tC0UHkpqKhTxQGTQy3tw5w899iIdse+s1dIRXia4Six//ErRjVcJqhGtx9RupMlxZEy N1xQ== X-Gm-Message-State: AOAM531asRae4Kgr4T68tUDFxzobvb/f2IgDilntZVNOb6bDle5CAFVd j7spoKYAysbuikaHkFmi4cM= X-Google-Smtp-Source: ABdhPJx2TJIhyjEeN/UPpS17xyv1+jQYM1e/qjUT9+lciRAbt1Gdxd+F2+tP6zrqciM6yctU5HhZow== X-Received: by 2002:a62:3606:: with SMTP id d6mr5652612pfa.196.1590002555557; Wed, 20 May 2020 12:22:35 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.22.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:22:34 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com Subject: [PATCH bpf-next v5 14/15] xsk: explicitly inline functions and move definitions Date: Wed, 20 May 2020 21:21:02 +0200 Message-Id: <20200520192103.355233-15-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Björn Töpel In order to reduce the number of function calls, the struct xsk_buff_pool definition is moved to xsk_buff_pool.h. The functions xp_get_dma(), xp_dma_sync_for_cpu(), xp_dma_sync_for_device(), xp_validate_desc() and various helper functions are explicitly inlined. Further, move xp_get_handle() and xp_release() to xsk.c, to allow for the compiler to perform inlining. rfc->v1: Make sure xp_validate_desc() is inlined for Tx perf. (Maxim) Signed-off-by: Björn Töpel --- include/net/xsk_buff_pool.h | 98 ++++++++++++++++++++++-- net/xdp/xsk.c | 15 ++++ net/xdp/xsk_buff_pool.c | 148 ++---------------------------------- net/xdp/xsk_queue.h | 45 +++++++++++ 4 files changed, 156 insertions(+), 150 deletions(-) diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index 9f221b36e405..a4ff226505c9 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -4,6 +4,7 @@ #ifndef XSK_BUFF_POOL_H_ #define XSK_BUFF_POOL_H_ +#include #include #include #include @@ -25,6 +26,27 @@ struct xdp_buff_xsk { struct list_head free_list_node; }; +struct xsk_buff_pool { + struct xsk_queue *fq; + struct list_head free_list; + dma_addr_t *dma_pages; + struct xdp_buff_xsk *heads; + u64 chunk_mask; + u64 addrs_cnt; + u32 free_list_cnt; + u32 dma_pages_cnt; + u32 heads_cnt; + u32 free_heads_cnt; + u32 headroom; + u32 chunk_size; + u32 frame_len; + bool cheap_dma; + bool unaligned; + void *addrs; + struct device *dev; + struct xdp_buff_xsk *free_heads[]; +}; + /* AF_XDP core. */ struct xsk_buff_pool *xp_create(struct page **pages, u32 nr_pages, u32 chunks, u32 chunk_size, u32 headroom, u64 size, @@ -32,8 +54,6 @@ struct xsk_buff_pool *xp_create(struct page **pages, u32 nr_pages, u32 chunks, void xp_set_fq(struct xsk_buff_pool *pool, struct xsk_queue *fq); void xp_destroy(struct xsk_buff_pool *pool); void xp_release(struct xdp_buff_xsk *xskb); -u64 xp_get_handle(struct xdp_buff_xsk *xskb); -bool xp_validate_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc); /* AF_XDP, and XDP core. */ void xp_free(struct xdp_buff_xsk *xskb); @@ -47,10 +67,74 @@ struct xdp_buff *xp_alloc(struct xsk_buff_pool *pool); bool xp_can_alloc(struct xsk_buff_pool *pool, u32 count); void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr); dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr); -dma_addr_t xp_get_dma(struct xdp_buff_xsk *xskb); -dma_addr_t xp_get_frame_dma(struct xdp_buff_xsk *xskb); -void xp_dma_sync_for_cpu(struct xdp_buff_xsk *xskb); -void xp_dma_sync_for_device(struct xsk_buff_pool *pool, dma_addr_t dma, - size_t size); +static inline dma_addr_t xp_get_dma(struct xdp_buff_xsk *xskb) +{ + return xskb->dma; +} + +static inline dma_addr_t xp_get_frame_dma(struct xdp_buff_xsk *xskb) +{ + return xskb->frame_dma; +} + +void xp_dma_sync_for_cpu_slow(struct xdp_buff_xsk *xskb); +static inline void xp_dma_sync_for_cpu(struct xdp_buff_xsk *xskb) +{ + if (xskb->pool->cheap_dma) + return; + + xp_dma_sync_for_cpu_slow(xskb); +} + +void xp_dma_sync_for_device_slow(struct xsk_buff_pool *pool, dma_addr_t dma, + size_t size); +static inline void xp_dma_sync_for_device(struct xsk_buff_pool *pool, + dma_addr_t dma, size_t size) +{ + if (pool->cheap_dma) + return; + + xp_dma_sync_for_device_slow(pool, dma, size); +} + +/* Masks for xdp_umem_page flags. + * The low 12-bits of the addr will be 0 since this is the page address, so we + * can use them for flags. + */ +#define XSK_NEXT_PG_CONTIG_SHIFT 0 +#define XSK_NEXT_PG_CONTIG_MASK BIT_ULL(XSK_NEXT_PG_CONTIG_SHIFT) + +static inline bool xp_desc_crosses_non_contig_pg(struct xsk_buff_pool *pool, + u64 addr, u32 len) +{ + bool cross_pg = (addr & (PAGE_SIZE - 1)) + len > PAGE_SIZE; + + if (pool->dma_pages_cnt && cross_pg) { + return !(pool->dma_pages[addr >> PAGE_SHIFT] & + XSK_NEXT_PG_CONTIG_MASK); + } + return false; +} + +static inline u64 xp_aligned_extract_addr(struct xsk_buff_pool *pool, u64 addr) +{ + return addr & pool->chunk_mask; +} + +static inline u64 xp_unaligned_extract_addr(u64 addr) +{ + return addr & XSK_UNALIGNED_BUF_ADDR_MASK; +} + +static inline u64 xp_unaligned_extract_offset(u64 addr) +{ + return addr >> XSK_UNALIGNED_BUF_OFFSET_SHIFT; +} + +static inline u64 xp_unaligned_add_offset_to_addr(u64 addr) +{ + return xp_unaligned_extract_addr(addr) + + xp_unaligned_extract_offset(addr); +} #endif /* XSK_BUFF_POOL_H_ */ diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 3f2ab732ab8b..b6c0f08bd80d 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -99,6 +99,21 @@ bool xsk_umem_uses_need_wakeup(struct xdp_umem *umem) } EXPORT_SYMBOL(xsk_umem_uses_need_wakeup); +void xp_release(struct xdp_buff_xsk *xskb) +{ + xskb->pool->free_heads[xskb->pool->free_heads_cnt++] = xskb; +} + +static u64 xp_get_handle(struct xdp_buff_xsk *xskb) +{ + u64 offset = xskb->xdp.data - xskb->xdp.data_hard_start; + + offset += xskb->pool->headroom; + if (!xskb->pool->unaligned) + return xskb->orig_addr + offset; + return xskb->orig_addr + (offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT); +} + static int __xsk_rcv_zc(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len) { struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp); diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index 89dae78865e7..540ed75e4482 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -8,34 +8,6 @@ #include "xsk_queue.h" -/* Masks for xdp_umem_page flags. - * The low 12-bits of the addr will be 0 since this is the page address, so we - * can use them for flags. - */ -#define XSK_NEXT_PG_CONTIG_SHIFT 0 -#define XSK_NEXT_PG_CONTIG_MASK BIT_ULL(XSK_NEXT_PG_CONTIG_SHIFT) - -struct xsk_buff_pool { - struct xsk_queue *fq; - struct list_head free_list; - dma_addr_t *dma_pages; - struct xdp_buff_xsk *heads; - u64 chunk_mask; - u64 addrs_cnt; - u32 free_list_cnt; - u32 dma_pages_cnt; - u32 heads_cnt; - u32 free_heads_cnt; - u32 headroom; - u32 chunk_size; - u32 frame_len; - bool cheap_dma; - bool unaligned; - void *addrs; - struct device *dev; - struct xdp_buff_xsk *free_heads[]; -}; - static void xp_addr_unmap(struct xsk_buff_pool *pool) { vunmap(pool->addrs); @@ -228,50 +200,12 @@ int xp_dma_map(struct xsk_buff_pool *pool, struct device *dev, } EXPORT_SYMBOL(xp_dma_map); -static bool xp_desc_crosses_non_contig_pg(struct xsk_buff_pool *pool, - u64 addr, u32 len) -{ - bool cross_pg = (addr & (PAGE_SIZE - 1)) + len > PAGE_SIZE; - - if (pool->dma_pages_cnt && cross_pg) { - return !(pool->dma_pages[addr >> PAGE_SHIFT] & - XSK_NEXT_PG_CONTIG_MASK); - } - return false; -} - static bool xp_addr_crosses_non_contig_pg(struct xsk_buff_pool *pool, u64 addr) { return xp_desc_crosses_non_contig_pg(pool, addr, pool->chunk_size); } -void xp_release(struct xdp_buff_xsk *xskb) -{ - xskb->pool->free_heads[xskb->pool->free_heads_cnt++] = xskb; -} - -static u64 xp_aligned_extract_addr(struct xsk_buff_pool *pool, u64 addr) -{ - return addr & pool->chunk_mask; -} - -static u64 xp_unaligned_extract_addr(u64 addr) -{ - return addr & XSK_UNALIGNED_BUF_ADDR_MASK; -} - -static u64 xp_unaligned_extract_offset(u64 addr) -{ - return addr >> XSK_UNALIGNED_BUF_OFFSET_SHIFT; -} - -static u64 xp_unaligned_add_offset_to_addr(u64 addr) -{ - return xp_unaligned_extract_addr(addr) + - xp_unaligned_extract_offset(addr); -} - static bool xp_check_unaligned(struct xsk_buff_pool *pool, u64 *addr) { *addr = xp_unaligned_extract_addr(*addr); @@ -370,60 +304,6 @@ void xp_free(struct xdp_buff_xsk *xskb) } EXPORT_SYMBOL(xp_free); -static bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, - struct xdp_desc *desc) -{ - u64 chunk, chunk_end; - - chunk = xp_aligned_extract_addr(pool, desc->addr); - chunk_end = xp_aligned_extract_addr(pool, desc->addr + desc->len); - if (chunk != chunk_end) - return false; - - if (chunk >= pool->addrs_cnt) - return false; - - if (desc->options) - return false; - return true; -} - -static bool xp_unaligned_validate_desc(struct xsk_buff_pool *pool, - struct xdp_desc *desc) -{ - u64 addr, base_addr; - - base_addr = xp_unaligned_extract_addr(desc->addr); - addr = xp_unaligned_add_offset_to_addr(desc->addr); - - if (desc->len > pool->chunk_size) - return false; - - if (base_addr >= pool->addrs_cnt || addr >= pool->addrs_cnt || - xp_desc_crosses_non_contig_pg(pool, addr, desc->len)) - return false; - - if (desc->options) - return false; - return true; -} - -bool xp_validate_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc) -{ - return pool->unaligned ? xp_unaligned_validate_desc(pool, desc) : - xp_aligned_validate_desc(pool, desc); -} - -u64 xp_get_handle(struct xdp_buff_xsk *xskb) -{ - u64 offset = xskb->xdp.data - xskb->xdp.data_hard_start; - - offset += xskb->pool->headroom; - if (!xskb->pool->unaligned) - return xskb->orig_addr + offset; - return xskb->orig_addr + (offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT); -} - void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr) { addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr; @@ -440,35 +320,17 @@ dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr) } EXPORT_SYMBOL(xp_raw_get_dma); -dma_addr_t xp_get_dma(struct xdp_buff_xsk *xskb) -{ - return xskb->dma; -} -EXPORT_SYMBOL(xp_get_dma); - -dma_addr_t xp_get_frame_dma(struct xdp_buff_xsk *xskb) +void xp_dma_sync_for_cpu_slow(struct xdp_buff_xsk *xskb) { - return xskb->frame_dma; -} -EXPORT_SYMBOL(xp_get_frame_dma); - -void xp_dma_sync_for_cpu(struct xdp_buff_xsk *xskb) -{ - if (xskb->pool->cheap_dma) - return; - dma_sync_single_range_for_cpu(xskb->pool->dev, xskb->dma, 0, xskb->pool->frame_len, DMA_BIDIRECTIONAL); } -EXPORT_SYMBOL(xp_dma_sync_for_cpu); +EXPORT_SYMBOL(xp_dma_sync_for_cpu_slow); -void xp_dma_sync_for_device(struct xsk_buff_pool *pool, dma_addr_t dma, - size_t size) +void xp_dma_sync_for_device_slow(struct xsk_buff_pool *pool, dma_addr_t dma, + size_t size) { - if (pool->cheap_dma) - return; - dma_sync_single_range_for_device(pool->dev, dma, 0, size, DMA_BIDIRECTIONAL); } -EXPORT_SYMBOL(xp_dma_sync_for_device); +EXPORT_SYMBOL(xp_dma_sync_for_device_slow); diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index 16bf15864788..5b5d24d2dd37 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -118,6 +118,51 @@ static inline bool xskq_cons_read_addr_unchecked(struct xsk_queue *q, u64 *addr) return false; } +static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, + struct xdp_desc *desc) +{ + u64 chunk, chunk_end; + + chunk = xp_aligned_extract_addr(pool, desc->addr); + chunk_end = xp_aligned_extract_addr(pool, desc->addr + desc->len); + if (chunk != chunk_end) + return false; + + if (chunk >= pool->addrs_cnt) + return false; + + if (desc->options) + return false; + return true; +} + +static inline bool xp_unaligned_validate_desc(struct xsk_buff_pool *pool, + struct xdp_desc *desc) +{ + u64 addr, base_addr; + + base_addr = xp_unaligned_extract_addr(desc->addr); + addr = xp_unaligned_add_offset_to_addr(desc->addr); + + if (desc->len > pool->chunk_size) + return false; + + if (base_addr >= pool->addrs_cnt || addr >= pool->addrs_cnt || + xp_desc_crosses_non_contig_pg(pool, addr, desc->len)) + return false; + + if (desc->options) + return false; + return true; +} + +static inline bool xp_validate_desc(struct xsk_buff_pool *pool, + struct xdp_desc *desc) +{ + return pool->unaligned ? xp_unaligned_validate_desc(pool, desc) : + xp_aligned_validate_desc(pool, desc); +} + static inline bool xskq_cons_is_valid_desc(struct xsk_queue *q, struct xdp_desc *d, struct xdp_umem *umem) From patchwork Wed May 20 19:21:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1294670 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=vZ8IO5Jv; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 49S2hM32h6z9sTR for ; Thu, 21 May 2020 05:22:43 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726932AbgETTWm (ORCPT ); Wed, 20 May 2020 15:22:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726560AbgETTWl (ORCPT ); Wed, 20 May 2020 15:22:41 -0400 Received: from mail-pg1-x543.google.com (mail-pg1-x543.google.com [IPv6:2607:f8b0:4864:20::543]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74EC0C061A0E; Wed, 20 May 2020 12:22:41 -0700 (PDT) Received: by mail-pg1-x543.google.com with SMTP id p30so1886026pgl.11; Wed, 20 May 2020 12:22:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=OVMLRI/qYuKHi4wFXwj/y3J+J9HuBJ9GEwbF/G1h7vs=; b=vZ8IO5JvkRxbPEL5oIae2LPKd4XKGs0oq6d9USfSK9MbFesburkTyZANdAFl2/YkZb m9mIVcA5FPwrpIo3W5NjuIqNAmUDjC5x8dYlUYDcDJb3nrkaT+GxVkNBHKOVX87U49PP prMp2CVSQelpUMk0Gg3oY4At+STHy85+PzYR/THKXqL25+UIBad1lu18qT7mWCieV87W H8HcFQwF56gZv50eYEcWQb+ckVgjsel6lBzAqKesCERsAmRLUMUxyk/4K9sPNEHJVtmV BzqTLNpqMevZr8KANwNxrA5ILhxxtQIP7yr8kHEFYAnFC9kmjYMJseu7z5MBUUyVKlVM dvag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OVMLRI/qYuKHi4wFXwj/y3J+J9HuBJ9GEwbF/G1h7vs=; b=UStHcV+Fs830YS2pslFqTprmJ0eFydYXUq6SIcnTlJjfC7J0pisff8HGUcOXHlrCGd SZsZ9vxa+2aGWAOZunrag68G2z00grcYXEe44hYcEuBnk/TNGcafNE+cQRadvK8Jqo1y jxtmcHNTYlAl9mTmNrpBLhRRfvTToyOpz7vV4/G6+ZkNthcksQtUvPkelhXVbqjzm/Ct tRmSvlh0Fjakhj61eMfvbjaGtZ5/mX/Kw6SbQv32ACwlvDm7GpdYqRrlYRYChkgBpHhj WjwZqdnXedh7KvyW190RyxLGY3QvRkaecjQDer/R8pS6Rfezu3Yl3wBERO1PTM4I0Ktt LQww== X-Gm-Message-State: AOAM530iLQrY6gD38emI9wU31VJaeIAoVR76/1WuAWHoSJZTUxlpJSCo qfcXFEXQh/2+kicKi9AzYgU= X-Google-Smtp-Source: ABdhPJyKrkGByROGfhVaaKME8Vy/7DHu4vmtEFkG/CHCI2prtWxpATDGqiv8/kxH+eYbqS0iCcnfkQ== X-Received: by 2002:a63:f316:: with SMTP id l22mr5204505pgh.38.1590002561093; Wed, 20 May 2020 12:22:41 -0700 (PDT) Received: from btopel-mobl.ger.intel.com (fmdmzpr03-ext.fm.intel.com. [192.55.54.38]) by smtp.gmail.com with ESMTPSA id 62sm2762424pfc.204.2020.05.20.12.22.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 12:22:40 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, kuba@kernel.org, hawk@kernel.org, john.fastabend@gmail.com, netdev@vger.kernel.org, bpf@vger.kernel.org, magnus.karlsson@intel.com, jonathan.lemon@gmail.com, jeffrey.t.kirsher@intel.com Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , maximmi@mellanox.com, maciej.fijalkowski@intel.com, Joe Perches Subject: [PATCH bpf-next v5 15/15] MAINTAINERS, xsk: update AF_XDP section after moves/adds Date: Wed, 20 May 2020 21:21:03 +0200 Message-Id: <20200520192103.355233-16-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200520192103.355233-1-bjorn.topel@gmail.com> References: <20200520192103.355233-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Björn Töpel Update MAINTAINERS to correctly mirror the current AF_XDP socket file layout. Also, add the AF_XDP files of libbpf. rfc->v1: Sorted file entries. (Joe) Cc: Joe Perches Signed-off-by: Björn Töpel --- MAINTAINERS | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index b7844f6cfa4a..087e68b21f9f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -18443,8 +18443,12 @@ R: Jonathan Lemon L: netdev@vger.kernel.org L: bpf@vger.kernel.org S: Maintained -F: kernel/bpf/xskmap.c +F: include/net/xdp_sock* +F: include/net/xsk_buffer_pool.h +F: include/uapi/linux/if_xdp.h F: net/xdp/ +F: samples/bpf/xdpsock* +F: tools/lib/bpf/xsk* XEN BLOCK SUBSYSTEM M: Konrad Rzeszutek Wilk