From patchwork Thu Jun 13 09:39:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1115229 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="F35YehXw"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 45Pp4w3F9Vz9sRl for ; Fri, 14 Jun 2019 01:46:40 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726693AbfFMPqi (ORCPT ); Thu, 13 Jun 2019 11:46:38 -0400 Received: from mail-pl1-f195.google.com ([209.85.214.195]:36765 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731787AbfFMJke (ORCPT ); Thu, 13 Jun 2019 05:40:34 -0400 Received: by mail-pl1-f195.google.com with SMTP id d21so7897847plr.3; Thu, 13 Jun 2019 02:40:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+XJa3hI7C/DWe69XCCaqBEl/nyj92kKguap6+DVO3NI=; b=F35YehXwA8tJO2YrP4wlu/QJp4iFcaRqL71u0CNbxmRkgqoWpo+rB3kMGzXFOaQ8Ae h4v2FCnqOsWLhHLSih3+hzfO6YRss874jMQ5LFoJDagYEVWCy3wJ0y9G38EJPTJBKe18 Bfv37H4x3n0AGK5hCCWXBtiu1X88JpoOyOANeH2VaX2+Xoop5CM6reZb8RAC6z8VjB7N 6I24keRZ7cRmtUMjfWLeXdn9ZCsxBQovYsdFTDMBW7OwEGyReTKP2QlqbDShBCj+SmmM +r7nd9UWWBiPiMfD1lSft5hDXE+ELW+dBXLCSdtPmYYFMBgyN8bmYUxjlA60AN6PhU+L O/5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+XJa3hI7C/DWe69XCCaqBEl/nyj92kKguap6+DVO3NI=; b=Pm3EtjPEEZR7AznZvFX+CZCgBq4pJP70DKU2CRPK1lIcjulbDPPePolEdFt/3NjV9J sq9HdQK4rKF0PZq4n/d9NSd4c2sncO5PBT77bZfDFbyKHJdLnSLygbvOa0IM+XIEa1g1 vbMXUkAaLZV3TQCS2v9FkziXoUkWo6Dh1Muj4Jr6KFwKcoGZU3JPlEP2X3Wy9rHBAPXK +bZB/56IqqVvfNVCjxDQZtkT/r700TASk/HS/HF3vtFoqCNPQVTIg3CUEfgO0hoOmU+D 1IFld6ypZ0NDHu8lsEs8fYg+WZcNo3qtYJ4p5XVt4TkGRJEmkU5JQRdbA4a92y6dQWat f48w== X-Gm-Message-State: APjAAAUTSNhamiU9DUqoI0Hww6LSfBGQp+NOBNxIPO5LRWVstsGmMrK9 A8gyW3eWT8l3khHAte4aOaw= X-Google-Smtp-Source: APXvYqw2yZ0GH1HP7aI+URVYis9IoIqvTykSoiw19DxS4dcZpTjsbcEdKgfiYVFMMrtbyM7I4WqnkA== X-Received: by 2002:a17:902:bd46:: with SMTP id b6mr86149810plx.173.1560418833432; Thu, 13 Jun 2019 02:40:33 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id y1sm2501015pfe.19.2019.06.13.02.40.30 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Thu, 13 Jun 2019 02:40:32 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend Cc: Toshiaki Makita , netdev@vger.kernel.org, xdp-newbies@vger.kernel.org, bpf@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgense?= =?utf-8?q?n?= , Jason Wang Subject: [PATCH v3 bpf-next 1/2] xdp: Add tracepoint for bulk XDP_TX Date: Thu, 13 Jun 2019 18:39:58 +0900 Message-Id: <20190613093959.2796-2-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190613093959.2796-1-toshiaki.makita1@gmail.com> References: <20190613093959.2796-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This is introduced for admins to check what is happening on XDP_TX when bulk XDP_TX is in use, which will be first introduced in veth in next commit. v3: - Add act field to be in line with other XDP tracepoints. Signed-off-by: Toshiaki Makita --- include/trace/events/xdp.h | 29 +++++++++++++++++++++++++++++ kernel/bpf/core.c | 1 + 2 files changed, 30 insertions(+) diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h index e95cb86..01389b9 100644 --- a/include/trace/events/xdp.h +++ b/include/trace/events/xdp.h @@ -50,6 +50,35 @@ __entry->ifindex) ); +TRACE_EVENT(xdp_bulk_tx, + + TP_PROTO(const struct net_device *dev, + int sent, int drops, int err), + + TP_ARGS(dev, sent, drops, err), + + TP_STRUCT__entry( + __field(int, ifindex) + __field(u32, act) + __field(int, drops) + __field(int, sent) + __field(int, err) + ), + + TP_fast_assign( + __entry->ifindex = dev->ifindex; + __entry->act = XDP_TX; + __entry->drops = drops; + __entry->sent = sent; + __entry->err = err; + ), + + TP_printk("ifindex=%d action=%s sent=%d drops=%d err=%d", + __entry->ifindex, + __print_symbolic(__entry->act, __XDP_ACT_SYM_TAB), + __entry->sent, __entry->drops, __entry->err) +); + DECLARE_EVENT_CLASS(xdp_redirect_template, TP_PROTO(const struct net_device *dev, diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 33fb292..3a3f4af 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2106,3 +2106,4 @@ int __weak skb_copy_bits(const struct sk_buff *skb, int offset, void *to, #include EXPORT_TRACEPOINT_SYMBOL_GPL(xdp_exception); +EXPORT_TRACEPOINT_SYMBOL_GPL(xdp_bulk_tx); From patchwork Thu Jun 13 09:39:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1115227 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="ih/CBTQQ"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 45Pp4h4T0zz9sQw for ; Fri, 14 Jun 2019 01:46:28 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730903AbfFMPq0 (ORCPT ); Thu, 13 Jun 2019 11:46:26 -0400 Received: from mail-pl1-f196.google.com ([209.85.214.196]:34474 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731788AbfFMJkh (ORCPT ); Thu, 13 Jun 2019 05:40:37 -0400 Received: by mail-pl1-f196.google.com with SMTP id i2so7898190plt.1; Thu, 13 Jun 2019 02:40:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jX5BJvQUurxukycMtLItq+NU98sEO94qqN9cwopDYrI=; b=ih/CBTQQ9AOP5q4yBoK0+V+LvlNgp23L7bEqbxMlZsa5oBmRtZSfFe/rW5qx7zUT6a oQ5Y986oZzE2YW97BRFYWMAal0u6aczFcNjyUflwp6ufaKLyjIVRK2KZwAEO44VEJBJw TbDgqHXt82nXHD3FtEzxbc+vRqXNdMWJ8CurR0LdqQcuBtWv0DkALP9k0Pe0vThxZdp8 1kqVEhAHunrDGgO3dcG5QnRptdZCNBW+i1E50YBDRNfelTwL/ZTbrOfTUXBG9Hsi88FR 20aQ9OwX3o5g3L7bNZtIg2K0hlyy5YLVj5W7xs4hHYnZ7Bzu3BpHf/huS+Svj1jzLYC3 +k7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jX5BJvQUurxukycMtLItq+NU98sEO94qqN9cwopDYrI=; b=fSyPAxyEF6D26K5XWGbWgkFsxvoVY/a6srUK6uW0StvYsDKx8KiVw+Ysf8ajUq5FlY kz0EEYYveNFNCJHParGqozN45smgAZeXFBxGFqEU/dGeHPjxzUKonG7Kx5v6DfX1Y4VK Scyn0P7aWlg/kX1kpQpgF4QGjxwH5Yw94H2Y0qze9jLMUrDS6y7pPZQTCDetCsM+p8KG fVx2/okLcsB89wrdqD/tzLGRBanEJU6D5ktmM1J+V+WG7adbLKTkEHyhO8+yY1c1FGKO ArbC2w3AoKibMcTKePKhGfm45QZu4Rd+qNMzyrfPtNK/zuIsCe/j9rK7FpR7skdwbFw0 vp4g== X-Gm-Message-State: APjAAAVktS/FAZWDe7vz7ujCVEmj9WVEu2474VrkqYcgl9l4hovTBS5K /kfiG37HJS8JAYYluOZFexo= X-Google-Smtp-Source: APXvYqyPbunPFJbilbVra7rTKM+2gSe4EqfbMxdNQaUj7iGz9YHtXyHUMS7Lr1deAkS1oqEz2bLVZQ== X-Received: by 2002:a17:902:2926:: with SMTP id g35mr41057968plb.269.1560418836513; Thu, 13 Jun 2019 02:40:36 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id y1sm2501015pfe.19.2019.06.13.02.40.33 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Thu, 13 Jun 2019 02:40:36 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend Cc: Toshiaki Makita , netdev@vger.kernel.org, xdp-newbies@vger.kernel.org, bpf@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgense?= =?utf-8?q?n?= , Jason Wang Subject: [PATCH v3 bpf-next 2/2] veth: Support bulk XDP_TX Date: Thu, 13 Jun 2019 18:39:59 +0900 Message-Id: <20190613093959.2796-3-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190613093959.2796-1-toshiaki.makita1@gmail.com> References: <20190613093959.2796-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org XDP_TX is similar to XDP_REDIRECT as it essentially redirects packets to the device itself. XDP_REDIRECT has bulk transmit mechanism to avoid the heavy cost of indirect call but it also reduces lock acquisition on the destination device that needs locks like veth and tun. XDP_TX does not use indirect calls but drivers which require locks can benefit from the bulk transmit for XDP_TX as well. This patch introduces bulk transmit mechanism in veth using bulk queue on stack, and improves XDP_TX performance by about 9%. Here are single-core/single-flow XDP_TX test results. CPU consumptions are taken from "perf report --no-child". - Before: 7.26 Mpps _raw_spin_lock 7.83% veth_xdp_xmit 12.23% - After: 7.94 Mpps _raw_spin_lock 1.08% veth_xdp_xmit 6.10% v2: - Use stack for bulk queue instead of a global variable. Signed-off-by: Toshiaki Makita --- drivers/net/veth.c | 60 +++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 48 insertions(+), 12 deletions(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 52110e5..b363a84 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -38,6 +38,8 @@ #define VETH_XDP_TX BIT(0) #define VETH_XDP_REDIR BIT(1) +#define VETH_XDP_TX_BULK_SIZE 16 + struct veth_rq_stats { u64 xdp_packets; u64 xdp_bytes; @@ -64,6 +66,11 @@ struct veth_priv { unsigned int requested_headroom; }; +struct veth_xdp_tx_bq { + struct xdp_frame *q[VETH_XDP_TX_BULK_SIZE]; + unsigned int count; +}; + /* * ethtool interface */ @@ -442,13 +449,30 @@ static int veth_xdp_xmit(struct net_device *dev, int n, return ret; } -static void veth_xdp_flush(struct net_device *dev) +static void veth_xdp_flush_bq(struct net_device *dev, struct veth_xdp_tx_bq *bq) +{ + int sent, i, err = 0; + + sent = veth_xdp_xmit(dev, bq->count, bq->q, 0); + if (sent < 0) { + err = sent; + sent = 0; + for (i = 0; i < bq->count; i++) + xdp_return_frame(bq->q[i]); + } + trace_xdp_bulk_tx(dev, sent, bq->count - sent, err); + + bq->count = 0; +} + +static void veth_xdp_flush(struct net_device *dev, struct veth_xdp_tx_bq *bq) { struct veth_priv *rcv_priv, *priv = netdev_priv(dev); struct net_device *rcv; struct veth_rq *rq; rcu_read_lock(); + veth_xdp_flush_bq(dev, bq); rcv = rcu_dereference(priv->peer); if (unlikely(!rcv)) goto out; @@ -464,19 +488,26 @@ static void veth_xdp_flush(struct net_device *dev) rcu_read_unlock(); } -static int veth_xdp_tx(struct net_device *dev, struct xdp_buff *xdp) +static int veth_xdp_tx(struct net_device *dev, struct xdp_buff *xdp, + struct veth_xdp_tx_bq *bq) { struct xdp_frame *frame = convert_to_xdp_frame(xdp); if (unlikely(!frame)) return -EOVERFLOW; - return veth_xdp_xmit(dev, 1, &frame, 0); + if (unlikely(bq->count == VETH_XDP_TX_BULK_SIZE)) + veth_xdp_flush_bq(dev, bq); + + bq->q[bq->count++] = frame; + + return 0; } static struct sk_buff *veth_xdp_rcv_one(struct veth_rq *rq, struct xdp_frame *frame, - unsigned int *xdp_xmit) + unsigned int *xdp_xmit, + struct veth_xdp_tx_bq *bq) { void *hard_start = frame->data - frame->headroom; void *head = hard_start - sizeof(struct xdp_frame); @@ -509,7 +540,7 @@ static struct sk_buff *veth_xdp_rcv_one(struct veth_rq *rq, orig_frame = *frame; xdp.data_hard_start = head; xdp.rxq->mem = frame->mem; - if (unlikely(veth_xdp_tx(rq->dev, &xdp) < 0)) { + if (unlikely(veth_xdp_tx(rq->dev, &xdp, bq) < 0)) { trace_xdp_exception(rq->dev, xdp_prog, act); frame = &orig_frame; goto err_xdp; @@ -559,7 +590,8 @@ static struct sk_buff *veth_xdp_rcv_one(struct veth_rq *rq, } static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, struct sk_buff *skb, - unsigned int *xdp_xmit) + unsigned int *xdp_xmit, + struct veth_xdp_tx_bq *bq) { u32 pktlen, headroom, act, metalen; void *orig_data, *orig_data_end; @@ -635,7 +667,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, struct sk_buff *skb, get_page(virt_to_page(xdp.data)); consume_skb(skb); xdp.rxq->mem = rq->xdp_mem; - if (unlikely(veth_xdp_tx(rq->dev, &xdp) < 0)) { + if (unlikely(veth_xdp_tx(rq->dev, &xdp, bq) < 0)) { trace_xdp_exception(rq->dev, xdp_prog, act); goto err_xdp; } @@ -690,7 +722,8 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, struct sk_buff *skb, return NULL; } -static int veth_xdp_rcv(struct veth_rq *rq, int budget, unsigned int *xdp_xmit) +static int veth_xdp_rcv(struct veth_rq *rq, int budget, unsigned int *xdp_xmit, + struct veth_xdp_tx_bq *bq) { int i, done = 0, drops = 0, bytes = 0; @@ -706,11 +739,11 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget, unsigned int *xdp_xmit) struct xdp_frame *frame = veth_ptr_to_xdp(ptr); bytes += frame->len; - skb = veth_xdp_rcv_one(rq, frame, &xdp_xmit_one); + skb = veth_xdp_rcv_one(rq, frame, &xdp_xmit_one, bq); } else { skb = ptr; bytes += skb->len; - skb = veth_xdp_rcv_skb(rq, skb, &xdp_xmit_one); + skb = veth_xdp_rcv_skb(rq, skb, &xdp_xmit_one, bq); } *xdp_xmit |= xdp_xmit_one; @@ -736,10 +769,13 @@ static int veth_poll(struct napi_struct *napi, int budget) struct veth_rq *rq = container_of(napi, struct veth_rq, xdp_napi); unsigned int xdp_xmit = 0; + struct veth_xdp_tx_bq bq; int done; + bq.count = 0; + xdp_set_return_frame_no_direct(); - done = veth_xdp_rcv(rq, budget, &xdp_xmit); + done = veth_xdp_rcv(rq, budget, &xdp_xmit, &bq); if (done < budget && napi_complete_done(napi, done)) { /* Write rx_notify_masked before reading ptr_ring */ @@ -751,7 +787,7 @@ static int veth_poll(struct napi_struct *napi, int budget) } if (xdp_xmit & VETH_XDP_TX) - veth_xdp_flush(rq->dev); + veth_xdp_flush(rq->dev, &bq); if (xdp_xmit & VETH_XDP_REDIR) xdp_do_flush_map(); xdp_clear_return_frame_no_direct();