From patchwork Thu Mar 21 10:14:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Abeni X-Patchwork-Id: 1059954 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44Q2ht5HJFz9sQy for ; Thu, 21 Mar 2019 21:14:54 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728085AbfCUKOx (ORCPT ); Thu, 21 Mar 2019 06:14:53 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53568 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727903AbfCUKOx (ORCPT ); Thu, 21 Mar 2019 06:14:53 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 09350C04FFFC; Thu, 21 Mar 2019 10:14:53 +0000 (UTC) Received: from localhost.localdomain.com (unknown [10.32.181.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 488501001E85; Thu, 21 Mar 2019 10:14:50 +0000 (UTC) From: Paolo Abeni To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , John Fastabend , Ivan Vecera Subject: [PATCH net-next 1/2] net: sched: add empty status flag for NOLOCK qdisc Date: Thu, 21 Mar 2019 11:14:18 +0100 Message-Id: In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Thu, 21 Mar 2019 10:14:53 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The queue is marked not empty after acquiring the seqlock, and it's up to the NOLOCK qdisc clearing such flag on dequeue. Since the empty status lays on the same cache-line of the seqlock, it's always hot on cache during the updates. This makes the empty flag update a little bit loosy. Given the lack of synchronization between enqueue and dequeue, this is unavoidable. Signed-off-by: Paolo Abeni --- include/net/sch_generic.h | 17 +++++++++++++++++ net/sched/sch_generic.c | 2 ++ 2 files changed, 19 insertions(+) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 31284c078d06..1dc0aeec5d39 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -113,6 +113,9 @@ struct Qdisc { spinlock_t busylock ____cacheline_aligned_in_smp; spinlock_t seqlock; + + /* for NOLOCK qdisc, true if there are enqueued skbs */ + bool not_empty; struct rcu_head rcu; }; @@ -143,11 +146,25 @@ static inline bool qdisc_is_running(struct Qdisc *qdisc) return (raw_read_seqcount(&qdisc->running) & 1) ? true : false; } +static inline bool qdisc_is_empty(struct Qdisc *qdisc) +{ + if (qdisc->flags & TCQ_F_NOLOCK) + return !qdisc->not_empty; + return !qdisc->q.qlen; +} + +/* for NOLOCK qdisc, caller must held seqlock */ +static inline void qdisc_set_empty(struct Qdisc *qdisc) +{ + qdisc->not_empty = false; +} + static inline bool qdisc_run_begin(struct Qdisc *qdisc) { if (qdisc->flags & TCQ_F_NOLOCK) { if (!spin_trylock(&qdisc->seqlock)) return false; + qdisc->not_empty = true; } else if (qdisc_is_running(qdisc)) { return false; } diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index a117d9260558..3b03c6b9be1e 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -671,6 +671,8 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc) qdisc_qstats_cpu_backlog_dec(qdisc, skb); qdisc_bstats_cpu_update(qdisc, skb); qdisc_qstats_atomic_qlen_dec(qdisc); + } else { + qdisc_set_empty(qdisc); } return skb; From patchwork Thu Mar 21 10:14:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Abeni X-Patchwork-Id: 1059955 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44Q2j139Kqz9sR4 for ; Thu, 21 Mar 2019 21:15:01 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728111AbfCUKO4 (ORCPT ); Thu, 21 Mar 2019 06:14:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33680 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727903AbfCUKOz (ORCPT ); Thu, 21 Mar 2019 06:14:55 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E515489C42; Thu, 21 Mar 2019 10:14:54 +0000 (UTC) Received: from localhost.localdomain.com (unknown [10.32.181.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4C5AB1001E82; Thu, 21 Mar 2019 10:14:53 +0000 (UTC) From: Paolo Abeni To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , John Fastabend , Ivan Vecera Subject: [PATCH net-next 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc Date: Thu, 21 Mar 2019 11:14:19 +0100 Message-Id: In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Thu, 21 Mar 2019 10:14:54 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org With commit c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array") pfifo_fast no longer benefit from the TCQ_F_CAN_BYPASS optimization. Due to retpolines the cost of the enqueue()/dequeue() pair has become relevant and we observe measurable regression for the uncontended scenario when the packet-rate is below line rate. After commit 46b1c18f9deb ("net: sched: put back q.qlen into a single location") we can check for empty qdisc with a reasonably fast operation even for nolock qdiscs. This change extends TCQ_F_CAN_BYPASS support to nolock qdisc. The new chunk of code mirrors closely the existing one for traditional qdisc, leveraging a newly introduced helper to read atomically the qdisc length. Tested with pktgen in queue xmit mode, with pfifo_fast, a MQ device, and MQ root qdisc: threads vanilla patched kpps kpps 1 2465 2889 2 4304 5188 4 7898 9589 Same as above, but with a single queue device: threads vanilla patched kpps kpps 1 2556 2827 2 2900 2900 4 5000 5000 8 4700 4700 No mesaurable changes in the contended scenarios, and more 10% improvement in the uncontended ones. Signed-off-by: Paolo Abeni Tested-by: Ivan Vecera --- net/core/dev.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/net/core/dev.c b/net/core/dev.c index 2b67f2aa59dd..9a6d3617f03a 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3468,6 +3468,15 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) { __qdisc_drop(skb, &to_free); rc = NET_XMIT_DROP; + } else if ((q->flags & TCQ_F_CAN_BYPASS) && !q->not_empty && + qdisc_run_begin(q)) { + qdisc_bstats_cpu_update(q, skb); + + if (sch_direct_xmit(skb, q, dev, txq, NULL, true)) + __qdisc_run(q); + + qdisc_run_end(q); + rc = NET_XMIT_SUCCESS; } else { rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK; qdisc_run(q);