From patchwork Fri Mar 22 15:01:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Abeni X-Patchwork-Id: 1061293 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44Qn3h1xDTz9sRd for ; Sat, 23 Mar 2019 02:03:44 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729574AbfCVPDl (ORCPT ); Fri, 22 Mar 2019 11:03:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37400 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729564AbfCVPDj (ORCPT ); Fri, 22 Mar 2019 11:03:39 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 464293092656; Fri, 22 Mar 2019 15:03:39 +0000 (UTC) Received: from localhost.localdomain.com (unknown [10.32.181.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 26C405C708; Fri, 22 Mar 2019 15:03:38 +0000 (UTC) From: Paolo Abeni To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , John Fastabend , Ivan Vecera Subject: [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc Date: Fri, 22 Mar 2019 16:01:55 +0100 Message-Id: <5c30962d8213c8483aee810aa447028e56de963e.1553263445.git.pabeni@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Fri, 22 Mar 2019 15:03:39 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The queue is marked not empty after acquiring the seqlock, and it's up to the NOLOCK qdisc clearing such flag on dequeue. Since the empty status lays on the same cache-line of the seqlock, it's always hot on cache during the updates. This makes the empty flag update a little bit loosy. Given the lack of synchronization between enqueue and dequeue, this is unavoidable. v2 -> v3: - qdisc_is_empty() has a const argument (Eric) v1 -> v2: - use really an 'empty' flag instead of 'not_empty', as suggested by Eric Signed-off-by: Paolo Abeni Reviewed-by: Eric Dumazet Reviewed-by: Ivan Vecera --- include/net/sch_generic.h | 11 +++++++++++ net/sched/sch_generic.c | 3 +++ 2 files changed, 14 insertions(+) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 31284c078d06..e227475e78ca 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -113,6 +113,9 @@ struct Qdisc { spinlock_t busylock ____cacheline_aligned_in_smp; spinlock_t seqlock; + + /* for NOLOCK qdisc, true if there are no enqueued skbs */ + bool empty; struct rcu_head rcu; }; @@ -143,11 +146,19 @@ static inline bool qdisc_is_running(struct Qdisc *qdisc) return (raw_read_seqcount(&qdisc->running) & 1) ? true : false; } +static inline bool qdisc_is_empty(const struct Qdisc *qdisc) +{ + if (qdisc->flags & TCQ_F_NOLOCK) + return qdisc->empty; + return !qdisc->q.qlen; +} + static inline bool qdisc_run_begin(struct Qdisc *qdisc) { if (qdisc->flags & TCQ_F_NOLOCK) { if (!spin_trylock(&qdisc->seqlock)) return false; + qdisc->empty = false; } else if (qdisc_is_running(qdisc)) { return false; } diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index a117d9260558..81356ef38d1d 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -671,6 +671,8 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc) qdisc_qstats_cpu_backlog_dec(qdisc, skb); qdisc_bstats_cpu_update(qdisc, skb); qdisc_qstats_atomic_qlen_dec(qdisc); + } else { + qdisc->empty = true; } return skb; @@ -880,6 +882,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, sch->enqueue = ops->enqueue; sch->dequeue = ops->dequeue; sch->dev_queue = dev_queue; + sch->empty = true; dev_hold(dev); refcount_set(&sch->refcnt, 1); From patchwork Fri Mar 22 15:01:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Abeni X-Patchwork-Id: 1061292 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44Qn3f4F3Tz9sRd for ; Sat, 23 Mar 2019 02:03:42 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729582AbfCVPDl (ORCPT ); Fri, 22 Mar 2019 11:03:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56912 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729566AbfCVPDl (ORCPT ); Fri, 22 Mar 2019 11:03:41 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A7E9A356FF; Fri, 22 Mar 2019 15:03:40 +0000 (UTC) Received: from localhost.localdomain.com (unknown [10.32.181.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 89F195C57E; Fri, 22 Mar 2019 15:03:39 +0000 (UTC) From: Paolo Abeni To: netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , John Fastabend , Ivan Vecera Subject: [PATCH net-next v3 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc Date: Fri, 22 Mar 2019 16:01:56 +0100 Message-Id: <309f53a89c097a186ec37c841bb5d986ecf0ce3e.1553263445.git.pabeni@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 22 Mar 2019 15:03:40 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org With commit c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array") pfifo_fast no longer benefit from the TCQ_F_CAN_BYPASS optimization. Due to retpolines the cost of the enqueue()/dequeue() pair has become relevant and we observe measurable regression for the uncontended scenario when the packet-rate is below line rate. After commit 46b1c18f9deb ("net: sched: put back q.qlen into a single location") we can check for empty qdisc with a reasonably fast operation even for nolock qdiscs. This change extends TCQ_F_CAN_BYPASS support to nolock qdisc. The new chunk of code mirrors closely the existing one for traditional qdisc, leveraging a newly introduced helper to read atomically the qdisc length. Tested with pktgen in queue xmit mode, with pfifo_fast, a MQ device, and MQ root qdisc: threads vanilla patched kpps kpps 1 2465 2889 2 4304 5188 4 7898 9589 Same as above, but with a single queue device: threads vanilla patched kpps kpps 1 2556 2827 2 2900 2900 4 5000 5000 8 4700 4700 No mesaurable changes in the contended scenarios, and more 10% improvement in the uncontended ones. v1 -> v2: - rebased after flag name change Signed-off-by: Paolo Abeni Tested-by: Ivan Vecera Reviewed-by: Eric Dumazet Reviewed-by: Ivan Vecera --- net/core/dev.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/net/core/dev.c b/net/core/dev.c index 2b67f2aa59dd..e6cf39a9d3c5 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3468,6 +3468,15 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) { __qdisc_drop(skb, &to_free); rc = NET_XMIT_DROP; + } else if ((q->flags & TCQ_F_CAN_BYPASS) && q->empty && + qdisc_run_begin(q)) { + qdisc_bstats_cpu_update(q, skb); + + if (sch_direct_xmit(skb, q, dev, txq, NULL, true)) + __qdisc_run(q); + + qdisc_run_end(q); + rc = NET_XMIT_SUCCESS; } else { rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK; qdisc_run(q);