From patchwork Thu Dec 7 17:57:39 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Fastabend X-Patchwork-Id: 845741 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="svGTzjYb"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3yt39f6Slvz9s7B for ; Fri, 8 Dec 2017 04:57:58 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753510AbdLGR54 (ORCPT ); Thu, 7 Dec 2017 12:57:56 -0500 Received: from mail-pf0-f194.google.com ([209.85.192.194]:46501 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750967AbdLGR5y (ORCPT ); Thu, 7 Dec 2017 12:57:54 -0500 Received: by mail-pf0-f194.google.com with SMTP id c204so5218631pfc.13 for ; Thu, 07 Dec 2017 09:57:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=OvSrADL/pBgwSAtTt1wBk7kjE603CuQTNfKCo7w8gxE=; b=svGTzjYb+js4e4ipxePPmVqzDo7lJm02LEh2ogBC5somtpgKCN07v4qaSWuYCz0Geg fHD0vNHz4EWcR0W1sO4PyaqnYXECMunOJp4JfLJDG0Mmm5LqAErygBrS+ReLbKvlG0z+ Q//JJ/rMOfPBBVfc2NrFp/09oIII9dTywUMmSwdmJkmOJtVDQJSx90oQF0sKfd1CK6rC +7Vj/pw1FwI+MQdGy57srQ1XwGgUFGw6AvZCZvV17XzI+Vu/oENvK2/JzZFjLbghxmEB /WWPHUl6bhBfyefHpjVevlAz9RA0XmSjW1/piX9sVULt16Fs6vFpWiPWKI9P/1ztph6X 9mHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=OvSrADL/pBgwSAtTt1wBk7kjE603CuQTNfKCo7w8gxE=; b=t+tVTHefbV6XvHpLpybK/45FB8tDpBDNhrxN8/F81tJ5L4MAjTBLkBplD7X3XSTBq+ eBSXEIhC9L8Vx8KyqUruxnCLGex6ux1odioTq+xzXUzJ5ydwzo4MAgCQfPZ0xdYXhcCv zyt/tuieZ8oRCD44XfRatsinGLkmFJd5YfoOmsE5MvgzOUfs2vgkJauUt2T1hTWeWhOi yt0mtrWNSQQWOXwYFXzQKpX10BaSUnEnC8OtPYi8awgO2aZvlV/jpfBJWw1ZgZ9wLtVH dhfF5RDDAqJRUg+KHRXZHdV5v1kNGk7JwXKf2THWFGyGgPn9r6JcYNsB7PpL6QsmQvp2 ZBfw== X-Gm-Message-State: AJaThX5oAXluI5izE+fb+QBM6Bh0Gdk5Slj9naYbuvlqwMpwWdghyJnf l8F9IUE6c4VTzLVwVMwWcs0= X-Google-Smtp-Source: AGs4zMaI7USJpKMinSXOr7IsyQ+eR2a09pCIbXcT2tCIn30TmfjoUglZw431E0MT/3CFGEkh8+Bc9Q== X-Received: by 10.99.119.134 with SMTP id s128mr26000754pgc.409.1512669473709; Thu, 07 Dec 2017 09:57:53 -0800 (PST) Received: from [127.0.1.1] ([72.168.144.118]) by smtp.gmail.com with ESMTPSA id p19sm10668376pfj.140.2017.12.07.09.57.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Dec 2017 09:57:53 -0800 (PST) Subject: [net-next PATCH 12/14] net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mqprio From: John Fastabend To: willemdebruijn.kernel@gmail.com, daniel@iogearbox.net, eric.dumazet@gmail.com, davem@davemloft.net Cc: netdev@vger.kernel.org, jiri@resnulli.us, xiyou.wangcong@gmail.com Date: Thu, 07 Dec 2017 09:57:39 -0800 Message-ID: <20171207175739.5771.12336.stgit@john-Precision-Tower-5810> In-Reply-To: <20171207173500.5771.41198.stgit@john-Precision-Tower-5810> References: <20171207173500.5771.41198.stgit@john-Precision-Tower-5810> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The sch_mqprio qdisc creates a sub-qdisc per tx queue which are then called independently for enqueue and dequeue operations. However statistics are aggregated and pushed up to the "master" qdisc. This patch adds support for any of the sub-qdiscs to be per cpu statistic qdiscs. To handle this case add a check when calculating stats and aggregate the per cpu stats if needed. Signed-off-by: John Fastabend --- net/sched/sch_mq.c | 35 +++++++++++++++--------- net/sched/sch_mqprio.c | 69 +++++++++++++++++++++++++++++++++--------------- 2 files changed, 69 insertions(+), 35 deletions(-) diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c index bc59f05..8cbb5c8 100644 --- a/net/sched/sch_mq.c +++ b/net/sched/sch_mq.c @@ -98,33 +98,42 @@ static int mq_dump(struct Qdisc *sch, struct sk_buff *skb) struct net_device *dev = qdisc_dev(sch); struct Qdisc *qdisc; unsigned int ntx; + __u32 qlen = 0; sch->q.qlen = 0; memset(&sch->bstats, 0, sizeof(sch->bstats)); memset(&sch->qstats, 0, sizeof(sch->qstats)); + /* MQ supports lockless qdiscs. However, statistics accounting needs + * to account for all, none, or a mix of locked and unlocked child + * qdiscs. Percpu stats are added to counters in-band and locking + * qdisc totals are added at end. + */ for (ntx = 0; ntx < dev->num_tx_queues; ntx++) { - struct gnet_stats_basic_cpu __percpu *cpu_bstats = NULL; - struct gnet_stats_queue __percpu *cpu_qstats = NULL; - __u32 qlen = 0; - qdisc = netdev_get_tx_queue(dev, ntx)->qdisc_sleeping; spin_lock_bh(qdisc_lock(qdisc)); if (qdisc_is_percpu_stats(qdisc)) { - cpu_bstats = qdisc->cpu_bstats; - cpu_qstats = qdisc->cpu_qstats; + qlen = qdisc_qlen_sum(qdisc); + __gnet_stats_copy_basic(NULL, &sch->bstats, + qdisc->cpu_bstats, + &qdisc->bstats); + __gnet_stats_copy_queue(&sch->qstats, + qdisc->cpu_qstats, + &qdisc->qstats, qlen); + } else { + sch->q.qlen += qdisc->q.qlen; + sch->bstats.bytes += qdisc->bstats.bytes; + sch->bstats.packets += qdisc->bstats.packets; + sch->qstats.backlog += qdisc->qstats.backlog; + sch->qstats.drops += qdisc->qstats.drops; + sch->qstats.requeues += qdisc->qstats.requeues; + sch->qstats.overlimits += qdisc->qstats.overlimits; } - qlen = qdisc_qlen_sum(qdisc); - - __gnet_stats_copy_basic(NULL, &sch->bstats, - cpu_bstats, &qdisc->bstats); - __gnet_stats_copy_queue(&sch->qstats, - cpu_qstats, &qdisc->qstats, qlen); - spin_unlock_bh(qdisc_lock(qdisc)); } + return 0; } diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c index b85885a9..8622745 100644 --- a/net/sched/sch_mqprio.c +++ b/net/sched/sch_mqprio.c @@ -388,22 +388,40 @@ static int mqprio_dump(struct Qdisc *sch, struct sk_buff *skb) struct nlattr *nla = (struct nlattr *)skb_tail_pointer(skb); struct tc_mqprio_qopt opt = { 0 }; struct Qdisc *qdisc; - unsigned int i; + unsigned int ntx, tc; sch->q.qlen = 0; memset(&sch->bstats, 0, sizeof(sch->bstats)); memset(&sch->qstats, 0, sizeof(sch->qstats)); - for (i = 0; i < dev->num_tx_queues; i++) { - qdisc = rtnl_dereference(netdev_get_tx_queue(dev, i)->qdisc); + /* MQ supports lockless qdiscs. However, statistics accounting needs + * to account for all, none, or a mix of locked and unlocked child + * qdiscs. Percpu stats are added to counters in-band and locking + * qdisc totals are added at end. + */ + for (ntx = 0; ntx < dev->num_tx_queues; ntx++) { + qdisc = netdev_get_tx_queue(dev, ntx)->qdisc_sleeping; spin_lock_bh(qdisc_lock(qdisc)); - sch->q.qlen += qdisc->q.qlen; - sch->bstats.bytes += qdisc->bstats.bytes; - sch->bstats.packets += qdisc->bstats.packets; - sch->qstats.backlog += qdisc->qstats.backlog; - sch->qstats.drops += qdisc->qstats.drops; - sch->qstats.requeues += qdisc->qstats.requeues; - sch->qstats.overlimits += qdisc->qstats.overlimits; + + if (qdisc_is_percpu_stats(qdisc)) { + __u32 qlen = qdisc_qlen_sum(qdisc); + + __gnet_stats_copy_basic(NULL, &sch->bstats, + qdisc->cpu_bstats, + &qdisc->bstats); + __gnet_stats_copy_queue(&sch->qstats, + qdisc->cpu_qstats, + &qdisc->qstats, qlen); + } else { + sch->q.qlen += qdisc->q.qlen; + sch->bstats.bytes += qdisc->bstats.bytes; + sch->bstats.packets += qdisc->bstats.packets; + sch->qstats.backlog += qdisc->qstats.backlog; + sch->qstats.drops += qdisc->qstats.drops; + sch->qstats.requeues += qdisc->qstats.requeues; + sch->qstats.overlimits += qdisc->qstats.overlimits; + } + spin_unlock_bh(qdisc_lock(qdisc)); } @@ -411,9 +429,9 @@ static int mqprio_dump(struct Qdisc *sch, struct sk_buff *skb) memcpy(opt.prio_tc_map, dev->prio_tc_map, sizeof(opt.prio_tc_map)); opt.hw = priv->hw_offload; - for (i = 0; i < netdev_get_num_tc(dev); i++) { - opt.count[i] = dev->tc_to_txq[i].count; - opt.offset[i] = dev->tc_to_txq[i].offset; + for (tc = 0; tc < netdev_get_num_tc(dev); tc++) { + opt.count[tc] = dev->tc_to_txq[tc].count; + opt.offset[tc] = dev->tc_to_txq[tc].offset; } if (nla_put(skb, TCA_OPTIONS, NLA_ALIGN(sizeof(opt)), &opt)) @@ -495,7 +513,6 @@ static int mqprio_dump_class_stats(struct Qdisc *sch, unsigned long cl, if (cl >= TC_H_MIN_PRIORITY) { int i; __u32 qlen = 0; - struct Qdisc *qdisc; struct gnet_stats_queue qstats = {0}; struct gnet_stats_basic_packed bstats = {0}; struct net_device *dev = qdisc_dev(sch); @@ -511,18 +528,26 @@ static int mqprio_dump_class_stats(struct Qdisc *sch, unsigned long cl, for (i = tc.offset; i < tc.offset + tc.count; i++) { struct netdev_queue *q = netdev_get_tx_queue(dev, i); + struct Qdisc *qdisc = rtnl_dereference(q->qdisc); + struct gnet_stats_basic_cpu __percpu *cpu_bstats = NULL; + struct gnet_stats_queue __percpu *cpu_qstats = NULL; - qdisc = rtnl_dereference(q->qdisc); spin_lock_bh(qdisc_lock(qdisc)); - qlen += qdisc->q.qlen; - bstats.bytes += qdisc->bstats.bytes; - bstats.packets += qdisc->bstats.packets; - qstats.backlog += qdisc->qstats.backlog; - qstats.drops += qdisc->qstats.drops; - qstats.requeues += qdisc->qstats.requeues; - qstats.overlimits += qdisc->qstats.overlimits; + if (qdisc_is_percpu_stats(qdisc)) { + cpu_bstats = qdisc->cpu_bstats; + cpu_qstats = qdisc->cpu_qstats; + } + + qlen = qdisc_qlen_sum(qdisc); + __gnet_stats_copy_basic(NULL, &sch->bstats, + cpu_bstats, &qdisc->bstats); + __gnet_stats_copy_queue(&sch->qstats, + cpu_qstats, + &qdisc->qstats, + qlen); spin_unlock_bh(qdisc_lock(qdisc)); } + /* Reclaim root sleeping lock before completing stats */ if (d->lock) spin_lock_bh(d->lock);