From patchwork Thu Dec 7 17:58:19 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Fastabend X-Patchwork-Id: 845743 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="fUlxhPKF"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3yt3BS09Ydz9s82 for ; Fri, 8 Dec 2017 04:58:40 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753880AbdLGR6i (ORCPT ); Thu, 7 Dec 2017 12:58:38 -0500 Received: from mail-pf0-f194.google.com ([209.85.192.194]:42350 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753554AbdLGR6d (ORCPT ); Thu, 7 Dec 2017 12:58:33 -0500 Received: by mail-pf0-f194.google.com with SMTP id d23so5229694pfe.9 for ; Thu, 07 Dec 2017 09:58:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=yVuB9l8hbQgDLHSGvHo6Ea3ddGox4Ia24vgRkVEE10E=; b=fUlxhPKF7YzM4MzmY9AvbRZHMDYZDrnbn1Coa23OOHsFJln4j5t0Gx7D35WyV1VUW6 4PebZKlBPwB2sI/eDEFhYGp0ydugSvLJxqtQ4qdr7D11BmOaP5FKsqP2Hy7zyfheb7QL rGP1ERSX+po6rbVE4m32d48i8YxW76FaFIngMdLqsG+jmoAn7gTST50Q5jte+K0342wp rKdAfqt52VPGOpEc47cC0oBw5IjAoH1Z0m1m8JlMJ7FOvJr4UoHoQujfe/YHjoffpsOd fHYeqCbuMnNnH8tncWKAHHH6QY0V8YZLHJzNVx2xRP6uzi9qbCfE8jm0RnZlgiLUJmdy RKVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=yVuB9l8hbQgDLHSGvHo6Ea3ddGox4Ia24vgRkVEE10E=; b=BnQ1T7+UX3D/9UsqABVIZ1RyDh7XyaJc1Mi3ncGyirOP8a2CT9hJ5dRJCYJG7mcMxn YI6/1eBY7BeUw/9SrELWN5fP2acJvvcznDZrOhEFYQk005xUdiVzlDWyYB3HAHoScr4W ifqKCeWyTjd9bYcxfGDdPHOxa4DEqVx42beu9cdLYVOQ92xEpwkgC/Hp1D9cAsHQFuCt bqOscN4xnJFoyaiFEOB8mpTZKLrjPETY1aY0mhYb9QuSitcAbvt2zAZFBwMiyYqSAA1j uytQbE1YqTeAenXnFx1HkM/sqSN7rd/MzOLxueyWGUeWCZVuiCLSzyiLIUSYf15LaV+G Ce6w== X-Gm-Message-State: AKGB3mKcK0ShVvkfPIYYF2lhifqPfmiGhMFDbR45DbP+qXMN4UeoEQ2B Ivevp4lRjuyM5ZmDmwf6omk= X-Google-Smtp-Source: AGs4zMaVGSraL1W5lPV/3gq0JOtBbbqoC8WVN2L4C8KQoqRpV2BuiuhrhHaIJo70a5xleHJjEKPNBA== X-Received: by 10.84.212.8 with SMTP id d8mr3828585pli.164.1512669512706; Thu, 07 Dec 2017 09:58:32 -0800 (PST) Received: from [127.0.1.1] ([72.168.144.118]) by smtp.gmail.com with ESMTPSA id j6sm11056044pfk.152.2017.12.07.09.58.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Dec 2017 09:58:32 -0800 (PST) Subject: [net-next PATCH 14/14] net: sched: pfifo_fast use skb_array From: John Fastabend To: willemdebruijn.kernel@gmail.com, daniel@iogearbox.net, eric.dumazet@gmail.com, davem@davemloft.net Cc: netdev@vger.kernel.org, jiri@resnulli.us, xiyou.wangcong@gmail.com Date: Thu, 07 Dec 2017 09:58:19 -0800 Message-ID: <20171207175819.5771.31262.stgit@john-Precision-Tower-5810> In-Reply-To: <20171207173500.5771.41198.stgit@john-Precision-Tower-5810> References: <20171207173500.5771.41198.stgit@john-Precision-Tower-5810> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This converts the pfifo_fast qdisc to use the skb_array data structure and set the lockless qdisc bit. pfifo_fast is the first qdisc to support the lockless bit that can be a child of a qdisc requiring locking. So we add logic to clear the lock bit on initialization in these cases when the qdisc graft operation occurs. This also removes the logic used to pick the next band to dequeue from and instead just checks a per priority array for packets from top priority to lowest. This might need to be a bit more clever but seems to work for now. Signed-off-by: John Fastabend --- net/sched/sch_api.c | 5 ++ net/sched/sch_generic.c | 140 +++++++++++++++++++++++++++++------------------ 2 files changed, 92 insertions(+), 53 deletions(-) diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 5cb64d2..81f61f5 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -955,6 +955,11 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent, } else { const struct Qdisc_class_ops *cops = parent->ops->cl_ops; + /* Only support running class lockless if parent is lockless */ + if (new && (new->flags & TCQ_F_NOLOCK) && + parent && !(parent->flags & TCQ_F_NOLOCK)) + new->flags &= ~TCQ_F_NOLOCK; + err = -EOPNOTSUPP; if (cops && cops->graft) { unsigned long cl = cops->find(parent, classid); diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 5ff93c2..ff6a5ac 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include #include @@ -578,93 +579,93 @@ struct Qdisc_ops noqueue_qdisc_ops __read_mostly = { /* * Private data for a pfifo_fast scheduler containing: - * - queues for the three band - * - bitmap indicating which of the bands contain skbs + * - rings for priority bands */ struct pfifo_fast_priv { - u32 bitmap; - struct qdisc_skb_head q[PFIFO_FAST_BANDS]; + struct skb_array q[PFIFO_FAST_BANDS]; }; -/* - * Convert a bitmap to the first band number where an skb is queued, where: - * bitmap=0 means there are no skbs on any band. - * bitmap=1 means there is an skb on band 0. - * bitmap=7 means there are skbs on all 3 bands, etc. - */ -static const int bitmap2band[] = {-1, 0, 1, 0, 2, 0, 1, 0}; - -static inline struct qdisc_skb_head *band2list(struct pfifo_fast_priv *priv, - int band) +static inline struct skb_array *band2list(struct pfifo_fast_priv *priv, + int band) { - return priv->q + band; + return &priv->q[band]; } static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc *qdisc, struct sk_buff **to_free) { - if (qdisc->q.qlen < qdisc_dev(qdisc)->tx_queue_len) { - int band = prio2band[skb->priority & TC_PRIO_MAX]; - struct pfifo_fast_priv *priv = qdisc_priv(qdisc); - struct qdisc_skb_head *list = band2list(priv, band); - - priv->bitmap |= (1 << band); - qdisc->q.qlen++; - return __qdisc_enqueue_tail(skb, qdisc, list); - } + int band = prio2band[skb->priority & TC_PRIO_MAX]; + struct pfifo_fast_priv *priv = qdisc_priv(qdisc); + struct skb_array *q = band2list(priv, band); + int err; - return qdisc_drop(skb, qdisc, to_free); + err = skb_array_produce(q, skb); + + if (unlikely(err)) + return qdisc_drop_cpu(skb, qdisc, to_free); + + qdisc_qstats_cpu_qlen_inc(qdisc); + qdisc_qstats_cpu_backlog_inc(qdisc, skb); + return NET_XMIT_SUCCESS; } static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc) { struct pfifo_fast_priv *priv = qdisc_priv(qdisc); - int band = bitmap2band[priv->bitmap]; - - if (likely(band >= 0)) { - struct qdisc_skb_head *qh = band2list(priv, band); - struct sk_buff *skb = __qdisc_dequeue_head(qh); + struct sk_buff *skb = NULL; + int band; - if (likely(skb != NULL)) { - qdisc_qstats_backlog_dec(qdisc, skb); - qdisc_bstats_update(qdisc, skb); - } + for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) { + struct skb_array *q = band2list(priv, band); - qdisc->q.qlen--; - if (qh->qlen == 0) - priv->bitmap &= ~(1 << band); + if (__skb_array_empty(q)) + continue; - return skb; + skb = skb_array_consume_bh(q); + } + if (likely(skb)) { + qdisc_qstats_cpu_backlog_dec(qdisc, skb); + qdisc_bstats_cpu_update(qdisc, skb); + qdisc_qstats_cpu_qlen_dec(qdisc); } - return NULL; + return skb; } static struct sk_buff *pfifo_fast_peek(struct Qdisc *qdisc) { struct pfifo_fast_priv *priv = qdisc_priv(qdisc); - int band = bitmap2band[priv->bitmap]; + struct sk_buff *skb = NULL; + int band; - if (band >= 0) { - struct qdisc_skb_head *qh = band2list(priv, band); + for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) { + struct skb_array *q = band2list(priv, band); - return qh->head; + skb = __skb_array_peek(q); } - return NULL; + return skb; } static void pfifo_fast_reset(struct Qdisc *qdisc) { - int prio; + int i, band; struct pfifo_fast_priv *priv = qdisc_priv(qdisc); - for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) - __qdisc_reset_queue(band2list(priv, prio)); + for (band = 0; band < PFIFO_FAST_BANDS; band++) { + struct skb_array *q = band2list(priv, band); + struct sk_buff *skb; - priv->bitmap = 0; - qdisc->qstats.backlog = 0; - qdisc->q.qlen = 0; + while ((skb = skb_array_consume_bh(q)) != NULL) + kfree_skb(skb); + } + + for_each_possible_cpu(i) { + struct gnet_stats_queue *q = per_cpu_ptr(qdisc->cpu_qstats, i); + + q->backlog = 0; + q->qlen = 0; + } } static int pfifo_fast_dump(struct Qdisc *qdisc, struct sk_buff *skb) @@ -682,17 +683,48 @@ static int pfifo_fast_dump(struct Qdisc *qdisc, struct sk_buff *skb) static int pfifo_fast_init(struct Qdisc *qdisc, struct nlattr *opt) { - int prio; + unsigned int qlen = qdisc_dev(qdisc)->tx_queue_len; struct pfifo_fast_priv *priv = qdisc_priv(qdisc); + int prio; + + /* guard against zero length rings */ + if (!qlen) + return -EINVAL; - for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) - qdisc_skb_head_init(band2list(priv, prio)); + for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) { + struct skb_array *q = band2list(priv, prio); + int err; + + err = skb_array_init(q, qlen, GFP_KERNEL); + if (err) + return -ENOMEM; + } /* Can by-pass the queue discipline */ qdisc->flags |= TCQ_F_CAN_BYPASS; return 0; } +static void pfifo_fast_destroy(struct Qdisc *sch) +{ + struct pfifo_fast_priv *priv = qdisc_priv(sch); + int prio; + + for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) { + struct skb_array *q = band2list(priv, prio); + + /* NULL ring is possible if destroy path is due to a failed + * skb_array_init() in pfifo_fast_init() case. + */ + if (!&q->ring.queue) + continue; + /* Destroy ring but no need to kfree_skb because a call to + * pfifo_fast_reset() has already done that work. + */ + ptr_ring_cleanup(&q->ring, NULL); + } +} + struct Qdisc_ops pfifo_fast_ops __read_mostly = { .id = "pfifo_fast", .priv_size = sizeof(struct pfifo_fast_priv), @@ -700,9 +732,11 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = { .dequeue = pfifo_fast_dequeue, .peek = pfifo_fast_peek, .init = pfifo_fast_init, + .destroy = pfifo_fast_destroy, .reset = pfifo_fast_reset, .dump = pfifo_fast_dump, .owner = THIS_MODULE, + .static_flags = TCQ_F_NOLOCK | TCQ_F_CPUSTATS, }; EXPORT_SYMBOL(pfifo_fast_ops);