Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/761/?format=api
{ "id": 761, "url": "http://patchwork.ozlabs.org/api/patches/761/?format=api", "web_url": "http://patchwork.ozlabs.org/project/netdev/patch/20080920234843.GA2531@ami.dom.local/", "project": { "id": 7, "url": "http://patchwork.ozlabs.org/api/projects/7/?format=api", "name": "Linux network development", "link_name": "netdev", "list_id": "netdev.vger.kernel.org", "list_email": "netdev@vger.kernel.org", "web_url": null, "scm_url": null, "webscm_url": null, "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<20080920234843.GA2531@ami.dom.local>", "list_archive_url": null, "date": "2008-09-20T23:48:43", "name": "[take,2] pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() race", "commit_ref": null, "pull_url": null, "state": "rejected", "archived": true, "hash": "2df9c30891a24b230453a2a20617b9487f346ec1", "submitter": { "id": 277, "url": "http://patchwork.ozlabs.org/api/people/277/?format=api", "name": "Jarek Poplawski", "email": "jarkao2@gmail.com" }, "delegate": { "id": 34, "url": "http://patchwork.ozlabs.org/api/users/34/?format=api", "username": "davem", "first_name": "David", "last_name": "Miller", "email": "davem@davemloft.net" }, "mbox": "http://patchwork.ozlabs.org/project/netdev/patch/20080920234843.GA2531@ami.dom.local/mbox/", "series": [], "comments": "http://patchwork.ozlabs.org/api/patches/761/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/761/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<netdev-owner@vger.kernel.org>", "X-Original-To": "patchwork-incoming@ozlabs.org", "Delivered-To": "patchwork-incoming@ozlabs.org", "Received": [ "from vger.kernel.org (vger.kernel.org [209.132.176.167])\n\tby ozlabs.org (Postfix) with ESMTP id 136BEDDE31\n\tfor <patchwork-incoming@ozlabs.org>;\n\tSun, 21 Sep 2008 09:46:02 +1000 (EST)", "(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751665AbYITXp4 (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tSat, 20 Sep 2008 19:45:56 -0400", "(majordomo@vger.kernel.org) by vger.kernel.org id S1751675AbYITXp4\n\t(ORCPT <rfc822; netdev-outgoing>); Sat, 20 Sep 2008 19:45:56 -0400", "from mu-out-0910.google.com ([209.85.134.189]:56270 \"EHLO\n\tmu-out-0910.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1751539AbYITXpz (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Sat, 20 Sep 2008 19:45:55 -0400", "by mu-out-0910.google.com with SMTP id g7so832855muf.1\n\tfor <netdev@vger.kernel.org>; Sat, 20 Sep 2008 16:45:53 -0700 (PDT)", "by 10.103.243.7 with SMTP id v7mr1479683mur.24.1221954353670;\n\tSat, 20 Sep 2008 16:45:53 -0700 (PDT)", "from ami.dom.local ( [83.27.54.207])\n\tby mx.google.com with ESMTPS id\n\ts10sm6279378muh.12.2008.09.20.16.45.51\n\t(version=SSLv3 cipher=RC4-MD5);\n\tSat, 20 Sep 2008 16:45:52 -0700 (PDT)" ], "DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;\n\th=domainkey-signature:received:received:date:from:to:cc:subject\n\t:message-id:references:mime-version:content-type:content-disposition\n\t:in-reply-to:user-agent;\n\tbh=8qnvR7c3afAX5bAisd5CdNAwAxFBHdZjoLyJJpP6194=;\n\tb=trS72smMIlNwmtmOEc6XC8u5TNpfS9wftvByinMPkbYj1wDhGM9O65s123QFUq4vVG\n\tkjO/zSpTHps1SMiHN3NBkoqbE8L8lnC1czrYG9QVXWMx/rz+Lhwx8ixHtr6uCzBaV7jF\n\tqJRjRNvT2ohD5qh0FC4OEHst5pNv1KITEgWpA=", "DomainKey-Signature": "a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;\n\th=date:from:to:cc:subject:message-id:references:mime-version\n\t:content-type:content-disposition:in-reply-to:user-agent;\n\tb=CsqjESlFWrGEmEpJLCciVC5B2zw1MIVSbhGQWghHlx7/lf5+P2jpj7VhqZFrrdVBhY\n\tqfbzLpgwrnZjOl53kKQskfVlM7SV96CwPAzD+0jZq2qpQ5+6ipeCUMmct/YZWiUxJgnq\n\tyUrJCEGidPci0lk0iuadYWAxHt27oTwc8YnPY=", "Date": "Sun, 21 Sep 2008 01:48:43 +0200", "From": "Jarek Poplawski <jarkao2@gmail.com>", "To": "David Miller <davem@davemloft.net>", "Cc": "herbert@gondor.apana.org.au, netdev@vger.kernel.org,\n\tkaber@trash.net", "Subject": "Re: [PATCH take 2] pkt_sched: Fix qdisc_watchdog() vs.\n\tdev_deactivate() race", "Message-ID": "<20080920234843.GA2531@ami.dom.local>", "References": "<20080913205408.GA2545@ami.dom.local>\n\t<20080914061610.GA20571@gondor.apana.org.au>\n\t<20080914202715.GA2540@ami.dom.local>\n\t<20080920.002137.108837580.davem@davemloft.net>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=us-ascii", "Content-Disposition": "inline", "In-Reply-To": "<20080920.002137.108837580.davem@davemloft.net>", "User-Agent": "Mutt/1.5.18 (2008-05-17)", "Sender": "netdev-owner@vger.kernel.org", "Precedence": "bulk", "List-ID": "<netdev.vger.kernel.org>", "X-Mailing-List": "netdev@vger.kernel.org" }, "content": "On Sat, Sep 20, 2008 at 12:21:37AM -0700, David Miller wrote:\n...\n> Let's look at what actually matters for cpu utilization. These\n> __qdisc_run() things are invoked in two situations where we might\n> block on the hw queue being stopped:\n> \n> 1) When feeding packets into the qdisc in dev_queue_xmit().\n> \n> Guess what? We _know_ the queue this packet is going to\n> hit.\n> \n> The only new thing we can possible trigger and be interested\n> in at this specific point is if _this_ packet can be sent at\n> this time.\n> \n> And we can check that queue mapping after the qdisc_enqueue_root()\n> call, so that multiq aware qdiscs can have made their changes.\n> \n> 2) When waking up a queue. And here we should schedule the qdisc_run\n> _unconditionally_.\n> \n> If the queue was full, it is extremely likely that new packets\n> are bound for that device queue. There is no real savings to\n> be had by doing this peek/requeue/dequeue stuff.\n> \n> The cpu utilization savings exist for case #1 only, and we can\n> implement the bypass logic _perfectly_ as described above.\n> \n> For #2 there is nothing to check, just do it and see what comes\n> out of the qdisc.\n\nRight, unless __netif_schedule() wasn't done when waking up. I've\nthought about this because of another thread/patch around this\nproblem, and got misled by dev_requeue_skb() scheduling. Now, I think\nthis could be the main reason for this high load. Anyway, if we want\nto skip this check for #2 I think something like the patch below is\nneeded.\n\n> I would suggest adding an skb pointer argument to qdisc_run().\n> If it's NULL, unconditionally schedule __qdisc_run(). Else,\n> only schedule if the TX queue indicated by skb_queue_mapping()\n> is not stopped.\n> \n> dev_queue_xmit() will use the \"pass the skb\" case, but only if\n> qdisc_enqueue_root()'s return value doesn't indicate that there\n> is a potential drop. On potential drop, we'll pass NULL to\n> make sure we don't potentially reference a free'd SKB.\n> \n> The other case in net_tx_action() can always pass NULL to qdisc_run().\n\nI'm not convinced this #1 is useful for us: this could be an skb #1000\nin a queue; the tx status could change many times before this packet\nwould be #1; why worry? This adds additional checks on the fast path\nfor something which is unlikely even if this skb would be #1, but for\nany later skbs it's only a guess. IMHO, if we can't check for the next\nskb to be xmitted it's better to skip this test entirely (which seems\nto be safe with the patch below).\n\nJarek P.\n\n--------------->\npkt_sched: dev_requeue_skb: Don't schedule if a queue is stopped\n\nDoing __netif_schedule() while requeuing because of a stopped tx queue\nand skipping such a test in qdisc_run() can cause a requeuing loop with\nhigh cpu use until the queue is awaken.\n\nSigned-off-by: Jarek Poplawski <jarkao2@gmail.com>\n---\n\n net/sched/sch_generic.c | 23 +++++++++++++++--------\n 1 files changed, 15 insertions(+), 8 deletions(-)\n\n--\nTo unsubscribe from this list: send the line \"unsubscribe netdev\" in\nthe body of a message to majordomo@vger.kernel.org\nMore majordomo info at http://vger.kernel.org/majordomo-info.html", "diff": "diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c\nindex ec0a083..bae2eb8 100644\n--- a/net/sched/sch_generic.c\n+++ b/net/sched/sch_generic.c\n@@ -42,14 +42,17 @@ static inline int qdisc_qlen(struct Qdisc *q)\n \treturn q->q.qlen;\n }\n \n-static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)\n+static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q,\n+\t\t\t\t bool stopped)\n {\n \tif (unlikely(skb->next))\n \t\tq->gso_skb = skb;\n \telse\n \t\tq->ops->requeue(skb, q);\n \n-\t__netif_schedule(q);\n+\tif (!stopped)\n+\t\t__netif_schedule(q);\n+\n \treturn 0;\n }\n \n@@ -89,7 +92,7 @@ static inline int handle_dev_cpu_collision(struct sk_buff *skb,\n \t\t * some time.\n \t\t */\n \t\t__get_cpu_var(netdev_rx_stat).cpu_collision++;\n-\t\tret = dev_requeue_skb(skb, q);\n+\t\tret = dev_requeue_skb(skb, q, false);\n \t}\n \n \treturn ret;\n@@ -121,6 +124,7 @@ static inline int qdisc_restart(struct Qdisc *q)\n \tstruct net_device *dev;\n \tspinlock_t *root_lock;\n \tstruct sk_buff *skb;\n+\tbool stopped;\n \n \t/* Dequeue packet */\n \tif (unlikely((skb = dequeue_skb(q)) == NULL))\n@@ -135,9 +139,13 @@ static inline int qdisc_restart(struct Qdisc *q)\n \ttxq = netdev_get_tx_queue(dev, skb_get_queue_mapping(skb));\n \n \tHARD_TX_LOCK(dev, txq, smp_processor_id());\n-\tif (!netif_tx_queue_stopped(txq) &&\n-\t !netif_tx_queue_frozen(txq))\n+\tif (!netif_tx_queue_stopped(txq) && !netif_tx_queue_frozen(txq)) {\n \t\tret = dev_hard_start_xmit(skb, dev, txq);\n+\t\tstopped = netif_tx_queue_stopped(txq) ||\n+\t\t\t netif_tx_queue_frozen(txq);\n+\t} else {\n+\t\tstopped = true;\n+\t}\n \tHARD_TX_UNLOCK(dev, txq);\n \n \tspin_lock(root_lock);\n@@ -159,12 +167,11 @@ static inline int qdisc_restart(struct Qdisc *q)\n \t\t\tprintk(KERN_WARNING \"BUG %s code %d qlen %d\\n\",\n \t\t\t dev->name, ret, q->q.qlen);\n \n-\t\tret = dev_requeue_skb(skb, q);\n+\t\tret = dev_requeue_skb(skb, q, stopped);\n \t\tbreak;\n \t}\n \n-\tif (ret && (netif_tx_queue_stopped(txq) ||\n-\t\t netif_tx_queue_frozen(txq)))\n+\tif (ret && stopped)\n \t\tret = 0;\n \n \treturn ret;\n", "prefixes": [ "take", "2" ] }