[{"id":1268,"web_url":"http://patchwork.ozlabs.org/comment/1268/","msgid":"<5f2db9d90809181811q74c3211fp9a099acb8e895fd4@mail.gmail.com>","list_archive_url":null,"date":"2008-09-19T01:11:45","subject":"Re: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","submitter":{"id":252,"url":"http://patchwork.ozlabs.org/api/people/252/","name":"Alexander Duyck","email":"alexander.duyck@gmail.com"},"content":"On Thu, Sep 18, 2008 at 12:44 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:\n> I think, these changes make sense only if they don't add more then give,\n> and two dequeues (and still no way to kill requeue) is IMHO too much.\n> (I mean the maintenance.) As far as I can see it's mainly for HFSC's\n> qdisc_peek_len(), anyway this looks like main issue to me.\n\nThe thing is this was just meant to be a proof of concept for the most\npart so I was doing a lot of cut and paste coding, and as a result the\nsize increased by a good amount.  I admit this could be cleaned up a\nlot I just wanted to verify some things.\n\nAlso my ultimate goal wasn't to kill requeue completely as you can't\ndo that since in the case of TSO/GSO you will end up with SKBs which\nrequire multiple transmit descriptors and therefore you will always\nneed an option to requeue.  The advantage with this approach is that\nyou don't incur the CPU cost, which is a significant savings when you\ncompare the requeue approach which was generating 13% cpu to the smart\ndequeue which was only using 3%.\n\n> Below a few small doubts. (I need to find some time for details yet.)\n> BTW, this patch needs a checkpatch run.\n\nI did run checkpatch on this.  Most of the errors are inherited from\nthe cut and paste and I didn't want to take the time to completely\nrewrite the core qdisc functionality.  Other than those errors I\nbelieve some were whitespace complaints as I was using tabs to the\nstart of my functions and then spaces to indent my function parameters\nin the case of having to wrap for long lines.\n\n> ---\n> diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h\n> index b786a5b..4082f39 100644\n> --- a/include/net/pkt_sched.h\n> +++ b/include/net/pkt_sched.h\n> @@ -90,10 +90,7 @@ extern void __qdisc_run(struct Qdisc *q);\n>\n>  static inline void qdisc_run(struct Qdisc *q)\n>  {\n> -       struct netdev_queue *txq = q->dev_queue;\n> -\n> -       if (!netif_tx_queue_stopped(txq) &&\n>\n> I think, there is no reason to do a full dequeue try each time instead\n> of this check, even if we save on requeuing now. We could try to save\n> the result of the last dequeue, e.g. a number or some mask of a few\n> tx_queues which prevented dequeuing, and check for the change of state\n> only. (Or alternatively, what I mentioned before: a flag set with the\n> full stop or freeze.)\n\nOnce again if you have a suggestion on approach feel free to modify\nthe patch and see how it works for you.  My only concern is that there\nare several qdiscs which won't give you the same packet twice and so\nyou don't know what is going to pop out until you go in and check.\n\n>\n> -           !test_and_set_bit(__QDISC_STATE_RUNNING, &q->state))\n> +       if (!test_and_set_bit(__QDISC_STATE_RUNNING, &q->state))\n>                __qdisc_run(q);\n>  }\n>\n> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h\n> index e556962..4400a18 100644\n> --- a/include/net/sch_generic.h\n> +++ b/include/net/sch_generic.h\n> ...\n> +static inline struct sk_buff *__qdisc_smart_dequeue(struct Qdisc *sch,\n> +                                                   struct sk_buff_head *list)\n> +{\n> +       struct sk_buff *skb = skb_peek(list);\n>\n> Since success is much more likely here, __skb_dequeue() with\n> __skb_queue_head() on fail looks better to me.\n>\n> Of course, it's a matter of taste, but (if we really need these two\n> dequeues) maybe qdisc_dequeue_smart() would be more in line with\n> qdisc_dequeue_head()? (And similarly smart names below.)\n\nRight, but then we are getting back to the queue/requeue stuff.  If\nyou want feel free to make use of my patch to generate your own that\nuses that approach.  I just don't like changing things unless I\nabsolutely have to and all I did is essentially tear apart\n__skb_dequeue and place the bits inline with my testing for\nnetif_tx_subqueue_stopped.\n\n> +       struct netdev_queue *txq;\n> +\n> +       if (!skb)\n> +               return NULL;\n> +\n> +       txq = netdev_get_tx_queue(qdisc_dev(sch), skb_get_queue_mapping(skb));\n> +       if (netif_tx_queue_stopped(txq) || netif_tx_queue_frozen(txq)) {\n> +               sch->flags |= TCQ_F_STOPPED;\n> +               return NULL;\n> +       }\n> +       __skb_unlink(skb, list);\n> +       sch->qstats.backlog -= qdisc_pkt_len(skb);\n> +       sch->flags &= ~TCQ_F_STOPPED;\n>\n> This clearing is probably needed in ->reset() and in ->drop() too.\n> (Or above, after if (!skb))\n\nFor the most part I would agree with you, but for now I was only using\nthe flag as a part of the smart_dequeue process to flag the upper\nqueues so I didn't give it much thought.  It is yet another thing I\nprobably should have cleaned up but didn't get around to since this\nwas mostly proof of concept.\n\n> ...\n> diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c\n> index d14f020..4da1a85 100644\n> --- a/net/sched/sch_htb.c\n> +++ b/net/sched/sch_htb.c\n> ...\n> +static struct sk_buff *htb_smart_dequeue(struct Qdisc *sch)\n> +{\n> +       struct sk_buff *skb = NULL;\n> +       struct htb_sched *q = qdisc_priv(sch);\n> +       int level, stopped = false;\n> +       psched_time_t next_event;\n> +\n> +       /* try to dequeue direct packets as high prio (!) to minimize cpu work */\n> +       skb = skb_peek(&q->direct_queue);\n>\n> As above: __skb_dequeue()?\n\nActually I could probably replace most of the skb_peek stuff with\ncalls to __qdisc_smart_dequeue instead and then just check for the\nflag on fail.\n\n> Thanks,\n> Jarek P.\n\nI probably won't be able to contribute to this for at least the next\ntwo weeks since I am going to be out for vacation from Saturday until\nthe start of October.\n\nIn the meantime I also just threw a couple of patches out which may\nhelp anyone who is trying to test this stuff.  Turns out if you try to\ndo a netperf UDP_STREAM test on a multiqueue aware system with 2.6.27\nyou get horrible performance on the receive side.  The root cause\nappears to be that simple_tx_hash was doing a hash on fragmented\npackets and as a result placing packets in psuedo random queues which\ncaused issues with packet ordering.\n\nThanks,\n\nAlex","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Received":["from vger.kernel.org (vger.kernel.org [209.132.176.167])\n\tby ozlabs.org (Postfix) with ESMTP id 6759ADDF09\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 19 Sep 2008 11:11:57 +1000 (EST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1755720AbYISBLt (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 18 Sep 2008 21:11:49 -0400","(majordomo@vger.kernel.org) by vger.kernel.org id S1755581AbYISBLs\n\t(ORCPT <rfc822; netdev-outgoing>); Thu, 18 Sep 2008 21:11:48 -0400","from el-out-1112.google.com ([209.85.162.178]:59486 \"EHLO\n\tel-out-1112.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1755527AbYISBLs (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 18 Sep 2008 21:11:48 -0400","by el-out-1112.google.com with SMTP id z25so51967ele.1\n\tfor <netdev@vger.kernel.org>; Thu, 18 Sep 2008 18:11:45 -0700 (PDT)","by 10.150.97.19 with SMTP id u19mr2286517ybb.24.1221786705476;\n\tThu, 18 Sep 2008 18:11:45 -0700 (PDT)","by 10.150.158.5 with HTTP; Thu, 18 Sep 2008 18:11:45 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;\n\th=domainkey-signature:received:received:message-id:date:from:to\n\t:subject:cc:in-reply-to:mime-version:content-type\n\t:content-transfer-encoding:content-disposition:references;\n\tbh=UbBE77bSxwY7TdmkDR69BYA+YBAGdQR0ErFLGMoxp4Y=;\n\tb=S468XrQYUqu9Aw1sc80UzRtMEn4GHuJ26CgpxEe0x/+VWwJ0VMUsC/8E8jaFbaJs1Y\n\tDYZvhBLCRmNy0t1ph5mBig2XcNhXkTvXZJ7PgkWb58qOdvGVmwArZE3/5semkuWuAtJV\n\thzSpDSzt+E34ryn60c6sCYOjiMEZhj8GHqusM=","DomainKey-Signature":"a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;\n\th=message-id:date:from:to:subject:cc:in-reply-to:mime-version\n\t:content-type:content-transfer-encoding:content-disposition\n\t:references;\n\tb=tFFVBo9GynYLI1T6CWcXC1C0wLRKYF/k2/uGl2eCPvPBNtvbqtA2lyfvA2rf7/meMZ\n\teC/0rt8kQ49IFAi0zDxZ+ERP/BfhWeLStQp5viq7U4JdaNx+fiuckm/oCV3n1Epz/hbt\n\tJfLR6EeN5EEWZJwxdqKCxfTPkuWZw3L1/G9x8=","Message-ID":"<5f2db9d90809181811q74c3211fp9a099acb8e895fd4@mail.gmail.com>","Date":"Thu, 18 Sep 2008 18:11:45 -0700","From":"\"Alexander Duyck\" <alexander.duyck@gmail.com>","To":"\"Jarek Poplawski\" <jarkao2@gmail.com>","Subject":"Re: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","Cc":"\"Alexander Duyck\" <alexander.h.duyck@intel.com>,\n\tnetdev@vger.kernel.org, herbert@gondor.apana.org.au,\n\tdavem@davemloft.net, kaber@trash.net","In-Reply-To":"<20080918194419.GA2982@ami.dom.local>","MIME-Version":"1.0","Content-Type":"text/plain; charset=ISO-8859-1","Content-Transfer-Encoding":"7bit","Content-Disposition":"inline","References":"<20080918063036.27934.91273.stgit@localhost.localdomain>\n\t<20080918194419.GA2982@ami.dom.local>","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1388,"web_url":"http://patchwork.ozlabs.org/comment/1388/","msgid":"<20080919151632.GB2646@ami.dom.local>","list_archive_url":null,"date":"2008-09-19T15:16:32","subject":"Re: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","submitter":{"id":277,"url":"http://patchwork.ozlabs.org/api/people/277/","name":"Jarek Poplawski","email":"jarkao2@gmail.com"},"content":"On Thu, Sep 18, 2008 at 09:44:19PM +0200, Jarek Poplawski wrote:\n> Alexander Duyck wrote, On 09/18/2008 08:43 AM:\n> ...\n> > ---\n> > This patch changes the behavior of the sch->dequeue to only\n> > dequeue a packet if the queue it is bound for is not currently\n> > stopped.  This functionality is provided via a new op called\n> > smart_dequeue.\n> > \n> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>\n> > ---\n\nAlexander, I guess you've seen my last messages/patches in this thread\nand noticed that this __netif_schedule() problem is present both in\nthis RFC patch and sch_multiq: if skb isn't dequeued because of inner\ntx_queue stopped state check, there is missing __netif_schedule()\nbefore exit from qdisc_run().\n\nJarek P.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Received":["from vger.kernel.org (vger.kernel.org [209.132.176.167])\n\tby ozlabs.org (Postfix) with ESMTP id 47F84DE171\n\tfor <patchwork-incoming@ozlabs.org>;\n\tSat, 20 Sep 2008 01:16:23 +1000 (EST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752342AbYISPQS (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tFri, 19 Sep 2008 11:16:18 -0400","(majordomo@vger.kernel.org) by vger.kernel.org id S1752572AbYISPQS\n\t(ORCPT <rfc822; netdev-outgoing>); Fri, 19 Sep 2008 11:16:18 -0400","from gv-out-0910.google.com ([216.239.58.186]:57215 \"EHLO\n\tgv-out-0910.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1752271AbYISPQR (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Fri, 19 Sep 2008 11:16:17 -0400","by gv-out-0910.google.com with SMTP id e6so22508gvc.37\n\tfor <netdev@vger.kernel.org>; Fri, 19 Sep 2008 08:16:15 -0700 (PDT)","by 10.103.219.18 with SMTP id w18mr152967muq.121.1221837374686;\n\tFri, 19 Sep 2008 08:16:14 -0700 (PDT)","from ami.dom.local ( [83.27.43.118])\n\tby mx.google.com with ESMTPS id j9sm5572919mue.3.2008.09.19.08.16.10\n\t(version=SSLv3 cipher=RC4-MD5);\n\tFri, 19 Sep 2008 08:16:13 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;\n\th=domainkey-signature:received:received:date:from:to:cc:subject\n\t:message-id:references:mime-version:content-type:content-disposition\n\t:in-reply-to:user-agent;\n\tbh=Cm3XL0KopQazFyMTZueBLaRbTDQY82Usmvrglzp49hc=;\n\tb=Kn2bzWJL09BkAMVt76tRmmfVDVmNBnhM2HO/8HHuZzb6VlxKS5PfKiBMksVAqJJyIw\n\tAbboUR9eCOnUT+3HyhECBhvgI6lz9OiyK8hacX4yokKxuhmcPrsEXMB6yaC+9jELXkwn\n\tiVz7EYAtzMMbtDx+KpaWpljZgxJ2er3cEzPIY=","DomainKey-Signature":"a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;\n\th=date:from:to:cc:subject:message-id:references:mime-version\n\t:content-type:content-disposition:in-reply-to:user-agent;\n\tb=aGDBxbwYjRJPKj4qYZDWGJ0YnuWVV2V7+vXWVKTuuOTuC0NpPSaIgiqea7oY7WXxyX\n\t894QVrwwcyho5ui4rPnOqyop7yXi6QckSywJnpqYPRtOmLbpuw2NUh2zy2DAdHlJi3Ee\n\tKi7y0l6yq3nJffgnCpUeW2d+PprogHkxzXQBA=","Date":"Fri, 19 Sep 2008 17:16:32 +0200","From":"Jarek Poplawski <jarkao2@gmail.com>","To":"Alexander Duyck <alexander.h.duyck@intel.com>","Cc":"netdev@vger.kernel.org, herbert@gondor.apana.org.au,\n\tdavem@davemloft.net, kaber@trash.net","Subject":"Re: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","Message-ID":"<20080919151632.GB2646@ami.dom.local>","References":"<20080918063036.27934.91273.stgit@localhost.localdomain>\n\t<20080918194419.GA2982@ami.dom.local>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20080918194419.GA2982@ami.dom.local>","User-Agent":"Mutt/1.5.18 (2008-05-17)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1400,"web_url":"http://patchwork.ozlabs.org/comment/1400/","msgid":"<80769D7B14936844A23C0C43D9FBCF0F1598E4A3@orsmsx501.amr.corp.intel.com>","list_archive_url":null,"date":"2008-09-19T16:26:29","subject":"RE: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","submitter":{"id":251,"url":"http://patchwork.ozlabs.org/api/people/251/","name":"Duyck, Alexander H","email":"alexander.h.duyck@intel.com"},"content":"Jarek Poplawski wrote:\n> On Thu, Sep 18, 2008 at 09:44:19PM +0200, Jarek Poplawski wrote:\n>> Alexander Duyck wrote, On 09/18/2008 08:43 AM:\n>> ...\n>>> ---\n>>> This patch changes the behavior of the sch->dequeue to only\n>>> dequeue a packet if the queue it is bound for is not currently\n>>> stopped.  This functionality is provided via a new op called\n>>> smart_dequeue.\n>>>\n>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>\n>>> ---\n>\n> Alexander, I guess you've seen my last messages/patches in this thread\n> and noticed that this __netif_schedule() problem is present both in\n> this RFC patch and sch_multiq: if skb isn't dequeued because of inner\n> tx_queue stopped state check, there is missing __netif_schedule()\n> before exit from qdisc_run().\n>\n> Jarek P.\n\nActually most of that is handled in the fact that netif_tx_wake_queue\nwill call __netif_schedule when it decides to wake up a queue that has\nbeen stopped.  Putting it in skb_dequeue is unnecessary, and the way\nyou did it would cause issues because you should be scheduling the root\nqdisc, not the one currently doing the dequeue.\n\nMy bigger concern is the fact one queue is back to stopping all queues.\nBy using one global XOFF state, you create head-of-line blocking.  I can\nsee the merit in this approach but the problem is for multiqueue the\nqueues really should be independent.  What your change does is reduce the\nbenefit of having multiqueue by throwing in a new state which pretty much\nmatches what netif_stop/wake_queue used to do in the 2.6.26 version of tx\nmultiqueue.\n\nAlso changing dequeue_skb will likely cause additional issues for\nseveral qdiscs as it doesn't report anything up to parent queues, as a\nresult you will end up with qdiscs like prio acting more like multiq\nbecause they won't know if a queue is empty, stopped, or throttled.\nAlso I believe this will cause a panic on pfifo_fast and several other\nqdiscs because some check to see if there are packets in the queue and\nif so dequeue with the assumption that they will get a packet out.  You\nwill need to add checks for this to resolve this issue.\n\nThe one thing I liked about my approach was that after I was done you\ncould have prio as a parent of multiq leaves or multiq as the parent of\nprio leaves and all the hardware queues would receive the packets in the\nsame order as the result.\n\nThanks,\n\nAlex","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Received":["from vger.kernel.org (vger.kernel.org [209.132.176.167])\n\tby ozlabs.org (Postfix) with ESMTP id 274E1DE05D\n\tfor <patchwork-incoming@ozlabs.org>;\n\tSat, 20 Sep 2008 02:26:37 +1000 (EST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1753132AbYISQ0c (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tFri, 19 Sep 2008 12:26:32 -0400","(majordomo@vger.kernel.org) by vger.kernel.org id S1752617AbYISQ0c\n\t(ORCPT <rfc822; netdev-outgoing>); Fri, 19 Sep 2008 12:26:32 -0400","from mga11.intel.com ([192.55.52.93]:37766 \"EHLO mga11.intel.com\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1752464AbYISQ0b convert rfc822-to-8bit (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Fri, 19 Sep 2008 12:26:31 -0400","from fmsmga002.fm.intel.com ([10.253.24.26])\n\tby fmsmga102.fm.intel.com with ESMTP; 19 Sep 2008 09:23:38 -0700","from azsmsx602.amr.corp.intel.com ([10.2.121.201])\n\tby fmsmga002.fm.intel.com with ESMTP; 19 Sep 2008 09:23:28 -0700","from orsmsx601.amr.corp.intel.com (10.22.226.213) by\n\tazsmsx602.amr.corp.intel.com (10.2.121.201) with Microsoft SMTP\n\tServer (TLS) id 8.1.291.1; Fri, 19 Sep 2008 09:26:30 -0700","from orsmsx501.amr.corp.intel.com ([10.22.226.209]) by\n\torsmsx601.amr.corp.intel.com ([10.22.226.213]) with mapi;\n\tFri, 19 Sep 2008 09:26:29 -0700"],"X-ExtLoop1":"1","X-IronPort-AV":"E=Sophos;i=\"4.32,431,1217833200\"; d=\"scan'208\";a=\"382236166\"","From":"\"Duyck, Alexander H\" <alexander.h.duyck@intel.com>","To":"Jarek Poplawski <jarkao2@gmail.com>","CC":"\"netdev@vger.kernel.org\" <netdev@vger.kernel.org>,\n\t\"herbert@gondor.apana.org.au\" <herbert@gondor.apana.org.au>,\n\t\"davem@davemloft.net\" <davem@davemloft.net>,\n\t\"kaber@trash.net\" <kaber@trash.net>","Date":"Fri, 19 Sep 2008 09:26:29 -0700","Subject":"RE: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","Thread-Topic":"[RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","Thread-Index":"Ackaarly8MF1Ypt6TvqACR5QCWAl7gABMzdQ","Message-ID":"<80769D7B14936844A23C0C43D9FBCF0F1598E4A3@orsmsx501.amr.corp.intel.com>","References":"<20080918063036.27934.91273.stgit@localhost.localdomain>\n\t<20080918194419.GA2982@ami.dom.local>\n\t<20080919151632.GB2646@ami.dom.local>","In-Reply-To":"<20080919151632.GB2646@ami.dom.local>","Accept-Language":"en-US","Content-Language":"en-US","X-MS-Has-Attach":"","X-MS-TNEF-Correlator":"","acceptlanguage":"en-US","Content-Type":"text/plain; charset=\"us-ascii\"","Content-Transfer-Encoding":"8BIT","MIME-Version":"1.0","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1402,"web_url":"http://patchwork.ozlabs.org/comment/1402/","msgid":"<20080919163738.GA2843@ami.dom.local>","list_archive_url":null,"date":"2008-09-19T16:37:38","subject":"Re: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","submitter":{"id":277,"url":"http://patchwork.ozlabs.org/api/people/277/","name":"Jarek Poplawski","email":"jarkao2@gmail.com"},"content":"On Fri, Sep 19, 2008 at 05:16:32PM +0200, Jarek Poplawski wrote:\n...\n> Alexander, I guess you've seen my last messages/patches in this thread\n> and noticed that this __netif_schedule() problem is present both in\n> this RFC patch and sch_multiq: if skb isn't dequeued because of inner\n> tx_queue stopped state check, there is missing __netif_schedule()\n> before exit from qdisc_run().\n\nHmm... probably false alarm. It seems there have to be same wake_queue\nafter all.\n\nSorry,\nJarek P.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Received":["from vger.kernel.org (vger.kernel.org [209.132.176.167])\n\tby ozlabs.org (Postfix) with ESMTP id 53416DDEE7\n\tfor <patchwork-incoming@ozlabs.org>;\n\tSat, 20 Sep 2008 02:36:49 +1000 (EST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752941AbYISQgo (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tFri, 19 Sep 2008 12:36:44 -0400","(majordomo@vger.kernel.org) by vger.kernel.org id S1752851AbYISQgo\n\t(ORCPT <rfc822; netdev-outgoing>); Fri, 19 Sep 2008 12:36:44 -0400","from ey-out-2122.google.com ([74.125.78.26]:37705 \"EHLO\n\tey-out-2122.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1752617AbYISQgn (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Fri, 19 Sep 2008 12:36:43 -0400","by ey-out-2122.google.com with SMTP id 6so163948eyi.37\n\tfor <netdev@vger.kernel.org>; Fri, 19 Sep 2008 09:36:41 -0700 (PDT)","by 10.103.49.7 with SMTP id b7mr293198muk.3.1221842201309;\n\tFri, 19 Sep 2008 09:36:41 -0700 (PDT)","from ami.dom.local ( [83.27.43.118])\n\tby mx.google.com with ESMTPS id\n\ts10sm2615939muh.12.2008.09.19.09.36.38\n\t(version=SSLv3 cipher=RC4-MD5);\n\tFri, 19 Sep 2008 09:36:40 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;\n\th=domainkey-signature:received:received:date:from:to:cc:subject\n\t:message-id:references:mime-version:content-type:content-disposition\n\t:in-reply-to:user-agent;\n\tbh=sSEAEwtUAcLLyt4nYVXt4hsy/8TWxkk+zcafglxUMrE=;\n\tb=b8z9r2Po+/upLxsDXTOtY4ShWeNzdp0pJioRIII1kR/ZtYEAKB3+xTbFAMwCVjZ1uF\n\t4z68ULeosYhg+aQbL3KTtSwuo7Z+AG62Q4bLlymCAq9TQRjBP3l5tQdbr1pc+YzgoC5H\n\tS2m/lUJ7oSLQAVzyFTKhyDgmI7IkFqZvGpsZc=","DomainKey-Signature":"a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;\n\th=date:from:to:cc:subject:message-id:references:mime-version\n\t:content-type:content-disposition:in-reply-to:user-agent;\n\tb=ExrC7fRR8vr5zWcaeAczdiYCmvoI6DWuM9X1u78xjjJjqJGt4nB01qM7ae6zO9vR0f\n\t5nwMVOVmtXOOk5saQZyTBaGWkW1HnZcxNJGgVhb74nHkAG5txaFZif8EB8P8DHd5oF6V\n\tosaNk8aPRRQamPDSYXfoh7ZvnhVxdYkx60Oo8=","Date":"Fri, 19 Sep 2008 18:37:38 +0200","From":"Jarek Poplawski <jarkao2@gmail.com>","To":"Alexander Duyck <alexander.h.duyck@intel.com>","Cc":"netdev@vger.kernel.org, herbert@gondor.apana.org.au,\n\tdavem@davemloft.net, kaber@trash.net","Subject":"Re: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","Message-ID":"<20080919163738.GA2843@ami.dom.local>","References":"<20080918063036.27934.91273.stgit@localhost.localdomain>\n\t<20080918194419.GA2982@ami.dom.local>\n\t<20080919151632.GB2646@ami.dom.local>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20080919151632.GB2646@ami.dom.local>","User-Agent":"Mutt/1.5.18 (2008-05-17)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1416,"web_url":"http://patchwork.ozlabs.org/comment/1416/","msgid":"<20080919173523.GA2888@ami.dom.local>","list_archive_url":null,"date":"2008-09-19T17:35:23","subject":"Re: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","submitter":{"id":277,"url":"http://patchwork.ozlabs.org/api/people/277/","name":"Jarek Poplawski","email":"jarkao2@gmail.com"},"content":"On Fri, Sep 19, 2008 at 09:26:29AM -0700, Duyck, Alexander H wrote:\n...\n> Actually most of that is handled in the fact that netif_tx_wake_queue\n> will call __netif_schedule when it decides to wake up a queue that has\n> been stopped.  Putting it in skb_dequeue is unnecessary,\n\nYou are right, I've just noticed this too, and I'll withdraw this change.\n\n> and the way\n> you did it would cause issues because you should be scheduling the root\n> qdisc, not the one currently doing the dequeue.\n\nI think, this is the right way (but useless here).\n\n> My bigger concern is the fact one queue is back to stopping all queues.\n> By using one global XOFF state, you create head-of-line blocking.  I can\n> see the merit in this approach but the problem is for multiqueue the\n> queues really should be independent.  What your change does is reduce the\n> benefit of having multiqueue by throwing in a new state which pretty much\n> matches what netif_stop/wake_queue used to do in the 2.6.26 version of tx\n> multiqueue.\n\nDo you mean netif_tx_stop_all_queues() and then netif_tx_wake_queue()\nbefore netif_tx_wake_all_queues()? OK, I'll withdraw this patch too.\n\n> Also changing dequeue_skb will likely cause additional issues for\n> several qdiscs as it doesn't report anything up to parent queues, as a\n> result you will end up with qdiscs like prio acting more like multiq\n> because they won't know if a queue is empty, stopped, or throttled.\n> Also I believe this will cause a panic on pfifo_fast and several other\n> qdiscs because some check to see if there are packets in the queue and\n> if so dequeue with the assumption that they will get a packet out.  You\n> will need to add checks for this to resolve this issue.\n\nI really can't get your point. Don't you mean skb_dequeue()?\ndequeue_skb() is used only by qdisc_restart()...\n\n> The one thing I liked about my approach was that after I was done you\n> could have prio as a parent of multiq leaves or multiq as the parent of\n> prio leaves and all the hardware queues would receive the packets in the\n> same order as the result.\n\nI'm not against your approach, but I'd like to be sure these\ncomplications are really worth of it. Of course, if my proposal, the\nfirst take of 3 patches, doesn't work as you predict (and I doubt),\nthen we can forget about it.\n\nThanks,\nJarek P.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Received":["from vger.kernel.org (vger.kernel.org [209.132.176.167])\n\tby ozlabs.org (Postfix) with ESMTP id 9AA52DDFAF\n\tfor <patchwork-incoming@ozlabs.org>;\n\tSat, 20 Sep 2008 03:34:14 +1000 (EST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751991AbYISReK (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tFri, 19 Sep 2008 13:34:10 -0400","(majordomo@vger.kernel.org) by vger.kernel.org id S1751808AbYISReJ\n\t(ORCPT <rfc822; netdev-outgoing>); Fri, 19 Sep 2008 13:34:09 -0400","from nf-out-0910.google.com ([64.233.182.187]:34107 \"EHLO\n\tnf-out-0910.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1751389AbYISReG (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Fri, 19 Sep 2008 13:34:06 -0400","by nf-out-0910.google.com with SMTP id d3so210310nfc.21\n\tfor <netdev@vger.kernel.org>; Fri, 19 Sep 2008 10:34:02 -0700 (PDT)","by 10.103.22.16 with SMTP id z16mr326952mui.78.1221845642507;\n\tFri, 19 Sep 2008 10:34:02 -0700 (PDT)","from ami.dom.local ( [83.27.43.118])\n\tby mx.google.com with ESMTPS id j10sm2885530muh.1.2008.09.19.10.33.57\n\t(version=SSLv3 cipher=RC4-MD5);\n\tFri, 19 Sep 2008 10:34:01 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;\n\th=domainkey-signature:received:received:date:from:to:cc:subject\n\t:message-id:references:mime-version:content-type:content-disposition\n\t:in-reply-to:user-agent;\n\tbh=b9JLLQoQ8w0w1rXo/8++LH/MHlAyDIlX5ZGIsSyx3Ms=;\n\tb=dCfU3Gr7kjdvCWfnFVM7Dgpr/wRlkCtD5z4dtMbjXkgVMnGV3YrUHeooqcC6DSiE7s\n\tTGHy0NccyYqpzadvlXRSIjnTn1/XY1wQFr8kTxRNJQJ9YMz1g8pqzjVgxKGfUgltaeTI\n\tFTo1xkjH3jT3M0/uuC+Or7JA3Z8JQugdxG/Lk=","DomainKey-Signature":"a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;\n\th=date:from:to:cc:subject:message-id:references:mime-version\n\t:content-type:content-disposition:in-reply-to:user-agent;\n\tb=DdaXjPUhFAm8sYfVnzAln5qHmSorG7+gNEEKJaRGpEFptF+2m7z9q6E476mK3/oFg5\n\tTGyXx9z8MaCT9rSYIqhUrqREJkm3y+QcNIY2J9nUQArN7lzVqFS64BahDa5uSdvMUl8R\n\t/otKKFLPacOLJOVN/O9wP5RZUlzDVsjC/4VcM=","Date":"Fri, 19 Sep 2008 19:35:23 +0200","From":"Jarek Poplawski <jarkao2@gmail.com>","To":"\"Duyck, Alexander H\" <alexander.h.duyck@intel.com>","Cc":"\"netdev@vger.kernel.org\" <netdev@vger.kernel.org>,\n\t\"herbert@gondor.apana.org.au\" <herbert@gondor.apana.org.au>,\n\t\"davem@davemloft.net\" <davem@davemloft.net>,\n\t\"kaber@trash.net\" <kaber@trash.net>","Subject":"Re: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","Message-ID":"<20080919173523.GA2888@ami.dom.local>","References":"<20080918063036.27934.91273.stgit@localhost.localdomain>\n\t<20080918194419.GA2982@ami.dom.local>\n\t<20080919151632.GB2646@ami.dom.local>\n\t<80769D7B14936844A23C0C43D9FBCF0F1598E4A3@orsmsx501.amr.corp.intel.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<80769D7B14936844A23C0C43D9FBCF0F1598E4A3@orsmsx501.amr.corp.intel.com>","User-Agent":"Mutt/1.5.18 (2008-05-17)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1434,"web_url":"http://patchwork.ozlabs.org/comment/1434/","msgid":"<80769D7B14936844A23C0C43D9FBCF0F1598E7E1@orsmsx501.amr.corp.intel.com>","list_archive_url":null,"date":"2008-09-19T18:01:12","subject":"RE: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","submitter":{"id":251,"url":"http://patchwork.ozlabs.org/api/people/251/","name":"Duyck, Alexander H","email":"alexander.h.duyck@intel.com"},"content":"Jarek Poplawski wrote:\n\n>> Also changing dequeue_skb will likely cause additional issues for\n>> several qdiscs as it doesn't report anything up to parent queues, as\n>> a result you will end up with qdiscs like prio acting more like\n>> multiq because they won't know if a queue is empty, stopped, or\n>> throttled.\n>> Also I believe this will cause a panic on pfifo_fast and several\n>> other qdiscs because some check to see if there are packets in the\n>> queue and if so dequeue with the assumption that they will get a\n>> packet out.  You will need to add checks for this to resolve this\n>> issue.\n>\n> I really can't get your point. Don't you mean skb_dequeue()?\n> dequeue_skb() is used only by qdisc_restart()...\nYou're right.  I misread it as skb_dequeue.  The problem is though you\nare still relying on q->requeue which last I knew was only being used\na few qdiscs.  In addition you will still be taking the cpu hit for the\ndequeue/requeue on several qdiscs which can't use q->requeue without\nviolating the way they were supposed to work.\n\n>> The one thing I liked about my approach was that after I was done you\n>> could have prio as a parent of multiq leaves or multiq as the parent\n>> of prio leaves and all the hardware queues would receive the packets\n>> in the same order as the result.\n>\n> I'm not against your approach, but I'd like to be sure these\n> complications are really worth of it. Of course, if my proposal, the\n> first take of 3 patches, doesn't work as you predict (and I doubt),\n> then we can forget about it.\nWell when you get some testing done let me know.  The main areas I am\nconcerned with are:\n\n1.  CPU utilization stays the same regardless of which queue used.\n2.  Maintain current qdisc behavior on a per hw queue basis.\n3.  Avoid head-of-line blocking where it applies\n        for example: prio band 0 not blocked by band 1, or 1 by 2, etc..\n                                or\n                         multiq not blocked on any band due to 1 band blocked\n\nAs long as all 3 criteria are met I would be happy with any solution\nprovided.\n\nThanks,\n\nAlex","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Received":["from vger.kernel.org (vger.kernel.org [209.132.176.167])\n\tby ozlabs.org (Postfix) with ESMTP id 4002BDDF79\n\tfor <patchwork-incoming@ozlabs.org>;\n\tSat, 20 Sep 2008 04:01:59 +1000 (EST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1754705AbYISSBt (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tFri, 19 Sep 2008 14:01:49 -0400","(majordomo@vger.kernel.org) by vger.kernel.org id S1754768AbYISSBs\n\t(ORCPT <rfc822; netdev-outgoing>); Fri, 19 Sep 2008 14:01:48 -0400","from mga02.intel.com ([134.134.136.20]:38578 \"EHLO mga02.intel.com\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1754705AbYISSBr convert rfc822-to-8bit (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Fri, 19 Sep 2008 14:01:47 -0400","from orsmga002.jf.intel.com ([10.7.209.21])\n\tby orsmga101.jf.intel.com with ESMTP; 19 Sep 2008 10:56:35 -0700","from unknown (HELO azsmsx001.amr.corp.intel.com) ([10.2.167.98])\n\tby orsmga002.jf.intel.com with ESMTP; 19 Sep 2008 11:01:38 -0700","from orsmsx601.amr.corp.intel.com (10.22.226.213) by\n\tazsmsx001.amr.corp.intel.com (10.2.167.98) with Microsoft SMTP Server\n\t(TLS) id 8.1.291.1; Fri, 19 Sep 2008 11:01:46 -0700","from orsmsx501.amr.corp.intel.com ([10.22.226.209]) by\n\torsmsx601.amr.corp.intel.com ([10.22.226.213]) with mapi;\n\tFri, 19 Sep 2008 11:01:12 -0700"],"X-ExtLoop1":"1","X-IronPort-AV":"E=Sophos;i=\"4.32,431,1217833200\"; d=\"scan'208\";a=\"338285043\"","From":"\"Duyck, Alexander H\" <alexander.h.duyck@intel.com>","To":"Jarek Poplawski <jarkao2@gmail.com>","CC":"\"netdev@vger.kernel.org\" <netdev@vger.kernel.org>,\n\t\"herbert@gondor.apana.org.au\" <herbert@gondor.apana.org.au>,\n\t\"davem@davemloft.net\" <davem@davemloft.net>,\n\t\"kaber@trash.net\" <kaber@trash.net>","Date":"Fri, 19 Sep 2008 11:01:12 -0700","Subject":"RE: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","Thread-Topic":"[RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","Thread-Index":"AckaffVolFEcQaUSQX2LMogwn8E0ZAAAfnQQ","Message-ID":"<80769D7B14936844A23C0C43D9FBCF0F1598E7E1@orsmsx501.amr.corp.intel.com>","References":"<20080918063036.27934.91273.stgit@localhost.localdomain>\n\t<20080918194419.GA2982@ami.dom.local>\n\t<20080919151632.GB2646@ami.dom.local>\n\t<80769D7B14936844A23C0C43D9FBCF0F1598E4A3@orsmsx501.amr.corp.intel.com>\n\t<20080919173523.GA2888@ami.dom.local>","In-Reply-To":"<20080919173523.GA2888@ami.dom.local>","Accept-Language":"en-US","Content-Language":"en-US","X-MS-Has-Attach":"","X-MS-TNEF-Correlator":"","acceptlanguage":"en-US","Content-Type":"text/plain; charset=\"us-ascii\"","Content-Transfer-Encoding":"8BIT","MIME-Version":"1.0","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1457,"web_url":"http://patchwork.ozlabs.org/comment/1457/","msgid":"<20080919185150.GB2888@ami.dom.local>","list_archive_url":null,"date":"2008-09-19T18:51:50","subject":"Re: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","submitter":{"id":277,"url":"http://patchwork.ozlabs.org/api/people/277/","name":"Jarek Poplawski","email":"jarkao2@gmail.com"},"content":"On Fri, Sep 19, 2008 at 11:01:12AM -0700, Duyck, Alexander H wrote:\n> Jarek Poplawski wrote:\n> \n> >> Also changing dequeue_skb will likely cause additional issues for\n> >> several qdiscs as it doesn't report anything up to parent queues, as\n> >> a result you will end up with qdiscs like prio acting more like\n> >> multiq because they won't know if a queue is empty, stopped, or\n> >> throttled.\n> >> Also I believe this will cause a panic on pfifo_fast and several\n> >> other qdiscs because some check to see if there are packets in the\n> >> queue and if so dequeue with the assumption that they will get a\n> >> packet out.  You will need to add checks for this to resolve this\n> >> issue.\n> >\n> > I really can't get your point. Don't you mean skb_dequeue()?\n> > dequeue_skb() is used only by qdisc_restart()...\n> You're right.  I misread it as skb_dequeue.  The problem is though you\n> are still relying on q->requeue which last I knew was only being used\n> a few qdiscs.  In addition you will still be taking the cpu hit for the\n> dequeue/requeue on several qdiscs which can't use q->requeue without\n> violating the way they were supposed to work.\n\nI'm not sure I understand your point. No qdisc will use any new\nq->requeue. All qdisc do dequeue, but on tx fail an skb isn't requeued\nback, but queued by qdisc_restart() in a private list (of root qdisc),\nand then tried before any new dequeuing. There will be no cpu hit,\nbecause each next try is only skb_peek() on this list, until tx queue\nof an skb at the head is working.\n\n> >> The one thing I liked about my approach was that after I was done you\n> >> could have prio as a parent of multiq leaves or multiq as the parent\n> >> of prio leaves and all the hardware queues would receive the packets\n> >> in the same order as the result.\n> >\n> > I'm not against your approach, but I'd like to be sure these\n> > complications are really worth of it. Of course, if my proposal, the\n> > first take of 3 patches, doesn't work as you predict (and I doubt),\n> > then we can forget about it.\n> Well when you get some testing done let me know.  The main areas I am\n> concerned with are:\n> \n> 1.  CPU utilization stays the same regardless of which queue used.\n> 2.  Maintain current qdisc behavior on a per hw queue basis.\n> 3.  Avoid head-of-line blocking where it applies\n>         for example: prio band 0 not blocked by band 1, or 1 by 2, etc..\n>                                 or\n>                          multiq not blocked on any band due to 1 band blocked\n> \n> As long as all 3 criteria are met I would be happy with any solution\n> provided.\n\nAlas this testing is both the weekest (I'm not going to do this), and\nthe strongest side of this solution, because it's mostly David's work\n(my patch could be actually skipped without much impact). So, I hope,\nyou don't suggest this could be not enough tested or otherwise not the\nbest?!\n\nAnyway, I don't insist on \"my\" solution. I simply think it's reasonable,\nand it's not for private reasons, because I didn't even find this out.\nI think, if it's so wrong you should have no problem to show this in\none little test. And maybe David changed his mind in the meantime and he\nwill choose your way even without this.\n\nThanks,\nJarek P.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Received":["from vger.kernel.org (vger.kernel.org [209.132.176.167])\n\tby ozlabs.org (Postfix) with ESMTP id 4F383DE3C3\n\tfor <patchwork-incoming@ozlabs.org>;\n\tSat, 20 Sep 2008 04:50:02 +1000 (EST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1753961AbYISSt5 (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tFri, 19 Sep 2008 14:49:57 -0400","(majordomo@vger.kernel.org) by vger.kernel.org id S1753814AbYISSt5\n\t(ORCPT <rfc822; netdev-outgoing>); Fri, 19 Sep 2008 14:49:57 -0400","from ey-out-2122.google.com ([74.125.78.24]:14786 \"EHLO\n\tey-out-2122.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1753510AbYISStz (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Fri, 19 Sep 2008 14:49:55 -0400","by ey-out-2122.google.com with SMTP id 6so186331eyi.37\n\tfor <netdev@vger.kernel.org>; Fri, 19 Sep 2008 11:49:53 -0700 (PDT)","by 10.103.201.5 with SMTP id d5mr426330muq.18.1221850193116;\n\tFri, 19 Sep 2008 11:49:53 -0700 (PDT)","from ami.dom.local ( [83.27.43.118])\n\tby mx.google.com with ESMTPS id u9sm7449292muf.9.2008.09.19.11.49.50\n\t(version=SSLv3 cipher=RC4-MD5);\n\tFri, 19 Sep 2008 11:49:52 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;\n\th=domainkey-signature:received:received:date:from:to:cc:subject\n\t:message-id:references:mime-version:content-type:content-disposition\n\t:in-reply-to:user-agent;\n\tbh=GaIzqTvFYwBigH+cRgRCM54YTTCqCsP82k0+KtZp0EI=;\n\tb=oFMiCjBVj3YhFBJfb7FmIabTHWYHkVDS459vxrFLO5dTalJQnlBfYp8dRH/wx/f7jV\n\tCg93Jsjmc6S4k+lNkER77H+60r/Mq3UeNfW3RiXaXRbuOhzSRqfQRqaBvdWjLdf3YA9u\n\txsSJ4xtQsgtMBigT7x7f6A4n8HNAJmRV6XhyU=","DomainKey-Signature":"a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;\n\th=date:from:to:cc:subject:message-id:references:mime-version\n\t:content-type:content-disposition:in-reply-to:user-agent;\n\tb=D1ERBpLQT+FKT6EeBaewvmHVHERQnmvaomCcoSSnOOClVv46EV3/YZ0297gnMKTyAt\n\t/G+tm9ArxZazcKGRLAPFdkg5IX3dA7R0iwxFMhP3SHqmS1qfY7qu9bpiK1rYyYNM/kYD\n\t4swJBBW57kteD+48Tok54qQO2Xn9alGlHQVdI=","Date":"Fri, 19 Sep 2008 20:51:50 +0200","From":"Jarek Poplawski <jarkao2@gmail.com>","To":"\"Duyck, Alexander H\" <alexander.h.duyck@intel.com>","Cc":"\"netdev@vger.kernel.org\" <netdev@vger.kernel.org>,\n\t\"herbert@gondor.apana.org.au\" <herbert@gondor.apana.org.au>,\n\t\"davem@davemloft.net\" <davem@davemloft.net>,\n\t\"kaber@trash.net\" <kaber@trash.net>","Subject":"Re: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","Message-ID":"<20080919185150.GB2888@ami.dom.local>","References":"<20080918063036.27934.91273.stgit@localhost.localdomain>\n\t<20080918194419.GA2982@ami.dom.local>\n\t<20080919151632.GB2646@ami.dom.local>\n\t<80769D7B14936844A23C0C43D9FBCF0F1598E4A3@orsmsx501.amr.corp.intel.com>\n\t<20080919173523.GA2888@ami.dom.local>\n\t<80769D7B14936844A23C0C43D9FBCF0F1598E7E1@orsmsx501.amr.corp.intel.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<80769D7B14936844A23C0C43D9FBCF0F1598E7E1@orsmsx501.amr.corp.intel.com>","User-Agent":"Mutt/1.5.18 (2008-05-17)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1494,"web_url":"http://patchwork.ozlabs.org/comment/1494/","msgid":"<80769D7B14936844A23C0C43D9FBCF0F1598EDAC@orsmsx501.amr.corp.intel.com>","list_archive_url":null,"date":"2008-09-19T21:43:27","subject":"RE: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","submitter":{"id":251,"url":"http://patchwork.ozlabs.org/api/people/251/","name":"Duyck, Alexander H","email":"alexander.h.duyck@intel.com"},"content":"Jarek Poplawski wrote:\n> On Fri, Sep 19, 2008 at 11:01:12AM -0700, Duyck, Alexander H wrote:\n> I'm not sure I understand your point. No qdisc will use any new\n> q->requeue. All qdisc do dequeue, but on tx fail an skb isn't requeued\n> back, but queued by qdisc_restart() in a private list (of root qdisc),\n> and then tried before any new dequeuing. There will be no cpu hit,\n> because each next try is only skb_peek() on this list, until tx queue\n> of an skb at the head is working.\n\nIt was me missing a piece.  I didn't look over the portion where Dave\nchanged requeue to always use the requeue list.  This breaks multiq\nand we are right back to the head-of-line blocking.  Also I think patch\n1 will break things since queueing a packet with skb->next already set\nwill cause issues since the queue uses prev and next last I knew.\n\n> Alas this testing is both the weekest (I'm not going to do this), and\n> the strongest side of this solution, because it's mostly David's work\n> (my patch could be actually skipped without much impact). So, I hope,\n> you don't suggest this could be not enough tested or otherwise not the\n> best?!\n>\n> Anyway, I don't insist on \"my\" solution. I simply think it's\n> reasonable, and it's not for private reasons, because I didn't even\n> find this out. I think, if it's so wrong you should have no problem\n> to show this in one little test. And maybe David changed his mind in\n> the meantime and he will choose your way even without this.\n\nI didn't mean to insist on this if it is how it came across.  My main\nconcern is that I just don't want to break what I just got working.\nMost of this requeue to q->requeue stuff will break multiq and\nintroduce head-of-line blocking.  When That happens it will fall back\ninto my lap to try to resolve it so that is why I want to avoid it in\nthe first place.\n\nAnyway I'm out for the next two weeks with at least the first week of\nthat being without email access.  I hope I provided some valuable\ninput so that this can head in the right direction.\n\nThanks,\n\nAlex","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Received":["from vger.kernel.org (vger.kernel.org [209.132.176.167])\n\tby ozlabs.org (Postfix) with ESMTP id ED394DE0B4\n\tfor <patchwork-incoming@ozlabs.org>;\n\tSat, 20 Sep 2008 07:43:37 +1000 (EST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751306AbYISVnd (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tFri, 19 Sep 2008 17:43:33 -0400","(majordomo@vger.kernel.org) by vger.kernel.org id S1751615AbYISVnd\n\t(ORCPT <rfc822; netdev-outgoing>); Fri, 19 Sep 2008 17:43:33 -0400","from mga02.intel.com ([134.134.136.20]:30103 \"EHLO mga02.intel.com\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1751023AbYISVnd convert rfc822-to-8bit (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Fri, 19 Sep 2008 17:43:33 -0400","from orsmga001.jf.intel.com ([10.7.209.18])\n\tby orsmga101.jf.intel.com with ESMTP; 19 Sep 2008 14:38:20 -0700","from unknown (HELO azsmsx601.amr.corp.intel.com) ([10.2.121.193])\n\tby orsmga001.jf.intel.com with ESMTP; 19 Sep 2008 14:42:41 -0700","from orsmsx601.amr.corp.intel.com (10.22.226.213) by\n\tazsmsx601.amr.corp.intel.com (10.2.121.193) with Microsoft SMTP\n\tServer (TLS) id 8.1.291.1; Fri, 19 Sep 2008 14:43:31 -0700","from orsmsx501.amr.corp.intel.com ([10.22.226.209]) by\n\torsmsx601.amr.corp.intel.com ([10.22.226.213]) with mapi;\n\tFri, 19 Sep 2008 14:43:28 -0700"],"X-ExtLoop1":"1","X-IronPort-AV":"E=Sophos;i=\"4.32,433,1217833200\"; d=\"scan'208\";a=\"441401581\"","From":"\"Duyck, Alexander H\" <alexander.h.duyck@intel.com>","To":"Jarek Poplawski <jarkao2@gmail.com>","CC":"\"netdev@vger.kernel.org\" <netdev@vger.kernel.org>,\n\t\"herbert@gondor.apana.org.au\" <herbert@gondor.apana.org.au>,\n\t\"davem@davemloft.net\" <davem@davemloft.net>,\n\t\"kaber@trash.net\" <kaber@trash.net>","Date":"Fri, 19 Sep 2008 14:43:27 -0700","Subject":"RE: [RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","Thread-Topic":"[RFC PATCH] sched: only dequeue if packet can be queued to\n\thardware queue.","Thread-Index":"AckaiILLzRxP4GkNSdeRsG6NINjbfAAD8o5w","Message-ID":"<80769D7B14936844A23C0C43D9FBCF0F1598EDAC@orsmsx501.amr.corp.intel.com>","References":"<20080918063036.27934.91273.stgit@localhost.localdomain>\n\t<20080918194419.GA2982@ami.dom.local>\n\t<20080919151632.GB2646@ami.dom.local>\n\t<80769D7B14936844A23C0C43D9FBCF0F1598E4A3@orsmsx501.amr.corp.intel.com>\n\t<20080919173523.GA2888@ami.dom.local>\n\t<80769D7B14936844A23C0C43D9FBCF0F1598E7E1@orsmsx501.amr.corp.intel.com>\n\t<20080919185150.GB2888@ami.dom.local>","In-Reply-To":"<20080919185150.GB2888@ami.dom.local>","Accept-Language":"en-US","Content-Language":"en-US","X-MS-Has-Attach":"","X-MS-TNEF-Correlator":"","acceptlanguage":"en-US","Content-Type":"text/plain; charset=\"us-ascii\"","Content-Transfer-Encoding":"8BIT","MIME-Version":"1.0","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}}]