[{"id":1760598,"web_url":"http://patchwork.ozlabs.org/comment/1760598/","msgid":"<20170830.184759.686464284014904531.davem@davemloft.net>","list_archive_url":null,"date":"2017-08-31T01:47:59","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":15,"url":"http://patchwork.ozlabs.org/api/people/15/","name":"David Miller","email":"davem@davemloft.net"},"content":"From: Florian Fainelli <f.fainelli@gmail.com>\nDate: Wed, 30 Aug 2017 17:49:29 -0700\n\n> This reverts commit 7ad813f208533cebfcc32d3d7474dc1677d1b09a (\"net: phy:\n> Correctly process PHY_HALTED in phy_stop_machine()\") because it is\n> creating the possibility for a NULL pointer dereference.\n> \n> David Daney provide the following call trace and diagram of events:\n> \n> When ndo_stop() is called we call:\n> \n>  phy_disconnect()\n>     +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;\n>     +---> phy_stop_machine()\n>     |      +---> phy_state_machine()\n>     |              +----> queue_delayed_work(): Work queued.\n>     +--->phy_detach() implies: phydev->attached_dev = NULL;\n> \n> Now at a later time the queued work does:\n> \n>  phy_state_machine()\n>     +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL:\n> \n>  CPU 12 Unable to handle kernel paging request at virtual address\n> 0000000000000048, epc == ffffffff80de37ec, ra == ffffffff80c7c\n> Oops[#1]:\n> CPU: 12 PID: 1502 Comm: kworker/12:1 Not tainted 4.9.43-Cavium-Octeon+ #1\n> Workqueue: events_power_efficient phy_state_machine\n> task: 80000004021ed100 task.stack: 8000000409d70000\n> $ 0   : 0000000000000000 ffffffff84720060 0000000000000048 0000000000000004\n> $ 4   : 0000000000000000 0000000000000001 0000000000000004 0000000000000000\n> $ 8   : 0000000000000000 0000000000000000 00000000ffff98f3 0000000000000000\n> $12   : 8000000409d73fe0 0000000000009c00 ffffffff846547c8 000000000000af3b\n> $16   : 80000004096bab68 80000004096babd0 0000000000000000 80000004096ba800\n> $20   : 0000000000000000 0000000000000000 ffffffff81090000 0000000000000008\n> $24   : 0000000000000061 ffffffff808637b0\n> $28   : 8000000409d70000 8000000409d73cf0 80000000271bd300 ffffffff80c7804c\n> Hi    : 000000000000002a\n> Lo    : 000000000000003f\n> epc   : ffffffff80de37ec netif_carrier_off+0xc/0x58\n> ra    : ffffffff80c7804c phy_state_machine+0x48c/0x4f8\n> Status: 14009ce3        KX SX UX KERNEL EXL IE\n> Cause : 00800008 (ExcCode 02)\n> BadVA : 0000000000000048\n> PrId  : 000d9501 (Cavium Octeon III)\n> Modules linked in:\n> Process kworker/12:1 (pid: 1502, threadinfo=8000000409d70000,\n> task=80000004021ed100, tls=0000000000000000)\n> Stack : 8000000409a54000 80000004096bab68 80000000271bd300 80000000271c1e00\n>         0000000000000000 ffffffff808a1708 8000000409a54000 80000000271bd300\n>         80000000271bd320 8000000409a54030 ffffffff80ff0f00 0000000000000001\n>         ffffffff81090000 ffffffff808a1ac0 8000000402182080 ffffffff84650000\n>         8000000402182080 ffffffff84650000 ffffffff80ff0000 8000000409a54000\n>         ffffffff808a1970 0000000000000000 80000004099e8000 8000000402099240\n>         0000000000000000 ffffffff808a8598 0000000000000000 8000000408eeeb00\n>         8000000409a54000 00000000810a1d00 0000000000000000 8000000409d73de8\n>         8000000409d73de8 0000000000000088 000000000c009c00 8000000409d73e08\n>         8000000409d73e08 8000000402182080 ffffffff808a84d0 8000000402182080\n>         ...\n> Call Trace:\n> [<ffffffff80de37ec>] netif_carrier_off+0xc/0x58\n> [<ffffffff80c7804c>] phy_state_machine+0x48c/0x4f8\n> [<ffffffff808a1708>] process_one_work+0x158/0x368\n> [<ffffffff808a1ac0>] worker_thread+0x150/0x4c0\n> [<ffffffff808a8598>] kthread+0xc8/0xe0\n> [<ffffffff808617f0>] ret_from_kernel_thread+0x14/0x1c\n> \n> The original motivation for this change originated from Marc Gonzales\n> indicating that his network driver did not have its adjust_link callback\n> executing with phydev->link = 0 while he was expecting it.\n> \n> PHYLIB has never made any such guarantees ever because phy_stop() merely just\n> tells the workqueue to move into PHY_HALTED state which will happen\n> asynchronously.\n> \n> Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>\n> Reported-by: David Daney <ddaney.cavm@gmail.com>\n> Fixes: 7ad813f20853 (\"net: phy: Correctly process PHY_HALTED in phy_stop_machine()\")\n> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>\n\nApplied and queued up for -stable, thanks Florian.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjQHm5yl1z9s7M\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu, 31 Aug 2017 11:48:04 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751016AbdHaBsC (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 30 Aug 2017 21:48:02 -0400","from shards.monkeyblade.net ([184.105.139.130]:39054 \"EHLO\n\tshards.monkeyblade.net\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1750814AbdHaBsC (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Wed, 30 Aug 2017 21:48:02 -0400","from localhost (74-93-104-98-Washington.hfc.comcastbusiness.net\n\t[74.93.104.98]) (using TLSv1 with cipher AES256-SHA (256/256 bits))\n\t(Client did not present a certificate)\n\t(Authenticated sender: davem-davemloft)\n\tby shards.monkeyblade.net (Postfix) with ESMTPSA id 69381133FDE6A;\n\tWed, 30 Aug 2017 18:48:01 -0700 (PDT)"],"Date":"Wed, 30 Aug 2017 18:47:59 -0700 (PDT)","Message-Id":"<20170830.184759.686464284014904531.davem@davemloft.net>","To":"f.fainelli@gmail.com","Cc":"netdev@vger.kernel.org, geert+renesas@glider.be,\n\tddaney.cavm@gmail.com, slash.tmp@free.fr,\n\tmarc_gonzales@sigmadesigns.com, andrew@lunn.ch","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","From":"David Miller <davem@davemloft.net>","In-Reply-To":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>","X-Mailer":"Mew version 6.7 on Emacs 25.2 / Mule 6.0 (HANACHIRUSATO)","Mime-Version":"1.0","Content-Type":"Text/Plain; charset=us-ascii","Content-Transfer-Encoding":"7bit","X-Greylist":"Sender succeeded SMTP AUTH, not delayed by\n\tmilter-greylist-4.5.12 (shards.monkeyblade.net\n\t[149.20.54.216]); Wed, 30 Aug 2017 18:48:01 -0700 (PDT)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1760915,"web_url":"http://patchwork.ozlabs.org/comment/1760915/","msgid":"<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>","list_archive_url":null,"date":"2017-08-31T12:29:36","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":67482,"url":"http://patchwork.ozlabs.org/api/people/67482/","name":"Marc Gonzalez","email":"marc_gonzalez@sigmadesigns.com"},"content":"On 31/08/2017 02:49, Florian Fainelli wrote:\n\n> This reverts commit 7ad813f208533cebfcc32d3d7474dc1677d1b09a (\"net: phy:\n> Correctly process PHY_HALTED in phy_stop_machine()\") because it is\n> creating the possibility for a NULL pointer dereference.\n> \n> David Daney provide the following call trace and diagram of events:\n> \n> When ndo_stop() is called we call:\n> \n>  phy_disconnect()\n>     +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;\n\nWhat does this mean?\n\nOn the contrary, phy_stop_interrupts() is only called when *not* polling.\n\n\tif (phydev->irq > 0)\n\t\tphy_stop_interrupts(phydev);\n\n>     +---> phy_stop_machine()\n>     |      +---> phy_state_machine()\n>     |              +----> queue_delayed_work(): Work queued.\n\nYou're referring to the fact that, at the end of phy_state_machine()\n(in polling mode) the code reschedules itself through:\n\n\tif (phydev->irq == PHY_POLL)\n\t\tqueue_delayed_work(system_power_efficient_wq, &phydev->state_queue, PHY_STATE_TIME * HZ);\n\n>     +--->phy_detach() implies: phydev->attached_dev = NULL;\n> \n> Now at a later time the queued work does:\n> \n>  phy_state_machine()\n>     +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL:\n\nI tested a sequence of 500 link up / link down in polling mode,\nand saw no such issue. Race condition?\n\nFor what case in phy_state_machine() is netif_carrier_off()\nbeing called? Surely not PHY_HALTED?\n\n\n> The original motivation for this change originated from Marc Gonzales\n> indicating that his network driver did not have its adjust_link callback\n> executing with phydev->link = 0 while he was expecting it.\n\nI expect the core to call phy_adjust_link() for link changes.\nThis used to work back in 3.4 and was broken somewhere along\nthe way.\n\n> PHYLIB has never made any such guarantees ever because phy_stop() merely\n> just tells the workqueue to move into PHY_HALTED state which will happen\n> asynchronously.\n\nMy original proposal was to fix the issue in the driver.\nI'll try locating it in my archives.\n\nRegards.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjhXC0yfVz9sNr\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu, 31 Aug 2017 22:29:47 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751916AbdHaM3n convert rfc822-to-8bit (ORCPT\n\t<rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 08:29:43 -0400","from us-smtp-delivery-107.mimecast.com ([216.205.24.107]:46345\n\t\"EHLO us-smtp-delivery-107.mimecast.com\" rhost-flags-OK-OK-OK-OK)\n\tby vger.kernel.org with ESMTP id S1751715AbdHaM3m (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 31 Aug 2017 08:29:42 -0400","from CPH-EX1.SDESIGNS.COM\n\t(195-215-56-170-static.dk.customer.tdc.net [195.215.56.170]) (Using\n\tTLS) by us-smtp-1.mimecast.com with ESMTP id\n\tus-mta-181-8YuoZhLtORa52040MLjjkA-1; \n\tThu, 31 Aug 2017 08:29:40 -0400","from [172.27.0.114] (172.27.0.114) by CPH-EX1.sdesigns.com\n\t(192.168.10.36) with Microsoft SMTP Server (TLS) id 14.3.294.0;\n\tThu, 31 Aug 2017 14:29:37 +0200"],"Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Florian Fainelli <f.fainelli@gmail.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>","CC":"netdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>,\n\tMason <slash.tmp@free.fr>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>","From":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>","Message-ID":"<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>","Date":"Thu, 31 Aug 2017 14:29:36 +0200","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tFirefox/52.0 SeaMonkey/2.49.1","MIME-Version":"1.0","In-Reply-To":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>","X-Originating-IP":"[172.27.0.114]","X-MC-Unique":"8YuoZhLtORa52040MLjjkA-1","Content-Type":"text/plain; charset=UTF-8","Content-Transfer-Encoding":"8BIT","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1761049,"web_url":"http://patchwork.ozlabs.org/comment/1761049/","msgid":"<986d76a0-0972-c99d-b87b-ca6924ec03c0@sigmadesigns.com>","list_archive_url":null,"date":"2017-08-31T14:21:42","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":67482,"url":"http://patchwork.ozlabs.org/api/people/67482/","name":"Marc Gonzalez","email":"marc_gonzalez@sigmadesigns.com"},"content":"On 31/08/2017 14:29, Marc Gonzalez wrote:\n\n> On 31/08/2017 02:49, Florian Fainelli wrote:\n> \n>> The original motivation for this change originated from Marc Gonzalez\n>> indicating that his network driver did not have its adjust_link callback\n>> executing with phydev->link = 0 while he was expecting it.\n> \n> I expect the core to call phy_adjust_link() for link changes.\n> This used to work back in 3.4 and was broken somewhere along\n> the way.\n> \n>> PHYLIB has never made any such guarantees ever because phy_stop() merely\n>> just tells the workqueue to move into PHY_HALTED state which will happen\n>> asynchronously.\n> \n> My original proposal was to fix the issue in the driver.\n> I'll try locating it in my archives.\n\nThe original proposal was:\n(I.e. basically a copy of phy_state_machine()'s PHY_HALTED case)\nIs this what I need to submit, now that the synchronous call to\nphy_state_machine() has been removed?\n\ndiff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c\nindex 607064a6d7a1..8b9a981c55c1 100644\n--- a/drivers/net/ethernet/aurora/nb8800.c\n+++ b/drivers/net/ethernet/aurora/nb8800.c\n@@ -1017,6 +1017,10 @@ static int nb8800_stop(struct net_device *dev)\n \n        phy_disconnect(phydev);\n \n+       phydev->link = 0;\n+       netif_carrier_off(dev);\n+       nb8800_link_reconfigure(dev);\n+\n        free_irq(dev->irq, dev);\n \n        nb8800_dma_free(dev);","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjl1V72dyz9s83\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 00:21:50 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751511AbdHaOVt convert rfc822-to-8bit (ORCPT\n\t<rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 10:21:49 -0400","from us-smtp-delivery-107.mimecast.com ([63.128.21.107]:59636 \"EHLO\n\tus-smtp-delivery-107.mimecast.com\" rhost-flags-OK-OK-OK-OK)\n\tby vger.kernel.org with ESMTP id S1751205AbdHaOVs (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 31 Aug 2017 10:21:48 -0400","from CPH-EX1.SDESIGNS.COM\n\t(195-215-56-170-static.dk.customer.tdc.net [195.215.56.170]) (Using\n\tTLS) by us-smtp-1.mimecast.com with ESMTP id\n\tus-mta-150-e6XnXa5ONEa_wT1fDSspfw-1; \n\tThu, 31 Aug 2017 10:21:46 -0400","from [172.27.0.114] (172.27.0.114) by CPH-EX1.sdesigns.com\n\t(192.168.10.36) with Microsoft SMTP Server (TLS) id 14.3.294.0;\n\tThu, 31 Aug 2017 16:21:43 +0200"],"Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","From":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>","To":"Florian Fainelli <f.fainelli@gmail.com>, Mans Rullgard <mans@mansr.com>","CC":"David Daney <ddaney.cavm@gmail.com>, netdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mason <slash.tmp@free.fr>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>","Message-ID":"<986d76a0-0972-c99d-b87b-ca6924ec03c0@sigmadesigns.com>","Date":"Thu, 31 Aug 2017 16:21:42 +0200","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tFirefox/52.0 SeaMonkey/2.49.1","MIME-Version":"1.0","In-Reply-To":"<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>","X-Originating-IP":"[172.27.0.114]","X-MC-Unique":"e6XnXa5ONEa_wT1fDSspfw-1","Content-Type":"text/plain; charset=UTF-8","Content-Transfer-Encoding":"8BIT","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1761177,"web_url":"http://patchwork.ozlabs.org/comment/1761177/","msgid":"<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>","list_archive_url":null,"date":"2017-08-31T16:36:19","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":10133,"url":"http://patchwork.ozlabs.org/api/people/10133/","name":"David Daney","email":"ddaney.cavm@gmail.com"},"content":"On 08/31/2017 05:29 AM, Marc Gonzalez wrote:\n> On 31/08/2017 02:49, Florian Fainelli wrote:\n> \n>> This reverts commit 7ad813f208533cebfcc32d3d7474dc1677d1b09a (\"net: phy:\n>> Correctly process PHY_HALTED in phy_stop_machine()\") because it is\n>> creating the possibility for a NULL pointer dereference.\n>>\n>> David Daney provide the following call trace and diagram of events:\n>>\n>> When ndo_stop() is called we call:\n>>\n>>   phy_disconnect()\n>>      +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;\n> \n> What does this mean?\n\nI meant that after the call to phy_stop_interrupts(), phydev->irq = \nPHY_POLL;\n\n\n> \n> On the contrary, phy_stop_interrupts() is only called when *not* polling.\n\nThat is the case I have.  We are using interrupts from the phy.\n\n\n> \n> \tif (phydev->irq > 0)\n> \t\tphy_stop_interrupts(phydev);\n> \n>>      +---> phy_stop_machine()\n>>      |      +---> phy_state_machine()\n>>      |              +----> queue_delayed_work(): Work queued.\n> \n> You're referring to the fact that, at the end of phy_state_machine()\n> (in polling mode) the code reschedules itself through:\n> \n> \tif (phydev->irq == PHY_POLL)\n> \t\tqueue_delayed_work(system_power_efficient_wq, &phydev->state_queue, PHY_STATE_TIME * HZ);\n\nExactly.  The call to phy_disconnect() ensures that there are no more \ninterrupts and also that phydev->irq = PHY_POLL\n\nThe call to cancel_delayed_work_sync() at the top of phy_stop_machine() \nwas meant to ensure that phy_state_machine() was never run again.  No \ninterrupts + no queued work means that it should be save to do...\n\n> \n>>      +--->phy_detach() implies: phydev->attached_dev = NULL;\n\nThe problem is that by calling phy_state_machine() again (which the \noffending patch added) we now have work scheduled that will try to \ndereference the pointer that was set to NULL as a result of the phy_detach()\n\n\n>>\n>> Now at a later time the queued work does:\n>>\n>>   phy_state_machine()\n>>      +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL:\n> \n> I tested a sequence of 500 link up / link down in polling mode,\n> and saw no such issue. Race condition?\n> \n\nYou were lucky.\n\n> For what case in phy_state_machine() is netif_carrier_off()\n> being called? Surely not PHY_HALTED?\n> \n\nThe phy can be in a variety of states.  It is connected to something \noutside of the system that we don't control, so you cannot assume any \nparticular state.  We must have code that doesn't crash the system no \nmatter what state the phy is in.\n\nI suspect, but have not checked, that the phy is in PHY_RUNNING.  I \nthink that means that because this patch turned the state machine back \non, it will start transitioning through PHY_UP, PHY_AN, ... and \neventually get to the crash we see because phydev->attached_dev = NULL\n\n\n> \n>> The original motivation for this change originated from Marc Gonzales\n>> indicating that his network driver did not have its adjust_link callback\n>> executing with phydev->link = 0 while he was expecting it.\n> \n> I expect the core to call phy_adjust_link() for link changes.\n> This used to work back in 3.4 and was broken somewhere along\n> the way.\n> \n>> PHYLIB has never made any such guarantees ever because phy_stop() merely\n>> just tells the workqueue to move into PHY_HALTED state which will happen\n>> asynchronously.\n> \n> My original proposal was to fix the issue in the driver.\n> I'll try locating it in my archives.\n> \n> Regards.\n>","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"EUVR7eVr\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjp0p14t0z9s81\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 02:36:26 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751871AbdHaQgY (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 12:36:24 -0400","from mail-yw0-f195.google.com ([209.85.161.195]:33466 \"EHLO\n\tmail-yw0-f195.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1751667AbdHaQgW (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 31 Aug 2017 12:36:22 -0400","by mail-yw0-f195.google.com with SMTP id w138so115189yww.0\n\tfor <netdev@vger.kernel.org>; Thu, 31 Aug 2017 09:36:22 -0700 (PDT)","from ddl.caveonetworks.com\n\t(50-233-148-156-static.hfc.comcastbusiness.net. [50.233.148.156])\n\tby smtp.googlemail.com with ESMTPSA id\n\td13sm48104ywd.81.2017.08.31.09.36.20\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tThu, 31 Aug 2017 09:36:21 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=subject:to:cc:references:from:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=uqIM+i2dMbu7dTmniYneMfQodfCfz7V9uvUmSHsHQdQ=;\n\tb=EUVR7eVrEDhvc54gIbQ+cZ7jjP9ZoXpV/gv5sW5t41gOlTzzThz29CT2HHMzNFL0ns\n\tGZBSYJTKLXt6vjHJqrlZ9ZbaGc3qXDfymuN+DrSQCZTwXnN+Gc3zuLEbOHzlq8Vt1UCY\n\tBmyHsFiWkiht92wCG0nNtE2B2g+1kO99rJsgeus2nPsGm3QrBUxy3uBnMARsxRmriH4Y\n\t9V3L6afz03Ha5g0QpNeL9lWBBRLGcuzgW4aflE487d9F1QWpgY5aS/v/uKLPYpaCYs7/\n\tvxliT206I85DJvk8hJ+tDLmKEPRnqD3z8OQ2Vv1DNuoT4im10rv4ePtDvaTValNbw3co\n\taZRw==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:subject:to:cc:references:from:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=uqIM+i2dMbu7dTmniYneMfQodfCfz7V9uvUmSHsHQdQ=;\n\tb=mhPo8iK7VsWQnTjBkwzuyH2VFdRJkYXCjJJ31TGf5VS1tDyydWT3zLk1ZyKg6E22RI\n\tv4FDObU1myhSHjirNCfntVZ/7xWHUdHbXwP2Fmy7Rji280BnT+7rw3Dh2iDUllIh6b1O\n\t/Y1vy9QQKvVibfupLUy14srqseS9y41vkcoX9mQWz51XK/iDQ7C2lNkEQ6v7GtI87Pp7\n\tX8D5c9CPsjHxFXZMfo1fyE4SCnDb/kVLnKckXeCWGxiJPM1aXTuFddgJKJPrMQtvb6f0\n\tx+hkYls0O6bWU1FMN+5s2S8YH9YZG5RAPTedsKkWpntcuSQFi6wA2VD+jEi2qH3Pu1Gw\n\tB8Kg==","X-Gm-Message-State":"AHYfb5g689vzc5vpdXypfvIS91aJsW6IeUsm7o/bHwJF7A45GWSwwCYT\n\tvx0vwWiFx7Aq4A==","X-Google-Smtp-Source":"ADKCNb6C67WuTRlR40dsMPvObdge//XGDLYlya+yeM8eP5fufPPha3VYHhs9EzAYAPCHRD70drZg8A==","X-Received":"by 10.129.55.7 with SMTP id e7mr5479530ywa.184.1504197382159;\n\tThu, 31 Aug 2017 09:36:22 -0700 (PDT)","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tFlorian Fainelli <f.fainelli@gmail.com>","Cc":"netdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>,\n\tMason <slash.tmp@free.fr>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>","From":"David Daney <ddaney.cavm@gmail.com>","Message-ID":"<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>","Date":"Thu, 31 Aug 2017 09:36:19 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Language":"en-US","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1761189,"web_url":"http://patchwork.ozlabs.org/comment/1761189/","msgid":"<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>","list_archive_url":null,"date":"2017-08-31T16:57:58","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":2800,"url":"http://patchwork.ozlabs.org/api/people/2800/","name":"Florian Fainelli","email":"f.fainelli@gmail.com"},"content":"On 08/31/2017 09:36 AM, David Daney wrote:\n> On 08/31/2017 05:29 AM, Marc Gonzalez wrote:\n>> On 31/08/2017 02:49, Florian Fainelli wrote:\n>>\n>>> This reverts commit 7ad813f208533cebfcc32d3d7474dc1677d1b09a (\"net: phy:\n>>> Correctly process PHY_HALTED in phy_stop_machine()\") because it is\n>>> creating the possibility for a NULL pointer dereference.\n>>>\n>>> David Daney provide the following call trace and diagram of events:\n>>>\n>>> When ndo_stop() is called we call:\n>>>\n>>>   phy_disconnect()\n>>>      +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;\n>>\n>> What does this mean?\n> \n> I meant that after the call to phy_stop_interrupts(), phydev->irq =\n> PHY_POLL;\n> \n> \n>>\n>> On the contrary, phy_stop_interrupts() is only called when *not* polling.\n> \n> That is the case I have.  We are using interrupts from the phy.\n> \n> \n>>\n>>     if (phydev->irq > 0)\n>>         phy_stop_interrupts(phydev);\n>>\n>>>      +---> phy_stop_machine()\n>>>      |      +---> phy_state_machine()\n>>>      |              +----> queue_delayed_work(): Work queued.\n>>\n>> You're referring to the fact that, at the end of phy_state_machine()\n>> (in polling mode) the code reschedules itself through:\n>>\n>>     if (phydev->irq == PHY_POLL)\n>>         queue_delayed_work(system_power_efficient_wq,\n>> &phydev->state_queue, PHY_STATE_TIME * HZ);\n> \n> Exactly.  The call to phy_disconnect() ensures that there are no more\n> interrupts and also that phydev->irq = PHY_POLL\n> \n> The call to cancel_delayed_work_sync() at the top of phy_stop_machine()\n> was meant to ensure that phy_state_machine() was never run again.  No\n> interrupts + no queued work means that it should be save to do...\n> \n>>\n>>>      +--->phy_detach() implies: phydev->attached_dev = NULL;\n> \n> The problem is that by calling phy_state_machine() again (which the\n> offending patch added) we now have work scheduled that will try to\n> dereference the pointer that was set to NULL as a result of the\n> phy_detach()\n\nAnd the race is between phy_detach() setting phydev->attached_dev = NULL\nand phy_state_machine() running in PHY_HALTED state and calling\nnetif_carrier_off().\n\n> \n> \n>>>\n>>> Now at a later time the queued work does:\n>>>\n>>>   phy_state_machine()\n>>>      +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL:\n>>\n>> I tested a sequence of 500 link up / link down in polling mode,\n>> and saw no such issue. Race condition?\n>>\n> \n> You were lucky.\n\nI too tested this a number of times on a 2 core and 4 core system, but\nthe race is there, both of us just were lucky enough we did not see any\ncrash. I suspect the race is easier to reproduce on a (at least 12 core)\nsystem with possibly a higher clock speed.\n\n> \n>> For what case in phy_state_machine() is netif_carrier_off()\n>> being called? Surely not PHY_HALTED?\n>>\n> \n> The phy can be in a variety of states.  It is connected to something\n> outside of the system that we don't control, so you cannot assume any\n> particular state.  We must have code that doesn't crash the system no\n> matter what state the phy is in.\n> \n> I suspect, but have not checked, that the phy is in PHY_RUNNING.  I\n> think that means that because this patch turned the state machine back\n> on, it will start transitioning through PHY_UP, PHY_AN, ... and\n> eventually get to the crash we see because phydev->attached_dev = NULL\n\nI actually think the PHY remains in PHY_HALTED but just re-schedules\nitself and keeps being in PHY_HALTED again until a call to phy_resume or\nphy_start() moves it back to another state. This is largely inefficient,\nand we should look into using the patch I posted yesterday which would\nprevent a re-schedule when moved to PHY_HALTED:\n\ndiff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c\nindex d0626bf5c540..78168e19bd5d 100644\n--- a/drivers/net/phy/phy.c\n+++ b/drivers/net/phy/phy.c\n@@ -1234,7 +1234,7 @@ void phy_state_machine(struct work_struct *work)\n         * PHY, if PHY_IGNORE_INTERRUPT is set, then we will be moving\n         * between states from phy_mac_interrupt()\n         */\n-       if (phydev->irq == PHY_POLL)\n+       if (phydev->irq == PHY_POLL && phydev->state != PHY_HALTED)\n                queue_delayed_work(system_power_efficient_wq,\n&phydev->state_queue,\n                                   PHY_STATE_TIME * HZ);\n }","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"cpWmf8mZ\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjpTv69Rgz9sD5\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 02:58:11 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751927AbdHaQ6J (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 12:58:09 -0400","from mail-wm0-f66.google.com ([74.125.82.66]:37217 \"EHLO\n\tmail-wm0-f66.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1751834AbdHaQ6I (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 31 Aug 2017 12:58:08 -0400","by mail-wm0-f66.google.com with SMTP id x189so191178wmg.4\n\tfor <netdev@vger.kernel.org>; Thu, 31 Aug 2017 09:58:07 -0700 (PDT)","from [10.112.156.244] ([192.19.255.250])\n\tby smtp.googlemail.com with ESMTPSA id\n\te74sm525429wmg.39.2017.08.31.09.58.03\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tThu, 31 Aug 2017 09:58:05 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=subject:to:cc:references:from:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=SxgBhNGFn2sO/4QLTt6QLw+a6odU4bSTzfJumFohpv4=;\n\tb=cpWmf8mZrqE5Ub477kqmSIDiX4Ow3yP6PZMEyWmUYSIDWZIjDew6SkBHUcG8M14L+k\n\tBBgZuuLtCqh18VCUdJ7dKjNEw0hI93Z/76ANJupB0Qx6L2XCe3MM4CjHNwSJB8Ub+Mau\n\tz1SUSttps6GJadzYPUS+yozhVzcNW0zUbYQ044sDj0KU0cwDw6sQHibKEnkrWK7LxCNW\n\t5Rq/EtA2+YW/DgCd5GIFtczSHO+/qA1NZJmDE8A9CVDfkdE1lnARPg1D2Gk6dgYHw8Vb\n\tiS+ss23pBFGC9Cv1MrxX5d8zQ+y7d2M+16iAA3Vltg+G+DA20bs95G1LLw0zh3OQN1Iq\n\tS0gw==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:subject:to:cc:references:from:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=SxgBhNGFn2sO/4QLTt6QLw+a6odU4bSTzfJumFohpv4=;\n\tb=RJr9BEQK3tnv1I6RNCC8ZTmyQUdWskMb3p7rz6Xmd71BxqJUB3XH0n74etUFVK/iMu\n\tH5ykukhjuIRY2xm0e/91erw9SXgcQazwPrRQex1T2w9k6G0jNDiWliiX2PW3lpSW9MHo\n\t8/0d+owjkazZnMGwLWDQplfTKmXxFsJBDPPdEswlLNuJeyp/ZqPvkHYyUPSXK7BB2qrh\n\tMlwI8rnRmamfA8DI2brzWqKEr7GdqCffXTRQni4F3WpCkCDOP9OUJuCnqNtUBd29rwZc\n\txD3miAgltIaot48Dyp0RFM3r/GCqiWShIDMgVWaJZ2DRVE1GrPIktqlemjwsSxjWJdm4\n\tUQ9Q==","X-Gm-Message-State":"AHYfb5g8dx4YRY+YqEf//JxOslpucVyyUMvxsrui49ee/ZkVWAMMMzuR\n\t3ySP6LXF/UrDYw==","X-Google-Smtp-Source":"ADKCNb6uT1Dtakxc5lxiN6BDDmv2NI9qZTQAt2YgGGzhui3UwYkom9eu3wniO80bmSa1rHCtf3poPg==","X-Received":"by 10.28.63.204 with SMTP id m195mr1051595wma.175.1504198686652; \n\tThu, 31 Aug 2017 09:58:06 -0700 (PDT)","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"David Daney <ddaney.cavm@gmail.com>,\n\tMarc Gonzalez <marc_gonzalez@sigmadesigns.com>","Cc":"netdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>,\n\tMason <slash.tmp@free.fr>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>","From":"Florian Fainelli <f.fainelli@gmail.com>","Message-ID":"<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>","Date":"Thu, 31 Aug 2017 09:57:58 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>","Content-Type":"text/plain; charset=utf-8","Content-Language":"en-US","Content-Transfer-Encoding":"8bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1761198,"web_url":"http://patchwork.ozlabs.org/comment/1761198/","msgid":"<a75691d9-c22a-9b89-2cce-604315062739@gmail.com>","list_archive_url":null,"date":"2017-08-31T17:03:21","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":2800,"url":"http://patchwork.ozlabs.org/api/people/2800/","name":"Florian Fainelli","email":"f.fainelli@gmail.com"},"content":"On 08/31/2017 05:29 AM, Marc Gonzalez wrote:\n> On 31/08/2017 02:49, Florian Fainelli wrote:\n> \n>> This reverts commit 7ad813f208533cebfcc32d3d7474dc1677d1b09a (\"net: phy:\n>> Correctly process PHY_HALTED in phy_stop_machine()\") because it is\n>> creating the possibility for a NULL pointer dereference.\n>>\n>> David Daney provide the following call trace and diagram of events:\n>>\n>> When ndo_stop() is called we call:\n>>\n>>  phy_disconnect()\n>>     +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;\n> \n> What does this mean?\n> \n> On the contrary, phy_stop_interrupts() is only called when *not* polling.\n> \n> \tif (phydev->irq > 0)\n> \t\tphy_stop_interrupts(phydev);\n> \n>>     +---> phy_stop_machine()\n>>     |      +---> phy_state_machine()\n>>     |              +----> queue_delayed_work(): Work queued.\n> \n> You're referring to the fact that, at the end of phy_state_machine()\n> (in polling mode) the code reschedules itself through:\n> \n> \tif (phydev->irq == PHY_POLL)\n> \t\tqueue_delayed_work(system_power_efficient_wq, &phydev->state_queue, PHY_STATE_TIME * HZ);\n> \n>>     +--->phy_detach() implies: phydev->attached_dev = NULL;\n>>\n>> Now at a later time the queued work does:\n>>\n>>  phy_state_machine()\n>>     +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL:\n> \n> I tested a sequence of 500 link up / link down in polling mode,\n> and saw no such issue. Race condition?\n> \n> For what case in phy_state_machine() is netif_carrier_off()\n> being called? Surely not PHY_HALTED?\n> \n> \n>> The original motivation for this change originated from Marc Gonzales\n>> indicating that his network driver did not have its adjust_link callback\n>> executing with phydev->link = 0 while he was expecting it.\n> \n> I expect the core to call phy_adjust_link() for link changes.\n> This used to work back in 3.4 and was broken somewhere along\n> the way.\n\nIf that was working correctly in 3.4 surely we can look at the diff and\nfigure out what changed, even maybe find the offending commit, can you\ndo that?\n\n> \n>> PHYLIB has never made any such guarantees ever because phy_stop() merely\n>> just tells the workqueue to move into PHY_HALTED state which will happen\n>> asynchronously.\n> \n> My original proposal was to fix the issue in the driver.\n> I'll try locating it in my archives.\n\nYes I remember you telling that, by the way I don't think you ever\nprovided a clear explanation why this is absolutely necessary for your\ndriver though?","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"i3V1cMzV\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjpc40xZTz9s81\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 03:03:32 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751243AbdHaRD3 (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 13:03:29 -0400","from mail-qk0-f194.google.com ([209.85.220.194]:34038 \"EHLO\n\tmail-qk0-f194.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1750815AbdHaRD2 (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 31 Aug 2017 13:03:28 -0400","by mail-qk0-f194.google.com with SMTP id a77so179203qkb.1\n\tfor <netdev@vger.kernel.org>; Thu, 31 Aug 2017 10:03:27 -0700 (PDT)","from [10.112.156.244] ([192.19.255.250])\n\tby smtp.googlemail.com with ESMTPSA id\n\tm83sm5711476qki.26.2017.08.31.10.03.23\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tThu, 31 Aug 2017 10:03:25 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=subject:to:cc:references:from:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=NimRx6b/XrlJjv7qO2X0jHK+qDkYA1Vwc3AitVbCdFY=;\n\tb=i3V1cMzVZL66Np+8iyhycdadA8QM/uEmQXrjMoq9f0tPVCZdI/BjUq0NQ49EwCWRJJ\n\trECSbfyuh8NJnsY2aM6rDE9Kn0tyl+4IyWruSuUPyxTacQJIFzgMvM4I3WiKO+BXUVg8\n\tsMchkujAtXTcj0bfoTp+YhLTxK4YTfXroB1D30FHwaWK84js+6WVGISWu6tZ91WiEACS\n\tVo2iD03YEpeVoWxGRZYwvSs3kwbkAzTuz4MAcZJUVXs9K4EJAJtoVm1uowNY2Tqf+230\n\toNNb35qkuZK2k2NiOVSYm+PN3X7yu7xZCCrgEZAX4p9XGRHGG/CbgweAkENFe3VJnQE/\n\tFFnA==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:subject:to:cc:references:from:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=NimRx6b/XrlJjv7qO2X0jHK+qDkYA1Vwc3AitVbCdFY=;\n\tb=X8p3N6W8xQiFwOXgq+Bi8G03yNwkHI8pwjuJ0cep9Bu3pbeb7HbNxNxGMcfkrXGrpe\n\tLhUqoe7+KGMEiEe5lkULAArjwdZWkBuqWX+nupoJznlmwOFaDuTrdBNSygb86idf8eul\n\tl+rhAao3vc2iPHdUl5QfMeIIrvmhrJZbxN9R4JN4u1spW4Jzuupn+GDGE89/CgT8ACwm\n\tJeJGcpb7jO5EzZ/jqBiJFHUqvElGp/veowpxdVbcWSn4t6F7AJkahUrJC6tfN0SIf817\n\tonvS3BzyTlDMVMgmvSZ1OcxX6cg8/3GXzPV00sR0WsCRhF+ExHs+TBQTH2MYpGVlOERn\n\tTZmA==","X-Gm-Message-State":"AHYfb5h7/Ja5c9Fcvx06E0lRMxmqPhaTrQwq/hUCEUEorhmXjpIq/CL9\n\tkAn5+B2avdcGdA==","X-Google-Smtp-Source":"ADKCNb5OnUVvHHmf2wHJCSHjpnFlMoCvEk7BYPSLezEyL6ecOzcrXUpdpqa1GV+/WH/9IJgGfAeCKQ==","X-Received":"by 10.55.118.132 with SMTP id r126mr4750771qkc.169.1504199006843;\n\tThu, 31 Aug 2017 10:03:26 -0700 (PDT)","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>","Cc":"netdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>,\n\tMason <slash.tmp@free.fr>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>","From":"Florian Fainelli <f.fainelli@gmail.com>","Message-ID":"<a75691d9-c22a-9b89-2cce-604315062739@gmail.com>","Date":"Thu, 31 Aug 2017 10:03:21 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>","Content-Type":"text/plain; charset=utf-8","Content-Language":"en-US","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1761212,"web_url":"http://patchwork.ozlabs.org/comment/1761212/","msgid":"<e5063ab9-643b-15eb-290d-bdfba3b5fd28@free.fr>","list_archive_url":null,"date":"2017-08-31T17:35:05","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":66150,"url":"http://patchwork.ozlabs.org/api/people/66150/","name":"Mason","email":"slash.tmp@free.fr"},"content":"On 31/08/2017 18:36, David Daney wrote:\n> On 08/31/2017 05:29 AM, Marc Gonzalez wrote:\n>> On 31/08/2017 02:49, Florian Fainelli wrote:\n>>\n>>> This reverts commit 7ad813f208533cebfcc32d3d7474dc1677d1b09a (\"net: phy:\n>>> Correctly process PHY_HALTED in phy_stop_machine()\") because it is\n>>> creating the possibility for a NULL pointer dereference.\n>>>\n>>> David Daney provide the following call trace and diagram of events:\n>>>\n>>> When ndo_stop() is called we call:\n>>>\n>>>   phy_disconnect()\n>>>      +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;\n>>\n>> What does this mean?\n> \n> I meant that after the call to phy_stop_interrupts(), phydev->irq = \n> PHY_POLL;\n\nI must be missing something.\n\nhttp://elixir.free-electrons.com/linux/latest/source/drivers/net/phy/phy.c#L868\n\nphy_stop_interrupts() doesn't change phydev->irq right?\n\nOnly phy_start_interrupts() sets phydev->irq to\nPHY_POLL if it cannot set up interrupt mode.\n\nRegards.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjqJj0GYFz9sPm\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 03:35:17 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752049AbdHaRfO (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 13:35:14 -0400","from smtp2-g21.free.fr ([212.27.42.2]:31717 \"EHLO\n\tsmtp2-g21.free.fr\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1751622AbdHaRfM (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tThu, 31 Aug 2017 13:35:12 -0400","from [192.168.0.66] (unknown [88.191.210.51])\n\tby smtp2-g21.free.fr (Postfix) with ESMTP id C87662003D9;\n\tThu, 31 Aug 2017 19:35:09 +0200 (CEST)"],"Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"David Daney <ddaney.cavm@gmail.com>,\n\tFlorian Fainelli <f.fainelli@gmail.com>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>","From":"Mason <slash.tmp@free.fr>","Message-ID":"<e5063ab9-643b-15eb-290d-bdfba3b5fd28@free.fr>","Date":"Thu, 31 Aug 2017 19:35:05 +0200","User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:52.0) Gecko/20100101\n\tFirefox/52.0 SeaMonkey/2.49.1","MIME-Version":"1.0","In-Reply-To":"<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>","Content-Type":"text/plain; charset=ISO-8859-15","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1761224,"web_url":"http://patchwork.ozlabs.org/comment/1761224/","msgid":"<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>","list_archive_url":null,"date":"2017-08-31T17:49:47","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":66150,"url":"http://patchwork.ozlabs.org/api/people/66150/","name":"Mason","email":"slash.tmp@free.fr"},"content":"On 31/08/2017 18:57, Florian Fainelli wrote:\n> And the race is between phy_detach() setting phydev->attached_dev = NULL\n> and phy_state_machine() running in PHY_HALTED state and calling\n> netif_carrier_off().\n\nI must be missing something.\n(Since a thread cannot race against itself.)\n\nphy_disconnect calls phy_stop_machine which\n1) stops the work queue from running in a separate thread\n2) calls phy_state_machine *synchronously*\n     which runs the PHY_HALTED case with everything well-defined\nend of phy_stop_machine\n\nphy_disconnect only then calls phy_detach()\nwhich makes future calls of phy_state_machine perilous.\n\nThis all happens in the same thread, so I'm not yet\nseeing where the race happens?\n\nRegards.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjqdd3nVvz9s7c\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 03:49:57 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751901AbdHaRtz (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 13:49:55 -0400","from smtp2-g21.free.fr ([212.27.42.2]:62201 \"EHLO\n\tsmtp2-g21.free.fr\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1751811AbdHaRty (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tThu, 31 Aug 2017 13:49:54 -0400","from [192.168.0.66] (unknown [88.191.210.51])\n\tby smtp2-g21.free.fr (Postfix) with ESMTP id 791922003D9;\n\tThu, 31 Aug 2017 19:49:52 +0200 (CEST)"],"Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Florian Fainelli <f.fainelli@gmail.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>","From":"Mason <slash.tmp@free.fr>","Message-ID":"<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>","Date":"Thu, 31 Aug 2017 19:49:47 +0200","User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:52.0) Gecko/20100101\n\tFirefox/52.0 SeaMonkey/2.49.1","MIME-Version":"1.0","In-Reply-To":"<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>","Content-Type":"text/plain; charset=ISO-8859-15","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1761229,"web_url":"http://patchwork.ozlabs.org/comment/1761229/","msgid":"<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>","list_archive_url":null,"date":"2017-08-31T17:53:37","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":2800,"url":"http://patchwork.ozlabs.org/api/people/2800/","name":"Florian Fainelli","email":"f.fainelli@gmail.com"},"content":"On 08/31/2017 10:49 AM, Mason wrote:\n> On 31/08/2017 18:57, Florian Fainelli wrote:\n>> And the race is between phy_detach() setting phydev->attached_dev = NULL\n>> and phy_state_machine() running in PHY_HALTED state and calling\n>> netif_carrier_off().\n> \n> I must be missing something.\n> (Since a thread cannot race against itself.)\n> \n> phy_disconnect calls phy_stop_machine which\n> 1) stops the work queue from running in a separate thread\n> 2) calls phy_state_machine *synchronously*\n>      which runs the PHY_HALTED case with everything well-defined\n> end of phy_stop_machine\n> \n> phy_disconnect only then calls phy_detach()\n> which makes future calls of phy_state_machine perilous.\n> \n> This all happens in the same thread, so I'm not yet\n> seeing where the race happens?\n\nThe race is as described in David's earlier email, so let's recap:\n\nThread 1\t\t\tThread 2\nphy_disconnect()\nphy_stop_interrupts()\nphy_stop_machine()\nphy_state_machine()\n -> queue_delayed_work()\nphy_detach()\n\t\t\t\tphy_state_machine()\n\t\t\t\t-> netif_carrier_off()\n\nIf phy_detach() finishes earlier than the workqueue had a chance to be\nscheduled and process PHY_HALTED again, then we trigger the NULL pointer\nde-reference.\n\nworkqueues are not tasklets, the CPU scheduling them gets no guarantee\nthey will run on the same CPU.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"ZCbYE16H\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjqk63Sscz9s83\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 03:53:50 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751127AbdHaRxs (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 13:53:48 -0400","from mail-qt0-f196.google.com ([209.85.216.196]:33463 \"EHLO\n\tmail-qt0-f196.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1751022AbdHaRxr (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 31 Aug 2017 13:53:47 -0400","by mail-qt0-f196.google.com with SMTP id h15so96943qta.0\n\tfor <netdev@vger.kernel.org>; Thu, 31 Aug 2017 10:53:47 -0700 (PDT)","from [10.112.156.244] ([192.19.255.250])\n\tby smtp.googlemail.com with ESMTPSA id\n\tq203sm5800348qke.27.2017.08.31.10.53.39\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tThu, 31 Aug 2017 10:53:45 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=subject:to:cc:references:from:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=EwHws2bM+5wpkA+H/G220hOGWO+f0e+BwZmeLqp3/5A=;\n\tb=ZCbYE16HP+SpHFBqdfNI9i7pTZTsJ/5DpGVscey9oZp/xiYUn1/Vb15UglMff8xVAp\n\tRxJC9YnU+hcGKNClHfacaWQglq08Rw8xVqYXuQ8k28nioNzH332V7pJDcwHi+lTt9cvH\n\tfIE8jMCrbIjLrPBis6nxapDiaNuCKKykcrLolTPzyS0hZ03HrfAGRE6ZV/H/NdfOv0C3\n\tgQm/CPcNUFVJ6QbCqsKSFodBuWzG9RNh8rTS//XIxfXtkyeza1UjFnz+KzF4JlP2UVZE\n\tAWXYq/RsBJYgQBO2RKhB8gppmr+a2UX61zEMMD3J6Hwxl8VS4axMrrO8bNv4qz16WMaJ\n\tvylA==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:subject:to:cc:references:from:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=EwHws2bM+5wpkA+H/G220hOGWO+f0e+BwZmeLqp3/5A=;\n\tb=Td513spZktljhn8kId5BD/h2F93vPIukobuRjC4AvrDrEe+qfrC0NB/U5CHAQexVfc\n\thBaKj5koCt3ee/mukRSyDBOuPa3rvprRLjZCFJsM/SCdoNURC/3hWFkNG+TYMcvtCyo/\n\tvMd2Pn5+0R3qogjly1PfpD6/5r94TGGKPHT/xgdQBe5LJQySdUy6gU76zc/YgHj+PsjB\n\t48rybfZYsvWzNMAWu/kfbVYJStrcPzZxpjesFNw0+mtTwOIDVPDX8R+kS1gZ7X9KqH2m\n\tbSsT9CLF0ci59rIT8mXGO4oe545iBPKUhooG778yma/419+0mJEvYKGBnt58J1wy/2j/\n\tS91Q==","X-Gm-Message-State":"AHPjjUhbAjNmaFmjMOeBcxxRn6LmNZtczqfoQxM70pGE8Ba2gHUsNs00\n\t7u/VxFaX8hoA5g==","X-Google-Smtp-Source":"ADKCNb5elPS11j1QSfquDxCJQ8OAjpvV4hssHBS1F2gIUp5/k/ye3NQTiKZKiErgxkHieotmk6E3gQ==","X-Received":"by 10.200.51.29 with SMTP id t29mr8062263qta.245.1504202026576; \n\tThu, 31 Aug 2017 10:53:46 -0700 (PDT)","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Mason <slash.tmp@free.fr>, David Daney <ddaney.cavm@gmail.com>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>","From":"Florian Fainelli <f.fainelli@gmail.com>","Message-ID":"<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>","Date":"Thu, 31 Aug 2017 10:53:37 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>","Content-Type":"text/plain; charset=utf-8","Content-Language":"en-US","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1761243,"web_url":"http://patchwork.ozlabs.org/comment/1761243/","msgid":"<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>","list_archive_url":null,"date":"2017-08-31T18:12:42","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":66150,"url":"http://patchwork.ozlabs.org/api/people/66150/","name":"Mason","email":"slash.tmp@free.fr"},"content":"On 31/08/2017 19:53, Florian Fainelli wrote:\n> On 08/31/2017 10:49 AM, Mason wrote:\n>> On 31/08/2017 18:57, Florian Fainelli wrote:\n>>> And the race is between phy_detach() setting phydev->attached_dev = NULL\n>>> and phy_state_machine() running in PHY_HALTED state and calling\n>>> netif_carrier_off().\n>>\n>> I must be missing something.\n>> (Since a thread cannot race against itself.)\n>>\n>> phy_disconnect calls phy_stop_machine which\n>> 1) stops the work queue from running in a separate thread\n>> 2) calls phy_state_machine *synchronously*\n>>      which runs the PHY_HALTED case with everything well-defined\n>> end of phy_stop_machine\n>>\n>> phy_disconnect only then calls phy_detach()\n>> which makes future calls of phy_state_machine perilous.\n>>\n>> This all happens in the same thread, so I'm not yet\n>> seeing where the race happens?\n> \n> The race is as described in David's earlier email, so let's recap:\n> \n> Thread 1\t\t\tThread 2\n> phy_disconnect()\n> phy_stop_interrupts()\n> phy_stop_machine()\n> phy_state_machine()\n>  -> queue_delayed_work()\n> phy_detach()\n> \t\t\t\tphy_state_machine()\n> \t\t\t\t-> netif_carrier_off()\n> \n> If phy_detach() finishes earlier than the workqueue had a chance to be\n> scheduled and process PHY_HALTED again, then we trigger the NULL pointer\n> de-reference.\n> \n> workqueues are not tasklets, the CPU scheduling them gets no guarantee\n> they will run on the same CPU.\n\nSomething does not add up.\n\nThe synchronous call to phy_state_machine() does:\n\n\tcase PHY_HALTED:\n\t\tif (phydev->link) {\n\t\t\tphydev->link = 0;\n\t\t\tnetif_carrier_off(phydev->attached_dev);\n\t\t\tphy_adjust_link(phydev);\n\t\t\tdo_suspend = true;\n\t\t}\n\nthen sets phydev->link = 0; therefore subsequent calls to\nphy_state_machin() will be no-op.\n\nAlso, queue_delayed_work() is only called in polling mode.\nDavid stated that he's using interrupt mode.\n\nRegards.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjr844ZPNz9s7c\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 04:12:52 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751310AbdHaSMu (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 14:12:50 -0400","from smtp2-g21.free.fr ([212.27.42.2]:12476 \"EHLO\n\tsmtp2-g21.free.fr\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1751103AbdHaSMt (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tThu, 31 Aug 2017 14:12:49 -0400","from [192.168.0.66] (unknown [88.191.210.51])\n\tby smtp2-g21.free.fr (Postfix) with ESMTP id 0849F2003DD;\n\tThu, 31 Aug 2017 20:12:47 +0200 (CEST)"],"Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Florian Fainelli <f.fainelli@gmail.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>","From":"Mason <slash.tmp@free.fr>","Message-ID":"<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>","Date":"Thu, 31 Aug 2017 20:12:42 +0200","User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:52.0) Gecko/20100101\n\tFirefox/52.0 SeaMonkey/2.49.1","MIME-Version":"1.0","In-Reply-To":"<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>","Content-Type":"text/plain; charset=ISO-8859-15","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1761249,"web_url":"http://patchwork.ozlabs.org/comment/1761249/","msgid":"<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>","list_archive_url":null,"date":"2017-08-31T18:29:12","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":2800,"url":"http://patchwork.ozlabs.org/api/people/2800/","name":"Florian Fainelli","email":"f.fainelli@gmail.com"},"content":"On 08/31/2017 11:12 AM, Mason wrote:\n> On 31/08/2017 19:53, Florian Fainelli wrote:\n>> On 08/31/2017 10:49 AM, Mason wrote:\n>>> On 31/08/2017 18:57, Florian Fainelli wrote:\n>>>> And the race is between phy_detach() setting phydev->attached_dev = NULL\n>>>> and phy_state_machine() running in PHY_HALTED state and calling\n>>>> netif_carrier_off().\n>>>\n>>> I must be missing something.\n>>> (Since a thread cannot race against itself.)\n>>>\n>>> phy_disconnect calls phy_stop_machine which\n>>> 1) stops the work queue from running in a separate thread\n>>> 2) calls phy_state_machine *synchronously*\n>>>      which runs the PHY_HALTED case with everything well-defined\n>>> end of phy_stop_machine\n>>>\n>>> phy_disconnect only then calls phy_detach()\n>>> which makes future calls of phy_state_machine perilous.\n>>>\n>>> This all happens in the same thread, so I'm not yet\n>>> seeing where the race happens?\n>>\n>> The race is as described in David's earlier email, so let's recap:\n>>\n>> Thread 1\t\t\tThread 2\n>> phy_disconnect()\n>> phy_stop_interrupts()\n>> phy_stop_machine()\n>> phy_state_machine()\n>>  -> queue_delayed_work()\n>> phy_detach()\n>> \t\t\t\tphy_state_machine()\n>> \t\t\t\t-> netif_carrier_off()\n>>\n>> If phy_detach() finishes earlier than the workqueue had a chance to be\n>> scheduled and process PHY_HALTED again, then we trigger the NULL pointer\n>> de-reference.\n>>\n>> workqueues are not tasklets, the CPU scheduling them gets no guarantee\n>> they will run on the same CPU.\n> \n> Something does not add up.\n> \n> The synchronous call to phy_state_machine() does:\n> \n> \tcase PHY_HALTED:\n> \t\tif (phydev->link) {\n> \t\t\tphydev->link = 0;\n> \t\t\tnetif_carrier_off(phydev->attached_dev);\n> \t\t\tphy_adjust_link(phydev);\n> \t\t\tdo_suspend = true;\n> \t\t}\n> \n> then sets phydev->link = 0; therefore subsequent calls to\n> phy_state_machin() will be no-op.\n\nActually you are right, once phydev->link is set to 0 these would become\nno-ops. Still scratching my head as to what happens for David then...\n\n> \n> Also, queue_delayed_work() is only called in polling mode.\n> David stated that he's using interrupt mode.\n\nRight that's confusing too now. David can you check if you tree has:\n\n49d52e8108a21749dc2114b924c907db43358984 (\"net: phy: handle state\ncorrectly in phy_stop_machine\")","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"WHJvNY6x\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjrW76Qcnz9s7c\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 04:29:23 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751241AbdHaS3U (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 14:29:20 -0400","from mail-qt0-f196.google.com ([209.85.216.196]:35888 \"EHLO\n\tmail-qt0-f196.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1750895AbdHaS3T (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 31 Aug 2017 14:29:19 -0400","by mail-qt0-f196.google.com with SMTP id e2so329924qta.3\n\tfor <netdev@vger.kernel.org>; Thu, 31 Aug 2017 11:29:19 -0700 (PDT)","from [10.112.156.244] ([192.19.255.250])\n\tby smtp.googlemail.com with ESMTPSA id\n\tj185sm5698781qkf.62.2017.08.31.11.29.15\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tThu, 31 Aug 2017 11:29:17 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=subject:to:cc:references:from:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=JQF7MxW6qNUzh2FgIbRQ/bGgTQ0BFroZy798DYvhsiA=;\n\tb=WHJvNY6xj9WIzTFkU+DAsI1vHloBD34t7/APO8dd+KgGDodwmISPvsZvPLF/5NnN+j\n\tIz2tw5+r6zRLqA6/NpCwpb3X5f0G+1+RdATUeXbB/V0yXWR9iaJvcQXsykB5tvIUiNvX\n\tUiIl9jIeFgb2joqUWcCUr078DKPtQPKnC2LrXAz3vzI4HMXp5aVdU8qYV4QNByq3/yWQ\n\tDmGRq4lrEiXuh+uQnL+96/zCvUbUkK9YCenpTBVA2RrAZqlsUVqhGXaugIdWh5iEFtsP\n\tkrNvljN9xVxe1iOd+FWcJK0dZ1TO1UmhwlSj1I9RnEjScKkiwjFHM5x+4odvMhF+1XLe\n\tlmlw==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:subject:to:cc:references:from:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=JQF7MxW6qNUzh2FgIbRQ/bGgTQ0BFroZy798DYvhsiA=;\n\tb=ok69MYMXdLAlzzi/eJ7LCgVyEy904St2kBOg/CWsPTgPvmgxysw1UmwbRzF9dPrwsn\n\t2SronKtMIHMy9tndKapnYgezTqzy4m7qOPWpmvk3Dj/vbOu2kL9RNjW9P9qgh2I1FIgN\n\tNnbwig3o+IUJ9Q49cC//miUNimaQ5WeiHZwWB4veFE99zpOs3trqQIqHg3QjArwR79sc\n\tKeaa2P/B3co5E9leKMcIf5vysiWtrO2TamvdyaJPdy/JOwZfL3c31pG/mu1oUHdkK+06\n\ttdOjJZwu9nauDsHHgJEEeqKe304tbbLMMbjeEC+80utJSnvmX1ZAjFH3+Sy4Dr+Arp93\n\tNwow==","X-Gm-Message-State":"AHYfb5hEMIYS7KJpVbbA+TW1E41mHrapZRJUCndm3eGG/Ntxjens87ck\n\tjfPb5k8KuzAwuA==","X-Google-Smtp-Source":"ADKCNb4Kr+bW/UojoI8EjsFJ49RcxKHl3PQoQ1aAwINK2NfnY69afi+FG7Q4k07jf29Qm1hQbhMvXw==","X-Received":"by 10.200.55.5 with SMTP id o5mr8956700qtb.228.1504204158892;\n\tThu, 31 Aug 2017 11:29:18 -0700 (PDT)","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Mason <slash.tmp@free.fr>, David Daney <ddaney.cavm@gmail.com>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>\n\t<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>","From":"Florian Fainelli <f.fainelli@gmail.com>","Message-ID":"<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>","Date":"Thu, 31 Aug 2017 11:29:12 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>","Content-Type":"text/plain; charset=utf-8","Content-Language":"en-US","Content-Transfer-Encoding":"8bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1761278,"web_url":"http://patchwork.ozlabs.org/comment/1761278/","msgid":"<d54d0555-c448-741c-eb63-5a773bed5a30@free.fr>","list_archive_url":null,"date":"2017-08-31T19:09:28","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":66150,"url":"http://patchwork.ozlabs.org/api/people/66150/","name":"Mason","email":"slash.tmp@free.fr"},"content":"On 31/08/2017 19:03, Florian Fainelli wrote:\n\n> On 08/31/2017 05:29 AM, Marc Gonzalez wrote:\n>\n>> On 31/08/2017 02:49, Florian Fainelli wrote:\n>>\n>>> The original motivation for this change originated from Marc Gonzalez\n>>> indicating that his network driver did not have its adjust_link callback\n>>> executing with phydev->link = 0 while he was expecting it.\n>>\n>> I expect the core to call phy_adjust_link() for link changes.\n>> This used to work back in 3.4 and was broken somewhere along\n>> the way.\n> \n> If that was working correctly in 3.4 surely we can look at the diff and\n> figure out what changed, even maybe find the offending commit, can you\n> do that?\n\nBisecting would a be a huge pain because my platform was\nnot upstream until v4.4\n\nYou mentioned the guarantees made by PHYLIB.\nWhen is the adjust_link callback guaranteed to be called?\n\n>>> PHYLIB has never made any such guarantees ever because phy_stop() merely\n>>> just tells the workqueue to move into PHY_HALTED state which will happen\n>>> asynchronously.\n>>\n>> My original proposal was to fix the issue in the driver.\n>> I'll try locating it in my archives.\n> \n> Yes I remember you telling that, by the way I don't think you ever\n> provided a clear explanation why this is absolutely necessary for your\n> driver though?\n\n1) nb8800_link_reconfigure() calls phy_print_status()\nwhich prints the \"Link down\" and \"Link up\" messages\nto the console. With the patch reverted, nothing is\nprinted when the link goes down, and the result is\nrandom when the link comes up. Sometimes, we get\ndown + up, sometimes just up.\n\n2) nb8800_link_reconfigure() does some HW init when\nthe link state changes. If we miss some notifications,\nwe might not perform some HW init, and stuff breaks.\n\nRegards.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjsPZ3nCBz9s3T\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 05:09:38 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751351AbdHaTJf (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 15:09:35 -0400","from smtp2-g21.free.fr ([212.27.42.2]:25392 \"EHLO\n\tsmtp2-g21.free.fr\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1750968AbdHaTJe (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tThu, 31 Aug 2017 15:09:34 -0400","from [192.168.0.66] (unknown [88.191.210.51])\n\tby smtp2-g21.free.fr (Postfix) with ESMTP id 18D44200379;\n\tThu, 31 Aug 2017 21:09:33 +0200 (CEST)"],"Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Florian Fainelli <f.fainelli@gmail.com>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>, netdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<a75691d9-c22a-9b89-2cce-604315062739@gmail.com>","From":"Mason <slash.tmp@free.fr>","Message-ID":"<d54d0555-c448-741c-eb63-5a773bed5a30@free.fr>","Date":"Thu, 31 Aug 2017 21:09:28 +0200","User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:52.0) Gecko/20100101\n\tFirefox/52.0 SeaMonkey/2.49.1","MIME-Version":"1.0","In-Reply-To":"<a75691d9-c22a-9b89-2cce-604315062739@gmail.com>","Content-Type":"text/plain; charset=ISO-8859-15","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1761282,"web_url":"http://patchwork.ozlabs.org/comment/1761282/","msgid":"<7b1c1dc9-b6e3-a1bd-2e36-474946741a79@gmail.com>","list_archive_url":null,"date":"2017-08-31T19:18:50","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":2800,"url":"http://patchwork.ozlabs.org/api/people/2800/","name":"Florian Fainelli","email":"f.fainelli@gmail.com"},"content":"On 08/31/2017 12:09 PM, Mason wrote:\n> On 31/08/2017 19:03, Florian Fainelli wrote:\n> \n>> On 08/31/2017 05:29 AM, Marc Gonzalez wrote:\n>>\n>>> On 31/08/2017 02:49, Florian Fainelli wrote:\n>>>\n>>>> The original motivation for this change originated from Marc Gonzalez\n>>>> indicating that his network driver did not have its adjust_link callback\n>>>> executing with phydev->link = 0 while he was expecting it.\n>>>\n>>> I expect the core to call phy_adjust_link() for link changes.\n>>> This used to work back in 3.4 and was broken somewhere along\n>>> the way.\n>>\n>> If that was working correctly in 3.4 surely we can look at the diff and\n>> figure out what changed, even maybe find the offending commit, can you\n>> do that?\n> \n> Bisecting would a be a huge pain because my platform was\n> not upstream until v4.4\n\nThen just diff the file and try to pinpoint which commit may have\nchanged that?\n\n> \n> You mentioned the guarantees made by PHYLIB.\n> When is the adjust_link callback guaranteed to be called?\n\nAs long as the state machine is running after a call to phy_start()\nadjust_link will be called if there a change in link and/or link\nsettings. Once you call phy_stop() no such guarantees are made.\n\n> \n>>>> PHYLIB has never made any such guarantees ever because phy_stop() merely\n>>>> just tells the workqueue to move into PHY_HALTED state which will happen\n>>>> asynchronously.\n>>>\n>>> My original proposal was to fix the issue in the driver.\n>>> I'll try locating it in my archives.\n>>\n>> Yes I remember you telling that, by the way I don't think you ever\n>> provided a clear explanation why this is absolutely necessary for your\n>> driver though?\n> \n> 1) nb8800_link_reconfigure() calls phy_print_status()\n> which prints the \"Link down\" and \"Link up\" messages\n> to the console. With the patch reverted, nothing is\n> printed when the link goes down, and the result is\n> random when the link comes up. Sometimes, we get\n> down + up, sometimes just up.\n\nNothing printed when you bring down the network interface as a result of\nnot signaling the link down, there is a small nuance here.\n\nSeeing a random message upon bringing the interface back up suggests you\nmay not be re-initialization your old vs. new link state book keeping\nvariables and state transitions are not properly detected and therefore\nnot printed. In fact, I don't see where priv->link is ever set to say -1\nto force the comparison between phydev->link != priv->link to be true,\noversight?\n\n> \n> 2) nb8800_link_reconfigure() does some HW init when\n> the link state changes. If we miss some notifications,\n> we might not perform some HW init, and stuff breaks.\n\nCare to be more specific? What specific HW init is required during link\nnotification that if not done breaks the HW? There is both\nnb8800_mac_config() and nb8800_pause_config() that are both called in\nthe adjust_link callback both could presumably be deferred until the\nlink is detected, so why do you need it during ndo_stop() absolutely?","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"fXGJS8WA\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xjscT2qFHz9s8J\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 05:19:05 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751404AbdHaTTC (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 31 Aug 2017 15:19:02 -0400","from mail-qk0-f177.google.com ([209.85.220.177]:36157 \"EHLO\n\tmail-qk0-f177.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1750968AbdHaTTB (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 31 Aug 2017 15:19:01 -0400","by mail-qk0-f177.google.com with SMTP id o63so2544677qkb.3\n\tfor <netdev@vger.kernel.org>; Thu, 31 Aug 2017 12:19:01 -0700 (PDT)","from [10.112.156.244] ([192.19.255.250])\n\tby smtp.googlemail.com with ESMTPSA id\n\tp90sm316327qtd.70.2017.08.31.12.18.53\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tThu, 31 Aug 2017 12:18:59 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=subject:to:cc:references:from:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=rxZCkUFNR1SbreW/zCmZ3P2Efk7QYCnkdABHY4cCcsw=;\n\tb=fXGJS8WAbXF5wgGmUmYFkAAnUeTv6Cje8BqHGaW6qwpNVV+pAFC5+YvsGqFyCYDP62\n\tNjiucRBFCE7vTAQkYx87+9qgRTMkqS2HiqV7sGNke0a2Or4f4qUUXTaX5e/d4GKKQXH+\n\tXDF7mBmTj0sA8MCw8QNHaZ0TFds9XMAK9/esWSzgRJsg8duiIMviRYQsGko42VxwiOtG\n\tMoabKzFWvEU6eskv5RSsa/9W03asrdL1+ldsAvLGbXU7mJOfySaca+4MJGYtRgqpy9uJ\n\tAXL8QntFODf8C8+kGf38wskjLDcixr3leEztc26rWpbtEnv6jg/bqxK7MTeshcxD6b0b\n\tT3hQ==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:subject:to:cc:references:from:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=rxZCkUFNR1SbreW/zCmZ3P2Efk7QYCnkdABHY4cCcsw=;\n\tb=hlREmdKrQ6s88xVDoTp1q3UQGu1liSu7HVbQSzePsLvj1KmEzvqk+OHv/krfgE58wC\n\tw4UQGLWJ9O0DFN4qzFpPHdDupNHkksClJdZ5sXHZFE2DngO03lLp0yN4tNJWl/icYfdE\n\tfinkVnowjaRB7xnaO1GxZK/DgO1N4ikH7dhh3sSA1sDSRS4w7s5s4+TItZiQ4A5zE3X9\n\tcz+6Uzy1e3m4tKngqJtzLGoItS7DZs47lz5VpswvsuTmpePMHDI/IasTIXmXHnzeEjgP\n\tlzpvjBfl08KkrajbCOyz7LdXRSIw5n7mwB3YBzGV/1k+gR2prwVhVB+mIT7rL+WuFd9I\n\tbuVQ==","X-Gm-Message-State":"AHYfb5icjWTO3qB+HyOS2Gc87WOUrgEs77q3FAUr+RPaQP4tF4dWudsL\n\t6Fc5ovKiTeA4Gw==","X-Google-Smtp-Source":"ADKCNb7Qs4ITynAkKKybuHpo+FLzOvKO5NjCP4Wb0NOT9lkZWinh1ny1FYVpnVfoK2uUTmXFa3iyFQ==","X-Received":"by 10.55.104.214 with SMTP id d205mr4956339qkc.37.1504207140998; \n\tThu, 31 Aug 2017 12:19:00 -0700 (PDT)","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Mason <slash.tmp@free.fr>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>, netdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<a75691d9-c22a-9b89-2cce-604315062739@gmail.com>\n\t<d54d0555-c448-741c-eb63-5a773bed5a30@free.fr>","From":"Florian Fainelli <f.fainelli@gmail.com>","Message-ID":"<7b1c1dc9-b6e3-a1bd-2e36-474946741a79@gmail.com>","Date":"Thu, 31 Aug 2017 12:18:50 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<d54d0555-c448-741c-eb63-5a773bed5a30@free.fr>","Content-Type":"text/plain; charset=utf-8","Content-Language":"en-US","Content-Transfer-Encoding":"8bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764155,"web_url":"http://patchwork.ozlabs.org/comment/1764155/","msgid":"<6721135d-8c3f-57a0-f423-9d18cd6e0947@free.fr>","list_archive_url":null,"date":"2017-09-06T14:33:54","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":66150,"url":"http://patchwork.ozlabs.org/api/people/66150/","name":"Mason","email":"slash.tmp@free.fr"},"content":"On 31/08/2017 20:29, Florian Fainelli wrote:\n> On 08/31/2017 11:12 AM, Mason wrote:\n>> On 31/08/2017 19:53, Florian Fainelli wrote:\n>>> On 08/31/2017 10:49 AM, Mason wrote:\n>>>> On 31/08/2017 18:57, Florian Fainelli wrote:\n>>>>> And the race is between phy_detach() setting phydev->attached_dev = NULL\n>>>>> and phy_state_machine() running in PHY_HALTED state and calling\n>>>>> netif_carrier_off().\n>>>>\n>>>> I must be missing something.\n>>>> (Since a thread cannot race against itself.)\n>>>>\n>>>> phy_disconnect calls phy_stop_machine which\n>>>> 1) stops the work queue from running in a separate thread\n>>>> 2) calls phy_state_machine *synchronously*\n>>>>       which runs the PHY_HALTED case with everything well-defined\n>>>> end of phy_stop_machine\n>>>>\n>>>> phy_disconnect only then calls phy_detach()\n>>>> which makes future calls of phy_state_machine perilous.\n>>>>\n>>>> This all happens in the same thread, so I'm not yet\n>>>> seeing where the race happens?\n>>>\n>>> The race is as described in David's earlier email, so let's recap:\n>>>\n>>> Thread 1\t\t\tThread 2\n>>> phy_disconnect()\n>>> phy_stop_interrupts()\n>>> phy_stop_machine()\n>>> phy_state_machine()\n>>>   -> queue_delayed_work()\n>>> phy_detach()\n>>> \t\t\t\tphy_state_machine()\n>>> \t\t\t\t-> netif_carrier_off()\n>>>\n>>> If phy_detach() finishes earlier than the workqueue had a chance to be\n>>> scheduled and process PHY_HALTED again, then we trigger the NULL pointer\n>>> de-reference.\n>>>\n>>> workqueues are not tasklets, the CPU scheduling them gets no guarantee\n>>> they will run on the same CPU.\n>>\n>> Something does not add up.\n>>\n>> The synchronous call to phy_state_machine() does:\n>>\n>> \tcase PHY_HALTED:\n>> \t\tif (phydev->link) {\n>> \t\t\tphydev->link = 0;\n>> \t\t\tnetif_carrier_off(phydev->attached_dev);\n>> \t\t\tphy_adjust_link(phydev);\n>> \t\t\tdo_suspend = true;\n>> \t\t}\n>>\n>> then sets phydev->link = 0; therefore subsequent calls to\n>> phy_state_machin() will be no-op.\n> \n> Actually you are right, once phydev->link is set to 0 these would become\n> no-ops. Still scratching my head as to what happens for David then...\n> \n>>\n>> Also, queue_delayed_work() is only called in polling mode.\n>> David stated that he's using interrupt mode.\n> \n> Right that's confusing too now. David can you check if you tree has:\n> \n> 49d52e8108a21749dc2114b924c907db43358984 (\"net: phy: handle state\n> correctly in phy_stop_machine\")\n\nHello David,\n\nA week ago, you wrote about my patch:\n\"This is broken.  Please revert.\"\n\nI assume you tested the revert locally, and that reverting did make\nthe crash disappear. Is that correct?\n\nThe reason I ask is because the analysis you provided contains some\nflaws, as noted above. But, if reverting my patch did fix your issue,\nthen perhaps understanding *why* is unimportant.\n\nI'm a bit baffled that it took less than 90 minutes for your request\nto be approved, and the patch reverted in all branches, before I even\nhad a chance to comment.\n\nRegards.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnR195v0Gz9t4t\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 00:34:21 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S932555AbdIFOeS (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 10:34:18 -0400","from smtp5-g21.free.fr ([212.27.42.5]:8613 \"EHLO smtp5-g21.free.fr\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S932345AbdIFOeQ (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tWed, 6 Sep 2017 10:34:16 -0400","from [172.27.0.114] (unknown [92.154.11.170])\n\t(Authenticated sender: slash.tmp)\n\tby smtp5-g21.free.fr (Postfix) with ESMTPSA id BA2E35FFB4;\n\tWed,  6 Sep 2017 16:33:54 +0200 (CEST)"],"Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Florian Fainelli <f.fainelli@gmail.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>,\n\tThibaud Cornic <thibaud_cornic@sigmadesigns.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>\n\t<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>\n\t<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>","From":"Mason <slash.tmp@free.fr>","Message-ID":"<6721135d-8c3f-57a0-f423-9d18cd6e0947@free.fr>","Date":"Wed, 6 Sep 2017 16:33:54 +0200","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tFirefox/52.0 SeaMonkey/2.49.1","MIME-Version":"1.0","In-Reply-To":"<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764171,"web_url":"http://patchwork.ozlabs.org/comment/1764171/","msgid":"<927413e9-4f1f-963c-2d3a-5a88de2eac9e@free.fr>","list_archive_url":null,"date":"2017-09-06T14:55:20","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":66150,"url":"http://patchwork.ozlabs.org/api/people/66150/","name":"Mason","email":"slash.tmp@free.fr"},"content":"On 31/08/2017 21:18, Florian Fainelli wrote:\n\n> On 08/31/2017 12:09 PM, Mason wrote:\n> \n>> 1) nb8800_link_reconfigure() calls phy_print_status()\n>> which prints the \"Link down\" and \"Link up\" messages\n>> to the console. With the patch reverted, nothing is\n>> printed when the link goes down, and the result is\n>> random when the link comes up. Sometimes, we get\n>> down + up, sometimes just up.\n> \n> Nothing printed when you bring down the network interface as a result of\n> not signaling the link down, there is a small nuance here.\n\nLet me first focus on the \"Link down\" message.\n\nDo you agree that such a message should be printed when the\nlink goes down, not when the link comes up?\n\nPerhaps the issue is that the 2 following cases need to be\nhandled differently:\nA) operator sets link down on the command-line\nB) asynchronous event makes link go down (peer is dead, cable is cut, etc)\n\nIn B) the PHY state machine keeps on running, and eventually\ncalls adjust_link()\n\nIn A) the driver calls phy_stop() and phy_disconnect() and\ntherefore adjust_link() will not be called?\n\nRegards.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnRTs0Pgbz9s7C\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 00:55:45 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S932727AbdIFOzm (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 10:55:42 -0400","from smtp5-g21.free.fr ([212.27.42.5]:19542 \"EHLO\n\tsmtp5-g21.free.fr\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S932682AbdIFOzl (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tWed, 6 Sep 2017 10:55:41 -0400","from [172.27.0.114] (unknown [92.154.11.170])\n\t(Authenticated sender: slash.tmp)\n\tby smtp5-g21.free.fr (Postfix) with ESMTPSA id 65B685FFD7;\n\tWed,  6 Sep 2017 16:55:20 +0200 (CEST)"],"Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Florian Fainelli <f.fainelli@gmail.com>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>, netdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>,\n\tThibaud Cornic <thibaud_cornic@sigmadesigns.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<a75691d9-c22a-9b89-2cce-604315062739@gmail.com>\n\t<d54d0555-c448-741c-eb63-5a773bed5a30@free.fr>\n\t<7b1c1dc9-b6e3-a1bd-2e36-474946741a79@gmail.com>","From":"Mason <slash.tmp@free.fr>","Message-ID":"<927413e9-4f1f-963c-2d3a-5a88de2eac9e@free.fr>","Date":"Wed, 6 Sep 2017 16:55:20 +0200","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tFirefox/52.0 SeaMonkey/2.49.1","MIME-Version":"1.0","In-Reply-To":"<7b1c1dc9-b6e3-a1bd-2e36-474946741a79@gmail.com>","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764215,"web_url":"http://patchwork.ozlabs.org/comment/1764215/","msgid":"<730292be-affa-c19d-75ab-edba367788e8@free.fr>","list_archive_url":null,"date":"2017-09-06T15:51:29","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":66150,"url":"http://patchwork.ozlabs.org/api/people/66150/","name":"Mason","email":"slash.tmp@free.fr"},"content":"On 31/08/2017 21:18, Florian Fainelli wrote:\n\n> On 08/31/2017 12:09 PM, Mason wrote:\n>\n>> On 31/08/2017 19:03, Florian Fainelli wrote:\n>>\n>>> On 08/31/2017 05:29 AM, Marc Gonzalez wrote:\n>>>\n>>>> On 31/08/2017 02:49, Florian Fainelli wrote:\n>>>>\n>>>>> The original motivation for this change originated from Marc Gonzalez\n>>>>> indicating that his network driver did not have its adjust_link callback\n>>>>> executing with phydev->link = 0 while he was expecting it.\n>>>>\n>>>> I expect the core to call phy_adjust_link() for link changes.\n>>>> This used to work back in 3.4 and was broken somewhere along\n>>>> the way.\n>>>\n>>> If that was working correctly in 3.4 surely we can look at the diff and\n>>> figure out what changed, even maybe find the offending commit, can you\n>>> do that?\n>>\n>> Bisecting would a be a huge pain because my platform was\n>> not upstream until v4.4\n> \n> Then just diff the file and try to pinpoint which commit may have\n> changed that?\n\nRunning 'ip link set eth0 down' on the command-line.\n\nIn v3.4 => adjust_link() callback is called\nIn v4.5 => adjust_link() callback is NOT called\n\n$ git log --oneline --no-merges v3.4..v4.5 drivers/net/phy/phy.c | wc -l\n59\n\nI'm not sure what \"just diff the file\" entails.\nI can't move 3.4 up, nor move 4.5 down.\nI'm not even sure the problem comes from drivers/net/phy/phy.c\nto be honest.\n\nRegards.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnSkk0YYWz9sRY\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 01:51:58 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1755074AbdIFPvz (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 11:51:55 -0400","from smtp5-g21.free.fr ([212.27.42.5]:17140 \"EHLO\n\tsmtp5-g21.free.fr\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1753845AbdIFPvw (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tWed, 6 Sep 2017 11:51:52 -0400","from [172.27.0.114] (unknown [92.154.11.170])\n\t(Authenticated sender: slash.tmp)\n\tby smtp5-g21.free.fr (Postfix) with ESMTPSA id DA5FE5FF9E;\n\tWed,  6 Sep 2017 17:51:29 +0200 (CEST)"],"Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Florian Fainelli <f.fainelli@gmail.com>, Andrew Lunn <andrew@lunn.ch>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>, netdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>, Mans Rullgard <mans@mansr.com>,\n\tThibaud Cornic <thibaud_cornic@sigmadesigns.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<a75691d9-c22a-9b89-2cce-604315062739@gmail.com>\n\t<d54d0555-c448-741c-eb63-5a773bed5a30@free.fr>\n\t<7b1c1dc9-b6e3-a1bd-2e36-474946741a79@gmail.com>","From":"Mason <slash.tmp@free.fr>","Message-ID":"<730292be-affa-c19d-75ab-edba367788e8@free.fr>","Date":"Wed, 6 Sep 2017 17:51:29 +0200","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tFirefox/52.0 SeaMonkey/2.49.1","MIME-Version":"1.0","In-Reply-To":"<7b1c1dc9-b6e3-a1bd-2e36-474946741a79@gmail.com>","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764289,"web_url":"http://patchwork.ozlabs.org/comment/1764289/","msgid":"<17ff342a-2123-275a-eac8-4aec27ae48d1@caviumnetworks.com>","list_archive_url":null,"date":"2017-09-06T17:53:40","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":721,"url":"http://patchwork.ozlabs.org/api/people/721/","name":"David Daney","email":"ddaney@caviumnetworks.com"},"content":"On 09/06/2017 07:33 AM, Mason wrote:\n> On 31/08/2017 20:29, Florian Fainelli wrote:\n>> On 08/31/2017 11:12 AM, Mason wrote:\n>>> On 31/08/2017 19:53, Florian Fainelli wrote:\n>>>> On 08/31/2017 10:49 AM, Mason wrote:\n>>>>> On 31/08/2017 18:57, Florian Fainelli wrote:\n>>>>>> And the race is between phy_detach() setting phydev->attached_dev \n>>>>>> = NULL\n>>>>>> and phy_state_machine() running in PHY_HALTED state and calling\n>>>>>> netif_carrier_off().\n>>>>>\n>>>>> I must be missing something.\n>>>>> (Since a thread cannot race against itself.)\n>>>>>\n>>>>> phy_disconnect calls phy_stop_machine which\n>>>>> 1) stops the work queue from running in a separate thread\n>>>>> 2) calls phy_state_machine *synchronously*\n>>>>>       which runs the PHY_HALTED case with everything well-defined\n>>>>> end of phy_stop_machine\n>>>>>\n>>>>> phy_disconnect only then calls phy_detach()\n>>>>> which makes future calls of phy_state_machine perilous.\n>>>>>\n>>>>> This all happens in the same thread, so I'm not yet\n>>>>> seeing where the race happens?\n>>>>\n>>>> The race is as described in David's earlier email, so let's recap:\n>>>>\n>>>> Thread 1            Thread 2\n>>>> phy_disconnect()\n>>>> phy_stop_interrupts()\n>>>> phy_stop_machine()\n>>>> phy_state_machine()\n>>>>   -> queue_delayed_work()\n>>>> phy_detach()\n>>>>                 phy_state_machine()\n>>>>                 -> netif_carrier_off()\n>>>>\n>>>> If phy_detach() finishes earlier than the workqueue had a chance to be\n>>>> scheduled and process PHY_HALTED again, then we trigger the NULL \n>>>> pointer\n>>>> de-reference.\n>>>>\n>>>> workqueues are not tasklets, the CPU scheduling them gets no guarantee\n>>>> they will run on the same CPU.\n>>>\n>>> Something does not add up.\n>>>\n>>> The synchronous call to phy_state_machine() does:\n>>>\n>>>     case PHY_HALTED:\n>>>         if (phydev->link) {\n>>>             phydev->link = 0;\n>>>             netif_carrier_off(phydev->attached_dev);\n>>>             phy_adjust_link(phydev);\n>>>             do_suspend = true;\n>>>         }\n>>>\n>>> then sets phydev->link = 0; therefore subsequent calls to\n>>> phy_state_machin() will be no-op.\n>>\n>> Actually you are right, once phydev->link is set to 0 these would become\n>> no-ops. Still scratching my head as to what happens for David then...\n>>\n>>>\n>>> Also, queue_delayed_work() is only called in polling mode.\n>>> David stated that he's using interrupt mode.\n>>\n>> Right that's confusing too now. David can you check if you tree has:\n>>\n>> 49d52e8108a21749dc2114b924c907db43358984 (\"net: phy: handle state\n>> correctly in phy_stop_machine\")\n> \n> Hello David,\n> \n> A week ago, you wrote about my patch:\n> \"This is broken.  Please revert.\"\n> \n> I assume you tested the revert locally, and that reverting did make\n> the crash disappear. Is that correct?\n> \n\nYes, I always test things before making this type of assertion.\n\n\n> The reason I ask is because the analysis you provided contains some\n> flaws, as noted above. But, if reverting my patch did fix your issue,\n> then perhaps understanding *why* is unimportant.\n\nI didn't want to take the time to generate calling sequence traces to \nverify each step of my analysis, but I believe the overall concept is \nessentially correct.\n\nOnce the polling work is canceled and we set a bunch of essential \npointers to NULL, you cannot go blindly restarting the polling.\n\n> \n> I'm a bit baffled that it took less than 90 minutes for your request\n> to be approved, and the patch reverted in all branches, before I even\n> had a chance to comment.\n> \n\no The last chance for patches to v4.13 was fast approaching.\n\no There were multiple reports of failures caused by the patch.\n\no The patch was clearly stand-alone.\n\nThe kernel maintainers are a model of efficiency, there was no reason to \ndelay.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tunprotected) header.d=CAVIUMNETWORKS.onmicrosoft.com\n\theader.i=@CAVIUMNETWORKS.onmicrosoft.com header.b=\"ABJUgNRv\"; \n\tdkim-atps=neutral","spf=none (sender IP is )\n\tsmtp.mailfrom=David.Daney@cavium.com; "],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnWRR0YPLz9t2d\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 03:53:54 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751978AbdIFRxw (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 13:53:52 -0400","from mail-bn3nam01on0087.outbound.protection.outlook.com\n\t([104.47.33.87]:60230\n\t\"EHLO NAM01-BN3-obe.outbound.protection.outlook.com\"\n\trhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP\n\tid S1751907AbdIFRxr (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tWed, 6 Sep 2017 13:53:47 -0400","from ddl.caveonetworks.com (50.233.148.156) by\n\tMWHPR07MB3504.namprd07.prod.outlook.com (10.164.192.31) with\n\tMicrosoft SMTP Server (version=TLS1_2,\n\tcipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id\n\t15.20.13.10; Wed, 6 Sep 2017 17:53:43 +0000"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version;\n\tbh=Cs7ymeCJQTUJwODwNTtzTetkHGA3rmcYjZO9yLCjz5A=;\n\tb=ABJUgNRvYWXt+1MIDLpeQBOEpIr7V/bv+q1bDYnbY442bI+5hU1AQO876M3Vo4tyejaVcoFsJ+pYzLBorndsFFbBcZ7BAWZIU1SR26PqiYhpidmJ37EtiqUssaqn3jgYQy9PVoFB1U62z36s3z1B8EwlYrsreQMF00FH8qEZeBc=","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Mason <slash.tmp@free.fr>, Florian Fainelli <f.fainelli@gmail.com>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>,\n\tThibaud Cornic <thibaud_cornic@sigmadesigns.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>\n\t<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>\n\t<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>\n\t<6721135d-8c3f-57a0-f423-9d18cd6e0947@free.fr>","From":"David Daney <ddaney@caviumnetworks.com>","Message-ID":"<17ff342a-2123-275a-eac8-4aec27ae48d1@caviumnetworks.com>","Date":"Wed, 6 Sep 2017 10:53:40 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<6721135d-8c3f-57a0-f423-9d18cd6e0947@free.fr>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Language":"en-US","Content-Transfer-Encoding":"7bit","X-Originating-IP":"[50.233.148.156]","X-ClientProxiedBy":"BY2PR07CA0028.namprd07.prod.outlook.com (10.166.107.23) To\n\tMWHPR07MB3504.namprd07.prod.outlook.com (10.164.192.31)","X-MS-PublicTrafficType":"Email","X-MS-Office365-Filtering-Correlation-Id":"c93183cd-5c53-4bb1-68fc-08d4f550374e","X-Microsoft-Antispam":"UriScan:; BCL:0; PCL:0;\n\tRULEID:(300000500095)(300135000095)(300000501095)(300135300095)(300000502095)(300135100095)(22001)(2017030254152)(300000503095)(300135400095)(2017052603199)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);\n\tSRVR:MWHPR07MB3504; ","X-Microsoft-Exchange-Diagnostics":["1; MWHPR07MB3504;\n\t3:fEo97xmexfNJ7OrW1/h6VmlHM6QsGMy5Kp3o+l6ARzBvTYIGbpfDVG+iW27beYDLNdmJRtePr0CbPRFaG+PaFKA6VOdQe0M7ydk2ZYZ3ueeMNMWEB+vW3p8szL//AiTrZzAdxiWDsnQ90/sqvOdsi6OvE1teFfSfGMaZIOLuxwpK2Kv7mVkPbphWg2cVZk3+O+BVACGzXZBaPLweHHHdAo7FxlVomH+YSFUwVTyvuTmw1XWZtqtcbjTFvMxkEzfD;\n\t25:vWSjytmScyji94eJT3lIVOGtE4Pi4TM1msPq8/YSIxRvpg2PHSEQTEdT/zIiToUfLFa+QFSN2hKhqUJK9sr1wQorlGKzTtwMMA1NWe2YNdhTPqUKWDMS+RWP6KrahTsRjYtZL4oU8PuqO8LKL8AST87Hvjh8/cIVsqp5NiHZKPk3ahoV6N/2iR6Dtzrha/I7S4byo84IDOBs26EYZkZs2NGLozt8Nih3iPciplN/zw7D91+nAU8goXJepcF9K+qS1ZbMlb/16U8ruNg66PtibDHAkqsLwnKdfIEgj2LVGY4KXrY625L8nbFBf5qc2g6hQG9hK4wp5VJPIoOT1kL0+g==;\n\t31:tP1r3i1ncb4HEThlN53LHbomYOMHXN7w9f8gOli70JGB37IRRN1/BbKUPiEZuOF+XMXNC5QH9OmGMuipIyA59eVheG89pX48+OKUWEb0RMefkqNV4Ek4LjyBFv86AxZEMPfbBwUGkaJ1aYfsOZQzMhwgq+cXHDbmoF0dZxzkAUU/ySZlSyEfv+dM/VOzkwWe+N/DhPXNGdos/EiVI7OB7k1H/Ubfd1OYOcLxxpVCtdM=","1; MWHPR07MB3504;\n\t20:OM6xXiuIuieq5QR8WThu9e+IJU4I6HID7lMGccrGzrmV8JmpF4gbmQJINVfG7G3AXckfwBDZvCf3hU+yFaJWyuK48/wwB0jcbtPgxHWNaR2DDzlKYPgdqQOCsHAa2hDK8IgeaxVeIq/tBWzA8gOI+yVHY7vTTyG5G51nIDxHicOLisxIcUrkUfNEenPOybtPgNw0AEghO8Gbd+8CsDLBaCtyHhFd5XeIIZEVgQYqhKx8Ac9qFHgI7gxp7s2vjTM4RiWXCY507uaxJGquACgHCXarrGF5yHkw8yRkxZkKvpheu4KTapbm1KcA3SnATpXaX0BAY4hnDKWpQ8OQ2UZRcqjqjA9Xj0wnbrhwuhw+pMVXewDjhKS1j/F+nHc+XlpjuEWUB1U6eLantXEyU91sugxErQ/Zj/eKc8NgV+H93yqV31ZiYBs48rJeTMlnHJIWtLXceu6gL3ICgqY5Z7aD8bQfaEgkHuyH8FgSt8rrJK1PQ/XvK2JjZ9uc4uZtL/asqqi8+I42XLilW/qVYHugWjZmvj7NuaELSHJC7mWoBzE2K6JEK3bMAQwykxrCMvMJ50xlaFrNiT3hwLNP+OkrryYJVZh8+oEUm84grPEgT4I=;\n\t4:Ss90rGeA2TEuzSwcNwXhBYW4CLxJxg2NNTwhULXqK0sNtm6d9z+H0ixw8Bzubb4pWQgWxImjVWaI8WilPo3RCWb5biPRV3/bdcLhcLMIOaVANdsLaH4Qs+d6OXzkqWEPCqLpt/kCf9mJrs2jH/bD7ebMSyL52W+gJ5Z3I0r2gDiBzu36fy7e3pZ1H+s9VEmvgIrKySS4YrRvU0N/bagXL0HxuI8gDHX6iubEopuc4wkM0xgP/qp4pXFsn/fdeDEG1DiNv2wJOO1l92+BHE3gaZfUnp8cFtzdx74DfFBC1GA=","=?utf-8?q?1=3BMWHPR07MB3504=3B23=3AWMdA?=\n\t=?utf-8?q?b8xA8/tzd7hl8lY5PGDm3hLiMz0heqZ7r5N9OOxmnJdQixjP6Tqrdwkl?=\n\t=?utf-8?q?E/JjXjEX+n+R1LHaf1GHFPpjXttqV5pEe90KBcK/zDvoeXh887fx3g+X?=\n\t=?utf-8?q?4UP4EWmMYpaHuQQmRqoTbJlZI/US53C5fTEzaeFqposcq5iq0+br+CiW?=\n\t=?utf-8?q?NElVqxNsy8oc5xnhKfQH2TZPCSLEkOVUigTJEljFMjt9hiz5k8b9e4o7?=\n\t=?utf-8?q?Xy5im0YBw0BUPzF7JGvOV448P0FhsmIdmtVzaOtyqJVc8x2thuq/XsM7?=\n\t=?utf-8?q?9kNmyHtPS1t00CTgqae0w9sc4JP+0opvmnnXkHund0MGIVE5GuVlc9t3?=\n\t=?utf-8?q?9wROgTJ44jfXeu/OdGDAujTRaqrB4ylAeHh6F0+GX9X0ltvDwGDI9QKm?=\n\t=?utf-8?q?rpXCVFmJhOtqNg9paPDx92ZgRcJZobAM4KML07i1Nd0aWLyjUrIagEHH?=\n\t=?utf-8?q?xMhVouNNHAV2jJEgzHeZuL65zdoAx2V3UJ2vJDf4W2uMbn71oBRpf+QK?=\n\t=?utf-8?q?B8yLOtATwJU4Gv+ByDAXz0Pcf543xSvWEg8aNgiskJJY3SmWeQgYY0A2?=\n\t=?utf-8?q?yA08Gu4DtFfRJF/3pY0JY//7T7r6jjWmCS4YzEI+moBNlVzVo2WCysMi?=\n\t=?utf-8?q?f8jRTSKTraE+F04ZEdFASpdKyDirKm/hyJwGTk+pFnWTUv8qdl+L7TKB?=\n\t=?utf-8?q?xzxShRsd8wtfhvs4jP+oxeCZTDddOYwmBjDZJ5Ye8MNs5vx7ueqKTpY6?=\n\t=?utf-8?q?ZnVid10b0yQaiZSm5QeZxGT0r6x2yqpHH+0Flb94kwzvVC8Duuf25sFU?=\n\t=?utf-8?q?/bKqDF2ER9w1rU8CsO9AmqzCtt2i5efUlmy1qH0JyoDvbnaGepi6JdjH?=\n\t=?utf-8?q?X/WwisDQarKHb2otKA2cCprd3YnB6DRZDlRoC9ISmWtnUyTAqxavK2GM?=\n\t=?utf-8?q?q0/f/pqS9Xij/jflMRBt+HMdJS+4d/5QXXxFl5vHQpS5FDHwG51sd2LP?=\n\t=?utf-8?q?P640jdZ6DoEl3qrYhMu8GhcgIMDpKGCClFhgRbM8mVk4JFMx4rPMkFM6?=\n\t=?utf-8?q?BtqxwdRNnAkiuGkTLsuO4iTJ+rAt0nYmFXo+dZdFBGPrezcdc92JlU2E?=\n\t=?utf-8?q?SYK4hrbhhq7YEcv24sLOoPlYVRW5a5YCYa2lxT6DiJcT9TAuDdQoLm4q?=\n\t=?utf-8?q?C9GlqgeZRoc/oX8wNcrPkyH22OtmNZM6MzCRuqnwN6nhPm9G9GpbRrxO?=\n\t=?utf-8?q?K42LRspGibMqlhmDCOFqpZL/WyTyoLfKMP6lvjP68d2rW481UZZRKZrk?=\n\t=?utf-8?q?NdfGTKzLPj2of3lUx/aMxPXTxt2ZuuWBUOuAw9nJEdOtkqKKMwYBbBD7?=\n\t=?utf-8?q?HPqm+3u+hWbqdFsTpwrSMs1vwma4NX2HxhyuF6QfdEG+W/g+ZHto4xNf?=\n\t=?utf-8?q?RzopjcM25zO+OluvIm1v2j7sX3UJN8MsW93MmPKe9SkzEThJw2eLx7dw?=\n\t=?utf-8?q?WifB1lE+c40BQL6JrMdspzarv3ept5SghwG1vaF5/k3Zrec=3D?=","1; MWHPR07MB3504;\n\t6:XVonI9Ag0NQ61NanQuYcA8lTFOLBTN2/czuj9bUK8Aspi1I4UUUCao5ZkbSKryqEQOuTMStv+GR3VX6La2yCiZrHOzsDcpWTAj/5mEtjtdj3ZvwztnTcCDfKtJlTRUH9TSi4gBv9WhNwofW8qPn379rDbJmYkP2dDzF/2KeO1wPGG+hbBgZgDtAz4QLS5ung2HVJtKrNE/Pj9bvbJpyGsEE9y257pn2wwacCagVS6xDe9e5cS3iM4530Y/3GgA45SAbthj/M3buSenQKfy4KRqTVZv+GiCkTzCs0OI4ZosUJI6bk+/XUgjKlzlZ0jLdvaEaFQU5VKlpdYsKMkA7ySg==;\n\t5:+Nzzu4VYdbHLbjD233hISbn2xILSd3aA7I7YzOeyfjTgcdcrvC3MVev7GnrgL+4nFI3uZQ+Akj1IXmhJvStV5L9aS0fGygz90gPaTv2OHp1riUDKtTGfuMiFsPIesqI2EJ2pwfBn4RTnPVIzb0CiWw==;\n\t24:84qmw07eucaP8xbqPHk9RuzEpl46M4rCIhyPcABcaRUWMZaIqba3TAq9faQCTJ5TMxUHbLZWFMvU0b+DHG29MhguIAgmJyrm3Z18UNNZC7k=;\n\t7:oT0OKUcLGMHXsXvgVYYMgJboINsIGOekFzw4PJyMJZ8ZDD7I36AppzD0vYB0PocpHYcaqz4CCUXeZ/Qk44J8MIntlV0j2GrIJDX7RDsnZt60B0T6ZxXSb4Ov3Qfe2tqcduvPo+NxxIrrJsofzefOkPMntEUSrLtFx2KxmH1IxRknmCgCqWhRZ35uilL/AYrcx9BluRTNY0vsqLtLnv0N7HBLPLylUx8GkLX/iuigl5U="],"X-MS-TrafficTypeDiagnostic":"MWHPR07MB3504:","X-Exchange-Antispam-Report-Test":"UriScan:(20558992708506);","X-Microsoft-Antispam-PRVS":"<MWHPR07MB3504332059C6533266D7821A97970@MWHPR07MB3504.namprd07.prod.outlook.com>","X-Exchange-Antispam-Report-CFA-Test":"BCL:0; PCL:0;\n\tRULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(10201501046)(93006095)(100000703101)(100105400095)(3002001)(6041248)(20161123564025)(20161123555025)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(20161123560025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);\n\tSRVR:MWHPR07MB3504; BCL:0; PCL:0;\n\tRULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);\n\tSRVR:MWHPR07MB3504; ","X-Forefront-PRVS":"0422860ED4","X-Forefront-Antispam-Report":"SFV:NSPM;\n\tSFS:(10009020)(6009001)(189002)(24454002)(377454003)(199003)(31696002)(5660300001)(50466002)(69596002)(68736007)(6506006)(76176999)(50986999)(229853002)(6486002)(54356999)(97736004)(42186005)(47776003)(53416004)(2906002)(42882006)(33646002)(106356001)(4326008)(2950100002)(230700001)(5890100001)(101416001)(53546010)(105586002)(36756003)(66066001)(93886005)(7736002)(305945005)(65806001)(65956001)(64126003)(6246003)(54906002)(53936002)(39060400002)(83506001)(25786009)(23676002)(65826007)(8936002)(6116002)(6512007)(8676002)(31686004)(3846002)(81156014)(4001350100001)(72206003)(81166006)(4000630100001)(189998001)(478600001);\n\tDIR:OUT; SFP:1101; SCL:1; SRVR:MWHPR07MB3504;\n\tH:ddl.caveonetworks.com; FPR:; SPF:None; PTR:InfoNoRecords;\n\tMX:1; A:1; LANG:en; ","Received-SPF":"None (protection.outlook.com: cavium.com does not designate\n\tpermitted sender hosts)","SpamDiagnosticOutput":"1:99","SpamDiagnosticMetadata":"NSPM","X-OriginatorOrg":"caviumnetworks.com","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"06 Sep 2017 17:53:43.3906\n\t(UTC)","X-MS-Exchange-CrossTenant-FromEntityHeader":"Hosted","X-MS-Exchange-CrossTenant-Id":"711e4ccf-2e9b-4bcf-a551-4094005b6194","X-MS-Exchange-Transport-CrossTenantHeadersStamped":"MWHPR07MB3504","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764293,"web_url":"http://patchwork.ozlabs.org/comment/1764293/","msgid":"<ff070239-28b7-d41b-8abe-c9f810561372@caviumnetworks.com>","list_archive_url":null,"date":"2017-09-06T18:00:53","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":721,"url":"http://patchwork.ozlabs.org/api/people/721/","name":"David Daney","email":"ddaney@caviumnetworks.com"},"content":"On 08/31/2017 11:29 AM, Florian Fainelli wrote:\n> On 08/31/2017 11:12 AM, Mason wrote:\n>> On 31/08/2017 19:53, Florian Fainelli wrote:\n>>> On 08/31/2017 10:49 AM, Mason wrote:\n>>>> On 31/08/2017 18:57, Florian Fainelli wrote:\n>>>>> And the race is between phy_detach() setting phydev->attached_dev = NULL\n>>>>> and phy_state_machine() running in PHY_HALTED state and calling\n>>>>> netif_carrier_off().\n>>>>\n>>>> I must be missing something.\n>>>> (Since a thread cannot race against itself.)\n>>>>\n>>>> phy_disconnect calls phy_stop_machine which\n>>>> 1) stops the work queue from running in a separate thread\n>>>> 2) calls phy_state_machine *synchronously*\n>>>>       which runs the PHY_HALTED case with everything well-defined\n>>>> end of phy_stop_machine\n>>>>\n>>>> phy_disconnect only then calls phy_detach()\n>>>> which makes future calls of phy_state_machine perilous.\n>>>>\n>>>> This all happens in the same thread, so I'm not yet\n>>>> seeing where the race happens?\n>>>\n>>> The race is as described in David's earlier email, so let's recap:\n>>>\n>>> Thread 1\t\t\tThread 2\n>>> phy_disconnect()\n>>> phy_stop_interrupts()\n>>> phy_stop_machine()\n>>> phy_state_machine()\n>>>   -> queue_delayed_work()\n>>> phy_detach()\n>>> \t\t\t\tphy_state_machine()\n>>> \t\t\t\t-> netif_carrier_off()\n>>>\n>>> If phy_detach() finishes earlier than the workqueue had a chance to be\n>>> scheduled and process PHY_HALTED again, then we trigger the NULL pointer\n>>> de-reference.\n>>>\n>>> workqueues are not tasklets, the CPU scheduling them gets no guarantee\n>>> they will run on the same CPU.\n>>\n>> Something does not add up.\n>>\n>> The synchronous call to phy_state_machine() does:\n>>\n>> \tcase PHY_HALTED:\n>> \t\tif (phydev->link) {\n>> \t\t\tphydev->link = 0;\n>> \t\t\tnetif_carrier_off(phydev->attached_dev);\n>> \t\t\tphy_adjust_link(phydev);\n>> \t\t\tdo_suspend = true;\n>> \t\t}\n>>\n>> then sets phydev->link = 0; therefore subsequent calls to\n>> phy_state_machin() will be no-op.\n> \n> Actually you are right, once phydev->link is set to 0 these would become\n> no-ops. Still scratching my head as to what happens for David then...\n> \n>>\n>> Also, queue_delayed_work() is only called in polling mode.\n>> David stated that he's using interrupt mode.\n\nDid you see what I wrote?\n\nphy_disconnect() calls phy_stop_interrupts() which puts it into polling \nmode.  So the polling work gets queued unconditionally.\n\n\n\n> \n> Right that's confusing too now. David can you check if you tree has:\n> \n> 49d52e8108a21749dc2114b924c907db43358984 (\"net: phy: handle state\n> correctly in phy_stop_machine\")\n> \n\nYes, I am using the 4.9 stable branch, and that commit was also present.\n\nDavid.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tunprotected) header.d=CAVIUMNETWORKS.onmicrosoft.com\n\theader.i=@CAVIUMNETWORKS.onmicrosoft.com header.b=\"FRkbB8ay\"; \n\tdkim-atps=neutral","spf=none (sender IP is )\n\tsmtp.mailfrom=David.Daney@cavium.com; "],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnWbg0TkTz9sBW\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 04:01:03 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751543AbdIFSBB (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 14:01:01 -0400","from mail-by2nam01on0080.outbound.protection.outlook.com\n\t([104.47.34.80]:39901\n\t\"EHLO NAM01-BY2-obe.outbound.protection.outlook.com\"\n\trhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP\n\tid S1750836AbdIFSA7 (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tWed, 6 Sep 2017 14:00:59 -0400","from ddl.caveonetworks.com (50.233.148.156) by\n\tMWHPR07MB3503.namprd07.prod.outlook.com (10.164.192.30) with\n\tMicrosoft SMTP Server (version=TLS1_2,\n\tcipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id\n\t15.20.13.10; Wed, 6 Sep 2017 18:00:56 +0000"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version;\n\tbh=ObG4RYTn4PVCeLbnWy240ZT1OXlO6g89ZpfyaRnQXeo=;\n\tb=FRkbB8ayNWBqkB9God6sy4ZP5deBFIMLqT5rnAdbWEd3Na/wzLUkZ0WwYIDYp/m/0JlH4Sm0QzbAj7vM7gMO6ds/nq2s8JaNLb8lh3x+hJL4jM7Ddqc2+JHpW8dGKFn9HM4WxCDy37KtslWZH3UOYhTGG0IMjO5HJJGy5/kEhm8=","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Florian Fainelli <f.fainelli@gmail.com>, Mason <slash.tmp@free.fr>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>\n\t<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>\n\t<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>","From":"David Daney <ddaney@caviumnetworks.com>","Message-ID":"<ff070239-28b7-d41b-8abe-c9f810561372@caviumnetworks.com>","Date":"Wed, 6 Sep 2017 11:00:53 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Language":"en-US","Content-Transfer-Encoding":"7bit","X-Originating-IP":"[50.233.148.156]","X-ClientProxiedBy":"BY2PR07CA0091.namprd07.prod.outlook.com (10.166.107.44) To\n\tMWHPR07MB3503.namprd07.prod.outlook.com (10.164.192.30)","X-MS-PublicTrafficType":"Email","X-MS-Office365-Filtering-Correlation-Id":"a4cd2b15-a15a-4f5c-8348-08d4f551396f","X-Microsoft-Antispam":"UriScan:; BCL:0; PCL:0;\n\tRULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(300000503095)(300135400095)(2017052603199)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);\n\tSRVR:MWHPR07MB3503; ","X-Microsoft-Exchange-Diagnostics":["1; MWHPR07MB3503;\n\t3:C5xJnKYTkV03YSwtGdTppL/IjvtDvFVsDsdLbeuZIQpGkOFObdssdgXqhtiNYbfqrPZZ438CP+no36CDWGQFJbbrBA623z+JRO2nYvNFuApHRJ8FojHJ9/PmsZnROgaq7QPtrsxmClHp2W9kJA4H0Xy2Grx77gIQnz6jbnLcRUYXmRyn+8XS8XtVmrbUPI2C163X/1G5nhm/OE4zRiRi+sE9F0G8xKitp9CzzM9HhmBU6KevwsXMY8/Mr7qZ9U+K;\n\t25:ttOAThdEPcvtXkftvfSAof5HaBpsJwBi5/reSqOFIng+eKvAMaXD9Izho+lk5i6Xga+HUHgupFeYZ54pLflptGIuNibz35k+BJV9xhdyBk6rp/l32mp+v13JpeqOqEAqjf3KTjkcWiftSI3D0Mf0SCWQJ7hcmeVeJjrnSk4OV81Ha0wwO+xMFTjpq0zGHP5UCnfMmjZ60mRDpIQIlaFT6WCk+GoybV5T6p/msBBnbsGpihZa9+KC2SK37jVa3OKmvRxz7Kc6yI8d9xtR8Bna/Om70bBpibH1vXb5NzTGTlM+h03kDUMVRWquK/ZgsHIS/Ji5f3Vwv0jvd1ZjVftE5g==;\n\t31:x/fa3S3O0vbbeKiXl3XICb8Nq/Od83FvE9r1YSXa7u1otCknhch1I4GkTzZ5JQjDuEy8lji/4/gr67qLOD7AiTjr91DyPOvNnYvbNh7CEgDyL2cgDQTje3jAMufPlr+ReSsOTq1uc4Lgt4QKKXQc0PYD68ZNT5CM7f1+Ijy6Bl5X34NlfG9AMyF9BjMwr11Q3LsgZnXqPvRFClMAFya//14sl19GQzaE8cWiXMc3xlU=","1; MWHPR07MB3503;\n\t20:NBCWI0pBdeqGmzcN43Iy+dyy/rEg1BArW1DalqApD7f3F32lHOt7O7iv91xGf8RfrSB2mXVYeJ/mgVqFtypU+y/BsRfQtC9OsT51V+EMNYFGtLIcgvrBPtHCcAVlt72HIj9PHHOvLC931hRUBiZuZDpRPaT+H/p/S7gINfwqGQA+VjTJk6BvKV9NOouqjZvc9wRyPgEoEBX20X6xj/BN6AM/IXMR11k4oVg6SIJoBu2i/S1Xg4gjGdFfigoY7FcdeBiKfLVMPa0osnIuQmgD7e+nsZ6Dg2H6WKjUetwwEY6oaX6PQ+jSZrGRnRWZrcWz1q56aoJHX27QncnKHoShfmn27jkZeaNtcbr6Df3OAr+eoNaZmGHrCwIwpIhjMyXeBjeC4lm9y6hkZs4ETz860LLoXcKonVjzg8x4Rbi06JFw9pAiJtu9U7HKUbGOFclmNweMQHKvHh/wCTTVEOV0EOrBvMNQ9DDnHK50tOC+SwGnIxOAGBxU82G/xnskRwXqQwj1B3IS9PDGmYwdF9r3jFQgX8xilNTjeKSSqgc4+qzm7dXT+aBEirik/9iUgfrKXD6IYLzaw6ddwPoq8mVAyTOLpa5c1eLC1gO4eAVsN8g=;\n\t4:+qRDDOy5GAG4DkVj1YTbwjCTR+uPX5Dhj2tsbxmUlor/gmMuy/dPlA96ZNW3ecBW3wke+5kD/7tCjBu0TzEUi9xmTk16Ru2KTA0j9w4QSa2e3RIWPBzoIpxwZokNG5SwrBJ3JcaPr68UJE0aBypFYnZy7REjTls/rnDWhM8XqXqI4rDORYKxSVCzmNkbsflsCgZz1HEPiOJjnVU1B5WGxouGr+wksytU2Pve9vFbyAfepq8TGwt5w0ufEFWMfuCT","=?utf-8?q?1=3BMWHPR07MB3503=3B23=3AINKD?=\n\t=?utf-8?q?tGpUW6YNHhw5IL4k7n9hfRvoThBXdt5PpmQkanpEJau41KdWAcb7zfXv?=\n\t=?utf-8?q?kXhWgWRFRf0pnuxjwstkXk7ku6eFEUKpjaiEbxUKuZI1F9zR1sn24Srs?=\n\t=?utf-8?q?Z9J9N+NLAtHXP3ysmORvnOoGX5vEvR0ryoYVarhlx+rVWs5c4pLAYWhF?=\n\t=?utf-8?q?Gq015zd+KA4mO4gHVn5VWiNlBXxU6GM73CE2p/BjpcLZM10Je2Mmlsca?=\n\t=?utf-8?q?MSGO6eFUNmB3GWPYSuKDTT6fVt3XhWs+hTpZdHeiLmu/UwUaoZJttuub?=\n\t=?utf-8?q?LzPKHstsaiRMLk5BgTvgTHbLt+nO7ETvEdm9a2CZxLpRuLr4FUNK4c9o?=\n\t=?utf-8?q?N4YlB2qL78bf2YHNVLkGg7NKeOK1LAxP0J4v5GZQJfgyug8W7Xcu112L?=\n\t=?utf-8?q?WnkR3rZEPaC7d4bO6xVyYPLX8E80JWnc9VtbpQfrWYGbYfLXJzuPWYdB?=\n\t=?utf-8?q?6sRJHdKKhDKYk9ciQ0w0FTflQio7AJ3n5KcdFyjvypte52f6h8NNG76G?=\n\t=?utf-8?q?yH7zr1Tvq0HwEbyoAE+9NfywnguA7hRC/jQ2LmxJ286TgYG9I0ELbWyT?=\n\t=?utf-8?q?ENT5s2Mh1o0BbnVmBd72NIcac09oLA8bG43oHcXACmFKcfIP3C+vVV3y?=\n\t=?utf-8?q?P6AH6ASnKbFH27BIB/6dObBA+nNVTminYvFF8pvlyUQaCqm4YqyKUGJt?=\n\t=?utf-8?q?14oEQfhmu3q9JJUaGjLHb2lMf/ZJWkDBxV69gpkU4YESVbUVrh9t7plE?=\n\t=?utf-8?q?iBhNV8z9mr3Nx25vmDxc1YR/Opk8pdDiLBWunobc+hGDEL7o4gz3+WG+?=\n\t=?utf-8?q?pByEI39DBc8AYhzxYb1H/qBvid4ZksH8H2NlH6MZIzLqrIwITPyVQlJ1?=\n\t=?utf-8?q?0a9Fkhx8H4YRI6HnEjRKcmuJ2RhnpEdJam9pp/FIP/7+3r3Q951W3S+0?=\n\t=?utf-8?q?Js2cJyof0taQHE1TNfV2mAzyQpHFYbgiVmy65qE+wU22U7FSoVDqcOCj?=\n\t=?utf-8?q?SQ3q7XAD9vRRWon/MrbwF0d2JxxTXJlwGP5fGnBvO49v8vDx1ktvDrsk?=\n\t=?utf-8?q?+g5yVXij1w1cNWGuFEng1VOpXc1lTAtazhkUPHziMjnrSj14i1dm/GWz?=\n\t=?utf-8?q?Ul66e6M09mfdyYebkboH+MHOnoRppyUuQXwvgIZqPMer+sBxshUqahBS?=\n\t=?utf-8?q?tspSmYGlRyIBu9BnhexG4gvAGBF0cgVZKgcf6NMwqRiGRdQFDWrFwwwe?=\n\t=?utf-8?q?SdMF8XEhhKJse8IKEbKt9ru+9IbWW1qtd9K7yGYK+xhqsHMvkkrKEoY4?=\n\t=?utf-8?q?eJDyOKxd6y9gDfDatrbmUXZ5a1omsxZllDwD70Goa2P6UGSLLfoyXno9?=\n\t=?utf-8?q?d7HaasYXa8+EoRxaVg7aNIKLQQhzHO5MdnsTCDEEEbZfochAL04ocPiB?=\n\t=?utf-8?q?gHxrcWqx4qUACBdfeF/X6URDq6daAjmi71YEKZcEC2ncfx3gjE+8iEt4?=\n\t=?utf-8?q?l2JgjHUVVlLcj3nPtbULzyF5tZSLJeu5dAFcqPX5rPKUUR1rg2cNxE96?=\n\t=?utf-8?q?NhWumXwprLV0?=","1; MWHPR07MB3503;\n\t6:I1gXNz1tbxPvg7uifJraW/EFz0KEjm6AWkvuSblFGejrzZFnCSgkRV8Efr5/tWOaPUk3VS0EnVf8deICcAO+LMGauJpeD1v56dIV7s8aeHt3W97AkwTQDVTMp4mTg52gE9dICWZkBSwz70H5P/vqVAZJriwxJncNxbKEhpxiKpOX+gqUBO0t8MI2D9hqdnwGLBlC2/yRhK0b2OuDHuxSsIGXlswBEfZzobfEkkoaJZNpCEUti7zsAJN1SL4ROxaO9z5piR8dSfpqP03AvZOmplCcvLR4ft/J6KAQxLyWa8VNKBTkFFG5AhyZzgE1gbx2ZHmyuMGimDVHIBGoSrI2JQ==;\n\t5:qDSk2ZzMHIapi95N98WCah3TRwXbTwCPP4az4sUUMHV9M7/HEADqO7uz4jmN3tvsdZDxq87LDVp0XUzAmFj+nYu/6OtGcKgRT9/BY1l6hrKFelQ5EffTVBu7gUZlKLtYZnPiigAu5xzGUVKd5dwAfw==;\n\t24:+JnZFCPnyUy+aRCTb0AqsNhSThUA6Yh4OubZ8o5rzGjFDlOHT+jvmuFdpU1ocfEDRLwo/SBu6mmuJ9xrDNen4QlCx+il4fgO8lzXLfRHDcA=;\n\t7:F5Jmn0jxT1uzGWoKXYLUCBWolpY0YFWfTSXC2XuyyYOLP1xMT3l/dQOwcfOasZYDMicuVIlesy5/Ulw3fGxU86pdH6SRcMYak0DpYKy4ab5IOV2ftcP6u4EnRJZ6ocXolcwrMlp/ZTO8ghMRbiPaSeoVQawcbC3iL8e8DsHJovQLAlBnLHH8eWc663BfaGjov0nK+Yc0Xu2uB/Be76KgmulnRMuiTW7IZPjxMS5J1pU="],"X-MS-TrafficTypeDiagnostic":"MWHPR07MB3503:","X-Exchange-Antispam-Report-Test":"UriScan:;","X-Microsoft-Antispam-PRVS":"<MWHPR07MB3503E8791FF2F987142B828097970@MWHPR07MB3503.namprd07.prod.outlook.com>","X-Exchange-Antispam-Report-CFA-Test":"BCL:0; PCL:0;\n\tRULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(5005006)(8121501046)(3002001)(10201501046)(100000703101)(100105400095)(93006095)(6041248)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123564025)(20161123555025)(20161123558100)(20161123560025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);\n\tSRVR:MWHPR07MB3503; BCL:0; PCL:0;\n\tRULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);\n\tSRVR:MWHPR07MB3503; ","X-Forefront-PRVS":"0422860ED4","X-Forefront-Antispam-Report":"SFV:NSPM;\n\tSFS:(10009020)(6009001)(24454002)(189002)(377454003)(199003)(6486002)(6506006)(39060400002)(2950100002)(42882006)(72206003)(229853002)(36756003)(23676002)(101416001)(6246003)(305945005)(8936002)(31686004)(105586002)(97736004)(50986999)(33646002)(5890100001)(189998001)(478600001)(76176999)(7736002)(6666003)(54356999)(4326008)(4001350100001)(68736007)(81166006)(83506001)(230700001)(6116002)(4000630100001)(42186005)(53936002)(53416004)(6512007)(3846002)(65956001)(47776003)(81156014)(8676002)(54906002)(65826007)(50466002)(93886005)(65806001)(25786009)(66066001)(69596002)(31696002)(53546010)(64126003)(106356001)(5660300001)(2906002);\n\tDIR:OUT; SFP:1101; SCL:1; SRVR:MWHPR07MB3503;\n\tH:ddl.caveonetworks.com; FPR:; SPF:None; PTR:InfoNoRecords;\n\tMX:1; A:1; LANG:en; ","Received-SPF":"None (protection.outlook.com: cavium.com does not designate\n\tpermitted sender hosts)","SpamDiagnosticOutput":"1:99","SpamDiagnosticMetadata":"NSPM","X-OriginatorOrg":"caviumnetworks.com","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"06 Sep 2017 18:00:56.1316\n\t(UTC)","X-MS-Exchange-CrossTenant-FromEntityHeader":"Hosted","X-MS-Exchange-CrossTenant-Id":"711e4ccf-2e9b-4bcf-a551-4094005b6194","X-MS-Exchange-Transport-CrossTenantHeadersStamped":"MWHPR07MB3503","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764311,"web_url":"http://patchwork.ozlabs.org/comment/1764311/","msgid":"<572f49fd-f623-f064-a551-e243c57cef7f@gmail.com>","list_archive_url":null,"date":"2017-09-06T18:59:19","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":2800,"url":"http://patchwork.ozlabs.org/api/people/2800/","name":"Florian Fainelli","email":"f.fainelli@gmail.com"},"content":"On 09/06/2017 11:00 AM, David Daney wrote:\n> On 08/31/2017 11:29 AM, Florian Fainelli wrote:\n>> On 08/31/2017 11:12 AM, Mason wrote:\n>>> On 31/08/2017 19:53, Florian Fainelli wrote:\n>>>> On 08/31/2017 10:49 AM, Mason wrote:\n>>>>> On 31/08/2017 18:57, Florian Fainelli wrote:\n>>>>>> And the race is between phy_detach() setting phydev->attached_dev\n>>>>>> = NULL\n>>>>>> and phy_state_machine() running in PHY_HALTED state and calling\n>>>>>> netif_carrier_off().\n>>>>>\n>>>>> I must be missing something.\n>>>>> (Since a thread cannot race against itself.)\n>>>>>\n>>>>> phy_disconnect calls phy_stop_machine which\n>>>>> 1) stops the work queue from running in a separate thread\n>>>>> 2) calls phy_state_machine *synchronously*\n>>>>>       which runs the PHY_HALTED case with everything well-defined\n>>>>> end of phy_stop_machine\n>>>>>\n>>>>> phy_disconnect only then calls phy_detach()\n>>>>> which makes future calls of phy_state_machine perilous.\n>>>>>\n>>>>> This all happens in the same thread, so I'm not yet\n>>>>> seeing where the race happens?\n>>>>\n>>>> The race is as described in David's earlier email, so let's recap:\n>>>>\n>>>> Thread 1            Thread 2\n>>>> phy_disconnect()\n>>>> phy_stop_interrupts()\n>>>> phy_stop_machine()\n>>>> phy_state_machine()\n>>>>   -> queue_delayed_work()\n>>>> phy_detach()\n>>>>                 phy_state_machine()\n>>>>                 -> netif_carrier_off()\n>>>>\n>>>> If phy_detach() finishes earlier than the workqueue had a chance to be\n>>>> scheduled and process PHY_HALTED again, then we trigger the NULL\n>>>> pointer\n>>>> de-reference.\n>>>>\n>>>> workqueues are not tasklets, the CPU scheduling them gets no guarantee\n>>>> they will run on the same CPU.\n>>>\n>>> Something does not add up.\n>>>\n>>> The synchronous call to phy_state_machine() does:\n>>>\n>>>     case PHY_HALTED:\n>>>         if (phydev->link) {\n>>>             phydev->link = 0;\n>>>             netif_carrier_off(phydev->attached_dev);\n>>>             phy_adjust_link(phydev);\n>>>             do_suspend = true;\n>>>         }\n>>>\n>>> then sets phydev->link = 0; therefore subsequent calls to\n>>> phy_state_machin() will be no-op.\n>>\n>> Actually you are right, once phydev->link is set to 0 these would become\n>> no-ops. Still scratching my head as to what happens for David then...\n>>\n>>>\n>>> Also, queue_delayed_work() is only called in polling mode.\n>>> David stated that he's using interrupt mode.\n> \n> Did you see what I wrote?\n\nStill not following, see below.\n\n> \n> phy_disconnect() calls phy_stop_interrupts() which puts it into polling\n> mode.  So the polling work gets queued unconditionally.\n\nWhat part of phy_stop_interrupts() is responsible for changing\nphydev->irq to PHY_POLL? free_irq() cannot touch phydev->irq otherwise\nsubsequent request_irq() calls won't work anymore.\nphy_disable_interrupts() only calls back into the PHY driver to\nacknowledge and clear interrupts.\n\nIf we were using a PHY with PHY_POLL, as Marc said, the first\nsynchronous call to phy_state_machine() would have acted on PHY_HALTED\nand even if we incorrectly keep re-scheduling the state machine from\nPHY_HALTED to PHY_HALTED the second time around nothing can happen.\n\nWhat are we missing here?\n\n> \n> \n> \n>>\n>> Right that's confusing too now. David can you check if you tree has:\n>>\n>> 49d52e8108a21749dc2114b924c907db43358984 (\"net: phy: handle state\n>> correctly in phy_stop_machine\")\n>>\n> \n> Yes, I am using the 4.9 stable branch, and that commit was also present.\n\nThanks for checking that.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"q5rjyxtM\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnXvH3fVGz9t2W\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 04:59:39 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752146AbdIFS7e (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 14:59:34 -0400","from mail-wm0-f67.google.com ([74.125.82.67]:38889 \"EHLO\n\tmail-wm0-f67.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1752058AbdIFS7d (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Wed, 6 Sep 2017 14:59:33 -0400","by mail-wm0-f67.google.com with SMTP id u26so5934562wma.5\n\tfor <netdev@vger.kernel.org>; Wed, 06 Sep 2017 11:59:32 -0700 (PDT)","from [10.112.156.244] ([192.19.255.250])\n\tby smtp.googlemail.com with ESMTPSA id\n\ty94sm420992wrc.41.2017.09.06.11.59.26\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tWed, 06 Sep 2017 11:59:30 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=subject:to:cc:references:from:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=agvOc2ig5XE/zTOBZtUVvRFTPEdaN4jrby8F2pdjeMo=;\n\tb=q5rjyxtMSJhD0HavOKKvNxErpfdLR91mGEd/Z4Gqc7u8NRncZXJvFYu2paYukDJmsi\n\t48Y8PwZ0Xy2/lVQPwe1UikjnZnkeCWX4wRJ3Jqru+7GN/SmZ/5CMh2c078IvrL0/GylF\n\tO83RUBeADlEGdx99cdbsX2HlBXmX2kmUW2v6gWQyikde8Jxlc7NFlcIvkNRrgrFluj7M\n\tfaYfl4tbv94thLNXRRjPM8P2UhdtIc915Znnuf2kV8k2L0H/4+h9hadR7A4r77dhkxjD\n\tDikNmNyeVytr9qqjZ7eu6/ood9p2111/6f5yt46YsgBWHvPJwr1junZJKgSeJoIckJWO\n\tHXoQ==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:subject:to:cc:references:from:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=agvOc2ig5XE/zTOBZtUVvRFTPEdaN4jrby8F2pdjeMo=;\n\tb=exUYpUP2r5QkpurZfdpezupgQKC/eGZhIV+lCFBtgLfyiUvyd0Kj96lvQhfxfyh+t0\n\to23wGpZ2mHmkUbjrBHjwAJsQaSrfQla7sD/9kgMOV9PQjc1RPmTJwz3AV7+sxmB5t9Pv\n\tFa+pfPMZbCVwb7icHCVFmytKWSZzhvl06tIifI5e23gdh0rr2uzicV4q+76aAQ6ISTJL\n\tArLWOTlKVkpfLVKwn88DHoHWtXLGf722XSjtiWDmjHJkCizDHt3eqm0ThLpGdt2ijm6F\n\tJ6t0Icxkxm4vCYd8pnf7ZPgJPNQMBc8YcfMhZfmryZgAsxzZ5f7qvcRDL1LjbevgT/dV\n\tOneQ==","X-Gm-Message-State":"AHPjjUgJ1Z5egkUpz+gP6Ikaw4OlnxfnJZ0apDQhdzL5vr/RgAs1/sic\n\tIocwHiJG1OydzA==","X-Google-Smtp-Source":"ADKCNb5mdbnXs4fWFLPZ2sP1yeQIn2AwYoThj0Gow8t7m2Q1cm8ynQkSuFkt8x9QWxg9dbRUUjyRYg==","X-Received":"by 10.28.236.67 with SMTP id k64mr561876wmh.146.1504724371877;\n\tWed, 06 Sep 2017 11:59:31 -0700 (PDT)","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"David Daney <ddaney@caviumnetworks.com>, Mason <slash.tmp@free.fr>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>\n\t<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>\n\t<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>\n\t<ff070239-28b7-d41b-8abe-c9f810561372@caviumnetworks.com>","From":"Florian Fainelli <f.fainelli@gmail.com>","Message-ID":"<572f49fd-f623-f064-a551-e243c57cef7f@gmail.com>","Date":"Wed, 6 Sep 2017 11:59:19 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<ff070239-28b7-d41b-8abe-c9f810561372@caviumnetworks.com>","Content-Type":"text/plain; charset=utf-8","Content-Language":"en-US","Content-Transfer-Encoding":"8bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764319,"web_url":"http://patchwork.ozlabs.org/comment/1764319/","msgid":"<89164218-82a8-ae12-4b1a-c2d0d8a04624@free.fr>","list_archive_url":null,"date":"2017-09-06T19:14:27","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":66150,"url":"http://patchwork.ozlabs.org/api/people/66150/","name":"Mason","email":"slash.tmp@free.fr"},"content":"On 06/09/2017 20:00, David Daney wrote:\n> On 08/31/2017 11:29 AM, Florian Fainelli wrote:\n>> On 08/31/2017 11:12 AM, Mason wrote:\n>>> On 31/08/2017 19:53, Florian Fainelli wrote:\n>>>> On 08/31/2017 10:49 AM, Mason wrote:\n>>>>> On 31/08/2017 18:57, Florian Fainelli wrote:\n>>>>>> And the race is between phy_detach() setting phydev->attached_dev = NULL\n>>>>>> and phy_state_machine() running in PHY_HALTED state and calling\n>>>>>> netif_carrier_off().\n>>>>>\n>>>>> I must be missing something.\n>>>>> (Since a thread cannot race against itself.)\n>>>>>\n>>>>> phy_disconnect calls phy_stop_machine which\n>>>>> 1) stops the work queue from running in a separate thread\n>>>>> 2) calls phy_state_machine *synchronously*\n>>>>>       which runs the PHY_HALTED case with everything well-defined\n>>>>> end of phy_stop_machine\n>>>>>\n>>>>> phy_disconnect only then calls phy_detach()\n>>>>> which makes future calls of phy_state_machine perilous.\n>>>>>\n>>>>> This all happens in the same thread, so I'm not yet\n>>>>> seeing where the race happens?\n>>>>\n>>>> The race is as described in David's earlier email, so let's recap:\n>>>>\n>>>> Thread 1\t\t\tThread 2\n>>>> phy_disconnect()\n>>>> phy_stop_interrupts()\n>>>> phy_stop_machine()\n>>>> phy_state_machine()\n>>>>   -> queue_delayed_work()\n>>>> phy_detach()\n>>>> \t\t\t\tphy_state_machine()\n>>>> \t\t\t\t-> netif_carrier_off()\n>>>>\n>>>> If phy_detach() finishes earlier than the workqueue had a chance to be\n>>>> scheduled and process PHY_HALTED again, then we trigger the NULL pointer\n>>>> de-reference.\n>>>>\n>>>> workqueues are not tasklets, the CPU scheduling them gets no guarantee\n>>>> they will run on the same CPU.\n>>>\n>>> Something does not add up.\n>>>\n>>> The synchronous call to phy_state_machine() does:\n>>>\n>>> \tcase PHY_HALTED:\n>>> \t\tif (phydev->link) {\n>>> \t\t\tphydev->link = 0;\n>>> \t\t\tnetif_carrier_off(phydev->attached_dev);\n>>> \t\t\tphy_adjust_link(phydev);\n>>> \t\t\tdo_suspend = true;\n>>> \t\t}\n>>>\n>>> then sets phydev->link = 0; therefore subsequent calls to\n>>> phy_state_machin() will be no-op.\n>>\n>> Actually you are right, once phydev->link is set to 0 these would become\n>> no-ops. Still scratching my head as to what happens for David then...\n>>\n>>>\n>>> Also, queue_delayed_work() is only called in polling mode.\n>>> David stated that he's using interrupt mode.\n> \n> Did you see what I wrote?\n> \n> phy_disconnect() calls phy_stop_interrupts() which puts it into polling \n> mode.  So the polling work gets queued unconditionally.\n\nI did address that remark in\nhttps://www.mail-archive.com/netdev@vger.kernel.org/msg186336.html\n\nint phy_stop_interrupts(struct phy_device *phydev)\n{\n\tint err = phy_disable_interrupts(phydev);\n\n\tif (err)\n\t\tphy_error(phydev);\n\n\tfree_irq(phydev->irq, phydev);\n\n\t/* If work indeed has been cancelled, disable_irq() will have\n\t * been left unbalanced from phy_interrupt() and enable_irq()\n\t * has to be called so that other devices on the line work.\n\t */\n\twhile (atomic_dec_return(&phydev->irq_disable) >= 0)\n\t\tenable_irq(phydev->irq);\n\n\treturn err;\n}\n\nWhich part of this function changes phydev->irq to PHY_POLL?\n\nPerhaps phydev->drv->config_intr?\n\nWhat PHY are you using?\n\nRegards.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnYMQ6rLTz9s3T\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 05:20:23 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752108AbdIFTOe (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 15:14:34 -0400","from smtp2-g21.free.fr ([212.27.42.2]:19994 \"EHLO\n\tsmtp2-g21.free.fr\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1751941AbdIFTOd (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tWed, 6 Sep 2017 15:14:33 -0400","from [192.168.0.66] (unknown [88.191.210.51])\n\tby smtp2-g21.free.fr (Postfix) with ESMTP id F137A2003EB;\n\tWed,  6 Sep 2017 21:14:30 +0200 (CEST)"],"Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"David Daney <ddaney@caviumnetworks.com>,\n\tFlorian Fainelli <f.fainelli@gmail.com>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>\n\t<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>\n\t<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>\n\t<ff070239-28b7-d41b-8abe-c9f810561372@caviumnetworks.com>","From":"Mason <slash.tmp@free.fr>","Message-ID":"<89164218-82a8-ae12-4b1a-c2d0d8a04624@free.fr>","Date":"Wed, 6 Sep 2017 21:14:27 +0200","User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:52.0) Gecko/20100101\n\tFirefox/52.0 SeaMonkey/2.49.1","MIME-Version":"1.0","In-Reply-To":"<ff070239-28b7-d41b-8abe-c9f810561372@caviumnetworks.com>","Content-Type":"text/plain; charset=ISO-8859-15","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764321,"web_url":"http://patchwork.ozlabs.org/comment/1764321/","msgid":"<2d12dbb5-17c9-a0a5-1345-6aba81eca7b5@gmail.com>","list_archive_url":null,"date":"2017-09-06T19:28:19","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":2800,"url":"http://patchwork.ozlabs.org/api/people/2800/","name":"Florian Fainelli","email":"f.fainelli@gmail.com"},"content":"On 09/06/2017 07:55 AM, Mason wrote:\n> On 31/08/2017 21:18, Florian Fainelli wrote:\n> \n>> On 08/31/2017 12:09 PM, Mason wrote:\n>>\n>>> 1) nb8800_link_reconfigure() calls phy_print_status()\n>>> which prints the \"Link down\" and \"Link up\" messages\n>>> to the console. With the patch reverted, nothing is\n>>> printed when the link goes down, and the result is\n>>> random when the link comes up. Sometimes, we get\n>>> down + up, sometimes just up.\n>>\n>> Nothing printed when you bring down the network interface as a result of\n>> not signaling the link down, there is a small nuance here.\n> \n> Let me first focus on the \"Link down\" message.\n> \n> Do you agree that such a message should be printed when the\n> link goes down, not when the link comes up?\n\nThe question is not so much about printing the message rather than a)\nhaving the adjust_link callback be called and b) having this particular\ncallback correctly determine if a \"change\" has occurred, but let's focus\non the notification too. Printing the message is a consequence of these\ntwo conditions and that's what matters.\n\nThere is not unfortunately a hard and fast answer it's clearly a\nphilosophical problem here.\n\nThe link is not physically down, the cable is still plugged so\ngenerating a link down even is not necessarily correct. It would be\nconvenient for network manager programs, just like it is for network\ndrivers to treat it as such because that allows them to act like if the\ncable was unplugged, which may be a good way to perform a number of\nactions including but not limited to: entering a low power state,\nre-initialization parts of the Ethernet MAC that need it (DMA, etc.,\netc.). That does not appear to be an absolute requirement for most, if\nnot all drivers since it changed after 3.4 and no one did bat an eye\nabout it.\n\nUpon bringing the interface back up again, same thing, if the cable was\nnot disconnected should we just generate a link UP event, and if we do\nthat, are we going to confuse any network manager application?\nGenerating a link transition DOWN -> UP is certainly helpful for any\nnetwork application in that they do not need to keep any state just like\nit clearly indicates a change was detected.\n\n> \n> Perhaps the issue is that the 2 following cases need to be\n> handled differently:\n> A) operator sets link down on the command-line\n\nThis is already handled differently since when you administratively\nbring down an interface you call ndo_stop() which will be doing a\nphy_stop() + phy_disconnect() which result in stopping the PHY state\nmachine and disconnecting from the PHY.\n\n> B) asynchronous event makes link go down (peer is dead, cable is cut, etc)\n> \n> In B) the PHY state machine keeps on running, and eventually\n> calls adjust_link()\n\nCorrect.\n\n> \n> In A) the driver calls phy_stop() and phy_disconnect() and\n> therefore adjust_link() will not be called?\n\nThat is the current behavior (after the revert) and we can always change\nit if deemed necessary, problem is, this broke for two people (one still\nbeing discussed as of right now), so at this point I am very wary of\nmaking any changes without more testing. I really need to get one of\nthese PHY interrupts wired to one of my boards or create a software\nmodel of such a configuration before accepting new changes in that area.\n\nThank you","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"SQVl6h34\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnYXZ6SsHz9s3w\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 05:28:30 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752432AbdIFT21 (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 15:28:27 -0400","from mail-wm0-f47.google.com ([74.125.82.47]:35392 \"EHLO\n\tmail-wm0-f47.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1751985AbdIFT20 (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Wed, 6 Sep 2017 15:28:26 -0400","by mail-wm0-f47.google.com with SMTP id f199so12798892wme.0\n\tfor <netdev@vger.kernel.org>; Wed, 06 Sep 2017 12:28:25 -0700 (PDT)","from [10.112.156.244] ([192.19.255.250])\n\tby smtp.googlemail.com with ESMTPSA id\n\tg37sm1127873wra.6.2017.09.06.12.28.21\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tWed, 06 Sep 2017 12:28:23 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=from:subject:to:cc:references:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=JMrBIYaFBTDj03x/Spe87pDzg+yJugmIwfiJY6Hntqw=;\n\tb=SQVl6h34LfXlX6TspjDQCezfVQrUHig1JMl3tzi8VltqqN5N5lyZ4Opd7H7fqGX+yX\n\tiNZyteB4QOfZ9UdAJWShYjcb1ghWpxuecqgNQUHhdaDEtc0pwk3IGD3p9S2piqj/umAv\n\tUbS00a4Zrp7Uj/Axvl7MWVpxAxp3hbK0N8wvwnC6E/NPtwpAOO+i49gE3eSJG1G8kYrx\n\t51a7g2Gd5l3TGZ6JQTakF1dktCHI+xKDyt7T/1Ny2e210wea/9CBMyxXKWkgjnVrv7SD\n\tYACmQvqM170k3NKuy3eNMI7m8vTnZRfuGOOvXjVH7BHtoG6+bgjlqfNTWB1IE1FXATCg\n\tgGHg==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:from:subject:to:cc:references:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=JMrBIYaFBTDj03x/Spe87pDzg+yJugmIwfiJY6Hntqw=;\n\tb=uoA6GYlo3aJg++QavAywT/qJKjaRW2pwYoF5e83BlkVREz79L9uHggdgk+vo5jWrQe\n\tWuiGYfOUHbxGT65qdeBeuLz62kxl0HswbWGbUnsoMXM+pX0NJBnUA5Lqf6S2mscfITFg\n\t2Crw9Fj9yf7tIDUrrMYKVYpg/Xf2g38jxwLBN30ynUOo1OnLyO6etubzDIbNT4pg9ppU\n\trJ9Y8/BjrSCELrMhbreI+epBLfZ4Kp2ucO/nwdmM8SXyoDfMAkaLjGjQkH3QGylPOM5R\n\tEFe9AK6tZheZGaRbs2tIknZFKQg5913rW08VaxPGYXaUdXWIMbruLDXNYaoaupdrmFo4\n\tRI2w==","X-Gm-Message-State":"AHPjjUgaNkYFm6yRwG7AmZJI3yRIGyRSZGGtOhs4BBQAEm7N+BU009PZ\n\tlA06b3bsD6r3lw==","X-Google-Smtp-Source":"ADKCNb47LTcpsUAdiQEOUGS9C79c4QWkTHhRPL+6e8E8wJlyJEUgBKt5/mF//NozazwL5+lUSWHfyg==","X-Received":"by 10.28.195.132 with SMTP id t126mr558725wmf.0.1504726104708;\n\tWed, 06 Sep 2017 12:28:24 -0700 (PDT)","From":"Florian Fainelli <f.fainelli@gmail.com>","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Mason <slash.tmp@free.fr>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>, netdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>,\n\tThibaud Cornic <thibaud_cornic@sigmadesigns.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<a75691d9-c22a-9b89-2cce-604315062739@gmail.com>\n\t<d54d0555-c448-741c-eb63-5a773bed5a30@free.fr>\n\t<7b1c1dc9-b6e3-a1bd-2e36-474946741a79@gmail.com>\n\t<927413e9-4f1f-963c-2d3a-5a88de2eac9e@free.fr>","Message-ID":"<2d12dbb5-17c9-a0a5-1345-6aba81eca7b5@gmail.com>","Date":"Wed, 6 Sep 2017 12:28:19 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<927413e9-4f1f-963c-2d3a-5a88de2eac9e@free.fr>","Content-Type":"text/plain; charset=utf-8","Content-Language":"en-US","Content-Transfer-Encoding":"8bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764327,"web_url":"http://patchwork.ozlabs.org/comment/1764327/","msgid":"<49d6a93e-1966-5104-603d-aad2d9a9e8e9@gmail.com>","list_archive_url":null,"date":"2017-09-06T19:42:54","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":2800,"url":"http://patchwork.ozlabs.org/api/people/2800/","name":"Florian Fainelli","email":"f.fainelli@gmail.com"},"content":"On 09/06/2017 08:51 AM, Mason wrote:\n> On 31/08/2017 21:18, Florian Fainelli wrote:\n> \n>> On 08/31/2017 12:09 PM, Mason wrote:\n>>\n>>> On 31/08/2017 19:03, Florian Fainelli wrote:\n>>>\n>>>> On 08/31/2017 05:29 AM, Marc Gonzalez wrote:\n>>>>\n>>>>> On 31/08/2017 02:49, Florian Fainelli wrote:\n>>>>>\n>>>>>> The original motivation for this change originated from Marc Gonzalez\n>>>>>> indicating that his network driver did not have its adjust_link\n>>>>>> callback\n>>>>>> executing with phydev->link = 0 while he was expecting it.\n>>>>>\n>>>>> I expect the core to call phy_adjust_link() for link changes.\n>>>>> This used to work back in 3.4 and was broken somewhere along\n>>>>> the way.\n>>>>\n>>>> If that was working correctly in 3.4 surely we can look at the diff and\n>>>> figure out what changed, even maybe find the offending commit, can you\n>>>> do that?\n>>>\n>>> Bisecting would a be a huge pain because my platform was\n>>> not upstream until v4.4\n>>\n>> Then just diff the file and try to pinpoint which commit may have\n>> changed that?\n> \n> Running 'ip link set eth0 down' on the command-line.\n> \n> In v3.4 => adjust_link() callback is called\n> In v4.5 => adjust_link() callback is NOT called\n> \n> $ git log --oneline --no-merges v3.4..v4.5 drivers/net/phy/phy.c | wc -l\n> 59\n> \n> I'm not sure what \"just diff the file\" entails.\n\ngit log -p --no-merges v3.4..v4.5 drivers/net/phy/{phy,phy_device.c} and\nsee what would seem remotely sensible to what you are observing.\n\n> I can't move 3.4 up, nor move 4.5 down.\n\nYou can always copy the PHYLIB files at any given commit back into an\nolder tree, or vice versa because it is largely self contained with\nlittle to no dependencies on other headers/files/facilities etc. This is\nnot convenient I agree, but it's a poor man's way of determining what\nchanged within PHYLIB that results in what you are seeing.\n\nAFAICT you could use QEMU with the versatile board that has smsc911x as\nan Ethernet adapter which is PHYLIB compliant which may be used to\npinpoint which commit start changing this behavior. It's long, it's painful.\n\n> I'm not even sure the problem comes from drivers/net/phy/phy.c\n> to be honest.\n\nIf that's the case then I am not sure what else we can do.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"HL9V0kPN\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnYsT64dTz9s8J\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 05:43:09 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752167AbdIFTnH (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 15:43:07 -0400","from mail-wr0-f172.google.com ([209.85.128.172]:35048 \"EHLO\n\tmail-wr0-f172.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1751441AbdIFTnG (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Wed, 6 Sep 2017 15:43:06 -0400","by mail-wr0-f172.google.com with SMTP id y15so15648980wrc.2\n\tfor <netdev@vger.kernel.org>; Wed, 06 Sep 2017 12:43:05 -0700 (PDT)","from [10.112.156.244] ([192.19.255.250])\n\tby smtp.googlemail.com with ESMTPSA id\n\tt68sm866495wme.15.2017.09.06.12.42.56\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tWed, 06 Sep 2017 12:43:04 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=subject:to:cc:references:from:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=4M1hYmui721G84i2MpleGeBhwKFXqJzMdc8q+ZXZOk8=;\n\tb=HL9V0kPN8xpXU60F5sTjTYIcmmf3jteyvhuxmtNOlWeOeUw4Umn2jcY+DXx7UEIGKh\n\tsFztMUJXkQpPctGGGNL0Mlg1Qs83QrcTvI6HEGQb9EJVMG7a0kwhvAKvgCcDUS8n5Jym\n\tJapcx+ni1qb08wEuJJNqDX1LPWCYQXSOTtxWr9Q3LOgfm99g8m05/JsqkGQngNZ5M7Ir\n\t29PqWXP9BDGiKE8bq4PDVwDcdkKpQoq5VwhPRNe5+Fyumw8Xm52gXpDI8JWsv3pnaunE\n\tUxatVIb8tFbHxiYCOo3P5pgJ8FpS9auIRp+d7aSPc2/mo1N+xJZFOlilppkq7Mj4a8JE\n\tB9VA==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:subject:to:cc:references:from:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=4M1hYmui721G84i2MpleGeBhwKFXqJzMdc8q+ZXZOk8=;\n\tb=SJzwo6ZfBoh8wOJcBJJmfuAMogJ9+eJDGT05UHRiFhzW/UZhk/1ZQUllXy1IunJ/QQ\n\t4fwhoIcvc2sITQeq431d0+KPp42Rh6izcClPhr0Mpy8ep+W6uRB4NYTn0Kda2HvfhF4+\n\tZG6kL/3wjg+BM1nJDUfaNev8ywtWSz4dY3Bzgs/ElBCCBvi50s7wL3/ZbfKXEaHp2anC\n\t2hVu0oalj2XXeDDykWrK++kBm4M32ef4T1u/MYMkRSHOHHQXRxvr9lXNp7YHCdl2+wJt\n\tGxyzGaaT7z8jfxwN+vpc/g3jR4UXq2j2J/9CDCYh4WqUChUgVPaGd/+QU00pkL6+ELD4\n\tCqGg==","X-Gm-Message-State":"AHPjjUiX/kQ1c+JwW2yjp5TT2xxCgDYJ3qO9d+bEO+TgACIbZPyp3wEB\n\tgXpTvzLcrt5J0g==","X-Google-Smtp-Source":"ADKCNb4BeyHKVx5qxrWNKLlrK//1cqaFl88iA0WvuyS1p3XaACtBButu8msknE1FS7TqlOrM8fHkdA==","X-Received":"by 10.223.142.73 with SMTP id n67mr151271wrb.278.1504726984892; \n\tWed, 06 Sep 2017 12:43:04 -0700 (PDT)","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Mason <slash.tmp@free.fr>, Andrew Lunn <andrew@lunn.ch>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>, netdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>, Mans Rullgard <mans@mansr.com>,\n\tThibaud Cornic <thibaud_cornic@sigmadesigns.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<a75691d9-c22a-9b89-2cce-604315062739@gmail.com>\n\t<d54d0555-c448-741c-eb63-5a773bed5a30@free.fr>\n\t<7b1c1dc9-b6e3-a1bd-2e36-474946741a79@gmail.com>\n\t<730292be-affa-c19d-75ab-edba367788e8@free.fr>","From":"Florian Fainelli <f.fainelli@gmail.com>","Message-ID":"<49d6a93e-1966-5104-603d-aad2d9a9e8e9@gmail.com>","Date":"Wed, 6 Sep 2017 12:42:54 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<730292be-affa-c19d-75ab-edba367788e8@free.fr>","Content-Type":"text/plain; charset=utf-8","Content-Language":"en-US","Content-Transfer-Encoding":"8bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764365,"web_url":"http://patchwork.ozlabs.org/comment/1764365/","msgid":"<6890a27f-e87e-62c1-a676-e5ddf968adb6@caviumnetworks.com>","list_archive_url":null,"date":"2017-09-06T20:49:41","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":721,"url":"http://patchwork.ozlabs.org/api/people/721/","name":"David Daney","email":"ddaney@caviumnetworks.com"},"content":"On 09/06/2017 11:59 AM, Florian Fainelli wrote:\n> On 09/06/2017 11:00 AM, David Daney wrote:\n>> On 08/31/2017 11:29 AM, Florian Fainelli wrote:\n>>> On 08/31/2017 11:12 AM, Mason wrote:\n>>>> On 31/08/2017 19:53, Florian Fainelli wrote:\n>>>>> On 08/31/2017 10:49 AM, Mason wrote:\n>>>>>> On 31/08/2017 18:57, Florian Fainelli wrote:\n>>>>>>> And the race is between phy_detach() setting phydev->attached_dev\n>>>>>>> = NULL\n>>>>>>> and phy_state_machine() running in PHY_HALTED state and calling\n>>>>>>> netif_carrier_off().\n>>>>>>\n>>>>>> I must be missing something.\n>>>>>> (Since a thread cannot race against itself.)\n>>>>>>\n>>>>>> phy_disconnect calls phy_stop_machine which\n>>>>>> 1) stops the work queue from running in a separate thread\n>>>>>> 2) calls phy_state_machine *synchronously*\n>>>>>>        which runs the PHY_HALTED case with everything well-defined\n>>>>>> end of phy_stop_machine\n>>>>>>\n>>>>>> phy_disconnect only then calls phy_detach()\n>>>>>> which makes future calls of phy_state_machine perilous.\n>>>>>>\n>>>>>> This all happens in the same thread, so I'm not yet\n>>>>>> seeing where the race happens?\n>>>>>\n>>>>> The race is as described in David's earlier email, so let's recap:\n>>>>>\n>>>>> Thread 1            Thread 2\n>>>>> phy_disconnect()\n>>>>> phy_stop_interrupts()\n>>>>> phy_stop_machine()\n>>>>> phy_state_machine()\n>>>>>    -> queue_delayed_work()\n>>>>> phy_detach()\n>>>>>                  phy_state_machine()\n>>>>>                  -> netif_carrier_off()\n>>>>>\n>>>>> If phy_detach() finishes earlier than the workqueue had a chance to be\n>>>>> scheduled and process PHY_HALTED again, then we trigger the NULL\n>>>>> pointer\n>>>>> de-reference.\n>>>>>\n>>>>> workqueues are not tasklets, the CPU scheduling them gets no guarantee\n>>>>> they will run on the same CPU.\n>>>>\n>>>> Something does not add up.\n>>>>\n>>>> The synchronous call to phy_state_machine() does:\n>>>>\n>>>>      case PHY_HALTED:\n>>>>          if (phydev->link) {\n>>>>              phydev->link = 0;\n>>>>              netif_carrier_off(phydev->attached_dev);\n>>>>              phy_adjust_link(phydev);\n>>>>              do_suspend = true;\n>>>>          }\n>>>>\n>>>> then sets phydev->link = 0; therefore subsequent calls to\n>>>> phy_state_machin() will be no-op.\n>>>\n>>> Actually you are right, once phydev->link is set to 0 these would become\n>>> no-ops. Still scratching my head as to what happens for David then...\n>>>\n>>>>\n>>>> Also, queue_delayed_work() is only called in polling mode.\n>>>> David stated that he's using interrupt mode.\n>>\n>> Did you see what I wrote?\n> \n> Still not following, see below.\n> \n>>\n>> phy_disconnect() calls phy_stop_interrupts() which puts it into polling\n>> mode.  So the polling work gets queued unconditionally.\n> \n> What part of phy_stop_interrupts() is responsible for changing\n> phydev->irq to PHY_POLL? free_irq() cannot touch phydev->irq otherwise\n> subsequent request_irq() calls won't work anymore.\n> phy_disable_interrupts() only calls back into the PHY driver to\n> acknowledge and clear interrupts.\n> \n> If we were using a PHY with PHY_POLL, as Marc said, the first\n> synchronous call to phy_state_machine() would have acted on PHY_HALTED\n> and even if we incorrectly keep re-scheduling the state machine from\n> PHY_HALTED to PHY_HALTED the second time around nothing can happen.\n> \n> What are we missing here?\n> \n\nOK, I am now as confused as you guys are.  I will go back and get an \nftrace log out of the failure.\n\nDavid.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tunprotected) header.d=CAVIUMNETWORKS.onmicrosoft.com\n\theader.i=@CAVIUMNETWORKS.onmicrosoft.com header.b=\"RE6gcA7M\"; \n\tdkim-atps=neutral","spf=none (sender IP is )\n\tsmtp.mailfrom=David.Daney@cavium.com; "],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnbLS442sz9sRY\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 06:49:52 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752616AbdIFUtu (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 16:49:50 -0400","from mail-by2nam01on0083.outbound.protection.outlook.com\n\t([104.47.34.83]:64736\n\t\"EHLO NAM01-BY2-obe.outbound.protection.outlook.com\"\n\trhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP\n\tid S1752058AbdIFUtt (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tWed, 6 Sep 2017 16:49:49 -0400","from ddl.caveonetworks.com (50.233.148.156) by\n\tDM5PR07MB3497.namprd07.prod.outlook.com (10.164.153.28) with\n\tMicrosoft SMTP Server (version=TLS1_2,\n\tcipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id\n\t15.20.13.10; Wed, 6 Sep 2017 20:49:45 +0000"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version;\n\tbh=VAQ0DluvysEWVx8WFM0QMaHnUZAKpBRaU7d5QQw0r30=;\n\tb=RE6gcA7Mm+Mipe1Q38wi5/c1FDeUYCVGyQR8DPYFf7kICQXgCXWITofyYWN8rk/Khlet6NAVxr6zZsrm/sPGnOz7seT5T8RWd3UMcXgVWNcRnJpbVKYo54TV9cEA3sr9oTuETP8Yki7Xzk3ppqA2j/JtXwe/fcrrRyi4+u4SEn8=","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Florian Fainelli <f.fainelli@gmail.com>, Mason <slash.tmp@free.fr>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>\n\t<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>\n\t<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>\n\t<ff070239-28b7-d41b-8abe-c9f810561372@caviumnetworks.com>\n\t<572f49fd-f623-f064-a551-e243c57cef7f@gmail.com>","From":"David Daney <ddaney@caviumnetworks.com>","Message-ID":"<6890a27f-e87e-62c1-a676-e5ddf968adb6@caviumnetworks.com>","Date":"Wed, 6 Sep 2017 13:49:41 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<572f49fd-f623-f064-a551-e243c57cef7f@gmail.com>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Language":"en-US","Content-Transfer-Encoding":"7bit","X-Originating-IP":"[50.233.148.156]","X-ClientProxiedBy":"BY2PR07CA0092.namprd07.prod.outlook.com (10.166.107.45) To\n\tDM5PR07MB3497.namprd07.prod.outlook.com (10.164.153.28)","X-MS-PublicTrafficType":"Email","X-MS-Office365-Filtering-Correlation-Id":"8de3e766-c048-42bb-ba0a-08d4f568cf0c","X-Microsoft-Antispam":"UriScan:; BCL:0; PCL:0;\n\tRULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(300000503095)(300135400095)(2017052603199)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);\n\tSRVR:DM5PR07MB3497; ","X-Microsoft-Exchange-Diagnostics":["1; DM5PR07MB3497;\n\t3:amXXS7/ysQMEk6ZGbxayE9Yoq7bC1IpR4zoDrZWbOzZKFo+z5XXrTXZfTemZicXMPRkzQZN7XyBe3TYeDVfri0PmUlJIaeGvZBF2ZlXbq4zfGjg4L7fgOQD6fTpKAOR0HEZ+QkJBA55sFhj8Enmf37mgQL265MPAhi60hXOQPa99yZBOEQxc7G61rRR6STUmxAiGon5n6lcCZuKO4H+orX2KxLlsKppP+EYFp+H2CQCAKI543yU1GK5VE5U1R4V+;\n\t25:6kFGJV8gaIngnhQ1g9Ll6fm8xcI2Zm0geETR9l+JAdIqpYk33sE3ziOApf666J+k8RY4sa9Rv9JaoDobs1F9/AqTPXR6x9sxtZdWHZkrCV+TpDmP3wm2Up46xVgk19kv+uQkZetwLr53ifjqjcNbh3p/zbMp4RAL9z3H0URtTIMoYLGN3rmyJm4elhniCGfX4qdNJ5rCHhD1NC5Ly5Ry7DSQrLSwqlBGgjrewimrXILfTuI5UZwj2ymHJ03Py1XKiu9mous4eJgk81UXBe+qOz97S0au+E91b/AYJMamBuJjAUilL/fQgnaRzpvRIV5Ds47/zYa31o8t/UfghFjJgg==;\n\t31:Upus+WyWaBoeQsc3q3KPVvUy5b05wyTZ776gRqSBpkwO08OhR9GKCWVJflMsDUFXjd7su2pha2LASxn/PHZaVrSSJuedt90r7FpZcih000ur/TomrRCUm4Lk+SOAVmcd4RQQVyaJaoslwGote57y2vY15npphE5d/RWMLzu/RoYaFqf+xiDlI1GcCJ0XuiWg1b+qOfdzexsGUOWM+QB5q4jdszJMFNQCNaFdCp3iJ/o=","1; DM5PR07MB3497;\n\t20:Deh53AUBM2VVBjdFswbSIarP1Jn6FhOuioTRTwODOUQkissScTSnd1N3bsjELMOBSdmWxFUCWWsxT9djPOYME4sOTRpOTtlxwlaKzzSuQt9Cl0NHxWaUAMYNbbG+nFbp+vEo4sR0KNXJDVdZ1NYcx03yR4wylcr7rbJ6ZBr1lPMZ7ueP6DUucNUFocok/UaVSzLFbUCgkpCeAH5XDRxfE5Q8p6pH9BowYxm/50mdKZf546qxKJ0F2TGV56xXOx43ijQtCA+SKOcroE/NM2tQSlRQ7RaKE+w38D1Np7AtZnH0/Fs2ulOgXlB3P9rlfGRYH4n7mcq8wKXKjNZK13cr3L53P67nyr3j/qOJBjLCzDO12Agt7ClC9O9azPQXQ/dXs3NpqXfiPJ1Z+kLF34y784rzQjj3bGZiIqM59pBhD4kroX8RYNv4oGm0C5Fy9qIrBGyhvYsRPU9pvY03sfUxGs0+xSe0IgZF5bf2K69KiLIMnhbucOamW+uPT8Uwyjvdt/egQzDt/oEiC7unL5DlNhOf6PxRHbqLQ9E0ukrKEuZHZBMRVbFIeKINejo9j3jZ9PtbgFcE/8vIfd8Up7Jk//7/GQ9gIbRAkMFgK1d0jEQ=;\n\t4:Oq0QdpIfTXmCAUotvNOZUl7csE69cz+rfy5RCPc1K7OvfuM658iosCvlmBRQTo1l828e6tvxypES1uD7Nv9eMGjvI/46diO4JsonkoUbnYly1CnprUk5vBJBlKSi8iUmWm8zMz/ZuRrGzFtO0gfA5zkd/WUflIA7wgSeKivwyaeXrsX/+7uPZLuGpgaYsaPpjRGjGoq6SsyB06LOTVx/Mqh6p5HSUq53SwlAbD9nG6YWPVHYMONF/f4PLkZ7yinh","=?utf-8?q?1=3BDM5PR07MB3497=3B23=3Al/IW?=\n\t=?utf-8?q?yy3o5LHK97/G/YOlWl4SZkzCjj1hbeIQVeVxhQAaRt8u4C7JwAzr01xT?=\n\t=?utf-8?q?aUtUScGssO/mTGDSeGFUsUklZNVvtzVc5p5lAuYCtJh43LCqgNczSELJ?=\n\t=?utf-8?q?jQWDmjWGpWc4GGN7V4q6NGMtSh9OVBX+sU54J86cZzI4EL/MsPA25h+f?=\n\t=?utf-8?q?I4h3FkBUo6xumoTBNfkIZyHJZ4ms30bXgRuhoE/vHJ1tFR22Va2ipnB8?=\n\t=?utf-8?q?dUgCZiqfjLhTeoNIdHZdHMZYiUIct8AP+oxVqdByA6z4ZiX2IYbeTD0X?=\n\t=?utf-8?q?9gwtVu4FvroX7tWV6sGzkOMmbLCQU3x6PnCPMHh3j9FRlwVJkYMc8PCn?=\n\t=?utf-8?q?K4LJAOTDGNT1+eF6Ch0RawSRHL8gMHhUgFD0WOyBTF2T6KPvqoJ3SmSJ?=\n\t=?utf-8?q?+reRSdQxk2gpvmdLnqovlvN4pGB8F7JgSvO3HtfxIasrynzNHySw3pNr?=\n\t=?utf-8?q?sFZi9FcDhh9d8retFdvYXtahduv3CRSpY4N4Op+2jeBitMGl+N1GP1Bp?=\n\t=?utf-8?q?io0GVTWeBtxNdT43LRIMXJ6Po/410oStz7BrxJ7n2GF5XECEkTJmKwBG?=\n\t=?utf-8?q?Se/28qr9PLl53hDSeJgw3L+YHSPuUWkprX6MAvl9n388K0O7v2piHMOs?=\n\t=?utf-8?q?53Tv/pzyJHW09RN3IxMPjnov1jdpO7zGnNSqDMeMi0mIw1kUFQuBcjzE?=\n\t=?utf-8?q?ykgLFQCvlb4R6KHgUsfKMXcBU0OTv3imoBnvjm4RWUXmZqDpotnWBIb3?=\n\t=?utf-8?q?Oxij1UZot6SSaBYHjpS1afImWiK2LX55pSHOzqavpZw8m0mmbwqDkRpC?=\n\t=?utf-8?q?zWb0p2LBvtsXw4OEPeEn+dFCpLp6Fbgp3zz6Zz1VSOeLxZsdRYXnIucb?=\n\t=?utf-8?q?+izJVr03hEdbpyqiHWRp7MTxdXVl3gTGgRQ2VQQLU4xuhZ6AJRzvZ1S6?=\n\t=?utf-8?q?v3ysYbu6PvNzzCNFf84hQD2R7q58OV3N9uN+0RjvOpu6f8RNwpBySjcL?=\n\t=?utf-8?q?vGMCTMa9R55+dlHZopR0167eHwPJuFgCKQbw17h9jDGADB9+JiAaEK6I?=\n\t=?utf-8?q?bTJNyWT0LH/67OqOF1fcIowiOxMj5NVZ66n9ghgtvZLbXIXe55lIKqJk?=\n\t=?utf-8?q?yG9JXlPu7bubXuq9E/wkswuOwcytUkt7SWGdGL2igLp9mVSqc6jUBsU1?=\n\t=?utf-8?q?moZ/qf+qjRqHZbllqVciRyrzWAUirQi9Ps+xPLaaUX2jIFOOTiUs32Yk?=\n\t=?utf-8?q?/2jITyTAspMqPKFwK5k9cJTKTIDnpDNo6DUA6r/yRWbh5hh0fhnONJia?=\n\t=?utf-8?q?55t/iCozIW0NrWmq4z2mftFPZJ4COMF42c5mbdqIgMQWuCChIQ1hiOXB?=\n\t=?utf-8?q?BaCT5ZmXwY5tUOwWRXgNV1Z/kbc2TeZdx+e1d/qFlhnlJrCppF5lpiDz?=\n\t=?utf-8?q?BrBT04NuObhyn8fLAD9srj9qVveJ6hPMaHTwUlJw8WOv9KDw8SmhMZ/V?=\n\t=?utf-8?q?UJeoB1HfpCqRuXztoX9trls0EigebCLvsWXPj09MlvW2ueJLXtCecimw?=\n\t=?utf-8?q?zmcdql+vTeNU?=","1; DM5PR07MB3497;\n\t6:GGTk0izlQAJii/a6tnXvzW/euQCRFVe1LV8JDT2cMIhRFsPGZ2ytnHKqNdXeBL590UWhz06qEBunrdYH6ahrjNaF0kkABPjvVJpcAhdGC10hEnJJCWBaZX8GwDx6AQCodEwFa+IfzCjvCo3yykAu6PYyANeq3zSgOx390wGlpRwko67o0fV8ZseeDAWW6/IDOzJWDhGdawk/CobeQRP8V1yN2leFUwIAjwokIwI7ZeRzW0m5+Qd3fFqJaG7iDeMi3aj07XdL2XcAbLf3RydY0orLvPNxLp2E78a+SBoKMSWsMNPlnU642GItZ3wieAvNm+TWAuZhgw893L49twnnng==;\n\t5:Ni1/ebfMhpSiOkxgXGO0tVrThb9mlH6iAvPcor2A+dLvz/Nxb8ImTnzhBZ0U7giXilpXQkeBgJknc7IYq+k696JPzk/8aXi6N4ZBoZOp9Cybdoa2RPQ5nZ2omukjYSzm2w3bqEoovjOuNkSk/r6PmA==;\n\t24:4t8S1r+3iDFf1I/XiR2WzVHeHr2K5JzdSpnayQKnYaIHf0xg+xtebn3Nss+DdDlXmlj3vLrkjx7n4VcV4fVX/52OcXzMLAWvVm71fjQBoHE=;\n\t7:RrVFuc6hvro8evaB1jJMOdCWC5rqyDYUtM76e1QNfP23Lrk6PAZm3emXJu3eBDXlF5OYsVoWgpvHtfCCrxR5aUcVJDBlPlplKBmUGokYyihxHZUtLQ9fCSv+vBfrlmr4KFgy4v+aUITmItRtQSi8h7Lj8kHTsG3jQNFBZcfFtfDoP6wlqF2hB81YJ+XZpn9qh745nySm3WSS02XfGVriPN4lkll1X7MBgl5R0yTy5H4="],"X-MS-TrafficTypeDiagnostic":"DM5PR07MB3497:","X-Exchange-Antispam-Report-Test":"UriScan:;","X-Microsoft-Antispam-PRVS":"<DM5PR07MB3497598CD1A9490AB474CD6C97970@DM5PR07MB3497.namprd07.prod.outlook.com>","X-Exchange-Antispam-Report-CFA-Test":"BCL:0; PCL:0;\n\tRULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(93006095)(3002001)(10201501046)(100000703101)(100105400095)(6041248)(20161123562025)(20161123564025)(20161123555025)(20161123558100)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);\n\tSRVR:DM5PR07MB3497; BCL:0; PCL:0;\n\tRULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);\n\tSRVR:DM5PR07MB3497; ","X-Forefront-PRVS":"0422860ED4","X-Forefront-Antispam-Report":"SFV:NSPM;\n\tSFS:(10009020)(6009001)(189002)(377454003)(24454002)(199003)(97736004)(81156014)(106356001)(6246003)(3846002)(4326008)(105586002)(230700001)(42186005)(6666003)(6116002)(23676002)(31686004)(68736007)(6486002)(8936002)(6506006)(305945005)(50986999)(25786009)(2950100002)(36756003)(76176999)(54356999)(72206003)(229853002)(81166006)(2906002)(478600001)(93886005)(8676002)(65826007)(6512007)(54906002)(101416001)(7736002)(50466002)(4001350100001)(69596002)(83506001)(53546010)(42882006)(5890100001)(66066001)(39060400002)(53936002)(31696002)(189998001)(33646002)(64126003)(65806001)(53416004)(47776003)(5660300001)(4000630100001)(65956001);\n\tDIR:OUT; SFP:1101; SCL:1; SRVR:DM5PR07MB3497;\n\tH:ddl.caveonetworks.com; FPR:; SPF:None; PTR:InfoNoRecords;\n\tMX:1; A:1; LANG:en; ","Received-SPF":"None (protection.outlook.com: cavium.com does not designate\n\tpermitted sender hosts)","SpamDiagnosticOutput":"1:99","SpamDiagnosticMetadata":"NSPM","X-OriginatorOrg":"caviumnetworks.com","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"06 Sep 2017 20:49:45.4891\n\t(UTC)","X-MS-Exchange-CrossTenant-FromEntityHeader":"Hosted","X-MS-Exchange-CrossTenant-Id":"711e4ccf-2e9b-4bcf-a551-4094005b6194","X-MS-Exchange-Transport-CrossTenantHeadersStamped":"DM5PR07MB3497","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764422,"web_url":"http://patchwork.ozlabs.org/comment/1764422/","msgid":"<4a65e53c-f13b-9cc3-bffa-f2f2aae423b9@gmail.com>","list_archive_url":null,"date":"2017-09-06T22:51:36","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":10133,"url":"http://patchwork.ozlabs.org/api/people/10133/","name":"David Daney","email":"ddaney.cavm@gmail.com"},"content":"On 09/06/2017 01:49 PM, David Daney wrote:\n> On 09/06/2017 11:59 AM, Florian Fainelli wrote:\n>> On 09/06/2017 11:00 AM, David Daney wrote:\n>>> On 08/31/2017 11:29 AM, Florian Fainelli wrote:\n>>>> On 08/31/2017 11:12 AM, Mason wrote:\n>>>>> On 31/08/2017 19:53, Florian Fainelli wrote:\n>>>>>> On 08/31/2017 10:49 AM, Mason wrote:\n>>>>>>> On 31/08/2017 18:57, Florian Fainelli wrote:\n>>>>>>>> And the race is between phy_detach() setting phydev->attached_dev\n>>>>>>>> = NULL\n>>>>>>>> and phy_state_machine() running in PHY_HALTED state and calling\n>>>>>>>> netif_carrier_off().\n>>>>>>>\n>>>>>>> I must be missing something.\n>>>>>>> (Since a thread cannot race against itself.)\n>>>>>>>\n>>>>>>> phy_disconnect calls phy_stop_machine which\n>>>>>>> 1) stops the work queue from running in a separate thread\n>>>>>>> 2) calls phy_state_machine *synchronously*\n>>>>>>>        which runs the PHY_HALTED case with everything well-defined\n>>>>>>> end of phy_stop_machine\n>>>>>>>\n>>>>>>> phy_disconnect only then calls phy_detach()\n>>>>>>> which makes future calls of phy_state_machine perilous.\n>>>>>>>\n>>>>>>> This all happens in the same thread, so I'm not yet\n>>>>>>> seeing where the race happens?\n>>>>>>\n>>>>>> The race is as described in David's earlier email, so let's recap:\n>>>>>>\n>>>>>> Thread 1            Thread 2\n>>>>>> phy_disconnect()\n>>>>>> phy_stop_interrupts()\n>>>>>> phy_stop_machine()\n>>>>>> phy_state_machine()\n>>>>>>    -> queue_delayed_work()\n>>>>>> phy_detach()\n>>>>>>                  phy_state_machine()\n>>>>>>                  -> netif_carrier_off()\n>>>>>>\n>>>>>> If phy_detach() finishes earlier than the workqueue had a chance \n>>>>>> to be\n>>>>>> scheduled and process PHY_HALTED again, then we trigger the NULL\n>>>>>> pointer\n>>>>>> de-reference.\n>>>>>>\n>>>>>> workqueues are not tasklets, the CPU scheduling them gets no \n>>>>>> guarantee\n>>>>>> they will run on the same CPU.\n>>>>>\n>>>>> Something does not add up.\n>>>>>\n>>>>> The synchronous call to phy_state_machine() does:\n>>>>>\n>>>>>      case PHY_HALTED:\n>>>>>          if (phydev->link) {\n>>>>>              phydev->link = 0;\n>>>>>              netif_carrier_off(phydev->attached_dev);\n>>>>>              phy_adjust_link(phydev);\n>>>>>              do_suspend = true;\n>>>>>          }\n>>>>>\n>>>>> then sets phydev->link = 0; therefore subsequent calls to\n>>>>> phy_state_machin() will be no-op.\n>>>>\n>>>> Actually you are right, once phydev->link is set to 0 these would \n>>>> become\n>>>> no-ops. Still scratching my head as to what happens for David then...\n>>>>\n>>>>>\n>>>>> Also, queue_delayed_work() is only called in polling mode.\n>>>>> David stated that he's using interrupt mode.\n>>>\n>>> Did you see what I wrote?\n>>\n>> Still not following, see below.\n>>\n>>>\n>>> phy_disconnect() calls phy_stop_interrupts() which puts it into polling\n>>> mode.  So the polling work gets queued unconditionally.\n>>\n>> What part of phy_stop_interrupts() is responsible for changing\n>> phydev->irq to PHY_POLL? free_irq() cannot touch phydev->irq otherwise\n>> subsequent request_irq() calls won't work anymore.\n>> phy_disable_interrupts() only calls back into the PHY driver to\n>> acknowledge and clear interrupts.\n>>\n>> If we were using a PHY with PHY_POLL, as Marc said, the first\n>> synchronous call to phy_state_machine() would have acted on PHY_HALTED\n>> and even if we incorrectly keep re-scheduling the state machine from\n>> PHY_HALTED to PHY_HALTED the second time around nothing can happen.\n>>\n>> What are we missing here?\n>>\n> \n> OK, I am now as confused as you guys are.  I will go back and get an \n> ftrace log out of the failure.\n> \nOK, let's forget about the PHY_HALTED discussion.\n\n\nConsider instead the case of a Marvell phy with no interrupts connected \non a v4.9.43 kernel, single CPU:\n\n\n   0)               |                 phy_disconnect() {\n   0)               |                   phy_stop_machine() {\n   0)               |                     cancel_delayed_work_sync() {\n   0) + 23.986 us   |                     } /* cancel_delayed_work_sync */\n   0)               |                     phy_state_machine() {\n   0)               |                       phy_start_aneg_priv() {\n   0)               |                         marvell_config_aneg() {\n   0) ! 240.538 us  |                         } /* marvell_config_aneg */\n   0) ! 244.971 us  |                       } /* phy_start_aneg_priv */\n   0)               |                       queue_delayed_work_on() {\n   0) + 18.016 us   |                       } /* queue_delayed_work_on */\n   0) ! 268.184 us  |                     } /* phy_state_machine */\n   0) ! 297.394 us  |                   } /* phy_stop_machine */\n   0)               |                   phy_detach() {\n   0)               |                     phy_suspend() {\n   0)               |                       phy_ethtool_get_wol() {\n   0)   0.677 us    |                       } /* phy_ethtool_get_wol */\n   0)               |                       genphy_suspend() {\n   0) + 71.250 us   |                       } /* genphy_suspend */\n   0) + 74.197 us   |                     } /* phy_suspend */\n   0) + 80.302 us   |                   } /* phy_detach */\n   0) ! 380.072 us  |                 } /* phy_disconnect */\n.\n.\n.\n   0)               |  process_one_work() {\n   0)               |    find_worker_executing_work() {\n   0)   0.688 us    |    } /* find_worker_executing_work */\n   0)               |    set_work_pool_and_clear_pending() {\n   0)   0.734 us    |    } /* set_work_pool_and_clear_pending */\n   0)               |    phy_state_machine() {\n   0)               |      genphy_read_status() {\n   0) ! 205.721 us  |      } /* genphy_read_status */\n   0)               |      netif_carrier_off() {\n   0)               |        do_page_fault() {\n\n\nThe do_page_fault() at the end indicates the NULL pointer dereference.\n\nThat added call to phy_state_machine() turns the polling back on \nunconditionally for a phy that should be disconnected.  How is that correct?\n\nDavid.","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"maYrwgoc\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnf335dKfz9s1h\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 08:51:43 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1754792AbdIFWvl (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 18:51:41 -0400","from mail-yw0-f195.google.com ([209.85.161.195]:36560 \"EHLO\n\tmail-yw0-f195.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1751399AbdIFWvj (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Wed, 6 Sep 2017 18:51:39 -0400","by mail-yw0-f195.google.com with SMTP id p77so4041828ywp.3\n\tfor <netdev@vger.kernel.org>; Wed, 06 Sep 2017 15:51:39 -0700 (PDT)","from ddl.caveonetworks.com\n\t(50-233-148-156-static.hfc.comcastbusiness.net. [50.233.148.156])\n\tby smtp.googlemail.com with ESMTPSA id\n\tb185sm369128ywh.110.2017.09.06.15.51.37\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tWed, 06 Sep 2017 15:51:38 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=subject:to:cc:references:from:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=/oWvrhmb5JNZy5wQnsapXvnKHH+PrWsuXzZuQlPonY8=;\n\tb=maYrwgoctRCivdfrEJ+5ebJdng7PnXwNLqzsXlnyAuMlRFUiwifzXvW+xprgx9Px0n\n\tuZCCW0X8Qgey6cOC1ai1ekWM8+oLJT+7HqAA0yv9MKeztEI+/OmC8Iiw8g4TwGS6gvQA\n\teT3+jm0NsvSKULJFzDKiDo5maU7hW4L5ORqifHOJ0cJmRr5v673qVEk9DZfyA0yS/9lR\n\tnZuwpmr775q6XeizCU52FQaVGoPJZzYNnFawRMN+iX9kgPE/Gcr6W3KXp3K4Lr6J4wlE\n\t+0vmgfP5Z4klxbNNfpDgh/hhuGYlWwp1AuFV/eJ9qgA0VRfmOHaH35DvJOzb0ZTQ8JbJ\n\tzynA==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:subject:to:cc:references:from:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=/oWvrhmb5JNZy5wQnsapXvnKHH+PrWsuXzZuQlPonY8=;\n\tb=b50uoIRABz915Hi3GtoXZ6MZ/uSIOx6hPX1lqnx83dokmrAhJSNMv4GKLdolJGqnDr\n\tOBJxyPwG5uZ7jLgl1QhIYwDobsYr6iDTMjeigFrXr7yxre6DB/E7R968bgtudSJr5WE1\n\ttVK3PthLQpPA4vQJmWZUrKG962vmyyYcMiBH4qTyqpT12guZFbqi0ctkDfFMrt0Quj6k\n\tNfRQqx2dmOT8s9IYDV4K5UzVwlwYGEgXJ3bnKIXE0QgSJ2drxOZ+VyMAkJdLLtQ5zp3V\n\t4N2WVj/L81VknHraXaO5rRXrGzDbXMVxi1bgw52wMBBxV1jYbrRyabd8UdFnHpdw0Out\n\tmQBQ==","X-Gm-Message-State":"AHPjjUjdStQMjDvsv5noWOy66XSS0XhHaJsr5/yTSM8NOgGF3yzdA/oD\n\tsDAhLkfY2AR9ng==","X-Google-Smtp-Source":"ADKCNb7rtvyN7ZE9XOtjFN2xib6r8iALrKYfpHDaguITcETSgiRO9TDO42KsjQdfSTccB3fNhrk1VQ==","X-Received":"by 10.37.186.75 with SMTP id z11mr583971ybj.91.1504738299278;\n\tWed, 06 Sep 2017 15:51:39 -0700 (PDT)","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"David Daney <ddaney@caviumnetworks.com>,\n\tFlorian Fainelli <f.fainelli@gmail.com>, Mason <slash.tmp@free.fr>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>\n\t<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>\n\t<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>\n\t<ff070239-28b7-d41b-8abe-c9f810561372@caviumnetworks.com>\n\t<572f49fd-f623-f064-a551-e243c57cef7f@gmail.com>\n\t<6890a27f-e87e-62c1-a676-e5ddf968adb6@caviumnetworks.com>","From":"David Daney <ddaney.cavm@gmail.com>","Message-ID":"<4a65e53c-f13b-9cc3-bffa-f2f2aae423b9@gmail.com>","Date":"Wed, 6 Sep 2017 15:51:36 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<6890a27f-e87e-62c1-a676-e5ddf968adb6@caviumnetworks.com>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Language":"en-US","Content-Transfer-Encoding":"7bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764428,"web_url":"http://patchwork.ozlabs.org/comment/1764428/","msgid":"<a4b70bf9-ec48-314d-b63d-e44f7cbb4bab@gmail.com>","list_archive_url":null,"date":"2017-09-06T23:14:22","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":2800,"url":"http://patchwork.ozlabs.org/api/people/2800/","name":"Florian Fainelli","email":"f.fainelli@gmail.com"},"content":"On 09/06/2017 03:51 PM, David Daney wrote:\n> On 09/06/2017 01:49 PM, David Daney wrote:\n>> On 09/06/2017 11:59 AM, Florian Fainelli wrote:\n>>> On 09/06/2017 11:00 AM, David Daney wrote:\n>>>> On 08/31/2017 11:29 AM, Florian Fainelli wrote:\n>>>>> On 08/31/2017 11:12 AM, Mason wrote:\n>>>>>> On 31/08/2017 19:53, Florian Fainelli wrote:\n>>>>>>> On 08/31/2017 10:49 AM, Mason wrote:\n>>>>>>>> On 31/08/2017 18:57, Florian Fainelli wrote:\n>>>>>>>>> And the race is between phy_detach() setting phydev->attached_dev\n>>>>>>>>> = NULL\n>>>>>>>>> and phy_state_machine() running in PHY_HALTED state and calling\n>>>>>>>>> netif_carrier_off().\n>>>>>>>>\n>>>>>>>> I must be missing something.\n>>>>>>>> (Since a thread cannot race against itself.)\n>>>>>>>>\n>>>>>>>> phy_disconnect calls phy_stop_machine which\n>>>>>>>> 1) stops the work queue from running in a separate thread\n>>>>>>>> 2) calls phy_state_machine *synchronously*\n>>>>>>>>        which runs the PHY_HALTED case with everything well-defined\n>>>>>>>> end of phy_stop_machine\n>>>>>>>>\n>>>>>>>> phy_disconnect only then calls phy_detach()\n>>>>>>>> which makes future calls of phy_state_machine perilous.\n>>>>>>>>\n>>>>>>>> This all happens in the same thread, so I'm not yet\n>>>>>>>> seeing where the race happens?\n>>>>>>>\n>>>>>>> The race is as described in David's earlier email, so let's recap:\n>>>>>>>\n>>>>>>> Thread 1            Thread 2\n>>>>>>> phy_disconnect()\n>>>>>>> phy_stop_interrupts()\n>>>>>>> phy_stop_machine()\n>>>>>>> phy_state_machine()\n>>>>>>>    -> queue_delayed_work()\n>>>>>>> phy_detach()\n>>>>>>>                  phy_state_machine()\n>>>>>>>                  -> netif_carrier_off()\n>>>>>>>\n>>>>>>> If phy_detach() finishes earlier than the workqueue had a chance\n>>>>>>> to be\n>>>>>>> scheduled and process PHY_HALTED again, then we trigger the NULL\n>>>>>>> pointer\n>>>>>>> de-reference.\n>>>>>>>\n>>>>>>> workqueues are not tasklets, the CPU scheduling them gets no\n>>>>>>> guarantee\n>>>>>>> they will run on the same CPU.\n>>>>>>\n>>>>>> Something does not add up.\n>>>>>>\n>>>>>> The synchronous call to phy_state_machine() does:\n>>>>>>\n>>>>>>      case PHY_HALTED:\n>>>>>>          if (phydev->link) {\n>>>>>>              phydev->link = 0;\n>>>>>>              netif_carrier_off(phydev->attached_dev);\n>>>>>>              phy_adjust_link(phydev);\n>>>>>>              do_suspend = true;\n>>>>>>          }\n>>>>>>\n>>>>>> then sets phydev->link = 0; therefore subsequent calls to\n>>>>>> phy_state_machin() will be no-op.\n>>>>>\n>>>>> Actually you are right, once phydev->link is set to 0 these would\n>>>>> become\n>>>>> no-ops. Still scratching my head as to what happens for David then...\n>>>>>\n>>>>>>\n>>>>>> Also, queue_delayed_work() is only called in polling mode.\n>>>>>> David stated that he's using interrupt mode.\n>>>>\n>>>> Did you see what I wrote?\n>>>\n>>> Still not following, see below.\n>>>\n>>>>\n>>>> phy_disconnect() calls phy_stop_interrupts() which puts it into polling\n>>>> mode.  So the polling work gets queued unconditionally.\n>>>\n>>> What part of phy_stop_interrupts() is responsible for changing\n>>> phydev->irq to PHY_POLL? free_irq() cannot touch phydev->irq otherwise\n>>> subsequent request_irq() calls won't work anymore.\n>>> phy_disable_interrupts() only calls back into the PHY driver to\n>>> acknowledge and clear interrupts.\n>>>\n>>> If we were using a PHY with PHY_POLL, as Marc said, the first\n>>> synchronous call to phy_state_machine() would have acted on PHY_HALTED\n>>> and even if we incorrectly keep re-scheduling the state machine from\n>>> PHY_HALTED to PHY_HALTED the second time around nothing can happen.\n>>>\n>>> What are we missing here?\n>>>\n>>\n>> OK, I am now as confused as you guys are.  I will go back and get an\n>> ftrace log out of the failure.\n>>\n> OK, let's forget about the PHY_HALTED discussion.\n> \n> \n> Consider instead the case of a Marvell phy with no interrupts connected\n> on a v4.9.43 kernel, single CPU:\n> \n> \n>   0)               |                 phy_disconnect() {\n>   0)               |                   phy_stop_machine() {\n>   0)               |                     cancel_delayed_work_sync() {\n>   0) + 23.986 us   |                     } /* cancel_delayed_work_sync */\n>   0)               |                     phy_state_machine() {\n>   0)               |                       phy_start_aneg_priv() {\n\nThanks for providing the trace, I think I have an idea of what's going\non, see below.\n\n>   0)               |                         marvell_config_aneg() {\n>   0) ! 240.538 us  |                         } /* marvell_config_aneg */\n>   0) ! 244.971 us  |                       } /* phy_start_aneg_priv */\n>   0)               |                       queue_delayed_work_on() {\n>   0) + 18.016 us   |                       } /* queue_delayed_work_on */\n>   0) ! 268.184 us  |                     } /* phy_state_machine */\n>   0) ! 297.394 us  |                   } /* phy_stop_machine */\n>   0)               |                   phy_detach() {\n>   0)               |                     phy_suspend() {\n>   0)               |                       phy_ethtool_get_wol() {\n>   0)   0.677 us    |                       } /* phy_ethtool_get_wol */\n>   0)               |                       genphy_suspend() {\n>   0) + 71.250 us   |                       } /* genphy_suspend */\n>   0) + 74.197 us   |                     } /* phy_suspend */\n>   0) + 80.302 us   |                   } /* phy_detach */\n>   0) ! 380.072 us  |                 } /* phy_disconnect */\n> .\n> .\n> .\n>   0)               |  process_one_work() {\n>   0)               |    find_worker_executing_work() {\n>   0)   0.688 us    |    } /* find_worker_executing_work */\n>   0)               |    set_work_pool_and_clear_pending() {\n>   0)   0.734 us    |    } /* set_work_pool_and_clear_pending */\n>   0)               |    phy_state_machine() {\n>   0)               |      genphy_read_status() {\n>   0) ! 205.721 us  |      } /* genphy_read_status */\n>   0)               |      netif_carrier_off() {\n>   0)               |        do_page_fault() {\n> \n> \n> The do_page_fault() at the end indicates the NULL pointer dereference.\n> \n> That added call to phy_state_machine() turns the polling back on\n> unconditionally for a phy that should be disconnected.  How is that\n> correct?\n\nIt is not fundamentally correct and I don't think there was any\nobjection to that to begin with. In fact there is a bug/inefficiency\nhere in that if we have entered the PHY state machine with PHY_HALTED we\nshould not re-schedule it period, only applicable to PHY_POLL cases\n*and* properly calling phy_stop() followed by phy_disconnect().\n\nWhat I now think is happening in your case is the following:\n\nphy_stop() was not called, so nothing does set phydev->state to\nPHY_HALTED in the first place so we have:\n\nphy_disconnect()\n-> phy_stop_machine()\n\t-> cancel_delayed_work_sync() OK\n\t\tphydev->state is probably RUNNING so we have:\n\t\t-> phydev->state = PHY_UP\n\tphy_state_machine() is called synchronously\n\t-> PHY_UP -> needs_aneg = true\n\t-> phy_restart_aneg()\n\t-> queue_delayed_work_sync()\n-> phydev->adjust_link = NULL\n-> phy_deatch() -> boom\n\nCan you confirm whether the driver you are using does call phy_stop()\nprior to phy_disconnect()? If that is the case then this whole theory\nfalls apart, if not, then this needs fixing in both the driver and PHYLIB.\n\nThanks","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"nOeoPeBP\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnfYT11pcz9t2W\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 09:14:37 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1754031AbdIFXOe (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 19:14:34 -0400","from mail-wr0-f194.google.com ([209.85.128.194]:38001 \"EHLO\n\tmail-wr0-f194.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1752666AbdIFXOd (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Wed, 6 Sep 2017 19:14:33 -0400","by mail-wr0-f194.google.com with SMTP id p37so1545418wrb.5\n\tfor <netdev@vger.kernel.org>; Wed, 06 Sep 2017 16:14:32 -0700 (PDT)","from [10.112.156.244] ([192.19.255.250])\n\tby smtp.googlemail.com with ESMTPSA id\n\tw82sm4737635wme.5.2017.09.06.16.14.26\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tWed, 06 Sep 2017 16:14:30 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=subject:to:cc:references:from:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=YTErNyYwzu2SsVnbXg4SIVKBwzi49uY7n+gO8k3PQjM=;\n\tb=nOeoPeBPSxCriye5/2ySYYs0sqEAl9PO4YRBTwM8nYQx5SXNT0wIgWZ1j1b3c+watQ\n\tWS/uhJd0b3pHwj/cPcMFeUn+Uy0QayyqJqckXkicgYHidSRmdFcc3p5itu5fAgiVn+Dx\n\tdpdhfZoVwwpkK328Mi5tVQbTA5CZeoS2MPXOXPwjeLJEOEmOtdJ5PPEcQZ9BeeDhXYZI\n\ttxbGls7v92FYttx4QVAU2PEYtw/3iD6+0pzSuv3fZLZaB5+xSzXAOrPT4iylX/8CrHmf\n\tdlTS7uUTnNno2DNDF6Q8RLFYsPjLJGdMr6DEziClhB4XElErEqkMjmglPZbi4LiE/9zZ\n\tMG4w==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:subject:to:cc:references:from:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=YTErNyYwzu2SsVnbXg4SIVKBwzi49uY7n+gO8k3PQjM=;\n\tb=EWrMZGzLkYIWck/HUZezS3/w6JYhRbK+Ki7WfOwYY5zW7j3NObl3o6200UHfxBB4MI\n\t4QITSGqnd8CKgkEzNiKbXemMoA6h+fcaJ+9DO3MZAPw5N//WWPH4z19Ny8xwWDVqU+2L\n\tFsQhZZbx8MgVNxEemKrq43m4agfQYnZHFij/0Xwr01icxiY6vq0M5+I5XeqhmdIhNNpI\n\tLYRmhul2++AHo4TbOY0mLHDRqGOUfqxexd27X6I0jwFTHaLevVCf//aL5BO+cQVvaT7d\n\tZ5s9gtq2ypfrgCIOXSxB0cIM+i+2BzoursIEujG3Yph/m8jXL8jFckdX0OIPLWFxAqor\n\tuHsA==","X-Gm-Message-State":"AHPjjUjUxc5Pem36shPqqH7+gDLo4p9j4iXH0YBtZjjl/WVjYxY7MWYr\n\tLit6eXjJDOqsPA==","X-Google-Smtp-Source":"ADKCNb7YDyJnfxOmXTEVc3X1nYYkZdRk1dtXLe7D+26HRlSzhhpa6Uzvy88QYL6fWxOplaVPw0/2HQ==","X-Received":"by 10.223.195.129 with SMTP id p1mr412901wrf.293.1504739671559; \n\tWed, 06 Sep 2017 16:14:31 -0700 (PDT)","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"David Daney <ddaney.cavm@gmail.com>,\n\tDavid Daney <ddaney@caviumnetworks.com>, Mason <slash.tmp@free.fr>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>\n\t<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>\n\t<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>\n\t<ff070239-28b7-d41b-8abe-c9f810561372@caviumnetworks.com>\n\t<572f49fd-f623-f064-a551-e243c57cef7f@gmail.com>\n\t<6890a27f-e87e-62c1-a676-e5ddf968adb6@caviumnetworks.com>\n\t<4a65e53c-f13b-9cc3-bffa-f2f2aae423b9@gmail.com>","From":"Florian Fainelli <f.fainelli@gmail.com>","Message-ID":"<a4b70bf9-ec48-314d-b63d-e44f7cbb4bab@gmail.com>","Date":"Wed, 6 Sep 2017 16:14:22 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<4a65e53c-f13b-9cc3-bffa-f2f2aae423b9@gmail.com>","Content-Type":"text/plain; charset=utf-8","Content-Language":"en-US","Content-Transfer-Encoding":"8bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764440,"web_url":"http://patchwork.ozlabs.org/comment/1764440/","msgid":"<64800ff2-201b-eb26-304e-1c4c6e0a6d5e@caviumnetworks.com>","list_archive_url":null,"date":"2017-09-07T00:10:02","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":721,"url":"http://patchwork.ozlabs.org/api/people/721/","name":"David Daney","email":"ddaney@caviumnetworks.com"},"content":"On 09/06/2017 04:14 PM, Florian Fainelli wrote:\n> On 09/06/2017 03:51 PM, David Daney wrote:\n[...]\n>>\n>> Consider instead the case of a Marvell phy with no interrupts connected\n>> on a v4.9.43 kernel, single CPU:\n>>\n>>\n>>    0)               |                 phy_disconnect() {\n>>    0)               |                   phy_stop_machine() {\n>>    0)               |                     cancel_delayed_work_sync() {\n>>    0) + 23.986 us   |                     } /* cancel_delayed_work_sync */\n>>    0)               |                     phy_state_machine() {\n>>    0)               |                       phy_start_aneg_priv() {\n> \n> Thanks for providing the trace, I think I have an idea of what's going\n> on, see below.\n> \n>>    0)               |                         marvell_config_aneg() {\n>>    0) ! 240.538 us  |                         } /* marvell_config_aneg */\n>>    0) ! 244.971 us  |                       } /* phy_start_aneg_priv */\n>>    0)               |                       queue_delayed_work_on() {\n>>    0) + 18.016 us   |                       } /* queue_delayed_work_on */\n>>    0) ! 268.184 us  |                     } /* phy_state_machine */\n>>    0) ! 297.394 us  |                   } /* phy_stop_machine */\n>>    0)               |                   phy_detach() {\n>>    0)               |                     phy_suspend() {\n>>    0)               |                       phy_ethtool_get_wol() {\n>>    0)   0.677 us    |                       } /* phy_ethtool_get_wol */\n>>    0)               |                       genphy_suspend() {\n>>    0) + 71.250 us   |                       } /* genphy_suspend */\n>>    0) + 74.197 us   |                     } /* phy_suspend */\n>>    0) + 80.302 us   |                   } /* phy_detach */\n>>    0) ! 380.072 us  |                 } /* phy_disconnect */\n>> .\n>> .\n>> .\n>>    0)               |  process_one_work() {\n>>    0)               |    find_worker_executing_work() {\n>>    0)   0.688 us    |    } /* find_worker_executing_work */\n>>    0)               |    set_work_pool_and_clear_pending() {\n>>    0)   0.734 us    |    } /* set_work_pool_and_clear_pending */\n>>    0)               |    phy_state_machine() {\n>>    0)               |      genphy_read_status() {\n>>    0) ! 205.721 us  |      } /* genphy_read_status */\n>>    0)               |      netif_carrier_off() {\n>>    0)               |        do_page_fault() {\n>>\n>>\n>> The do_page_fault() at the end indicates the NULL pointer dereference.\n>>\n>> That added call to phy_state_machine() turns the polling back on\n>> unconditionally for a phy that should be disconnected.  How is that\n>> correct?\n> \n> It is not fundamentally correct and I don't think there was any\n> objection to that to begin with. In fact there is a bug/inefficiency\n> here in that if we have entered the PHY state machine with PHY_HALTED we\n> should not re-schedule it period, only applicable to PHY_POLL cases\n> *and* properly calling phy_stop() followed by phy_disconnect().\n> \n> What I now think is happening in your case is the following:\n> \n> phy_stop() was not called, so nothing does set phydev->state to\n> PHY_HALTED in the first place so we have:\n> \n> phy_disconnect()\n> -> phy_stop_machine()\n> \t-> cancel_delayed_work_sync() OK\n> \t\tphydev->state is probably RUNNING so we have:\n> \t\t-> phydev->state = PHY_UP\n> \tphy_state_machine() is called synchronously\n> \t-> PHY_UP -> needs_aneg = true\n> \t-> phy_restart_aneg()\n> \t-> queue_delayed_work_sync()\n> -> phydev->adjust_link = NULL\n> -> phy_deatch() -> boom\n> \n> Can you confirm whether the driver you are using does call phy_stop()\n> prior to phy_disconnect()? \n\nThere is no call to phy_stop().\n\nI can add this to the ethernet drivers, but I wonder if it should be \ncalled by the code code when doing phy_disconnect(), if it was not \nalready stopped.\n\n> If that is the case then this whole theory\n> falls apart, if not, then this needs fixing in both the driver and PHYLIB.\n> \n> Thanks\n>","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tunprotected) header.d=CAVIUMNETWORKS.onmicrosoft.com\n\theader.i=@CAVIUMNETWORKS.onmicrosoft.com header.b=\"NCHbuJLP\"; \n\tdkim-atps=neutral","spf=none (sender IP is )\n\tsmtp.mailfrom=David.Daney@cavium.com; "],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xngns4f3rz9s4q\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 10:10:25 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751836AbdIGAKK (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 20:10:10 -0400","from mail-sn1nam01on0042.outbound.protection.outlook.com\n\t([104.47.32.42]:21719\n\t\"EHLO NAM01-SN1-obe.outbound.protection.outlook.com\"\n\trhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP\n\tid S1751267AbdIGAKJ (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tWed, 6 Sep 2017 20:10:09 -0400","from ddl.caveonetworks.com (50.233.148.156) by\n\tCY4PR07MB3495.namprd07.prod.outlook.com (10.171.252.152) with\n\tMicrosoft SMTP Server (version=TLS1_2,\n\tcipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id\n\t15.20.35.12; Thu, 7 Sep 2017 00:10:05 +0000"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version;\n\tbh=3DfPOM7zePAgoQl4XKdUMnQZKpp9EppUhlyFnlrjRCw=;\n\tb=NCHbuJLP/QsIfT2T482lRZ3zjPu+/beTsivYpqq3BOZvVVa6dgTmIACgCYwt0SJO/fDjSGXX8K3B+JlNRu+2aeWMx7CP7sx+twrIUXVB+GhSwMY8AJxdFTZFv4S73aDK94QZTGLpAkz2H0gCVmCAr7rZEPDX+/UryU7DcFYMOGE=","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"Florian Fainelli <f.fainelli@gmail.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>, Mason <slash.tmp@free.fr>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>\n\t<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>\n\t<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>\n\t<ff070239-28b7-d41b-8abe-c9f810561372@caviumnetworks.com>\n\t<572f49fd-f623-f064-a551-e243c57cef7f@gmail.com>\n\t<6890a27f-e87e-62c1-a676-e5ddf968adb6@caviumnetworks.com>\n\t<4a65e53c-f13b-9cc3-bffa-f2f2aae423b9@gmail.com>\n\t<a4b70bf9-ec48-314d-b63d-e44f7cbb4bab@gmail.com>","From":"David Daney <ddaney@caviumnetworks.com>","Message-ID":"<64800ff2-201b-eb26-304e-1c4c6e0a6d5e@caviumnetworks.com>","Date":"Wed, 6 Sep 2017 17:10:02 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<a4b70bf9-ec48-314d-b63d-e44f7cbb4bab@gmail.com>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Language":"en-US","Content-Transfer-Encoding":"7bit","X-Originating-IP":"[50.233.148.156]","X-ClientProxiedBy":"SN1PR0701CA0043.namprd07.prod.outlook.com (10.163.126.11)\n\tTo CY4PR07MB3495.namprd07.prod.outlook.com\n\t(10.171.252.152)","X-MS-PublicTrafficType":"Email","X-MS-Office365-Filtering-Correlation-Id":"b14d0708-3b33-4e14-0bc4-08d4f584cb6f","X-Microsoft-Antispam":"UriScan:; BCL:0; PCL:0;\n\tRULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(300000503095)(300135400095)(2017052603199)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);\n\tSRVR:CY4PR07MB3495; ","X-Microsoft-Exchange-Diagnostics":["1; CY4PR07MB3495;\n\t3:R5PnrZvN+Hr4N4AoIwTpYvexcO7HG+jSxiXIOSDoaxlLanNT6u5DNDDpIIv7CBFcP1G4WH+IFpEiaTDlqoUMzQxhqwgwhTACJbNxYQgSzZMpBrDXU1BX7xHp9Osc9HN6fgcN9p+bvqXst4wYUNcGknFXyixqTtM5iHmAfWtlfGfTDWFyL04HovHQuzMlL0iyTFrBTZoE9y89GPqYktHgOpvwOH5MAJQKgW+9WQ8W0yeBKUD2mFwHXltHApfcsM2e;\n\t25:CjJsdwPPmyjVHQbcHSX/4/U5846tOS9q7pBknXivGm5eg1XdvoYn3tnrFTn8pJthzzJjSa/4D+2SUzxErabRIjek46bhBbsS0yE9HeRRiVBilnaVWqPr+fG7R5quJCvL6Qe7//RvNDABrguRQHMK1BtAX3is3jqwGQy82Vmf+vIBfYcNrF9hP0P3o9wIuvCi1GSGwQe1yoTQJicjmpOMY0svLk6MrBbyvY1iSvfNfuXhKXzfltaevW2KHAj7MveAzqBiYPUPWVs1vB3VR7RedNFOopm75y2HF3R5/v8Zj6P62uSQ5zBDMuULGTBSuJIKvCog7Si7CfEN+oyM+5uHuQ==;\n\t31:Bwx2xSMzuT2c8psumlA8qbceSLxWfZlS2KoAKt4KRqGJdlV7EtVE4V5yA+zVAqR/AmeQP+aO3PRDzoqu7SxyzOrxMPx9rCX+iiHzVWkt8rYnS+/cUJRCklBCxDClIQ/iHG+qhgztX7SUQuoGFey+MwCGrun90mESbWTsn7UDEyDCqVqcp7dqbo1MMCDdsPxvxnUx45FgwP7WZBq+i9nMCb9Xl7Ed7ZYH35KEtG8t4D0=","1; CY4PR07MB3495;\n\t20:lYvlgf5MVk5yoz3WrqnH5uDdt29MPY/VbrgniYHkGi1QLPOXkVp7be80NAggJbov/hM6xqWMedGkkhG/lJ3KfkMyiOqUSY73z5qOM4CdoOdz/vEJ+BM0z6obNNuJw74T6GKJO+PApWvbIGYcuiCBsDJ4tzNOS/WoDVQpCxuYqprktOGzbWkMI3z7I5ydRSToQlz/hVVLf9YGjBcZn6gMZnEoyriPTXLT7z0zOdkADK2gVtk3fFZNbj2aiPlFfDxD0yVFS6aWdJIK6Wwd2lCtF4baQ5GPU9wE4wwM8pwQnd7ka6SHS7nPJe1bAICVCf5/jksKln+hEio2a7jeF7R1ALSgsi+DqdQK3XbynHOcA8W+h06Qyu7C6sTL88fGaq7FzAiw/1SBDouQTzaOze322vH1XnVDUry8tb1dKHtUFIAhQukg1NNY49rkU3pUnMmNZlejEl9LVQ1mCpIoFJwAh3LxcwPwUupfy8kJyJ3O9+NzeHTmQ3H41hdmYUhOPcbSJuEjkocg6ooqE08/HV4xtKkp2EZr1MeEhrTtXBbXfySnu8zVEzxA7tHWt+76zEblZnLaMTJqPk/GTWPzPeRR7j4F/r/ziRytSYIcT902Pp8=;\n\t4:jPV+ZVq9kIU4JRhWfIDumLzhpNZBeUfNl9JrwNAjwHKad3O04R+mU/eXTTibrvsx9juZxe0iP58V44mHkYmW/BAJtmGZJ5RqtKnwaOi1i4ZWHVmYIX+M5Etxslh1DcLCSst1UchNP0tAI2cTmymOUOLSpT8bAasz6A5kosKCqjg5w2UJIBfzOI7831BfEa3gmEvHZH2zt6nj0tABFA3+LUKHfkXp0ua/Ixj/X9zB+KaZy3vlWFeXKLzTiaqdNyJu","=?utf-8?q?1=3BCY4PR07MB3495=3B23=3AHfcQ?=\n\t=?utf-8?q?fbQ4V/3sEqNMTHZTzi/iT2pb3QEHKbIF+hLPko3yp4FUOHDcZicf0/8a?=\n\t=?utf-8?q?PZQI93BV6JvRIfeoZURC99tLHncejZ3p0LlJPWoC5BHfLFVMBQMhmPrT?=\n\t=?utf-8?q?4uf9vDX7ieyW2Fgl1ijPnlciFTjDfqxO7V+W9TdlupESIprThJsdXSnX?=\n\t=?utf-8?q?ImKSvVgjpK28puDN3hewYjcAPUXbxKum/1EZD2A2VW8ab+2uILRevsJa?=\n\t=?utf-8?q?JYxt9WYItp+9iX6RBRbKOMi1mOPiuRw9YRkdf8bZpPO5yl72imxCTNpU?=\n\t=?utf-8?q?PpCxzvCW6DvbzxgEZyTLdUUrusrCbitJdTjEmum8UCVoioXj49+hdLwq?=\n\t=?utf-8?q?xl8yuTJvUBiuqMzAg6rrItgPI9TOz5whZ9e4BML8vcCTjKQwmvIg0arM?=\n\t=?utf-8?q?AWC4L7B7Ung2DzZRtqN7JJE/TJkyxxwdDg1dgm0Z/ckmtkugUDB8CXG6?=\n\t=?utf-8?q?w1hosjr4Sw34ucOH9p0+4g1PFLomaMzKl5BB+kdkrdIO+pPkgnCVU3po?=\n\t=?utf-8?q?AJwPKKUShnWKb6Bid/Bv+0lD3/4b4JCi4NjQKU/3fpXHrdx7ypdSE8UB?=\n\t=?utf-8?q?OUyOWTmOWve8yzT8tmIiYbXRY73uph3Na6YA/vZjNqCckGZyHaMvMHR5?=\n\t=?utf-8?q?nGez4bGnJ35DVGDABRFSIofpSe6jKCT2cnya8jJ6u1z/dXgYbtvWLdOx?=\n\t=?utf-8?q?GPmTwuC/Pf/Me4oCJlxX+BeZyg/5wVCgcisA02vWV1Z/uiwHMXY7NSkI?=\n\t=?utf-8?q?rAZI/EvOqwr0j0pqWzCNBoSLn94mRqT+9PpbJfFmjrMuEn3VCAeqE8YS?=\n\t=?utf-8?q?js71VuLYwX9DLqjPFhH6E8INzg3McMitzdttghZ6XZZgMb2llUoqlzNK?=\n\t=?utf-8?q?ThTB2UiQyTbxYFxESjU0babC1iFVXAFOy3/+JpjzRD7HxOdQw81ruB1o?=\n\t=?utf-8?q?oNlFKIRYXwNnYICZkBV9yg0G/RdybBr/6lM8C2VkZlDgyDncbTmuXG60?=\n\t=?utf-8?q?QBwDcuL1GE/JEN8VZ5jBDvxRMGyWb8PkW87SSru3Ng3GDl2qArRgQltK?=\n\t=?utf-8?q?IJg2/Zxgu6Bog4g4smA4r4UudKtN3EOKc8BTgAJwYnWLktg79sOQ0dXl?=\n\t=?utf-8?q?SDwwYZR+hfBbi/Oo1ahuJS3Xkzt90zI8uc3I7PXlhTGOcvFc7/81FBwA?=\n\t=?utf-8?q?R3iXv0mXP/qJrVZXVbooOTgq6ua/k84laJtylSDisWoOeYSMOem/6ORb?=\n\t=?utf-8?q?mGfidVgmOf1zP51HnTh+gFuiLPK58AlA1gt6biCpHlD4Pt5A0CRlEkfO?=\n\t=?utf-8?q?7FGCpaEa17QZiDQ5FWVJfqOb6M7R08GnpvALZudLS83B/FKO78dsEkVX?=\n\t=?utf-8?q?6/gJ9aCL8gc7dIWEix0So2p6Fgwjq2sXwwneMFdjDsy9Ex8ByR+odNrv?=\n\t=?utf-8?q?q0K7Kc4Zr8dpJlGsKWThvVKh2F5a1oGEcNr96uNyTeknzzIPrfIxp8Do?=\n\t=?utf-8?q?Mg8+aMuwX3bt0RLswSGJTygW8dSdDag6iIt7ksH5FQYARoQ=3D?=","1; CY4PR07MB3495;\n\t6:Bf8NjVDepcrnz/6d0g8Nx95fqwpWMDC0IhDjtykJCH7WfSpUIiJFOOXHVy53SxTZ6PCU3ZJsXCXa8P7d3Y8zU/yiOa8oYFfVCVbqBTfpYpg1bE6CdzYQ6nuWdMGh7r6Gc0oG+of1XovNa30ztYv8MLF/T0Q+0OXYGHNeHXMqVoKfGw5EYQevFmrFPMA0sTRO558rBJ2kDFSfq80SYwR6uIIu+i87dt26zC5Y3cluFDBPhzFPgHuJXkTNc/vYxrI0udPHzFP/RsxhUfL8MNxIhpRUQy88TFGttNJ6t6AHSOfKvTBPfq5gHqFUX3yUx5CJ7KDgCJtgrMqIDBI0cpu6mg==;\n\t5:rMgkJmrVjvRXFuBESthSv105lHSQMmu6mGoCPJ4cafDZ7h3akwPGiWkrxEvMGlyhuIsSixnoqZ67hgd0r19l6jJrZ6ybZiu2IzCz43eSEJ1YtfvCZLGUWH1iKwjmxdeFYoMnjtsspc50NAmUbIWAvQ==;\n\t24:JjjaGYEXkKCs+MEk0WHfLJTh+CJ108NUpy2Tu8+EPFQFuKDgOiAWIO1TWu8lw3jigMjgfOnT0RUfJCuOUXU4X1dMtrMIYn6D+kDptr/2WfQ=;\n\t7:vsvOT2XnNjI4fIs4BtIGpyW5P4W0fWLQ2g9k6hv0Ajm89c3uQxbOk5eWX9u0woVlNfZ/q4gH3QjqhEafvOp9y0Hdg8fbr5zryCAkAQnRuy2xvkuhyrfiXg8ySrLHo73Yhg38rCQhHmTm8iheZIvou6XBWcDreYULGOrIHogeSEGoe0BcCqciVeklZMyehGQndeA/bxtqKiDgAEoLHgBeeXJAkxZMoReXMSdy6b/hfAM="],"X-MS-TrafficTypeDiagnostic":"CY4PR07MB3495:","X-Exchange-Antispam-Report-Test":"UriScan:;","X-Microsoft-Antispam-PRVS":"<CY4PR07MB3495416ADB59258472564E7197940@CY4PR07MB3495.namprd07.prod.outlook.com>","X-Exchange-Antispam-Report-CFA-Test":"BCL:0; PCL:0;\n\tRULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(5005006)(8121501046)(100000703101)(100105400095)(10201501046)(93006095)(3002001)(6041248)(20161123558100)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123555025)(20161123564025)(20161123560025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);\n\tSRVR:CY4PR07MB3495; BCL:0; PCL:0;\n\tRULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);\n\tSRVR:CY4PR07MB3495; ","X-Forefront-PRVS":"04238CD941","X-Forefront-Antispam-Report":"SFV:NSPM;\n\tSFS:(10009020)(6009001)(199003)(24454002)(189002)(377454003)(54094003)(39060400002)(106356001)(54906002)(47776003)(33646002)(105586002)(81156014)(8676002)(8936002)(93886005)(83506001)(229853002)(5660300001)(69596002)(81166006)(25786009)(189998001)(3846002)(64126003)(6116002)(6246003)(6506006)(6512007)(36756003)(6486002)(65826007)(31696002)(53936002)(478600001)(72206003)(4001350100001)(2906002)(42186005)(42882006)(4326008)(31686004)(50466002)(2950100002)(4000630100001)(53416004)(23676002)(6666003)(53546010)(230700001)(76176999)(65956001)(54356999)(7736002)(50986999)(101416001)(97736004)(305945005)(65806001)(66066001)(68736007);\n\tDIR:OUT; SFP:1101; SCL:1; SRVR:CY4PR07MB3495;\n\tH:ddl.caveonetworks.com; FPR:; SPF:None; PTR:InfoNoRecords;\n\tA:1; MX:1; LANG:en; ","Received-SPF":"None (protection.outlook.com: cavium.com does not designate\n\tpermitted sender hosts)","SpamDiagnosticOutput":"1:99","SpamDiagnosticMetadata":"NSPM","X-OriginatorOrg":"caviumnetworks.com","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"07 Sep 2017 00:10:05.2373\n\t(UTC)","X-MS-Exchange-CrossTenant-FromEntityHeader":"Hosted","X-MS-Exchange-CrossTenant-Id":"711e4ccf-2e9b-4bcf-a551-4094005b6194","X-MS-Exchange-Transport-CrossTenantHeadersStamped":"CY4PR07MB3495","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764460,"web_url":"http://patchwork.ozlabs.org/comment/1764460/","msgid":"<167efcdf-51bb-e460-caaf-6819d9026053@gmail.com>","list_archive_url":null,"date":"2017-09-07T01:41:06","subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","submitter":{"id":2800,"url":"http://patchwork.ozlabs.org/api/people/2800/","name":"Florian Fainelli","email":"f.fainelli@gmail.com"},"content":"On 09/06/2017 05:10 PM, David Daney wrote:\n> On 09/06/2017 04:14 PM, Florian Fainelli wrote:\n>> On 09/06/2017 03:51 PM, David Daney wrote:\n> [...]\n>>>\n>>> Consider instead the case of a Marvell phy with no interrupts connected\n>>> on a v4.9.43 kernel, single CPU:\n>>>\n>>>\n>>>    0)               |                 phy_disconnect() {\n>>>    0)               |                   phy_stop_machine() {\n>>>    0)               |                     cancel_delayed_work_sync() {\n>>>    0) + 23.986 us   |                     } /*\n>>> cancel_delayed_work_sync */\n>>>    0)               |                     phy_state_machine() {\n>>>    0)               |                       phy_start_aneg_priv() {\n>>\n>> Thanks for providing the trace, I think I have an idea of what's going\n>> on, see below.\n>>\n>>>    0)               |                         marvell_config_aneg() {\n>>>    0) ! 240.538 us  |                         } /*\n>>> marvell_config_aneg */\n>>>    0) ! 244.971 us  |                       } /* phy_start_aneg_priv */\n>>>    0)               |                       queue_delayed_work_on() {\n>>>    0) + 18.016 us   |                       } /*\n>>> queue_delayed_work_on */\n>>>    0) ! 268.184 us  |                     } /* phy_state_machine */\n>>>    0) ! 297.394 us  |                   } /* phy_stop_machine */\n>>>    0)               |                   phy_detach() {\n>>>    0)               |                     phy_suspend() {\n>>>    0)               |                       phy_ethtool_get_wol() {\n>>>    0)   0.677 us    |                       } /* phy_ethtool_get_wol */\n>>>    0)               |                       genphy_suspend() {\n>>>    0) + 71.250 us   |                       } /* genphy_suspend */\n>>>    0) + 74.197 us   |                     } /* phy_suspend */\n>>>    0) + 80.302 us   |                   } /* phy_detach */\n>>>    0) ! 380.072 us  |                 } /* phy_disconnect */\n>>> .\n>>> .\n>>> .\n>>>    0)               |  process_one_work() {\n>>>    0)               |    find_worker_executing_work() {\n>>>    0)   0.688 us    |    } /* find_worker_executing_work */\n>>>    0)               |    set_work_pool_and_clear_pending() {\n>>>    0)   0.734 us    |    } /* set_work_pool_and_clear_pending */\n>>>    0)               |    phy_state_machine() {\n>>>    0)               |      genphy_read_status() {\n>>>    0) ! 205.721 us  |      } /* genphy_read_status */\n>>>    0)               |      netif_carrier_off() {\n>>>    0)               |        do_page_fault() {\n>>>\n>>>\n>>> The do_page_fault() at the end indicates the NULL pointer dereference.\n>>>\n>>> That added call to phy_state_machine() turns the polling back on\n>>> unconditionally for a phy that should be disconnected.  How is that\n>>> correct?\n>>\n>> It is not fundamentally correct and I don't think there was any\n>> objection to that to begin with. In fact there is a bug/inefficiency\n>> here in that if we have entered the PHY state machine with PHY_HALTED we\n>> should not re-schedule it period, only applicable to PHY_POLL cases\n>> *and* properly calling phy_stop() followed by phy_disconnect().\n>>\n>> What I now think is happening in your case is the following:\n>>\n>> phy_stop() was not called, so nothing does set phydev->state to\n>> PHY_HALTED in the first place so we have:\n>>\n>> phy_disconnect()\n>> -> phy_stop_machine()\n>>     -> cancel_delayed_work_sync() OK\n>>         phydev->state is probably RUNNING so we have:\n>>         -> phydev->state = PHY_UP\n>>     phy_state_machine() is called synchronously\n>>     -> PHY_UP -> needs_aneg = true\n>>     -> phy_restart_aneg()\n>>     -> queue_delayed_work_sync()\n>> -> phydev->adjust_link = NULL\n>> -> phy_deatch() -> boom\n>>\n>> Can you confirm whether the driver you are using does call phy_stop()\n>> prior to phy_disconnect()? \n> \n> There is no call to phy_stop().\n\nOK this all makes sense now.\n\n> \n> I can add this to the ethernet drivers, but I wonder if it should be\n> called by the code code when doing phy_disconnect(), if it was not\n> already stopped.\n\nFixing the driver should be reasonably quick and easy and can be done\nindependently from fixing PHYLIB, but I agree that PHYLIB should be\nsafeguarded against such a case.\n\nOf course, now that I looked again at the code, there is really a ton of\nunnecessary workqueue scheduling going on, similarly to phy_stop()\nmaking us go from PHY_HALTED to PHY_HALTED, phy_start_machine() does the\nsame thing with PHY_READY -> PHY_READY, I suppose back when this was\ndone the assumption was that there is not going to be a tremendous\namount of time being spent between a call to\nphy_connect()/phy_start_machine() and phy_start() and respectively\nphy_stop() followed by a phy_disconnect(), oh well.\n\nNow that the revert is in 4.13 we can work on a solution that satisfies\neverybody on this thread.\n\nThanks!","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"iZphs4tV\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xnjpf0cP2z9ryk\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu,  7 Sep 2017 11:41:14 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1752968AbdIGBlL (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 6 Sep 2017 21:41:11 -0400","from mail-oi0-f65.google.com ([209.85.218.65]:38819 \"EHLO\n\tmail-oi0-f65.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1751287AbdIGBlK (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Wed, 6 Sep 2017 21:41:10 -0400","by mail-oi0-f65.google.com with SMTP id d10so4131577oih.5\n\tfor <netdev@vger.kernel.org>; Wed, 06 Sep 2017 18:41:10 -0700 (PDT)","from ?IPv6:2001:470:d:73f:4c67:8db8:cf7a:9670?\n\t([2001:470:d:73f:4c67:8db8:cf7a:9670])\n\tby smtp.gmail.com with ESMTPSA id\n\tq83sm1367049oif.4.2017.09.06.18.41.07\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tWed, 06 Sep 2017 18:41:09 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=gmail.com; s=20161025;\n\th=subject:to:cc:references:from:message-id:date:user-agent\n\t:mime-version:in-reply-to:content-language:content-transfer-encoding; \n\tbh=fLYLhSZPGH80yTNvHUsieUj2uLXJpdKhncKWTrswJbU=;\n\tb=iZphs4tV6iwWBDO53kYIqUFReCnjzFT6heNaZCXbpE4md77ewDVIrOJpDrY+WhiAt8\n\tCp9uZXReemc8MuHRPeNHHC9b1Xxd2M1sklBxA1vr0rcf5AOlmnfYFnnO2frImY0rRHQF\n\t72nuWopmsoMykRninigcvREAW+UyDqwqNqEeU/DFhEo3J+JNwnL9KSNvwcflIQ1xtaLZ\n\ts0OcUjzTivVfZkubYOWdruSzZi0v2bgLmcv1DdP7NFk23Sh7gJRVEU16wiu7Hr3vDZPl\n\tDABJArYpkev2WELArMe0NRt/RbX3Ct2dB5QDIjj0waElOEfXajv+HHut7PSvgk+edPRT\n\tpaYw==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:subject:to:cc:references:from:message-id:date\n\t:user-agent:mime-version:in-reply-to:content-language\n\t:content-transfer-encoding;\n\tbh=fLYLhSZPGH80yTNvHUsieUj2uLXJpdKhncKWTrswJbU=;\n\tb=nbxRAFbxVgTgeCtQWJ6xRR0GITe8r7hW3eg0k/AKAcdzQmtpUy/cdoVW0f57yZBITp\n\tqtCNuGz0gMSEtOnj6oz8EUP/J9XydffVp3dCZtTBsJsZISiSm0D91Lk5eSx/o1hE0zpE\n\tIVJ6vxSaAW4ZSqe5erCha4ThUJf0CQhc1ywJTNBj9fzW5pXO+UI/W1E0oPh7KX+h0h+V\n\tLt9LjFg+dB4bzdcume7o4eT+kM6sGJ1jw5xpt+qQ0MJycdtH1GFfLYEs3hghjFLfCcEE\n\tIwHAwjHsLvGpMuhndVCOejAAe0GZlCg/BybOaZ1LKiipebCq8PLOh/LEO6S8+7NdZAIt\n\tyrdg==","X-Gm-Message-State":"AHPjjUizKkvjXxO0hmXpWad4I04jOZsuxtWkwXINmg0GLrNcmsgbLn1J\n\tlMNN6RP8IjUszg==","X-Google-Smtp-Source":"ADKCNb5EqlhT+mGDijXCGqbvnIUMplZuk8S7n/xwYbfPj4itr0AUrfOCpnWJ8/Dv/bhfYidRFMm8zw==","X-Received":"by 10.202.199.129 with SMTP id x123mr1336035oif.67.1504748470151;\n\tWed, 06 Sep 2017 18:41:10 -0700 (PDT)","Subject":"Re: [PATCH net] Revert \"net: phy: Correctly process PHY_HALTED in\n\tphy_stop_machine()\"","To":"David Daney <ddaney@caviumnetworks.com>,\n\tDavid Daney <ddaney.cavm@gmail.com>, Mason <slash.tmp@free.fr>","Cc":"Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,\n\tnetdev <netdev@vger.kernel.org>,\n\tGeert Uytterhoeven <geert+renesas@glider.be>,\n\tDavid Miller <davem@davemloft.net>,\n\tAndrew Lunn <andrew@lunn.ch>, Mans Rullgard <mans@mansr.com>","References":"<1504140569-2063-1-git-send-email-f.fainelli@gmail.com>\n\t<f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>\n\t<e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com>\n\t<931bf454-81ff-94dc-82e6-bc2b889bd43a@gmail.com>\n\t<d6a6b552-95a7-8353-54c8-fa804f9366a1@free.fr>\n\t<f74f1aad-3990-ae54-316f-751c3b15de41@gmail.com>\n\t<ebee6e5d-5bc1-1c5b-b31d-6d50618d6074@free.fr>\n\t<4ea8b432-4968-1616-eff9-48a2689dd3ce@gmail.com>\n\t<ff070239-28b7-d41b-8abe-c9f810561372@caviumnetworks.com>\n\t<572f49fd-f623-f064-a551-e243c57cef7f@gmail.com>\n\t<6890a27f-e87e-62c1-a676-e5ddf968adb6@caviumnetworks.com>\n\t<4a65e53c-f13b-9cc3-bffa-f2f2aae423b9@gmail.com>\n\t<a4b70bf9-ec48-314d-b63d-e44f7cbb4bab@gmail.com>\n\t<64800ff2-201b-eb26-304e-1c4c6e0a6d5e@caviumnetworks.com>","From":"Florian Fainelli <f.fainelli@gmail.com>","Message-ID":"<167efcdf-51bb-e460-caaf-6819d9026053@gmail.com>","Date":"Wed, 6 Sep 2017 18:41:06 -0700","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.1","MIME-Version":"1.0","In-Reply-To":"<64800ff2-201b-eb26-304e-1c4c6e0a6d5e@caviumnetworks.com>","Content-Type":"text/plain; charset=windows-1252","Content-Language":"en-US","Content-Transfer-Encoding":"8bit","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}}]