{"id":804867,"url":"http://patchwork.ozlabs.org/api/1.2/patches/804867/?format=json","web_url":"http://patchwork.ozlabs.org/project/linux-pci/patch/599D3410.9050504@intel.com/","project":{"id":28,"url":"http://patchwork.ozlabs.org/api/1.2/projects/28/?format=json","name":"Linux PCI development","link_name":"linux-pci","list_id":"linux-pci.vger.kernel.org","list_email":"linux-pci@vger.kernel.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<599D3410.9050504@intel.com>","list_archive_url":null,"date":"2017-08-23T07:51:44","name":"Possible regression between 4.9 and 4.13","commit_ref":null,"pull_url":null,"state":"not-applicable","archived":false,"hash":"cfbce005f46806ee75389b5a8c0c2dd088ce77df","submitter":{"id":62746,"url":"http://patchwork.ozlabs.org/api/1.2/people/62746/?format=json","name":"Mathias Nyman","email":"mathias.nyman@intel.com"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/linux-pci/patch/599D3410.9050504@intel.com/mbox/","series":[],"comments":"http://patchwork.ozlabs.org/api/patches/804867/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/804867/checks/","tags":{},"related":[],"headers":{"Return-Path":"<linux-pci-owner@vger.kernel.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":"patchwork-incoming@bilbo.ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=linux-pci-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xcfgb073Zz9s9Y\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 23 Aug 2017 17:48:43 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1753416AbdHWHsl (ORCPT <rfc822;incoming@patchwork.ozlabs.org>);\n\tWed, 23 Aug 2017 03:48:41 -0400","from mga07.intel.com ([134.134.136.100]:19806 \"EHLO\n\tmga07.intel.com\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1753386AbdHWHsk (ORCPT <rfc822;linux-pci@vger.kernel.org>);\n\tWed, 23 Aug 2017 03:48:40 -0400","from fmsmga002.fm.intel.com ([10.253.24.26])\n\tby orsmga105.jf.intel.com with ESMTP; 23 Aug 2017 00:48:39 -0700","from mattu-haswell.fi.intel.com (HELO [10.237.72.164])\n\t([10.237.72.164])\n\tby fmsmga002.fm.intel.com with ESMTP; 23 Aug 2017 00:48:16 -0700"],"X-ExtLoop1":"1","X-IronPort-AV":"E=Sophos;i=\"5.41,415,1498546800\"; d=\"scan'208\";a=\"1209297573\"","Subject":"Re: Possible regression between 4.9 and 4.13","To":"Felipe Balbi <felipe.balbi@linux.intel.com>,\n\tMason <slash.tmp@free.fr>, linux-pci <linux-pci@vger.kernel.org>,\n\tlinux-usb <linux-usb@vger.kernel.org>,\n\tLinux ARM <linux-arm-kernel@lists.infradead.org>","References":"<4dee5523-2d76-e731-6e81-f3027e88827f@free.fr>\n\t<87a82qbyv5.fsf@linux.intel.com>","Cc":"Bjorn Helgaas <helgaas@kernel.org>,\n\tAlan Stern <stern@rowland.harvard.edu>,\n\tGreg Kroah-Hartman <gregkh@linuxfoundation.org>","From":"Mathias Nyman <mathias.nyman@intel.com>","Message-ID":"<599D3410.9050504@intel.com>","Date":"Wed, 23 Aug 2017 10:51:44 +0300","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101\n\tThunderbird/38.8.0","MIME-Version":"1.0","In-Reply-To":"<87a82qbyv5.fsf@linux.intel.com>","Content-Type":"text/plain; charset=windows-1252; format=flowed","Content-Transfer-Encoding":"7bit","Sender":"linux-pci-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<linux-pci.vger.kernel.org>","X-Mailing-List":"linux-pci@vger.kernel.org"},"content":"On 23.08.2017 09:07, Felipe Balbi wrote:\n>\n> Hi,\n>\n> Mason <slash.tmp@free.fr> writes:\n>> Hello,\n>>\n>> The driver for my system's PCIe host bridge landed recently\n>> (in 4.13) but it was developed on 4.9\n>>\n>> I tested the PCIe host bridge by plugging a 4-port USB3 adapter\n>> into the PCIe slot (system at rest) and plugging an USB3 Flash\n>> drive into the USB3 adapter (at run-time).\n>>\n>> On 4.9, the setup works (almost perfectly, see below).\n>> On 4.13, once I unplug the Flash drive, the controller port\n>> remains unresponsive.\n>>\n>>\n>> On 4.9, I said *almost* perfectly, because the pcieport driver\n>> does report a few non-fatal errors when I unplug:\n>>\n>> [  193.838504] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd\n>> [  193.878081] usb-storage 2-2:1.0: USB Mass Storage device detected\n>> [  193.884547] scsi host0: usb-storage 2-2:1.0\n>> [  194.907936] scsi 0:0:0:0: Direct-Access     Kingston DataTraveler 3.0      PQ: 0 ANSI: 6\n>> [  194.920296] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 GB/7.20 GiB)\n>> [  194.928666] sd 0:0:0:0: [sda] Write Protect is off\n>> [  194.933755] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA\n>> [  194.946074]  sda: sda1\n>> [  194.953608] sd 0:0:0:0: [sda] Attached SCSI removable disk\n>>\n>> [  208.930260] pcieport 0000:00:00.0: AER: Uncorrected (Non-Fatal) error received: id=0000\n>> [  208.938342] pcieport 0000:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0000(Requester ID)\n>> [  208.950163] pcieport 0000:00:00.0:   device [1105:0024] error status/mask=00004000/00000000\n>> [  208.958577] pcieport 0000:00:00.0:    [14] Completion Timeout     (First)\n>> [  208.965432] pcieport 0000:00:00.0: AER: Device recovery failed\n>> [  209.663733] xhci_hcd 0000:01:00.0: Cannot set link state.\n>> [  209.669194] usb usb2-port2: cannot disable (err = -32)\n>> [  209.674376] usb 2-2: USB disconnect, device number 2\n>> [  209.680481] pcieport 0000:00:00.0: AER: Uncorrected (Non-Fatal) error received: id=0000\n>> [  209.688689] pcieport 0000:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0000(Requester ID)\n>> [  209.700555] pcieport 0000:00:00.0:   device [1105:0024] error status/mask=00004000/00000000\n>> [  209.708978] pcieport 0000:00:00.0:    [14] Completion Timeout     (First)\n>> [  209.715845] pcieport 0000:00:00.0: AER: Device recovery failed\n>> [  209.721722] pcieport 0000:00:00.0: AER: Uncorrected (Non-Fatal) error received: id=0000\n>> [  209.729785] pcieport 0000:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0000(Requester ID)\n>> [  209.741602] pcieport 0000:00:00.0:   device [1105:0024] error status/mask=00004000/00000000\n>> [  209.750027] pcieport 0000:00:00.0:    [14] Completion Timeout     (First)\n>> [  209.756866] pcieport 0000:00:00.0: AER: Device recovery failed\n>>\n>> After that, I can still plug the drive into the same port.\n>>\n>> But on 4.13, I get\n>>\n>> [   27.330378] usb 2-2: new SuperSpeed USB device number 2 using xhci_hcd\n>> [   27.369383] usb-storage 2-2:1.0: USB Mass Storage device detected\n>> [   27.375840] scsi host0: usb-storage 2-2:1.0\n>> [   28.403035] scsi 0:0:0:0: Direct-Access     Kingston DataTraveler 3.0      PQ: 0 ANSI: 6\n>> [   28.413326] sd 0:0:0:0: [sda] 15109516 512-byte logical blocks: (7.74 GB/7.20 GiB)\n>> [   28.423653] sd 0:0:0:0: [sda] Write Protect is off\n>> [   28.429139] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA\n>> [   28.441529]  sda: sda1\n>> [   28.449431] sd 0:0:0:0: [sda] Attached SCSI removable disk\n>>\n>> [   90.592134] xhci_hcd 0000:01:00.0: xHCI host controller not responding, assume dead\n>> [   90.599857] xhci_hcd 0000:01:00.0: HC died; cleaning up\n>> [   90.605336] usb 2-2: USB disconnect, device number 2\n>> [   90.630414] udevd[955]: inotify_add_watch(6, /dev/sda, 10) failed: No such file or directory\n>>\n>> Trying to replug into the same port = nothing happens\n>> (Linux did say \"assume dead\")\n>>\n>> Any idea what could have changed between 4.9 and 4.13 ?\n>>\n>\n> Quite a bit:\n>\n> $ git rev-list --no-merges  --count v4.13-rc6 ^v4.9 -- drivers/usb/host/xhci drivers/usb/core/\n> 58\n>\n\nvery likely cause is the more aggressive detection of pci removed xhci hosts\n\nSee commit d9f11ba9f107aa335091ab8d7ba5eea714e46e8b\n     xhci: Rework how we handle unresponsive or hoptlug removed hosts\n\nIt checks if a xhci register reads returns 0xffffffff and assumes xhci\ndied in that case.\n\nCould you add something like the below to check which what is killing the host?\nOr a BUG()/WARN() in xhci_hc_died() to get a backtrace of who called it.\n\n\n\nThanks\nMathias","diff":"diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c\nindex 51cd4b8..ade2ad6 100644\n--- a/drivers/usb/host/xhci-ring.c\n+++ b/drivers/usb/host/xhci-ring.c\n@@ -922,7 +922,8 @@ void xhci_hc_died(struct xhci_hcd *xhci)\n         if (xhci->xhc_state & XHCI_STATE_DYING)\n                 return;\n  \n-       xhci_err(xhci, \"xHCI host controller not responding, assume dead\\n\");\n+       xhci_err(xhci, \"xHC not responding in %pf, assume controller is dead\\n\",\n+                __builtin_return_address(0));\n         xhci->xhc_state |= XHCI_STATE_DYING;\n  \n         xhci_cleanup_command_queue(xhci);\n","prefixes":[]}