From patchwork Mon Jun 25 07:47:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pingfan Liu X-Patchwork-Id: 934100 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41DhH92MPCz9ryk for ; Mon, 25 Jun 2018 17:52:53 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="hU1yrn8g"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 41DhH86ssLzF1Gj for ; Mon, 25 Jun 2018 17:52:52 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="hU1yrn8g"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:400e:c01::241; helo=mail-pl0-x241.google.com; envelope-from=kernelfans@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="hU1yrn8g"; dkim-atps=neutral Received: from mail-pl0-x241.google.com (mail-pl0-x241.google.com [IPv6:2607:f8b0:400e:c01::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 41Dh9V2P70zF18l for ; Mon, 25 Jun 2018 17:47:58 +1000 (AEST) Received: by mail-pl0-x241.google.com with SMTP id g20-v6so6444195plq.1 for ; Mon, 25 Jun 2018 00:47:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=BzaOHQJzsfa3ParVwOyiRh2VXWWpE9DRaIR73EkU7tA=; b=hU1yrn8grpIBz05rPXMk7nzEn7ujdzPsJOyPCwcLHavxAF7Fd++1yCbBXAYEKqBB7N ET0uiCIa0vKYls8Sk/lN/4X0FutFLESSxokl3UFzq3LXcGu7IW41ljPczBrF6fRPrNpf t1s+3CQyJ1/yV5KTOMLbg+DzpR55aRVDAdMXluWfjJfJprMxsTDG0SVNF/nq44eMC7D4 62HL5PZVWBOBe/z8qVQEpuPoE51ux3vQ7AtivwD5p2t2diZPq/GiIKpOiCSyOy/Fantf B96MNjyq6VRNamR9ShEud3UWJHuCcor8ySlZpXUBKfWA6ezW4IaxY0PZSLFxqYNqt6Mz qRJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=BzaOHQJzsfa3ParVwOyiRh2VXWWpE9DRaIR73EkU7tA=; b=t/hzL+rVfe0fyptbDwGv5Ep5vZqn6Hw7nfGMIAcCeEziwsCqg/cC45CGNYsqomiL2W MTVXLZ5pZ9u8rMN18EEL9IzDGDURUv9TKaqFLH3CF3u8YZ03QNiXUh1djcPmv1WDe4AI m9sM081nlKeFj6ZUgVurGp04XFiKL2YzP5Vs9r2iXl9P5TUlwcx8VkZUgBd+JHZF6eWI 4dKm4ZhIB6CA0afAkIeseqGBmzSj4WzMDEOjhVNZpBrQok6r/tPeHJjsAwD9A1bswLn1 UrpKOHGzK8aBYc0t6hhj56/uE0Yjt+EUdBeSuhmwF+DaUnwjg2Qx6FbRO92lh/mvUOlQ PNvg== X-Gm-Message-State: APt69E33W+KkjRJN13DYuWHnj7m+RM6C1Wuqg0dgPuu8AhzZic0nk2qm QV6+rSm5d6D3kbe3HXMQqA== X-Google-Smtp-Source: ADUXVKK4eOsMRiplG82h3qgDpqcSm/f0OWr/mqNycQCmV9rfVZa4Q03GWOxD2Wwuwhk7Z2odBKRiCw== X-Received: by 2002:a17:902:585:: with SMTP id f5-v6mr11381540plf.142.1529912875732; Mon, 25 Jun 2018 00:47:55 -0700 (PDT) Received: from mylaptop.nay.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id p12-v6sm27350460pfi.175.2018.06.25.00.47.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Jun 2018 00:47:55 -0700 (PDT) From: Pingfan Liu To: linux-kernel@vger.kernel.org Subject: [PATCHv2 1/2] drivers/base: only reordering consumer device when probing Date: Mon, 25 Jun 2018 15:47:38 +0800 Message-Id: <1529912859-10475-2-git-send-email-kernelfans@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1529912859-10475-1-git-send-email-kernelfans@gmail.com> References: <1529912859-10475-1-git-send-email-kernelfans@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Grygorii Strashko , Greg Kroah-Hartman , linuxppc-dev@lists.ozlabs.org, Pingfan Liu , Christoph Hellwig , Bjorn Helgaas , linux-pci@vger.kernel.org, Dave Young Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" commit 52cdbdd49853 ("driver core: correct device's shutdown order") places an assumption of supplier<-consumer order on the process of probe. But it turns out to break down the parent <- child order in some scene. E.g in pci, a bridge is enabled by pci core, and behind it, the devices have been probed. Then comes the bridge's module, which enables extra feature(such as hotplug) on this bridge. This will break the parent<-children order and cause failure when "kexec -e" in some scenario. The detailed description of the scenario: An IBM Power9 machine on which, two drivers portdrv_pci and shpchp(a mod) match the PCI_CLASS_BRIDGE_PCI, but neither of them success to probe due to some issue. For this case, the bridge is moved after its children in devices_kset. Then, when "kexec -e", a ata-disk behind the bridge can not write back buffer in flight due to the former shutdown of the bridge which clears the BusMaster bit. To fix this issue, only reordering a device and all of its children if it is a consumer. Note, the bridge involved: 0004:00:00.0 PCI bridge: IBM Device 04c1 (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [48] Express (v2) Root Port (Slot-), MSI 00 DevCap: MaxPayload 512 bytes, PhantFunc 0 ExtTag- RBE+ DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 256 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM not supported, Exit Latency L0s <64ns, L1 <1us ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp- LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt+ RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd+ DevCtl2: Completion Timeout: 16ms to 55ms, TimeoutDis+, LTR-, OBFF Disabled ARIFwd+ LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+ EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP+ FCP- CmpltTO+ CmpltAbrt+ UnxCmplt- RxOF- MalfTLP- ECRC+ UnsupReq- ACSViol- UESvrt: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+ Capabilities: [148 v1] #19 Cc: Greg Kroah-Hartman Cc: Grygorii Strashko Cc: Christoph Hellwig Cc: Bjorn Helgaas Cc: Dave Young Cc: linux-pci@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Pingfan Liu --- note: this patch points out the code where the bug is introduced. and I hope it sketches out the scene. Will fold the series in one. --- drivers/base/base.h | 1 + drivers/base/core.c | 9 +++++++++ drivers/base/dd.c | 9 ++------- 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/base/base.h b/drivers/base/base.h index a75c302..37f86ca 100644 --- a/drivers/base/base.h +++ b/drivers/base/base.h @@ -135,6 +135,7 @@ extern void device_unblock_probing(void); /* /sys/devices directory */ extern struct kset *devices_kset; +extern int device_reorder_consumer(struct device *dev); extern void devices_kset_move_last(struct device *dev); #if defined(CONFIG_MODULES) && defined(CONFIG_SYSFS) diff --git a/drivers/base/core.c b/drivers/base/core.c index 36622b5..66f06ff 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -123,6 +123,15 @@ static int device_is_dependent(struct device *dev, void *target) return ret; } +/* a temporary place holder to mark out the root cause of the bug. + * The proposal algorithm will come in next patch + */ +int device_reorder_consumer(struct device *dev) +{ + devices_kset_move_last(dev); + return 0; +} + static int device_reorder_to_tail(struct device *dev, void *not_used) { struct device_link *link; diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 1435d72..c74f23c 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -434,13 +434,8 @@ static int really_probe(struct device *dev, struct device_driver *drv) goto probe_failed; } - /* - * Ensure devices are listed in devices_kset in correct order - * It's important to move Dev to the end of devices_kset before - * calling .probe, because it could be recursive and parent Dev - * should always go first - */ - devices_kset_move_last(dev); + /* only reoder consumer and its children after suppliers.*/ + device_reorder_consumer(dev); if (dev->bus->probe) { ret = dev->bus->probe(dev);