diff mbox

[1/3] core/pci: Fix lost NVMe adapter behind PMC 8546 switch

Message ID 1490151841-13124-1-git-send-email-gwshan@linux.vnet.ibm.com
State Superseded
Headers show

Commit Message

Gavin Shan March 22, 2017, 3:03 a.m. UTC
The NVMe adapter in below PCI topology is lost. The root cause is
the presence bit on its PCI slot is missed, but the PCIe link has
been up. The PCI core doesn't probe the adapter behind the slot,
leading to lost NVMe adapter in the particular case.

   PHB3 root port
   PLX switch 8748 (10b5:8748)
   PLX swich 9733 (10b5:9733)
   PMC 8546 swtich (11f8:8546)
   NVMe adapter (1c58:0023)

This fixes the issue by overriding the PCI slot presence bit with
PCIe link state bit.

Reported-by: Mark E Schreiter <markes@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 core/pci.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

Comments

Gavin Shan March 29, 2017, 3:01 a.m. UTC | #1
On Wed, Mar 22, 2017 at 02:03:59PM +1100, Gavin Shan wrote:
>The NVMe adapter in below PCI topology is lost. The root cause is
>the presence bit on its PCI slot is missed, but the PCIe link has
>been up. The PCI core doesn't probe the adapter behind the slot,
>leading to lost NVMe adapter in the particular case.
>
>   PHB3 root port
>   PLX switch 8748 (10b5:8748)
>   PLX swich 9733 (10b5:9733)
>   PMC 8546 swtich (11f8:8546)
>   NVMe adapter (1c58:0023)
>
>This fixes the issue by overriding the PCI slot presence bit with
>PCIe link state bit.
>
>Reported-by: Mark E Schreiter <markes@us.ibm.com>
>Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Please hold for merging. With PATCH[2/3] and PATCH[3/3] applied,
The samsung NVMe adapter is still missed when it's connected to
PMC PCIe switch. For this case, the proper fix would be to disable
the CRC generation/check on the root port, not on the endpoints.
I'll rework PATCH[2/3] and PATCH[3/3] and post an updated version.

Thanks,
Gavin
diff mbox

Patch

diff --git a/core/pci.c b/core/pci.c
index ecb94c2..6864e6f 100644
--- a/core/pci.c
+++ b/core/pci.c
@@ -360,12 +360,19 @@  static bool pci_enable_bridge(struct phb *phb, struct pci_device *pd)
 	uint16_t bctl;
 	bool was_reset = false;
 	int64_t ecap = 0;
+	uint32_t lcap = 0;
+	uint16_t lstat;
 
 	/* Disable master aborts, clear errors */
 	pci_cfg_read16(phb, pd->bdfn, PCI_CFG_BRCTL, &bctl);
 	bctl &= ~PCI_CFG_BRCTL_MABORT_REPORT;
 	pci_cfg_write16(phb, pd->bdfn, PCI_CFG_BRCTL, bctl);
 
+	if (pci_has_cap(pd, PCI_CFG_CAP_ID_EXP, false)) {
+		ecap = pci_cap(pd, PCI_CFG_CAP_ID_EXP, false);
+		pci_cfg_read32(phb, pd->bdfn, ecap+PCICAP_EXP_LCAP, &lcap);
+	}
+
 	/* PCI-E bridge, check the slot state. We don't do that on the
 	 * root complex as this is handled separately and not all our
 	 * RCs implement the standard register set.
@@ -374,7 +381,21 @@  static bool pci_enable_bridge(struct phb *phb, struct pci_device *pd)
 	    pd->dev_type == PCIE_TYPE_SWITCH_DNPORT) {
 		uint16_t slctl, slcap, slsta, lctl;
 
-		ecap = pci_cap(pd, PCI_CFG_CAP_ID_EXP, false);
+		/*
+		 * No need to touch the power supply if the PCIe link has
+		 * been up. Further more, the slot presence bit is lost while
+		 * the PCIe link is up on the specific PCI topology. In that
+		 * case, we need ignore the slot presence bit and go ahead for
+		 * probing. Otherwise, the NVMe adapter won't be probed.
+		 *
+		 * PHB3 root port, PLX switch 8748 (10b5:8748), PLX swich 9733
+		 * (10b5:9733), PMC 8546 swtich (11f8:8546), NVMe adapter
+		 * (1c58:0023).
+		 */
+		pci_cfg_read16(phb, pd->bdfn, ecap+PCICAP_EXP_LSTAT, &lstat);
+		if ((lcap & PCICAP_EXP_LCAP_DL_ACT_REP) &&
+		    (lstat & PCICAP_EXP_LSTAT_DLLL_ACT))
+			return true;
 
 		/* Read the slot status & check for presence detect */
 		pci_cfg_read16(phb, pd->bdfn, ecap+PCICAP_EXP_SLOTSTAT, &slsta);
@@ -430,11 +451,6 @@  static bool pci_enable_bridge(struct phb *phb, struct pci_device *pd)
 	/* PCI-E bridge, wait for link */
 	if (pd->dev_type == PCIE_TYPE_ROOT_PORT ||
 	    pd->dev_type == PCIE_TYPE_SWITCH_DNPORT) {
-		uint32_t lcap;
-
-		/* Read link caps */
-		pci_cfg_read32(phb, pd->bdfn, ecap+PCICAP_EXP_LCAP, &lcap);
-
 		/* Did link capability say we got reporting ?
 		 *
 		 * If yes, wait up to 10s, if not, wait 1s if we didn't already