diff mbox series

[7/9] powerpc/pseries/eeh: Rework device EEH PE determination

Message ID 20200910054532.2043724-8-oohall@gmail.com (mailing list archive)
State Superseded
Headers show
Series [1/9] powerpc/eeh: Rework EEH initialisation | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (4b552a4cbf286ff9dcdab19153f3c1c7d1680fab)
snowpatch_ozlabs/checkpatch success total: 0 errors, 0 warnings, 0 checks, 83 lines checked
snowpatch_ozlabs/needsstable success Patch has no Fixes tags

Commit Message

Oliver O'Halloran Sept. 10, 2020, 5:45 a.m. UTC
The process Linux uses for determining if a device supports EEH or not
appears to be at odds with what PAPR+ says the OS should be doing. The
current flow is something like:

1. Assume pe_config_addr is equal the the device's config_addr.
2. Attempt to enable EEH on that PE
3. Verify EEH was enabled (POWER4 bug workaround)
4. Try find the pe_config_addr using the ibm,get-config-addr-info2 RTAS
   call.
5. If that fails walk the pci_dn tree upwards trying to find a parent
   device with EEH support. If we find one then add the device to that PE.

The first major flaw with this is that in order to enable EEH on a PE we
need to know the PE's configuration address since that's an input to the
ibm,set-eeh-option RTAS call which is used to enable EEH for the PE. We
hack around that by assuming that the PE address is equal to the device's
RTAS config address with the register fields set to zero (see
rtas_config_addr()). This assumption happens to be valid if:

a) The PCI device is the 0th function, and
b) The device is on the PE's root bus.

However, it this does also appear to work for devices where these
conditions are not true. At a guess PowerVM's RTAS has some workarounds to
accommodate Linux's quirks. However, it's a bit sketch and the code is
confusing since it's not implementing what PAPR claims is the correct way.

This patch re-works how we handle EEH init so that we find the PE config
address using the ibm,get-config-addr-info2 RTAS call, then use that to
finish the EEH init process. It also drops the Power4 workaround since as
of commit 471d7ff8b51b ("powerpc/64s: Remove POWER4 support") the kernel
does not support running on a Power4 CPU.

1. Find the pe_config_addr using the RTAS call.
2. Enable the PE (if needed)
3. Insert the edev into the tree and create an eeh_pe if needed.

The other change made here is ignoring unsupported devices entirely.
Currently the device's BARs are saved to the eeh_dev even if the device is
not part of an EEH PE. Not being part of a PE means that an EEH recovery
pass will never see that device so the saving the BARs is pointless.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/platforms/pseries/eeh_pseries.c | 57 ++++++++------------
 1 file changed, 22 insertions(+), 35 deletions(-)
diff mbox series

Patch

diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 10303de3d8d5..c2ecc0db2f94 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -357,10 +357,10 @@  static struct eeh_pe *pseries_eeh_pe_get_parent(struct eeh_dev *edev)
  */
 void pseries_eeh_init_edev(struct pci_dn *pdn)
 {
+	struct eeh_pe pe, *parent;
 	struct eeh_dev *edev;
-	struct eeh_pe pe;
+	int addr;
 	u32 pcie_flags;
-	int enable = 0;
 	int ret;
 
 	if (WARN_ON_ONCE(!eeh_has_flag(EEH_PROBE_MODE_DEVTREE)))
@@ -417,51 +417,38 @@  void pseries_eeh_init_edev(struct pci_dn *pdn)
 		}
 	}
 
-	/* Initialize the fake PE */
+	/* first up, find the pe_config_addr for the PE containing the device */
+	addr = pseries_eeh_get_pe_config_addr(pdn);
+	if (addr == 0) {
+		eeh_edev_dbg(edev, "Unable to find pe_config_addr\n");
+		goto err;
+	}
+
+	/* Try enable EEH on the fake PE */
 	memset(&pe, 0, sizeof(struct eeh_pe));
 	pe.phb = pdn->phb;
-	pe.config_addr = (pdn->busno << 16) | (pdn->devfn << 8);
+	pe.addr = addr;
 
-	/* Enable EEH on the device */
 	eeh_edev_dbg(edev, "Enabling EEH on device\n");
 	ret = eeh_ops->set_option(&pe, EEH_OPT_ENABLE);
 	if (ret) {
 		eeh_edev_dbg(edev, "EEH failed to enable on device (code %d)\n", ret);
-	} else {
-		struct eeh_pe *parent;
+		goto err;
+	}
 
-		/* Retrieve PE address */
-		edev->pe_config_addr = pseries_eeh_get_pe_config_addr(pdn);
-		pe.addr = edev->pe_config_addr;
+	edev->pe_config_addr = addr;
 
-		/* Some older systems (Power4) allow the ibm,set-eeh-option
-		 * call to succeed even on nodes where EEH is not supported.
-		 * Verify support explicitly.
-		 */
-		ret = eeh_ops->get_state(&pe, NULL);
-		if (ret > 0 && ret != EEH_STATE_NOT_SUPPORT)
-			enable = 1;
+	eeh_add_flag(EEH_ENABLED);
 
-		/*
-		 * This device doesn't support EEH, but it may have an
-		 * EEH parent. In this case any error on the device will
-		 * freeze the PE of it's upstream bridge, so added it to
-		 * the upstream PE.
-		 */
-		parent = pseries_eeh_pe_get_parent(edev);
-		if (parent && !enable)
-			edev->pe_config_addr = parent->addr;
+	parent = pseries_eeh_pe_get_parent(edev);
+	eeh_pe_tree_insert(edev, parent);
+	eeh_save_bars(edev);
+	eeh_edev_dbg(edev, "EEH enabled for device");
 
-		if (enable || parent) {
-			eeh_add_flag(EEH_ENABLED);
-			eeh_pe_tree_insert(edev, parent);
-		}
-		eeh_edev_dbg(edev, "EEH is %s on device (code %d)\n",
-			     (enable ? "enabled" : "unsupported"), ret);
-	}
+	return;
 
-	/* Save memory bars */
-	eeh_save_bars(edev);
+err:
+	eeh_edev_dbg(edev, "EEH is unsupported on device (code = %d)\n", ret);
 }
 
 static struct eeh_dev *pseries_eeh_probe(struct pci_dev *pdev)