diff mbox

powerpc/eeh: Dump PHB diag-data for non-existing PE

Message ID 1431414332-29061-1-git-send-email-gwshan@linux.vnet.ibm.com (mailing list archive)
State Accepted
Delegated to: Michael Ellerman
Headers show

Commit Message

Gavin Shan May 12, 2015, 7:05 a.m. UTC
When detecting EEH error on non-existing PE, including the reserved
one, the PE is simply unfrozen without dumping the PHB diag-data,
which is useful for locating the root cause of the EEH error. The
patch dumps the PHB diag-data when non-existing PE reports error.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/eeh-powernv.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Michael Ellerman July 22, 2015, 9:44 a.m. UTC | #1
On Tue, 2015-12-05 at 07:05:32 UTC, Gavin Shan wrote:
> When detecting EEH error on non-existing PE, including the reserved
> one, the PE is simply unfrozen without dumping the PHB diag-data,
> which is useful for locating the root cause of the EEH error. The
> patch dumps the PHB diag-data when non-existing PE reports error.
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/79cd95200035fb4b39b0

cheers
diff mbox

Patch

diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 9e001e1..bbcf81f 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -1394,11 +1394,19 @@  static int pnv_eeh_next_error(struct eeh_pe **pe)
 			 */
 			if (pnv_eeh_get_pe(hose,
 				be64_to_cpu(frozen_pe_no), pe)) {
-				/* Try best to clear it */
 				pr_info("EEH: Clear non-existing PHB#%x-PE#%llx\n",
 					hose->global_number, be64_to_cpu(frozen_pe_no));
 				pr_info("EEH: PHB location: %s\n",
 					eeh_pe_loc_get(phb_pe));
+
+				/* Dump PHB diag-data */
+				rc = opal_pci_get_phb_diag_data2(phb->opal_id,
+					phb->diag.blob, PNV_PCI_DIAG_BUF_SIZE);
+				if (rc == OPAL_SUCCESS)
+					pnv_pci_dump_phb_diag_data(hose,
+							phb->diag.blob);
+
+				/* Try best to clear it */
 				opal_pci_eeh_freeze_clear(phb->opal_id,
 					frozen_pe_no,
 					OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);