[v2,11/15] opal/hmi: Fix handling of TFMR parity/corrupt error.

Message ID 152390004962.2566.14004738930530913719.stgit@jupiter.in.ibm.com
State Accepted
Headers show
Series
  • opal/hmi: Rework HMI handling.
Related show

Commit Message

Mahesh Jagannath Salgaonkar April 16, 2018, 5:34 p.m.
From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

While testing TFMR parity/corrupt error it has been observed that HMIs are
delivered twice for this error
- First time HMI is delivered with HMER[4,5]=1 and TFMR[60]=1.
- Second time HMI is delivered with HMER[4,5]=1 and TFMR[60]=0 with valid TB.

On second HMI we end up throwing below error message even though TB is in
valid state.

	"HMI: TB invalid without core error reported"

This patch fixes this issue by ignoring HMER[5] and checking only for
TFMR[60] before setting this_cpu()->tb_invalid to true.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 core/hmi.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Patch

diff --git a/core/hmi.c b/core/hmi.c
index d9dd83c62..b01a2bf32 100644
--- a/core/hmi.c
+++ b/core/hmi.c
@@ -1045,14 +1045,13 @@  error_out:
 	return recover;
 }
 
-static int handle_tfac_errors(uint64_t hmer, struct OpalHMIEvent *hmi_evt,
-			      uint64_t *out_flags)
+static int handle_tfac_errors(struct OpalHMIEvent *hmi_evt, uint64_t *out_flags)
 {
 	int recover = -1;
 	uint64_t tfmr = mfspr(SPR_TFMR);
 
-	/* A TFMR parity error makes us ignore all the local stuff */
-	if ((hmer & SPR_HMER_TFMR_PARITY_ERROR) || (tfmr & SPR_TFMR_TFMR_CORRUPT)) {
+	/* A TFMR parity/corrupt error makes us ignore all the local stuff.*/
+	if (tfmr & SPR_TFMR_TFMR_CORRUPT) {
 		/* Mark TB as invalid for now as we don't trust TFMR, we'll fix
 		 * it up later
 		 */
@@ -1160,7 +1159,7 @@  static int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt,
 		hmi_print_debug("Timer Facility Error", hmer);
 		handled = hmer & (SPR_HMER_TFAC_ERROR | SPR_HMER_TFMR_PARITY_ERROR);
 		mtspr(SPR_HMER, ~handled);
-		recover = handle_tfac_errors(hmer, hmi_evt, out_flags);
+		recover = handle_tfac_errors(hmi_evt, out_flags);
 		handled = 0;
 	}