diff mbox series

[v4] powerpc/pseries: Handle UE event for memcpy_mcsafe

Message ID 20200326184916.31172-1-ganeshgr@linux.ibm.com (mailing list archive)
State Accepted
Commit efbc4303b255bb80ab1283794b36dd5fe1fb0ec3
Headers show
Series [v4] powerpc/pseries: Handle UE event for memcpy_mcsafe | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (c6624071c338732402e8c726df6a4074473eaa0e)
snowpatch_ozlabs/build-ppc64le success Build succeeded
snowpatch_ozlabs/build-ppc64be success Build succeeded
snowpatch_ozlabs/build-ppc64e success Build succeeded
snowpatch_ozlabs/build-pmac32 success Build succeeded
snowpatch_ozlabs/checkpatch warning total: 0 errors, 0 warnings, 1 checks, 59 lines checked
snowpatch_ozlabs/needsstable success Patch has no Fixes tags

Commit Message

Ganesh Goudar March 26, 2020, 6:49 p.m. UTC
memcpy_mcsafe has been implemented for power machines which is used
by pmem infrastructure, so that an UE encountered during memcpy from
pmem devices would not result in panic instead a right error code
is returned. The implementation expects machine check handler to ignore
the event and set nip to continue the execution from fixup code.

Appropriate changes are already made to powernv machine check handler,
make similar changes to pseries machine check handler to ignore the
the event and set nip to continue execution at the fixup entry if we
hit UE at an instruction with a fixup entry.

while we are at it, have a common function which searches the exception
table entry and updates nip with fixup address, and any future common
changes can be made in this function that are valid for both architectures.

powernv changes are made by
commit 895e3dceeb97 ("powerpc/mce: Handle UE event for memcpy_mcsafe")

Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reviewed-by: Santosh S <santosh@fossix.org>
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
---
V2: Fixes a trivial checkpatch error in commit msg.
V3: Use proper subject prefix.
V4: Rephrase the commit message.
    Define a common function to update nip with fixup address.
---
 arch/powerpc/include/asm/mce.h       |  2 ++
 arch/powerpc/kernel/mce.c            | 14 ++++++++++++++
 arch/powerpc/kernel/mce_power.c      |  8 ++------
 arch/powerpc/platforms/pseries/ras.c |  3 +++
 4 files changed, 21 insertions(+), 6 deletions(-)

Comments

Michael Ellerman April 1, 2020, 12:53 p.m. UTC | #1
On Thu, 2020-03-26 at 18:49:16 UTC, Ganesh Goudar wrote:
> memcpy_mcsafe has been implemented for power machines which is used
> by pmem infrastructure, so that an UE encountered during memcpy from
> pmem devices would not result in panic instead a right error code
> is returned. The implementation expects machine check handler to ignore
> the event and set nip to continue the execution from fixup code.
> 
> Appropriate changes are already made to powernv machine check handler,
> make similar changes to pseries machine check handler to ignore the
> the event and set nip to continue execution at the fixup entry if we
> hit UE at an instruction with a fixup entry.
> 
> while we are at it, have a common function which searches the exception
> table entry and updates nip with fixup address, and any future common
> changes can be made in this function that are valid for both architectures.
> 
> powernv changes are made by
> commit 895e3dceeb97 ("powerpc/mce: Handle UE event for memcpy_mcsafe")
> 
> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Reviewed-by: Santosh S <santosh@fossix.org>
> Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/efbc4303b255bb80ab1283794b36dd5fe1fb0ec3

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 6a6ddaabdb34..376a395daf32 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -218,6 +218,8 @@  extern void machine_check_queue_event(void);
 extern void machine_check_print_event_info(struct machine_check_event *evt,
 					   bool user_mode, bool in_guest);
 unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
+extern void mce_common_process_ue(struct pt_regs *regs,
+				  struct mce_error_info *mce_err);
 #ifdef CONFIG_PPC_BOOK3S_64
 void flush_and_reload_slb(void);
 #endif /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 34c1001e9e8b..8077b5fb18a7 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -15,6 +15,7 @@ 
 #include <linux/percpu.h>
 #include <linux/export.h>
 #include <linux/irq_work.h>
+#include <linux/extable.h>
 
 #include <asm/machdep.h>
 #include <asm/mce.h>
@@ -251,6 +252,19 @@  void machine_check_queue_event(void)
 	/* Queue irq work to process this event later. */
 	irq_work_queue(&mce_event_process_work);
 }
+
+void mce_common_process_ue(struct pt_regs *regs,
+			   struct mce_error_info *mce_err)
+{
+	const struct exception_table_entry *entry;
+
+	entry = search_kernel_exception_table(regs->nip);
+	if (entry) {
+		mce_err->ignore_event = true;
+		regs->nip = extable_fixup(entry);
+	}
+}
+
 /*
  * process pending MCE event from the mce event queue. This function will be
  * called during syscall exit.
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index 1cbf7f1a4e3d..067b094bfeff 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -579,14 +579,10 @@  static long mce_handle_ue_error(struct pt_regs *regs,
 				struct mce_error_info *mce_err)
 {
 	long handled = 0;
-	const struct exception_table_entry *entry;
 
-	entry = search_kernel_exception_table(regs->nip);
-	if (entry) {
-		mce_err->ignore_event = true;
-		regs->nip = extable_fixup(entry);
+	mce_common_process_ue(regs, mce_err);
+	if (mce_err->ignore_event)
 		return 1;
-	}
 
 	/*
 	 * On specific SCOM read via MMIO we may get a machine check
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index 43710b69e09e..1d1da639b8b7 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -558,6 +558,9 @@  static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
 	switch (mce_log->error_type) {
 	case MC_ERROR_TYPE_UE:
 		mce_err.error_type = MCE_ERROR_TYPE_UE;
+		mce_common_process_ue(regs, &mce_err);
+		if (mce_err.ignore_event)
+			disposition = RTAS_DISP_FULLY_RECOVERED;
 		switch (err_sub_type) {
 		case MC_ERROR_UE_IFETCH:
 			mce_err.u.ue_error_type = MCE_UE_ERROR_IFETCH;