diff mbox series

[3/4] powerpc/tm: Unset MSR[TS] if not recheckpointing

Message ID 1543263121-9590-3-git-send-email-leitao@debian.org (mailing list archive)
State Accepted
Commit 6f5b9f018f4c7686fd944d920209d1382d320e4e
Headers show
Series [1/4] powerpc/tm: Save MSR to PACA before RFID | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success next/apply_patch Successfully applied
snowpatch_ozlabs/checkpatch success total: 0 errors, 0 warnings, 0 checks, 62 lines checked

Commit Message

Breno Leitao Nov. 26, 2018, 8:12 p.m. UTC
There is a TM Bad Thing bug that can be caused when you return from a
signal context in a suspended transaction but with ucontext MSR[TS] unset.

This forces regs->msr[TS] to be set at syscall entrance (since the CPU
state is transactional). It also calls treclaim() to flush the transaction
state, which is done based on the live (mfmsr) MSR state.

Since user context MSR[TS] is not set, then restore_tm_sigcontexts() is not
called, thus, not executing recheckpoint, keeping the CPU state as not
transactional. When calling rfid, SRR1 will have MSR[TS] set, but the CPU
state is non transactional, causing the TM Bad Thing with the following
stack:

	[   33.862316] Bad kernel stack pointer 3fffd9dce3e0 at c00000000000c47c
	cpu 0x8: Vector: 700 (Program Check) at [c00000003ff7fd40]
	    pc: c00000000000c47c: fast_exception_return+0xac/0xb4
	    lr: 00003fff865f442c
	    sp: 3fffd9dce3e0
	   msr: 8000000102a03031
	  current = 0xc00000041f68b700
	  paca    = 0xc00000000fb84800   softe: 0        irq_happened: 0x01
	    pid   = 1721, comm = tm-signal-sigre
	Linux version 4.9.0-3-powerpc64le (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26)
	WARNING: exception is not recoverable, can't continue

The same problem happens on 32-bits signal handler, and the fix is very
similar, if tm_recheckpoint() is not executed, then regs->msr[TS] should be
zeroed.

This patch also fixes a sparse warning related to lack of indentation when
CONFIG_PPC_TRANSACTIONAL_MEM is set.

Fixes: 2b0a576d15e0e ("powerpc: Add new transactional memory state to the signal context")
CC: Stable <stable@vger.kernel.org>	# 3.10+
Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/signal_32.c | 18 +++++++++++++-----
 arch/powerpc/kernel/signal_64.c | 20 ++++++++++++++++----
 2 files changed, 29 insertions(+), 9 deletions(-)

Comments

Michal Suchánek Dec. 7, 2018, 1:48 p.m. UTC | #1
On Mon, 26 Nov 2018 18:12:00 -0200
Breno Leitao <leitao@debian.org> wrote:

> There is a TM Bad Thing bug that can be caused when you return from a
> signal context in a suspended transaction but with ucontext MSR[TS] unset.
> 
> This forces regs->msr[TS] to be set at syscall entrance (since the CPU
> state is transactional). It also calls treclaim() to flush the transaction
> state, which is done based on the live (mfmsr) MSR state.
> 
> Since user context MSR[TS] is not set, then restore_tm_sigcontexts() is not
> called, thus, not executing recheckpoint, keeping the CPU state as not
> transactional. When calling rfid, SRR1 will have MSR[TS] set, but the CPU
> state is non transactional, causing the TM Bad Thing with the following
> stack:
> 

Works for me on Linux 4.4 and 4.12

Tested-by: Michal Suchánek <msuchanek@suse.de>

Thanks
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index e6474a45cef5..6327fd79b0fb 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -1140,11 +1140,11 @@  SYSCALL_DEFINE0(rt_sigreturn)
 {
 	struct rt_sigframe __user *rt_sf;
 	struct pt_regs *regs = current_pt_regs();
+	int tm_restore = 0;
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	struct ucontext __user *uc_transact;
 	unsigned long msr_hi;
 	unsigned long tmp;
-	int tm_restore = 0;
 #endif
 	/* Always make any pending restarted system calls return -EINTR */
 	current->restart_block.fn = do_no_restart_syscall;
@@ -1192,11 +1192,19 @@  SYSCALL_DEFINE0(rt_sigreturn)
 				goto bad;
 		}
 	}
-	if (!tm_restore)
-		/* Fall through, for non-TM restore */
+	if (!tm_restore) {
+		/*
+		 * Unset regs->msr because ucontext MSR TS is not
+		 * set, and recheckpoint was not called. This avoid
+		 * hitting a TM Bad thing at RFID
+		 */
+		regs->msr &= ~MSR_TS_MASK;
+	}
+	/* Fall through, for non-TM restore */
 #endif
-	if (do_setcontext(&rt_sf->uc, regs, 1))
-		goto bad;
+	if (!tm_restore)
+		if (do_setcontext(&rt_sf->uc, regs, 1))
+			goto bad;
 
 	/*
 	 * It's not clear whether or why it is desirable to save the
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 83d51bf586c7..daa28cb72272 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -740,11 +740,23 @@  SYSCALL_DEFINE0(rt_sigreturn)
 					   &uc_transact->uc_mcontext))
 			goto badframe;
 	}
-	else
-	/* Fall through, for non-TM restore */
 #endif
-	if (restore_sigcontext(current, NULL, 1, &uc->uc_mcontext))
-		goto badframe;
+	/* Fall through, for non-TM restore */
+	if (!MSR_TM_ACTIVE(msr)) {
+		/*
+		 * Unset MSR[TS] on the thread regs since MSR from user
+		 * context does not have MSR active, and recheckpoint was
+		 * not called since restore_tm_sigcontexts() was not called
+		 * also.
+		 *
+		 * If not unsetting it, the code can RFID to userspace with
+		 * MSR[TS] set, but without CPU in the proper state,
+		 * causing a TM bad thing.
+		 */
+		current->thread.regs->msr &= ~MSR_TS_MASK;
+		if (restore_sigcontext(current, NULL, 1, &uc->uc_mcontext))
+			goto badframe;
+	}
 
 	if (restore_altstack(&uc->uc_stack))
 		goto badframe;