[v2,2/2] powerpc: tm: Always reclaim in start_thread() for exec() class syscalls
diff mbox

Message ID 20160617045834.22275-2-cyrilbur@gmail.com
State Accepted
Headers show

Commit Message

Cyril Bur June 17, 2016, 4:58 a.m. UTC
Userspace can quite legitimately perform an exec() syscall with a
suspended transaction. exec() does not return to the old process,
rather it load a new one and starts that, the expectation therefore is
that the new process starts not in a transaction. Currently exec() is
not treated any differently to any other syscall which creates
problems.

Firstly it could allow a new process to start with a suspended
transaction for a binary that no longer exists. This means that the
checkpointed state won't be valid and if the suspended transaction
were ever to be resumed and subsequently aborted (a possibility which
is exceedingly likely as exec()ing will likely doom the transaction)
the new process will jump to invalid state.

Secondly the incorrect attempt to keep the transactional state while
still zeroing state for the new process creates at least two TM Bad
Things. The first triggers on the rfid to return to userspace as
start_thread() has given the new process a 'clean' MSR but the suspend
will still be set in the hardware MSR. The second TM Bad Thing
triggers in __switch_to() as the processor is still transactionally
suspended but __switch_to() wants to zero the TM sprs for the new
process.

This is an example of the outcome of calling exec() with a suspended
transaction. Note the first 700 is likely the first TM bad thing
decsribed earlier only the kernel can't report it as we've loaded
userspace registers. c000000000009980 is the rfid in
fast_exception_return()

Bad kernel stack pointer 3fffcfa1a370 at c000000000009980
Oops: Bad kernel stack pointer, sig: 6 [#1]
SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 0 PID: 2006 Comm: tm-execed Not tainted
4.6.0-rc3cyrilb769744c1efb74735f687b36ba6f97b5668e0f515 #1
task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000
NIP: c000000000009980 LR: 0000000000000000 CTR: 0000000000000000
REGS: c00000003ffefd40 TRAP: 0700   Not tainted
(4.6.0-rc3cyrilb769744c1efb74735f687b36ba6f97b5668e0f515)
MSR: 8000000300201031 <SF,ME,IR,DR,LE,TM[SE]>  CR: 00000000  XER: 00000000
CFAR: c0000000000098b4 SOFTE: 0
PACATMSCRATCH: b00000010000d033
GPR00: 0000000000000000 00003fffcfa1a370 0000000000000000 0000000000000000
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR12: 00003fff966611c0 0000000000000000 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR28: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
NIP [c000000000009980] fast_exception_return+0xb0/0xb8
LR [0000000000000000]           (null)
Call Trace:
Instruction dump:
f84d0278 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070
e8410080 e8610088 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed023b
---[ end trace 4d79afb454bb5313 ]---

------------[ cut here ]------------
Kernel BUG at c000000000043e80 [verbose debug info unavailable]
Unexpected TM Bad Thing exception at c000000000043e80 (msr 0x201033)
Oops: Unrecoverable exception, sig: 6 [#2]
SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 0 PID: 2006 Comm: tm-execed Tainted: G      D
4.6.0-rc3cyrilb769744c1efb74735f687b36ba6f97b5668e0f515 #1
task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000
NIP: c000000000043e80 LR: c000000000015a24 CTR: 0000000000000000
REGS: c00000003ffef7e0 TRAP: 0700   Tainted: G      D
(4.6.0-rc3cyrilb769744c1efb74735f687b36ba6f97b5668e0f515)
MSR: 8000000300201033 <SF,ME,IR,DR,RI,LE,TM[SE]>  CR: 28002828  XER: 00000000
CFAR: c000000000015a20 SOFTE: 0
PACATMSCRATCH: b00000010000d033
GPR00: 0000000000000000 c00000003ffefa60 c000000000db5500 c0000000fbead000
GPR04: 8000000300001033 2222222222222222 2222222222222222 00000000ff160000
GPR08: 0000000000000000 800000010000d033 c0000000fb7e3ea0 c00000000fe00004
GPR12: 0000000000002200 c00000000fe00000 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 c0000000fbea7410 00000000ff160000
GPR24: c0000000ffe1f600 c0000000fbea8700 c0000000fbea8700 c0000000fbead000
GPR28: c000000000e20198 c0000000fbea6d80 c0000000fbeab680 c0000000fbea6d80
NIP [c000000000043e80] tm_restore_sprs+0xc/0x1c
LR [c000000000015a24] __switch_to+0x1f4/0x420
Call Trace:
Instruction dump:
7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020
---[ end trace 4d79afb454bb5314 ]---

Fixes: bc2a940 ("powerpc: Hook in new transactional memory code")
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
---
V2: Wrap the entire thing in #ifdef to avoid breaking 32bit builds.

 arch/powerpc/kernel/process.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Michael Ellerman June 29, 2016, 6:43 a.m. UTC | #1
On Fri, 2016-17-06 at 04:58:34 UTC, Cyril Bur wrote:
> Userspace can quite legitimately perform an exec() syscall with a
> suspended transaction. exec() does not return to the old process,
...
> 
> Fixes: bc2a940 ("powerpc: Hook in new transactional memory code")
> Signed-off-by: Cyril Bur <cyrilbur@gmail.com>

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/8e96a87c5431c256feb65bcfc5

cheers

Patch
diff mbox

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index c5c3ae2..e9b1efd 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1513,6 +1513,16 @@  void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
 		current->thread.regs = regs - 1;
 	}
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+	/*
+	 * Clear any transactional state, we're exec()ing. The cause is
+	 * not important as there will never be a recheckpoint so it's not
+	 * user visible.
+	 */
+	if (MSR_TM_SUSPENDED(mfmsr()))
+		tm_reclaim_current(0);
+#endif
+
 	memset(regs->gpr, 0, sizeof(regs->gpr));
 	regs->ctr = 0;
 	regs->link = 0;