Message ID | 20190117113510.4265-1-mpe@ellerman.id.au (mailing list archive) |
---|---|
State | Accepted |
Commit | e7fda7e569e1776d4dccbcef52d34882b62b0654 |
Headers | show |
Series | powerpc/64s: Remove MSR_RI optimisation in system_call_exit() | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | next/apply_patch Successfully applied |
snowpatch_ozlabs/build-ppc64le | success | build succeeded & removed 0 sparse warning(s) |
snowpatch_ozlabs/build-ppc64be | success | build succeeded & removed 0 sparse warning(s) |
snowpatch_ozlabs/build-ppc64e | success | build succeeded & removed 0 sparse warning(s) |
snowpatch_ozlabs/build-pmac32 | success | build succeeded & removed 0 sparse warning(s) |
snowpatch_ozlabs/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 63 lines checked |
Michael Ellerman's on January 17, 2019 9:35 pm: > Currently in system_call_exit() we have an optimisation where we > disable MSR_RI (recoverable interrupt) and MSR_EE (external interrupt > enable) in a single mtmsrd instruction. > > Unfortunately this will no longer work with THREAD_INFO_IN_TASK, > because then the load of TI_FLAGS might fault and faulting with MSR_RI > clear is treated as an unrecoverable exception which leads to a > panic(). > > So change the code to only clear MSR_EE prior to loading TI_FLAGS, > leaving the clear of MSR_RI until later. We have some latitude in > where do the clear of MSR_RI. A bit of experimentation has shown that > this location gives the least slow down. > > This still causes a noticeable slow down in our null_syscall > performance. On a Power9 DD2.2: > > Before After Delta Delta % > 955 cycles 999 cycles -44 -4.6% > > On the plus side this does simplify the code somewhat, because we > don't have to reenable MSR_RI on the restore_math() or > syscall_exit_work() paths which was necessitated previously by the > optimisation. > > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> But only because spectre and meltdown broke my spirit.
Nicholas Piggin <npiggin@gmail.com> writes: > Michael Ellerman's on January 17, 2019 9:35 pm: >> Currently in system_call_exit() we have an optimisation where we >> disable MSR_RI (recoverable interrupt) and MSR_EE (external interrupt >> enable) in a single mtmsrd instruction. >> >> Unfortunately this will no longer work with THREAD_INFO_IN_TASK, >> because then the load of TI_FLAGS might fault and faulting with MSR_RI >> clear is treated as an unrecoverable exception which leads to a >> panic(). >> >> So change the code to only clear MSR_EE prior to loading TI_FLAGS, >> leaving the clear of MSR_RI until later. We have some latitude in >> where do the clear of MSR_RI. A bit of experimentation has shown that >> this location gives the least slow down. >> >> This still causes a noticeable slow down in our null_syscall >> performance. On a Power9 DD2.2: >> >> Before After Delta Delta % >> 955 cycles 999 cycles -44 -4.6% >> >> On the plus side this does simplify the code somewhat, because we >> don't have to reenable MSR_RI on the restore_math() or >> syscall_exit_work() paths which was necessitated previously by the >> optimisation. >> >> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> > > Reviewed-by: Nicholas Piggin <npiggin@gmail.com> > > But only because spectre and meltdown broke my spirit. 😭😭😭😭😭😭😭😭😭😭 Thanks for reviewing it anyway. cheers
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 83bddacd7a17..b1aea6680a9d 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -236,18 +236,14 @@ system_call: /* label this so stack traces look sane */ /* * Disable interrupts so current_thread_info()->flags can't change, * and so that we don't get interrupted after loading SRR0/1. + * + * Leave MSR_RI enabled for now, because with THREAD_INFO_IN_TASK we + * could fault on the load of the TI_FLAGS below. */ #ifdef CONFIG_PPC_BOOK3E wrteei 0 #else - /* - * For performance reasons we clear RI the same time that we - * clear EE. We only need to clear RI just before we restore r13 - * below, but batching it with EE saves us one expensive mtmsrd call. - * We have to be careful to restore RI if we branch anywhere from - * here (eg syscall_exit_work). - */ - li r11,0 + li r11,MSR_RI mtmsrd r11,1 #endif /* CONFIG_PPC_BOOK3E */ @@ -263,15 +259,7 @@ system_call: /* label this so stack traces look sane */ bne 3f #endif 2: addi r3,r1,STACK_FRAME_OVERHEAD -#ifdef CONFIG_PPC_BOOK3S - li r10,MSR_RI - mtmsrd r10,1 /* Restore RI */ -#endif bl restore_math -#ifdef CONFIG_PPC_BOOK3S - li r11,0 - mtmsrd r11,1 -#endif ld r8,_MSR(r1) ld r3,RESULT(r1) li r11,-MAX_ERRNO @@ -287,6 +275,16 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) andi. r6,r8,MSR_PR ld r4,_LINK(r1) +#ifdef CONFIG_PPC_BOOK3S + /* + * Clear MSR_RI, MSR_EE is already and remains disabled. We could do + * this later, but testing shows that doing it here causes less slow + * down than doing it closer to the rfid. + */ + li r11,0 + mtmsrd r11,1 +#endif + beq- 1f ACCOUNT_CPU_USER_EXIT(r13, r11, r12) @@ -363,10 +361,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) b .Lsyscall_exit .Lsyscall_exit_work: -#ifdef CONFIG_PPC_BOOK3S - li r10,MSR_RI - mtmsrd r10,1 /* Restore RI */ -#endif /* If TIF_RESTOREALL is set, don't scribble on either r3 or ccr. If TIF_NOERROR is set, just save r3 as it is. */
Currently in system_call_exit() we have an optimisation where we disable MSR_RI (recoverable interrupt) and MSR_EE (external interrupt enable) in a single mtmsrd instruction. Unfortunately this will no longer work with THREAD_INFO_IN_TASK, because then the load of TI_FLAGS might fault and faulting with MSR_RI clear is treated as an unrecoverable exception which leads to a panic(). So change the code to only clear MSR_EE prior to loading TI_FLAGS, leaving the clear of MSR_RI until later. We have some latitude in where do the clear of MSR_RI. A bit of experimentation has shown that this location gives the least slow down. This still causes a noticeable slow down in our null_syscall performance. On a Power9 DD2.2: Before After Delta Delta % 955 cycles 999 cycles -44 -4.6% On the plus side this does simplify the code somewhat, because we don't have to reenable MSR_RI on the restore_math() or syscall_exit_work() paths which was necessitated previously by the optimisation. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> --- arch/powerpc/kernel/entry_64.S | 34 ++++++++++++++-------------------- 1 file changed, 14 insertions(+), 20 deletions(-)