diff mbox

powerpc: Fix possible unrecoverable exception caused by '70fe3d9'

Message ID 1458095370-26731-1-git-send-email-cyrilbur@gmail.com (mailing list archive)
State Accepted
Headers show

Commit Message

Cyril Bur March 16, 2016, 2:29 a.m. UTC
Commit 70fe3d9 "powerpc: Restore FPU/VEC/VSX if previously used" introduces a
call to restore_math() late in the syscall return path, after MSR_RI has been
cleared. The MSR_RI flag is used to indicate whether the kernel can take
another exception or not. A cleared MSR_RI flag indicates that the kernel
cannot.

Unfortunately when a machine is under high load an SLB miss can occur in
restore_math() which (with MSR_RI cleared) leads to an unrecoverable exception.

Unrecoverable exception trace:
 powerpc: Restore FPU/VEC/VSX if previously used
  Unrecoverable exception 4100 at c0000000000088d8
  cpu 0x0: Vector: 4100  at [c0000003fa473b20]
      pc: c0000000000088d8: .load_vr_state+0x70/0x110
      lr: c00000000000f710: .restore_math+0x130/0x188
      sp: c0000003fa473da0
     msr: 9000000002003030
    current = 0xc0000007f876f180
    paca    = 0xc00000000fff0000	 softe: 0	 irq_happened: 0x01
      pid   = 1944, comm = K08umountfs
  Linux version 4.5.0-rc3-g22ccd98 (kerkins@alpine1-p1) (gcc version 5.2.1
  20151001 (GCC) ) #1 SMP Tue Mar 15 21:33:26 AEDT 2016
  WARNING: exception is not recoverable, can't continue
  enter ? for help
  [link register   ] c00000000000f710 .restore_math+0x130/0x188
  [c0000003fa473da0] c0000003fa473e30 (unreliable)
  [c0000003fa473e30] c000000000007b6c system_call+0x84/0xfc
  --- Exception: c00 (System Call) at 000000000fe84328
  0:mon>

The clearing of MSR_RI is actually an optimisation to avoid multiple MSR
writes, what must be disabled are interrupts. See comment in entry_64.S:
  /*
   * For performance reasons we clear RI the same time that we
   * clear EE. We only need to clear RI just before we restore r13
   * below, but batching it with EE saves us one expensive mtmsrd call.
   * We have to be careful to restore RI if we branch anywhere from
   * here (eg syscall_exit_work).
   */
At the point of calling restore_math() r13 has not been restored, as such, the
quick fix of turning MSR_RI back on for the call to restore_math() will
eliminate the occurrence of an unrecoverable exception.

We'd like to do a better fix in future.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
---
 arch/powerpc/kernel/entry_64.S | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Michael Ellerman March 16, 2016, 10:24 a.m. UTC | #1
On Wed, 2016-16-03 at 02:29:30 UTC, Cyril Bur wrote:
> Commit 70fe3d9 "powerpc: Restore FPU/VEC/VSX if previously used" introduces a
> call to restore_math() late in the syscall return path, after MSR_RI has been
> cleared. The MSR_RI flag is used to indicate whether the kernel can take
> another exception or not. A cleared MSR_RI flag indicates that the kernel
> cannot.
> 
> Unfortunately when a machine is under high load an SLB miss can occur in
> restore_math() which (with MSR_RI cleared) leads to an unrecoverable exception.
...
> 
> Signed-off-by: Cyril Bur <cyrilbur@gmail.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/6e669f085d595cb6053920832c

cheers
Benjamin Herrenschmidt March 16, 2016, 11:45 a.m. UTC | #2
On Wed, 2016-03-16 at 13:29 +1100, Cyril Bur wrote:
> +#ifdef CONFIG_PPC_BOOK3S
> +       ld      r10,PACAKMSR(r13)
> +       li      r9,MSR_RI
> +       andc    r11,r10,r9 /* Re-clear RI */
> +       mtmsrd  r11,1
> +#endif

Do you need that ? IE, mtmsrd with "1" will only update RI and EE, is
the value of EE in PACAMSR variable ?

Cheers,
Ben.
Cyril Bur March 16, 2016, 10:32 p.m. UTC | #3
On Wed, 16 Mar 2016 22:45:20 +1100
Benjamin Herrenschmidt <benh@au1.ibm.com> wrote:

> On Wed, 2016-03-16 at 13:29 +1100, Cyril Bur wrote:
> > +#ifdef CONFIG_PPC_BOOK3S
> > +       ld      r10,PACAKMSR(r13)
> > +       li      r9,MSR_RI
> > +       andc    r11,r10,r9 /* Re-clear RI */
> > +       mtmsrd  r11,1
> > +#endif
> 
> Do you need that ? IE, mtmsrd with "1" will only update RI and EE, is
> the value of EE in PACAMSR variable ?
> 

Indeed, so, we should check with Anton who did the disable EE and clear RI just
a bit before that.

I had a quick look and to the best of my knowledge EE is not set in PACAKMSR.

> Cheers,
> Ben.
>
Michael Ellerman March 17, 2016, 2:52 a.m. UTC | #4
On Wed, 2016-03-16 at 22:45 +1100, Benjamin Herrenschmidt wrote:

> On Wed, 2016-03-16 at 13:29 +1100, Cyril Bur wrote:

> > +#ifdef CONFIG_PPC_BOOK3S
> > +       ld      r10,PACAKMSR(r13)
> > +       li      r9,MSR_RI
> > +       andc    r11,r10,r9 /* Re-clear RI */
> > +       mtmsrd  r11,1
> > +#endif
>
> Do you need that ? IE, mtmsrd with "1" will only update RI and EE, is
> the value of EE in PACAMSR variable ?

Yeah that's a bit brain dead when you look at it. But in Cyril's defence he
just copied it from the existing code.

PACAKMSR does not include EE, so the code is correct, just suboptimal.

I've already merged it though in my rush to get my next branch working. Cyril
is looking at reworking the EE/RI clearing in there so we'll fix it up as part
of that.

cheers
diff mbox

Patch

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 038e0a1..f3aa4b4 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -218,7 +218,16 @@  system_call:			/* label this so stack traces look sane */
 	bne	3f
 #endif
 2:	addi    r3,r1,STACK_FRAME_OVERHEAD
+#ifdef CONFIG_PPC_BOOK3S
+	mtmsrd	r10,1		/* Restore RI */
+#endif
 	bl	restore_math
+#ifdef CONFIG_PPC_BOOK3S
+	ld	r10,PACAKMSR(r13)
+	li	r9,MSR_RI
+	andc	r11,r10,r9 /* Re-clear RI */
+	mtmsrd	r11,1
+#endif
 	ld	r8,_MSR(r1)
 	ld	r3,RESULT(r1)
 	li	r11,-MAX_ERRNO