diff mbox series

[2/2] powerpc/rtas: Fix RTAS MSR[HV] handling for Cell

Message ID 20220823115952.1203106-2-mpe@ellerman.id.au (mailing list archive)
State Accepted
Headers show
Series [1/2] Revert "powerpc: Remove unused FW_FEATURE_NATIVE references" | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 10 jobs.
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 6 jobs.
snowpatch_ozlabs/github-powerpc_selftests success Successfully ran 10 jobs.
snowpatch_ozlabs/github-powerpc_kernel_qemu success Successfully ran 23 jobs.
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.

Commit Message

Michael Ellerman Aug. 23, 2022, 11:59 a.m. UTC
The semi-recent changes to MSR handling when entering RTAS (firmware)
cause crashes on IBM Cell machines. An example trace:

  kernel tried to execute user page (2fff01a8) - exploit attempt? (uid: 0)
  BUG: Unable to handle kernel instruction fetch
  Faulting instruction address: 0x2fff01a8
  Oops: Kernel access of bad area, sig: 11 [#1]
  BE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=4 NUMA Cell
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W          6.0.0-rc2-00433-gede0a8d3307a #207
  NIP:  000000002fff01a8 LR: 0000000000032608 CTR: 0000000000000000
  REGS: c0000000015236b0 TRAP: 0400   Tainted: G        W           (6.0.0-rc2-00433-gede0a8d3307a)
  MSR:  0000000008001002 <ME,RI>  CR: 00000000  XER: 20000000
  ...
  NIP 0x2fff01a8
  LR  0x32608
  Call Trace:
    0xc00000000143c5f8 (unreliable)
    .rtas_call+0x224/0x320
    .rtas_get_boot_time+0x70/0x150
    .read_persistent_clock64+0x114/0x140
    .read_persistent_wall_and_boot_offset+0x24/0x80
    .timekeeping_init+0x40/0x29c
    .start_kernel+0x674/0x8f0
    start_here_common+0x1c/0x50

Unlike PAPR platforms where RTAS is only used in guests, on the IBM Cell
machines Linux runs with MSR[HV] set but also uses RTAS, provided by
SLOF.

Fix it by copying the MSR[HV] bit from the MSR value we've just read
using mfmsr into the value used for RTAS.

It seems like we could also fix it using an #ifdef CELL to set MSR[HV],
but that doesn't work because it's possible to build a single kernel
image that runs on both Cell native and pseries.

Fixes: b6b1c3ce06ca ("powerpc/rtas: Keep MSR[RI] set when calling RTAS")
Cc: stable@vger.kernel.org # v5.19+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/rtas_entry.S | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Jordan Niethe Aug. 24, 2022, 1:50 a.m. UTC | #1
On Tue, 2022-08-23 at 21:59 +1000, Michael Ellerman wrote:
> The semi-recent changes to MSR handling when entering RTAS (firmware)
> cause crashes on IBM Cell machines. An example trace:
> 
>   kernel tried to execute user page (2fff01a8) - exploit attempt? (uid: 0)
>   BUG: Unable to handle kernel instruction fetch
>   Faulting instruction address: 0x2fff01a8
>   Oops: Kernel access of bad area, sig: 11 [#1]
>   BE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=4 NUMA Cell
>   Modules linked in:
>   CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W          6.0.0-rc2-00433-gede0a8d3307a #207
>   NIP:  000000002fff01a8 LR: 0000000000032608 CTR: 0000000000000000
>   REGS: c0000000015236b0 TRAP: 0400   Tainted: G        W           (6.0.0-rc2-00433-gede0a8d3307a)
>   MSR:  0000000008001002 <ME,RI>  CR: 00000000  XER: 20000000
>   ...
>   NIP 0x2fff01a8
>   LR  0x32608
>   Call Trace:
>     0xc00000000143c5f8 (unreliable)
>     .rtas_call+0x224/0x320
>     .rtas_get_boot_time+0x70/0x150
>     .read_persistent_clock64+0x114/0x140
>     .read_persistent_wall_and_boot_offset+0x24/0x80
>     .timekeeping_init+0x40/0x29c
>     .start_kernel+0x674/0x8f0
>     start_here_common+0x1c/0x50
> 
> Unlike PAPR platforms where RTAS is only used in guests, on the IBM Cell
> machines Linux runs with MSR[HV] set but also uses RTAS, provided by
> SLOF.
> 
> Fix it by copying the MSR[HV] bit from the MSR value we've just read
> using mfmsr into the value used for RTAS.
> 
> It seems like we could also fix it using an #ifdef CELL to set MSR[HV],
> but that doesn't work because it's possible to build a single kernel
> image that runs on both Cell native and pseries.
> 
> Fixes: b6b1c3ce06ca ("powerpc/rtas: Keep MSR[RI] set when calling RTAS")
> Cc: stable@vger.kernel.org # v5.19+
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>  arch/powerpc/kernel/rtas_entry.S | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/rtas_entry.S b/arch/powerpc/kernel/rtas_entry.S
> index 9a434d42e660..6ce95ddadbcd 100644
> --- a/arch/powerpc/kernel/rtas_entry.S
> +++ b/arch/powerpc/kernel/rtas_entry.S
> @@ -109,8 +109,12 @@ _GLOBAL(enter_rtas)
>  	 * its critical regions (as specified in PAPR+ section 7.2.1). MSR[S]
>  	 * is not impacted by RFI_TO_KERNEL (only urfid can unset it). So if
>  	 * MSR[S] is set, it will remain when entering RTAS.
> +	 * If we're in HV mode, RTAS must also run in HV mode, so extract MSR_HV
> +	 * from the saved MSR value and insert into the value RTAS will use.
>  	 */

Interestingly it looks like these are the first uses of these extended
mnemonics in the kernel?

> +	extrdi	r0, r6, 1, 63 - MSR_HV_LG

Or in non-mnemonic form...
rldicl  r0, r6, 64 - MSR_HV_LG, 63

>  	LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI)
> +	insrdi	r6, r0, 1, 63 - MSR_HV_LG

Or in non-mnemonic form...
rldimi	r6, r0, MSR_HV_LG, 63 - MSR_HV_LG

It is ok to use r0 as a scratch register as it is loaded with 0 afterwards anyway.

>  
>  	li      r0,0
>  	mtmsrd  r0,1                    /* disable RI before using SRR0/1 */

Reviewed-by: Jordan Niethe <jniethe5@gmail.com>
Michael Ellerman Aug. 24, 2022, 12:04 p.m. UTC | #2
Jordan Niethe <jniethe5@gmail.com> writes:
> On Tue, 2022-08-23 at 21:59 +1000, Michael Ellerman wrote:
>> The semi-recent changes to MSR handling when entering RTAS (firmware)
>> cause crashes on IBM Cell machines. An example trace:
...
>> diff --git a/arch/powerpc/kernel/rtas_entry.S b/arch/powerpc/kernel/rtas_entry.S
>> index 9a434d42e660..6ce95ddadbcd 100644
>> --- a/arch/powerpc/kernel/rtas_entry.S
>> +++ b/arch/powerpc/kernel/rtas_entry.S
>> @@ -109,8 +109,12 @@ _GLOBAL(enter_rtas)
>>  	 * its critical regions (as specified in PAPR+ section 7.2.1). MSR[S]
>>  	 * is not impacted by RFI_TO_KERNEL (only urfid can unset it). So if
>>  	 * MSR[S] is set, it will remain when entering RTAS.
>> +	 * If we're in HV mode, RTAS must also run in HV mode, so extract MSR_HV
>> +	 * from the saved MSR value and insert into the value RTAS will use.
>>  	 */
>
> Interestingly it looks like these are the first uses of these extended
> mnemonics in the kernel?

We used to have at least one use I know of in TM code, but it's since
been converted to C.

>> +	extrdi	r0, r6, 1, 63 - MSR_HV_LG
>
> Or in non-mnemonic form...
> rldicl  r0, r6, 64 - MSR_HV_LG, 63

It's rldicl all the way down.

>>  	LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI)
>> +	insrdi	r6, r0, 1, 63 - MSR_HV_LG
>
> Or in non-mnemonic form...
> rldimi	r6, r0, MSR_HV_LG, 63 - MSR_HV_LG

I think the extended mnemonics are slightly more readable than the
open-coded versions?

> It is ok to use r0 as a scratch register as it is loaded with 0 afterwards anyway.

I originally used r7, but r0 is more obviously safe.

>>
>>  	li      r0,0
>>  	mtmsrd  r0,1                    /* disable RI before using SRR0/1 */
>
> Reviewed-by: Jordan Niethe <jniethe5@gmail.com>

Thanks.

cheers
Jordan Niethe Aug. 25, 2022, 1:22 a.m. UTC | #3
On Wed, 2022-08-24 at 22:04 +1000, Michael Ellerman wrote:
> Jordan Niethe <jniethe5@gmail.com> writes:
> > On Tue, 2022-08-23 at 21:59 +1000, Michael Ellerman wrote:
> > > The semi-recent changes to MSR handling when entering RTAS (firmware)
> > > cause crashes on IBM Cell machines. An example trace:
> ...
> > > diff --git a/arch/powerpc/kernel/rtas_entry.S b/arch/powerpc/kernel/rtas_entry.S
> > > index 9a434d42e660..6ce95ddadbcd 100644
> > > --- a/arch/powerpc/kernel/rtas_entry.S
> > > +++ b/arch/powerpc/kernel/rtas_entry.S
> > > @@ -109,8 +109,12 @@ _GLOBAL(enter_rtas)
> > >  	 * its critical regions (as specified in PAPR+ section 7.2.1). MSR[S]
> > >  	 * is not impacted by RFI_TO_KERNEL (only urfid can unset it). So if
> > >  	 * MSR[S] is set, it will remain when entering RTAS.
> > > +	 * If we're in HV mode, RTAS must also run in HV mode, so extract MSR_HV
> > > +	 * from the saved MSR value and insert into the value RTAS will use.
> > >  	 */
> > 
> > Interestingly it looks like these are the first uses of these extended
> > mnemonics in the kernel?
> 
> We used to have at least one use I know of in TM code, but it's since
> been converted to C.
> 
> > > +	extrdi	r0, r6, 1, 63 - MSR_HV_LG
> > 
> > Or in non-mnemonic form...
> > rldicl  r0, r6, 64 - MSR_HV_LG, 63
> 
> It's rldicl all the way down.
> 
> > >  	LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI)
> > > +	insrdi	r6, r0, 1, 63 - MSR_HV_LG
> > 
> > Or in non-mnemonic form...
> > rldimi	r6, r0, MSR_HV_LG, 63 - MSR_HV_LG
> 
> I think the extended mnemonics are slightly more readable than the
> open-coded versions?

Yeah definitely. I was just noting the plain instruction as I think we
have some existing patterns that may be potential candidates for conversion to the
extended version. Like in exceptions-64s.S

	rldicl. r0, r12, (64-MSR_TS_LG), (64-2) 
	to 
	extrdi. r0, r12, 2, 63 - MSR_TS_LG - 1

Would it be worth changing these?

> 
> > It is ok to use r0 as a scratch register as it is loaded with 0 afterwards anyway.
> 
> I originally used r7, but r0 is more obviously safe.
> 
> > >  	li      r0,0
> > >  	mtmsrd  r0,1                    /* disable RI before using SRR0/1 */
> > 
> > Reviewed-by: Jordan Niethe <jniethe5@gmail.com>
> 
> Thanks.
> 
> cheers
Michael Ellerman Aug. 25, 2022, 8:03 a.m. UTC | #4
Jordan Niethe <jniethe5@gmail.com> writes:
> On Wed, 2022-08-24 at 22:04 +1000, Michael Ellerman wrote:
>> Jordan Niethe <jniethe5@gmail.com> writes:
>> > On Tue, 2022-08-23 at 21:59 +1000, Michael Ellerman wrote:
>> > > The semi-recent changes to MSR handling when entering RTAS (firmware)
>> > > cause crashes on IBM Cell machines. An example trace:
>> ...
>> > > diff --git a/arch/powerpc/kernel/rtas_entry.S b/arch/powerpc/kernel/rtas_entry.S
>> > > index 9a434d42e660..6ce95ddadbcd 100644
>> > > --- a/arch/powerpc/kernel/rtas_entry.S
>> > > +++ b/arch/powerpc/kernel/rtas_entry.S
>> > > @@ -109,8 +109,12 @@ _GLOBAL(enter_rtas)
>> > >  	 * its critical regions (as specified in PAPR+ section 7.2.1). MSR[S]
>> > >  	 * is not impacted by RFI_TO_KERNEL (only urfid can unset it). So if
>> > >  	 * MSR[S] is set, it will remain when entering RTAS.
>> > > +	 * If we're in HV mode, RTAS must also run in HV mode, so extract MSR_HV
>> > > +	 * from the saved MSR value and insert into the value RTAS will use.
>> > >  	 */
>> > 
>> > Interestingly it looks like these are the first uses of these extended
>> > mnemonics in the kernel?
>> 
>> We used to have at least one use I know of in TM code, but it's since
>> been converted to C.
>> 
>> > > +	extrdi	r0, r6, 1, 63 - MSR_HV_LG
>> > 
>> > Or in non-mnemonic form...
>> > rldicl  r0, r6, 64 - MSR_HV_LG, 63
>> 
>> It's rldicl all the way down.
>> 
>> > >  	LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI)
>> > > +	insrdi	r6, r0, 1, 63 - MSR_HV_LG
>> > 
>> > Or in non-mnemonic form...
>> > rldimi	r6, r0, MSR_HV_LG, 63 - MSR_HV_LG
>> 
>> I think the extended mnemonics are slightly more readable than the
>> open-coded versions?
>
> Yeah definitely. I was just noting the plain instruction as I think we
> have some existing patterns that may be potential candidates for conversion to the
> extended version. Like in exceptions-64s.S
>
> 	rldicl. r0, r12, (64-MSR_TS_LG), (64-2) 
> 	to 
> 	extrdi. r0, r12, 2, 63 - MSR_TS_LG - 1
>
> Would it be worth changing these?

Some folks are very comfortable with rldicl and probably prefer the
former, but I'm not sure there's many of those people around anymore :)

I think the extrdi is a bit more readable.

You could use MSR_TS_T_LG to avoid the - 1? All those uses have a
comment about it being 2 bits already.

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/rtas_entry.S b/arch/powerpc/kernel/rtas_entry.S
index 9a434d42e660..6ce95ddadbcd 100644
--- a/arch/powerpc/kernel/rtas_entry.S
+++ b/arch/powerpc/kernel/rtas_entry.S
@@ -109,8 +109,12 @@  _GLOBAL(enter_rtas)
 	 * its critical regions (as specified in PAPR+ section 7.2.1). MSR[S]
 	 * is not impacted by RFI_TO_KERNEL (only urfid can unset it). So if
 	 * MSR[S] is set, it will remain when entering RTAS.
+	 * If we're in HV mode, RTAS must also run in HV mode, so extract MSR_HV
+	 * from the saved MSR value and insert into the value RTAS will use.
 	 */
+	extrdi	r0, r6, 1, 63 - MSR_HV_LG
 	LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI)
+	insrdi	r6, r0, 1, 63 - MSR_HV_LG
 
 	li      r0,0
 	mtmsrd  r0,1                    /* disable RI before using SRR0/1 */