diff mbox series

[v2,09/12] powerpc/mce: Enable MCE notifiers in external modules

Message ID 20190702051932.511-10-santosh@fossix.org (mailing list archive)
State Superseded
Headers show
Series powerpc: implement machine check safe memcpy | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch next (c7d64b560ce80d8c44f082eee8352f0778a73195)
snowpatch_ozlabs/checkpatch warning total: 0 errors, 1 warnings, 0 checks, 25 lines checked

Commit Message

Santosh Sivaraj July 2, 2019, 5:19 a.m. UTC
From: Reza Arbab <arbab@linux.ibm.com>

Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
---
 arch/powerpc/kernel/exceptions-64s.S | 6 ++++++
 arch/powerpc/kernel/mce.c            | 2 ++
 2 files changed, 8 insertions(+)

Comments

Nicholas Piggin July 2, 2019, 6:17 a.m. UTC | #1
Santosh Sivaraj's on July 2, 2019 3:19 pm:
> From: Reza Arbab <arbab@linux.ibm.com>
> 
> Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
> ---
>  arch/powerpc/kernel/exceptions-64s.S | 6 ++++++
>  arch/powerpc/kernel/mce.c            | 2 ++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index c83e38a403fd..311f1392a2ec 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -458,6 +458,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>  	bl	machine_check_early
>  	std	r3,RESULT(r1)	/* Save result */
>  
> +	/* Notifiers may be in a module, so enable virtual addressing. */
> +	mfmsr	r11
> +	ori	r11,r11,MSR_IR
> +	ori	r11,r11,MSR_DR
> +	mtmsr	r11

Can't do this, we could take a machine check somewhere the MMU is
not sane (in fact the guest early mce handling that was added recently
should not be enabling virtual mode either, which needs to be fixed).

Thanks,
Nick
Mahesh J Salgaonkar July 2, 2019, 9:33 a.m. UTC | #2
On 7/2/19 11:47 AM, Nicholas Piggin wrote:
> Santosh Sivaraj's on July 2, 2019 3:19 pm:
>> From: Reza Arbab <arbab@linux.ibm.com>
>>
>> Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
>> ---
>>  arch/powerpc/kernel/exceptions-64s.S | 6 ++++++
>>  arch/powerpc/kernel/mce.c            | 2 ++
>>  2 files changed, 8 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index c83e38a403fd..311f1392a2ec 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -458,6 +458,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>  	bl	machine_check_early
>>  	std	r3,RESULT(r1)	/* Save result */
>>  
>> +	/* Notifiers may be in a module, so enable virtual addressing. */
>> +	mfmsr	r11
>> +	ori	r11,r11,MSR_IR
>> +	ori	r11,r11,MSR_DR
>> +	mtmsr	r11
> 
> Can't do this, we could take a machine check somewhere the MMU is
> not sane (in fact the guest early mce handling that was added recently
> should not be enabling virtual mode either, which needs to be fixed).

Looks like they need this to be able to run notifier chain which may
fail in real mode.

> 
> Thanks,
> Nick
>
Reza Arbab July 3, 2019, 5:20 p.m. UTC | #3
On Tue, Jul 02, 2019 at 04:17:11PM +1000, Nicholas Piggin wrote:
>Santosh Sivaraj's on July 2, 2019 3:19 pm:
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -458,6 +458,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>  	bl	machine_check_early
>>  	std	r3,RESULT(r1)	/* Save result */
>>
>> +	/* Notifiers may be in a module, so enable virtual addressing. */
>> +	mfmsr	r11
>> +	ori	r11,r11,MSR_IR
>> +	ori	r11,r11,MSR_DR
>> +	mtmsr	r11
>
>Can't do this, we could take a machine check somewhere the MMU is
>not sane (in fact the guest early mce handling that was added recently
>should not be enabling virtual mode either, which needs to be fixed).

Rats. So in machine_check_handle_early() there are two options; either 

1. The mc is unhandled/unrecoverable. Stay in real mode, proceed to 
unrecover_mce(), the fatal path of no return (panic, reboot, etc).

2. The mc is handled/recovered. Return from MCE where any further action 
can be done by processing the machine check event workqueue. Am I  
understanding you correctly that this is the absolute earliest we can 
get back to virtual mode?

Since the notifier chain is actually part of the decision between (1) 
and (2), it's a hard limitation then that callbacks be in real address 
space. Is there any way to structure things so that's not the case?

Luckily this patch isn't really necessary for memcpy_mcsafe(), but we 
have a couple of other potential users of the notifier from external 
modules (so their callbacks would require virtual mode).
Nicholas Piggin July 4, 2019, 2:36 a.m. UTC | #4
Reza Arbab's on July 4, 2019 3:20 am:
> On Tue, Jul 02, 2019 at 04:17:11PM +1000, Nicholas Piggin wrote:
>>Santosh Sivaraj's on July 2, 2019 3:19 pm:
>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>> @@ -458,6 +458,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>>  	bl	machine_check_early
>>>  	std	r3,RESULT(r1)	/* Save result */
>>>
>>> +	/* Notifiers may be in a module, so enable virtual addressing. */
>>> +	mfmsr	r11
>>> +	ori	r11,r11,MSR_IR
>>> +	ori	r11,r11,MSR_DR
>>> +	mtmsr	r11
>>
>>Can't do this, we could take a machine check somewhere the MMU is
>>not sane (in fact the guest early mce handling that was added recently
>>should not be enabling virtual mode either, which needs to be fixed).
> 
> Rats. So in machine_check_handle_early() there are two options; either 
> 
> 1. The mc is unhandled/unrecoverable. Stay in real mode, proceed to 
> unrecover_mce(), the fatal path of no return (panic, reboot, etc).
> 
> 2. The mc is handled/recovered. Return from MCE where any further action 
> can be done by processing the machine check event workqueue. Am I  
> understanding you correctly that this is the absolute earliest we can 
> get back to virtual mode?

Yes.

> Since the notifier chain is actually part of the decision between (1) 
> and (2), it's a hard limitation then that callbacks be in real address 
> space. Is there any way to structure things so that's not the case?

If we tested for KVM guest first, and went through and marked (maybe
in a paca flag) everywhere else that put the MMU into a bad / non-host
state, and had the notifiers use the machine check stack, then it
would be possible to enable MMU here.

Hmm, testing for IR|DR after testing for KVM guest might actually be
enough without requiring changes outside the machine check handler...
Actually no that may not quite work because the handler could take a
SLB miss and it might have been triggered inside the SLB miss handler.

All in all I'm pretty against turning on MMU in the MCE handler
anywhere.

> Luckily this patch isn't really necessary for memcpy_mcsafe(), but we 
> have a couple of other potential users of the notifier from external 
> modules (so their callbacks would require virtual mode).

What users are there? Do they do any significant amount of logic that
can not be moved to vmlinux?

Thanks,
Nick
Reza Arbab July 5, 2019, 2:50 a.m. UTC | #5
On Thu, Jul 04, 2019 at 12:36:18PM +1000, Nicholas Piggin wrote:
>Reza Arbab's on July 4, 2019 3:20 am:
>> Since the notifier chain is actually part of the decision between (1)
>> and (2), it's a hard limitation then that callbacks be in real address
>> space. Is there any way to structure things so that's not the case?
>
>If we tested for KVM guest first, and went through and marked (maybe
>in a paca flag) everywhere else that put the MMU into a bad / non-host
>state, and had the notifiers use the machine check stack, then it
>would be possible to enable MMU here.
>
>Hmm, testing for IR|DR after testing for KVM guest might actually be
>enough without requiring changes outside the machine check handler...
>Actually no that may not quite work because the handler could take a
>SLB miss and it might have been triggered inside the SLB miss handler.
>
>All in all I'm pretty against turning on MMU in the MCE handler
>anywhere.

Hey, fair enough. Just making sure there really isnt't any room to make 
things work the way I was trying.

>> Luckily this patch isn't really necessary for memcpy_mcsafe(), but we
>> have a couple of other potential users of the notifier from external
>> modules (so their callbacks would require virtual mode).
>
>What users are there? Do they do any significant amount of logic that
>can not be moved to vmlinux?

One I had in mind was the NVIDIA driver. When taking a UE from defective 
GPU memory, it could use the notifier to save the bad address to a 
blacklist in their nvram. Not so much recovering the machine check, just 
logging before the system reboots.

The other user is a prototype driver for the IBM Research project we had 
a talk about offline a while back.

We can make this patchset work for memcpy_mcsafe(), but I think it's 
back to the drawing board for the others.
Nicholas Piggin July 5, 2019, 5:29 a.m. UTC | #6
Reza Arbab's on July 5, 2019 12:50 pm:
> On Thu, Jul 04, 2019 at 12:36:18PM +1000, Nicholas Piggin wrote:
>>Reza Arbab's on July 4, 2019 3:20 am:
>>> Since the notifier chain is actually part of the decision between (1)
>>> and (2), it's a hard limitation then that callbacks be in real address
>>> space. Is there any way to structure things so that's not the case?
>>
>>If we tested for KVM guest first, and went through and marked (maybe
>>in a paca flag) everywhere else that put the MMU into a bad / non-host
>>state, and had the notifiers use the machine check stack, then it
>>would be possible to enable MMU here.
>>
>>Hmm, testing for IR|DR after testing for KVM guest might actually be
>>enough without requiring changes outside the machine check handler...
>>Actually no that may not quite work because the handler could take a
>>SLB miss and it might have been triggered inside the SLB miss handler.
>>
>>All in all I'm pretty against turning on MMU in the MCE handler
>>anywhere.
> 
> Hey, fair enough. Just making sure there really isnt't any room to make 
> things work the way I was trying.

Understand.

> 
>>> Luckily this patch isn't really necessary for memcpy_mcsafe(), but we
>>> have a couple of other potential users of the notifier from external
>>> modules (so their callbacks would require virtual mode).
>>
>>What users are there? Do they do any significant amount of logic that
>>can not be moved to vmlinux?
> 
> One I had in mind was the NVIDIA driver. When taking a UE from defective 
> GPU memory, it could use the notifier to save the bad address to a 
> blacklist in their nvram. Not so much recovering the machine check, just 
> logging before the system reboots.
> 
> The other user is a prototype driver for the IBM Research project we had 
> a talk about offline a while back.

Okay. It might be possible to save the address in the kernel and
then notify the driver afterward. For user-mode and any non-atomic
user copy AFAIK the irq_work should practically run synchronously
after the machine check returns so it might be enough to have a
notifier in the irq work processing.

> We can make this patchset work for memcpy_mcsafe(), but I think it's 
> back to the drawing board for the others.

For the first stage that would be preferable.

Thanks,
Nick
Reza Arbab July 8, 2019, 3:23 p.m. UTC | #7
On Fri, Jul 05, 2019 at 03:29:39PM +1000, Nicholas Piggin wrote:
>Okay. It might be possible to save the address in the kernel and
>then notify the driver afterward. For user-mode and any non-atomic
>user copy AFAIK the irq_work should practically run synchronously
>after the machine check returns so it might be enough to have a
>notifier in the irq work processing.

We can pick up this thread later, but if I remember correctly the 
sticking point we ran into was that we never got that far. Instead of 
returning from the MCE, we went down the fatal codepath.
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index c83e38a403fd..311f1392a2ec 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -458,6 +458,12 @@  EXC_COMMON_BEGIN(machine_check_handle_early)
 	bl	machine_check_early
 	std	r3,RESULT(r1)	/* Save result */
 
+	/* Notifiers may be in a module, so enable virtual addressing. */
+	mfmsr	r11
+	ori	r11,r11,MSR_IR
+	ori	r11,r11,MSR_DR
+	mtmsr	r11
+
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	machine_check_notify
 	ld	r11,RESULT(r1)
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index a8348a9bea5b..9e4d497837d8 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -50,11 +50,13 @@  int mce_register_notifier(struct notifier_block *nb)
 {
 	return blocking_notifier_chain_register(&mce_notifier_list, nb);
 }
+EXPORT_SYMBOL_GPL(mce_register_notifier);
 
 int mce_unregister_notifier(struct notifier_block *nb)
 {
 	return blocking_notifier_chain_unregister(&mce_notifier_list, nb);
 }
+EXPORT_SYMBOL_GPL(mce_unregister_notifier);
 
 static int check_memcpy_mcsafe(struct notifier_block *nb, unsigned long val,
 			       void *data)