Message ID | 20190702051932.511-10-santosh@fossix.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | powerpc: implement machine check safe memcpy | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch next (c7d64b560ce80d8c44f082eee8352f0778a73195) |
snowpatch_ozlabs/checkpatch | warning | total: 0 errors, 1 warnings, 0 checks, 25 lines checked |
Santosh Sivaraj's on July 2, 2019 3:19 pm: > From: Reza Arbab <arbab@linux.ibm.com> > > Signed-off-by: Reza Arbab <arbab@linux.ibm.com> > --- > arch/powerpc/kernel/exceptions-64s.S | 6 ++++++ > arch/powerpc/kernel/mce.c | 2 ++ > 2 files changed, 8 insertions(+) > > diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S > index c83e38a403fd..311f1392a2ec 100644 > --- a/arch/powerpc/kernel/exceptions-64s.S > +++ b/arch/powerpc/kernel/exceptions-64s.S > @@ -458,6 +458,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early) > bl machine_check_early > std r3,RESULT(r1) /* Save result */ > > + /* Notifiers may be in a module, so enable virtual addressing. */ > + mfmsr r11 > + ori r11,r11,MSR_IR > + ori r11,r11,MSR_DR > + mtmsr r11 Can't do this, we could take a machine check somewhere the MMU is not sane (in fact the guest early mce handling that was added recently should not be enabling virtual mode either, which needs to be fixed). Thanks, Nick
On 7/2/19 11:47 AM, Nicholas Piggin wrote: > Santosh Sivaraj's on July 2, 2019 3:19 pm: >> From: Reza Arbab <arbab@linux.ibm.com> >> >> Signed-off-by: Reza Arbab <arbab@linux.ibm.com> >> --- >> arch/powerpc/kernel/exceptions-64s.S | 6 ++++++ >> arch/powerpc/kernel/mce.c | 2 ++ >> 2 files changed, 8 insertions(+) >> >> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S >> index c83e38a403fd..311f1392a2ec 100644 >> --- a/arch/powerpc/kernel/exceptions-64s.S >> +++ b/arch/powerpc/kernel/exceptions-64s.S >> @@ -458,6 +458,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early) >> bl machine_check_early >> std r3,RESULT(r1) /* Save result */ >> >> + /* Notifiers may be in a module, so enable virtual addressing. */ >> + mfmsr r11 >> + ori r11,r11,MSR_IR >> + ori r11,r11,MSR_DR >> + mtmsr r11 > > Can't do this, we could take a machine check somewhere the MMU is > not sane (in fact the guest early mce handling that was added recently > should not be enabling virtual mode either, which needs to be fixed). Looks like they need this to be able to run notifier chain which may fail in real mode. > > Thanks, > Nick >
On Tue, Jul 02, 2019 at 04:17:11PM +1000, Nicholas Piggin wrote: >Santosh Sivaraj's on July 2, 2019 3:19 pm: >> --- a/arch/powerpc/kernel/exceptions-64s.S >> +++ b/arch/powerpc/kernel/exceptions-64s.S >> @@ -458,6 +458,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early) >> bl machine_check_early >> std r3,RESULT(r1) /* Save result */ >> >> + /* Notifiers may be in a module, so enable virtual addressing. */ >> + mfmsr r11 >> + ori r11,r11,MSR_IR >> + ori r11,r11,MSR_DR >> + mtmsr r11 > >Can't do this, we could take a machine check somewhere the MMU is >not sane (in fact the guest early mce handling that was added recently >should not be enabling virtual mode either, which needs to be fixed). Rats. So in machine_check_handle_early() there are two options; either 1. The mc is unhandled/unrecoverable. Stay in real mode, proceed to unrecover_mce(), the fatal path of no return (panic, reboot, etc). 2. The mc is handled/recovered. Return from MCE where any further action can be done by processing the machine check event workqueue. Am I understanding you correctly that this is the absolute earliest we can get back to virtual mode? Since the notifier chain is actually part of the decision between (1) and (2), it's a hard limitation then that callbacks be in real address space. Is there any way to structure things so that's not the case? Luckily this patch isn't really necessary for memcpy_mcsafe(), but we have a couple of other potential users of the notifier from external modules (so their callbacks would require virtual mode).
Reza Arbab's on July 4, 2019 3:20 am: > On Tue, Jul 02, 2019 at 04:17:11PM +1000, Nicholas Piggin wrote: >>Santosh Sivaraj's on July 2, 2019 3:19 pm: >>> --- a/arch/powerpc/kernel/exceptions-64s.S >>> +++ b/arch/powerpc/kernel/exceptions-64s.S >>> @@ -458,6 +458,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early) >>> bl machine_check_early >>> std r3,RESULT(r1) /* Save result */ >>> >>> + /* Notifiers may be in a module, so enable virtual addressing. */ >>> + mfmsr r11 >>> + ori r11,r11,MSR_IR >>> + ori r11,r11,MSR_DR >>> + mtmsr r11 >> >>Can't do this, we could take a machine check somewhere the MMU is >>not sane (in fact the guest early mce handling that was added recently >>should not be enabling virtual mode either, which needs to be fixed). > > Rats. So in machine_check_handle_early() there are two options; either > > 1. The mc is unhandled/unrecoverable. Stay in real mode, proceed to > unrecover_mce(), the fatal path of no return (panic, reboot, etc). > > 2. The mc is handled/recovered. Return from MCE where any further action > can be done by processing the machine check event workqueue. Am I > understanding you correctly that this is the absolute earliest we can > get back to virtual mode? Yes. > Since the notifier chain is actually part of the decision between (1) > and (2), it's a hard limitation then that callbacks be in real address > space. Is there any way to structure things so that's not the case? If we tested for KVM guest first, and went through and marked (maybe in a paca flag) everywhere else that put the MMU into a bad / non-host state, and had the notifiers use the machine check stack, then it would be possible to enable MMU here. Hmm, testing for IR|DR after testing for KVM guest might actually be enough without requiring changes outside the machine check handler... Actually no that may not quite work because the handler could take a SLB miss and it might have been triggered inside the SLB miss handler. All in all I'm pretty against turning on MMU in the MCE handler anywhere. > Luckily this patch isn't really necessary for memcpy_mcsafe(), but we > have a couple of other potential users of the notifier from external > modules (so their callbacks would require virtual mode). What users are there? Do they do any significant amount of logic that can not be moved to vmlinux? Thanks, Nick
On Thu, Jul 04, 2019 at 12:36:18PM +1000, Nicholas Piggin wrote: >Reza Arbab's on July 4, 2019 3:20 am: >> Since the notifier chain is actually part of the decision between (1) >> and (2), it's a hard limitation then that callbacks be in real address >> space. Is there any way to structure things so that's not the case? > >If we tested for KVM guest first, and went through and marked (maybe >in a paca flag) everywhere else that put the MMU into a bad / non-host >state, and had the notifiers use the machine check stack, then it >would be possible to enable MMU here. > >Hmm, testing for IR|DR after testing for KVM guest might actually be >enough without requiring changes outside the machine check handler... >Actually no that may not quite work because the handler could take a >SLB miss and it might have been triggered inside the SLB miss handler. > >All in all I'm pretty against turning on MMU in the MCE handler >anywhere. Hey, fair enough. Just making sure there really isnt't any room to make things work the way I was trying. >> Luckily this patch isn't really necessary for memcpy_mcsafe(), but we >> have a couple of other potential users of the notifier from external >> modules (so their callbacks would require virtual mode). > >What users are there? Do they do any significant amount of logic that >can not be moved to vmlinux? One I had in mind was the NVIDIA driver. When taking a UE from defective GPU memory, it could use the notifier to save the bad address to a blacklist in their nvram. Not so much recovering the machine check, just logging before the system reboots. The other user is a prototype driver for the IBM Research project we had a talk about offline a while back. We can make this patchset work for memcpy_mcsafe(), but I think it's back to the drawing board for the others.
Reza Arbab's on July 5, 2019 12:50 pm: > On Thu, Jul 04, 2019 at 12:36:18PM +1000, Nicholas Piggin wrote: >>Reza Arbab's on July 4, 2019 3:20 am: >>> Since the notifier chain is actually part of the decision between (1) >>> and (2), it's a hard limitation then that callbacks be in real address >>> space. Is there any way to structure things so that's not the case? >> >>If we tested for KVM guest first, and went through and marked (maybe >>in a paca flag) everywhere else that put the MMU into a bad / non-host >>state, and had the notifiers use the machine check stack, then it >>would be possible to enable MMU here. >> >>Hmm, testing for IR|DR after testing for KVM guest might actually be >>enough without requiring changes outside the machine check handler... >>Actually no that may not quite work because the handler could take a >>SLB miss and it might have been triggered inside the SLB miss handler. >> >>All in all I'm pretty against turning on MMU in the MCE handler >>anywhere. > > Hey, fair enough. Just making sure there really isnt't any room to make > things work the way I was trying. Understand. > >>> Luckily this patch isn't really necessary for memcpy_mcsafe(), but we >>> have a couple of other potential users of the notifier from external >>> modules (so their callbacks would require virtual mode). >> >>What users are there? Do they do any significant amount of logic that >>can not be moved to vmlinux? > > One I had in mind was the NVIDIA driver. When taking a UE from defective > GPU memory, it could use the notifier to save the bad address to a > blacklist in their nvram. Not so much recovering the machine check, just > logging before the system reboots. > > The other user is a prototype driver for the IBM Research project we had > a talk about offline a while back. Okay. It might be possible to save the address in the kernel and then notify the driver afterward. For user-mode and any non-atomic user copy AFAIK the irq_work should practically run synchronously after the machine check returns so it might be enough to have a notifier in the irq work processing. > We can make this patchset work for memcpy_mcsafe(), but I think it's > back to the drawing board for the others. For the first stage that would be preferable. Thanks, Nick
On Fri, Jul 05, 2019 at 03:29:39PM +1000, Nicholas Piggin wrote: >Okay. It might be possible to save the address in the kernel and >then notify the driver afterward. For user-mode and any non-atomic >user copy AFAIK the irq_work should practically run synchronously >after the machine check returns so it might be enough to have a >notifier in the irq work processing. We can pick up this thread later, but if I remember correctly the sticking point we ran into was that we never got that far. Instead of returning from the MCE, we went down the fatal codepath.
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index c83e38a403fd..311f1392a2ec 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -458,6 +458,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early) bl machine_check_early std r3,RESULT(r1) /* Save result */ + /* Notifiers may be in a module, so enable virtual addressing. */ + mfmsr r11 + ori r11,r11,MSR_IR + ori r11,r11,MSR_DR + mtmsr r11 + addi r3,r1,STACK_FRAME_OVERHEAD bl machine_check_notify ld r11,RESULT(r1) diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index a8348a9bea5b..9e4d497837d8 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.c @@ -50,11 +50,13 @@ int mce_register_notifier(struct notifier_block *nb) { return blocking_notifier_chain_register(&mce_notifier_list, nb); } +EXPORT_SYMBOL_GPL(mce_register_notifier); int mce_unregister_notifier(struct notifier_block *nb) { return blocking_notifier_chain_unregister(&mce_notifier_list, nb); } +EXPORT_SYMBOL_GPL(mce_unregister_notifier); static int check_memcpy_mcsafe(struct notifier_block *nb, unsigned long val, void *data)