Message ID | 20180403071548.19829-2-clg@kaod.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | powerpc/xive: add support for H_INT_RESET | expand |
Cédric Le Goater <clg@kaod.org> writes: > The kexec_state KEXEC_STATE_IRQS_OFF barrier is reached by all > secondary CPUs before the kexec_cpu_down() operation is called on > secondaries. This can raise conflicts and provoque errors in the XIVE > hcalls when XIVE is shutdowned with H_INT_RESET on the primary CPU. > > To synchronize the kexec_cpu_down() operations and make sure the > secondaries have completed their task before the primary starts doing > the same, let's move the primary kexec_cpu_down() after the > KEXEC_STATE_REAL_MODE barrier. This sounds reasonable, I'm sure you've tested it. I'm just a bit worried that it could potentially break on other platforms because it changes the sequence of operations. Looking we only have kexec_cpu_down() implemented for pseries, powernv, ps3 and 85xx. We can easily test the first two. ps3 doesn't do much so hopefully that's safe. mpc85xx_smp_kexec_cpu_down() does very little on 32-bit, and on 64-bit it seems to already wait for at least one other CPU to get into KEXEC_STATE_REAL_MODE, so that's probably safe too. So I guess I'm OK to merge this, and we'll fix any fallout. It would be good for the change log to call out the change though, and that we think it's a sensible change for all platforms. cheers > Signed-off-by: Cédric Le Goater <clg@kaod.org> > --- > arch/powerpc/kernel/machine_kexec_64.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/kernel/machine_kexec_64.c b/arch/powerpc/kernel/machine_kexec_64.c > index 49d34d7271e7..212ecb8e829c 100644 > --- a/arch/powerpc/kernel/machine_kexec_64.c > +++ b/arch/powerpc/kernel/machine_kexec_64.c > @@ -230,16 +230,16 @@ static void kexec_prepare_cpus(void) > /* we are sure every CPU has IRQs off at this point */ > kexec_all_irq_disabled = 1; > > - /* after we tell the others to go down */ > - if (ppc_md.kexec_cpu_down) > - ppc_md.kexec_cpu_down(0, 0); > - > /* > * Before removing MMU mappings make sure all CPUs have entered real > * mode: > */ > kexec_prepare_cpus_wait(KEXEC_STATE_REAL_MODE); > > + /* after we tell the others to go down */ > + if (ppc_md.kexec_cpu_down) > + ppc_md.kexec_cpu_down(0, 0); > + > put_cpu(); > } > > -- > 2.13.6
On 05/04/2018 12:41 PM, Michael Ellerman wrote: > Cédric Le Goater <clg@kaod.org> writes: > >> The kexec_state KEXEC_STATE_IRQS_OFF barrier is reached by all >> secondary CPUs before the kexec_cpu_down() operation is called on >> secondaries. This can raise conflicts and provoque errors in the XIVE >> hcalls when XIVE is shutdowned with H_INT_RESET on the primary CPU. >> >> To synchronize the kexec_cpu_down() operations and make sure the >> secondaries have completed their task before the primary starts doing >> the same, let's move the primary kexec_cpu_down() after the >> KEXEC_STATE_REAL_MODE barrier. > > This sounds reasonable, I'm sure you've tested it. I'm just a bit > worried that it could potentially break on other platforms because it > changes the sequence of operations. yes. We are adding a last barrier to be exact making the full sequence a little slower. > Looking we only have kexec_cpu_down() implemented for pseries, powernv, > ps3 and 85xx. > > We can easily test the first two. > ps3 doesn't do much so hopefully that's safe. > > mpc85xx_smp_kexec_cpu_down() does very little on 32-bit, and on 64-bit > it seems to already wait for at least one other CPU to get into > KEXEC_STATE_REAL_MODE, so that's probably safe too. > > So I guess I'm OK to merge this, and we'll fix any fallout. It would be > good for the change log to call out the change though, and that we think > it's a sensible change for all platforms. OK. Thanks, C. > cheers > >> Signed-off-by: Cédric Le Goater <clg@kaod.org> >> --- >> arch/powerpc/kernel/machine_kexec_64.c | 8 ++++---- >> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/arch/powerpc/kernel/machine_kexec_64.c b/arch/powerpc/kernel/machine_kexec_64.c >> index 49d34d7271e7..212ecb8e829c 100644 >> --- a/arch/powerpc/kernel/machine_kexec_64.c >> +++ b/arch/powerpc/kernel/machine_kexec_64.c >> @@ -230,16 +230,16 @@ static void kexec_prepare_cpus(void) >> /* we are sure every CPU has IRQs off at this point */ >> kexec_all_irq_disabled = 1; >> >> - /* after we tell the others to go down */ >> - if (ppc_md.kexec_cpu_down) >> - ppc_md.kexec_cpu_down(0, 0); >> - >> /* >> * Before removing MMU mappings make sure all CPUs have entered real >> * mode: >> */ >> kexec_prepare_cpus_wait(KEXEC_STATE_REAL_MODE); >> >> + /* after we tell the others to go down */ >> + if (ppc_md.kexec_cpu_down) >> + ppc_md.kexec_cpu_down(0, 0); >> + >> put_cpu(); >> } >> >> -- >> 2.13.6
On 05/04/2018 01:13 PM, Cédric Le Goater wrote: > On 05/04/2018 12:41 PM, Michael Ellerman wrote: >> Cédric Le Goater <clg@kaod.org> writes: >> >>> The kexec_state KEXEC_STATE_IRQS_OFF barrier is reached by all >>> secondary CPUs before the kexec_cpu_down() operation is called on >>> secondaries. This can raise conflicts and provoque errors in the XIVE >>> hcalls when XIVE is shutdowned with H_INT_RESET on the primary CPU. >>> >>> To synchronize the kexec_cpu_down() operations and make sure the >>> secondaries have completed their task before the primary starts doing >>> the same, let's move the primary kexec_cpu_down() after the >>> KEXEC_STATE_REAL_MODE barrier. >> >> This sounds reasonable, I'm sure you've tested it. I'm just a bit >> worried that it could potentially break on other platforms because it >> changes the sequence of operations. > > yes. We are adding a last barrier to be exact making the full sequence > a little slower. > >> Looking we only have kexec_cpu_down() implemented for pseries, powernv, >> ps3 and 85xx. >> >> We can easily test the first two. > ps3 doesn't do much so hopefully that's safe. >> >> mpc85xx_smp_kexec_cpu_down() does very little on 32-bit, and on 64-bit >> it seems to already wait for at least one other CPU to get into >> KEXEC_STATE_REAL_MODE, so that's probably safe too. >> >> So I guess I'm OK to merge this, and we'll fix any fallout. It would be >> good for the change log to call out the change though, and that we think >> it's a sensible change for all platforms. > > OK. Ah and can you please fix the 'shutdowned' spelling ? it has been bugging me since I sent the patch :) thx C.
diff --git a/arch/powerpc/kernel/machine_kexec_64.c b/arch/powerpc/kernel/machine_kexec_64.c index 49d34d7271e7..212ecb8e829c 100644 --- a/arch/powerpc/kernel/machine_kexec_64.c +++ b/arch/powerpc/kernel/machine_kexec_64.c @@ -230,16 +230,16 @@ static void kexec_prepare_cpus(void) /* we are sure every CPU has IRQs off at this point */ kexec_all_irq_disabled = 1; - /* after we tell the others to go down */ - if (ppc_md.kexec_cpu_down) - ppc_md.kexec_cpu_down(0, 0); - /* * Before removing MMU mappings make sure all CPUs have entered real * mode: */ kexec_prepare_cpus_wait(KEXEC_STATE_REAL_MODE); + /* after we tell the others to go down */ + if (ppc_md.kexec_cpu_down) + ppc_md.kexec_cpu_down(0, 0); + put_cpu(); }
The kexec_state KEXEC_STATE_IRQS_OFF barrier is reached by all secondary CPUs before the kexec_cpu_down() operation is called on secondaries. This can raise conflicts and provoque errors in the XIVE hcalls when XIVE is shutdowned with H_INT_RESET on the primary CPU. To synchronize the kexec_cpu_down() operations and make sure the secondaries have completed their task before the primary starts doing the same, let's move the primary kexec_cpu_down() after the KEXEC_STATE_REAL_MODE barrier. Signed-off-by: Cédric Le Goater <clg@kaod.org> --- arch/powerpc/kernel/machine_kexec_64.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)