diff mbox

[RFC,14/17] powerpc/book3e-64/kexec: Enable SMP release

Message ID 1437250134-307-15-git-send-email-scottwood@freescale.com (mailing list archive)
State RFC
Delegated to: Scott Wood
Headers show

Commit Message

Scott Wood July 18, 2015, 8:08 p.m. UTC
booted_from_exec is similar to __run_at_load, except that it is set for
regular kexec as well as kdump.

The flag is needed because the SMP release mechanism for FSL book3e is
different from when booting with normal hardware.  In theory we could
simulate the normal spin table mechanism, but not at the addresses
U-Boot put in the device tree -- so there'd need to be even more
communication between the kernel and kexec to set that up.  Since
there's already a similar flag being set (for kdump only), this seemed
like a reasonable approach.

Unlike __run_at_kexec in http://patchwork.ozlabs.org/patch/257657/
("book3e/kexec/kdump: introduce a kexec kernel flag"), this flag is at
a fixed address for ABI stability, and actually gets set properly in
the kdump case (i.e. on the crash kernel, not on the crashing kernel).

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
This depends on the kexec-tools patch "ppc64: Add a flag to tell the
kernel it's booting from kexec":
http://lists.infradead.org/pipermail/kexec/2015-July/014048.html
---
 arch/powerpc/include/asm/smp.h    |  1 +
 arch/powerpc/kernel/head_64.S     | 15 +++++++++++++++
 arch/powerpc/kernel/setup_64.c    | 14 +++++++++++++-
 arch/powerpc/platforms/85xx/smp.c | 16 ++++++++++++----
 4 files changed, 41 insertions(+), 5 deletions(-)

Comments

Michael Ellerman Aug. 18, 2015, 4:51 a.m. UTC | #1
On Sat, 2015-18-07 at 20:08:51 UTC, Scott Wood wrote:
> booted_from_exec is similar to __run_at_load, except that it is set for
              ^
	      missing k.

Also do you mind using __booted_from_kexec to keep the naming similar to the
other variables down there, and also make it clear it's low level guts.

I see you asked for them to be removed on the original patch but all the other
vars in there are named that way.

> regular kexec as well as kdump.
> 
> The flag is needed because the SMP release mechanism for FSL book3e is
> different from when booting with normal hardware.  In theory we could
> simulate the normal spin table mechanism, but not at the addresses
> U-Boot put in the device tree -- so there'd need to be even more
> communication between the kernel and kexec to set that up.  Since
> there's already a similar flag being set (for kdump only), this seemed
> like a reasonable approach.

Yeah I guess it is. Obviously it'd be nicer if we didn't have to do it though.

> 
> Unlike __run_at_kexec in http://patchwork.ozlabs.org/patch/257657/
> ("book3e/kexec/kdump: introduce a kexec kernel flag"), this flag is at
> a fixed address for ABI stability, and actually gets set properly in
> the kdump case (i.e. on the crash kernel, not on the crashing kernel).
> 
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
> This depends on the kexec-tools patch "ppc64: Add a flag to tell the
> kernel it's booting from kexec":
> http://lists.infradead.org/pipermail/kexec/2015-July/014048.html
> ---
>  arch/powerpc/include/asm/smp.h    |  1 +
>  arch/powerpc/kernel/head_64.S     | 15 +++++++++++++++
>  arch/powerpc/kernel/setup_64.c    | 14 +++++++++++++-
>  arch/powerpc/platforms/85xx/smp.c | 16 ++++++++++++----
>  4 files changed, 41 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
> index 825663c..f9245df 100644
> --- a/arch/powerpc/include/asm/smp.h
> +++ b/arch/powerpc/include/asm/smp.h
> @@ -197,6 +197,7 @@ extern void generic_secondary_thread_init(void);
>  extern unsigned long __secondary_hold_spinloop;
>  extern unsigned long __secondary_hold_acknowledge;
>  extern char __secondary_hold;
> +extern u32 booted_from_kexec;
>  
>  extern void __early_start(void);
>  #endif /* __ASSEMBLY__ */
> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> index 1b77956..ae2d6b5 100644
> --- a/arch/powerpc/kernel/head_64.S
> +++ b/arch/powerpc/kernel/head_64.S
> @@ -91,6 +91,21 @@ __secondary_hold_spinloop:
>  __secondary_hold_acknowledge:
>  	.llong	0x0
>  
> +	/* Do not move this variable as kexec-tools knows about it. */
> +	. = 0x58
> +	.globl	booted_from_kexec
> +booted_from_kexec:
> +	/*
> +	 * "nkxc" -- not (necessarily) from kexec by default
> +	 *
> +	 * This flag is set to 1 by a loader if the kernel is being
> +	 * booted by kexec.  Older kexec-tools don't know about this
> +	 * flag, so platforms other than fsl-book3e should treat a value
> +	 * of "nkxc" as inconclusive.  fsl-book3e relies on this to
> +	 * know how to release secondary cpus.
> +	 */
> +	.long	0x6e6b7863

Couldn't we say that "nkxc" (whatever that stands for) means "unknown", and
have kexec-tools write "yes" to indicate yes. I realise that's not 100% bullet
proof, but it seems like it would be good enough. And it would mean we could
use the flag on other platforms if we ever want to.

Also "nkxc" ? "bfkx" ?

> diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
> index 505ec2c..baeddcc 100644
> --- a/arch/powerpc/kernel/setup_64.c
> +++ b/arch/powerpc/kernel/setup_64.c
> @@ -340,11 +340,23 @@ void early_setup_secondary(void)
>  #endif /* CONFIG_SMP */
>  
>  #if defined(CONFIG_SMP) || defined(CONFIG_KEXEC)
> +static bool use_spinloop(void)
> +{
> +#ifdef CONFIG_PPC_FSL_BOOK3E
> +	return booted_from_kexec == 1;
> +#else
> +	return true;
> +#endif

Ugh, more ifdefs.

What about:

	return IS_ENABLED(CONFIG_PPC_FSL_BOOK3E) && (booted_from_kexec == 1);

If that works, I haven't checked. It's slightly less ugly?

> +}
> +
>  void smp_release_cpus(void)
>  {
>  	unsigned long *ptr;
>  	int i;
>  
> +	if (!use_spinloop())
> +		return;
> +
>  	DBG(" -> smp_release_cpus()\n");
>  
>  	/* All secondary cpus are spinning on a common spinloop, release them
> @@ -524,7 +536,7 @@ void __init setup_system(void)
>  	 * Freescale Book3e parts spin in a loop provided by firmware,
>  	 * so smp_release_cpus() does nothing for them
>  	 */
> -#if defined(CONFIG_SMP) && !defined(CONFIG_PPC_FSL_BOOK3E)
> +#if defined(CONFIG_SMP)

Can you make that just #ifdef CONFIG_SMP.

>  	/* Release secondary cpus out of their spinloops at 0x60 now that
>  	 * we can map physical -> logical CPU ids
>  	 */

cheers
Scott Wood Aug. 18, 2015, 5:09 a.m. UTC | #2
On Tue, 2015-08-18 at 14:51 +1000, Michael Ellerman wrote:
> On Sat, 2015-18-07 at 20:08:51 UTC, Scott Wood wrote:
> > booted_from_exec is similar to __run_at_load, except that it is set for
>               ^
>             missing k.
> 
> Also do you mind using __booted_from_kexec to keep the naming similar to the
> other variables down there, and also make it clear it's low level guts.
> 
> I see you asked for them to be removed on the original patch but all the 
> other
> vars in there are named that way.

I'm not a fan of it as it isn't distinguishing from a non-underscore version, 
isn't there for namespacing reasons, and it's not even a private 
implementation detail -- it's part of the interface with kexec tools.  I'll 
change it if you want, though.

> > diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> > index 1b77956..ae2d6b5 100644
> > --- a/arch/powerpc/kernel/head_64.S
> > +++ b/arch/powerpc/kernel/head_64.S
> > @@ -91,6 +91,21 @@ __secondary_hold_spinloop:
> >  __secondary_hold_acknowledge:
> >     .llong  0x0
> >  
> > +   /* Do not move this variable as kexec-tools knows about it. */
> > +   . = 0x58
> > +   .globl  booted_from_kexec
> > +booted_from_kexec:
> > +   /*
> > +    * "nkxc" -- not (necessarily) from kexec by default
> > +    *
> > +    * This flag is set to 1 by a loader if the kernel is being
> > +    * booted by kexec.  Older kexec-tools don't know about this
> > +    * flag, so platforms other than fsl-book3e should treat a value
> > +    * of "nkxc" as inconclusive.  fsl-book3e relies on this to
> > +    * know how to release secondary cpus.
> > +    */
> > +   .long   0x6e6b7863
> 
> Couldn't we say that "nkxc" (whatever that stands for)

It stands for "no kexec", which is true if that value is not overwritten.

>  means "unknown", and
> have kexec-tools write "yes" to indicate yes. I realise that's not 100% 
> bullet

That is what I implemented (other than "1" versus "yes").

> > diff --git a/arch/powerpc/kernel/setup_64.c 
> > b/arch/powerpc/kernel/setup_64.c
> > index 505ec2c..baeddcc 100644
> > --- a/arch/powerpc/kernel/setup_64.c
> > +++ b/arch/powerpc/kernel/setup_64.c
> > @@ -340,11 +340,23 @@ void early_setup_secondary(void)
> >  #endif /* CONFIG_SMP */
> >  
> >  #if defined(CONFIG_SMP) || defined(CONFIG_KEXEC)
> > +static bool use_spinloop(void)
> > +{
> > +#ifdef CONFIG_PPC_FSL_BOOK3E
> > +   return booted_from_kexec == 1;
> > +#else
> > +   return true;
> > +#endif
> 
> Ugh, more ifdefs.
> 
> What about:
> 
>       return IS_ENABLED(CONFIG_PPC_FSL_BOOK3E) && (booted_from_kexec == 1);
> 
> If that works, I haven't checked. It's slightly less ugly?

That would return "false" for non-book3e which isn't correct.

If it has to be done with a single expression, then it'd be:

        return !IS_ENABLED(CONFIG_PPC_FSL_BOOK3E) || booted_from_kexec == 1;

-Scott
Michael Ellerman Aug. 20, 2015, 4:54 a.m. UTC | #3
Hi Scott,

Sorry for the delay. So I'm back to square one on this patch.

On Sat, 2015-07-18 at 15:08 -0500, Scott Wood wrote:
> booted_from_exec is similar to __run_at_load, except that it is set for
> regular kexec as well as kdump.
> 
> The flag is needed because the SMP release mechanism for FSL book3e is
> different from when booting with normal hardware.  In theory we could
> simulate the normal spin table mechanism, but not at the addresses
> U-Boot put in the device tree -- so there'd need to be even more
> communication between the kernel and kexec to set that up.  Since
> there's already a similar flag being set (for kdump only), this seemed
> like a reasonable approach.

Although this is a reasonable approach, I don't think it's the best approach.

AFAICS there's no reason why we can't use a device tree property for this, so I
think we should do that.

It avoids using up space in the low memory area, and also any ambiguities about
whether the value has been set or not.

The reason we used a flag like this for __run_at_load is we need to access that
very early, ie. before we've even relocated the kernel, well before we have
(easy) access to the flattened device tree.

In contrast for this, you don't need to know you're booted from kexec until
much later, so using a device tree property is cleaner and just as easy.

If you want to call it "linux,kexec-boot" or similar that's fine. Equally you
could make it more specific, something like "fsl,avoid-spin-table".


Also below ...

> diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
> index 5152289..4abda43 100644
> --- a/arch/powerpc/platforms/85xx/smp.c
> +++ b/arch/powerpc/platforms/85xx/smp.c
> @@ -305,10 +310,13 @@ static int smp_85xx_kick_cpu(int nr)
>  		__secondary_hold_acknowledge = -1;
>  	}
>  #endif
> -	flush_spin_table(spin_table);
> -	out_be32(&spin_table->pir, hw_cpu);
> -	out_be32(&spin_table->addr_l, __pa(__early_start));
> -	flush_spin_table(spin_table);
> +
> +	if (have_spin_table) {
> +		flush_spin_table(spin_table);
> +		out_be32(&spin_table->pir, hw_cpu);
> +		out_be32(&spin_table->addr_l, __pa(__early_start));
> +		flush_spin_table(spin_table);
> +	}
>  
>  	/* Wait a bit for the CPU to ack. */
>  	if (!spin_event_timeout(__secondary_hold_acknowledge == hw_cpu,

This looks like it's inside an #ifdef CONFIG_PPC32 block, which doesn't make
sense, so I must be missing a lead-up patch or something? (I looked on the list
but didn't find anything immediately)

cheers
Scott Wood Aug. 24, 2015, 8:25 p.m. UTC | #4
On Thu, 2015-08-20 at 14:54 +1000, Michael Ellerman wrote:
> Hi Scott,
> 
> Sorry for the delay. So I'm back to square one on this patch.
> 
> On Sat, 2015-07-18 at 15:08 -0500, Scott Wood wrote:
> > booted_from_exec is similar to __run_at_load, except that it is set for
> > regular kexec as well as kdump.
> > 
> > The flag is needed because the SMP release mechanism for FSL book3e is
> > different from when booting with normal hardware.  In theory we could
> > simulate the normal spin table mechanism, but not at the addresses
> > U-Boot put in the device tree -- so there'd need to be even more
> > communication between the kernel and kexec to set that up.  Since
> > there's already a similar flag being set (for kdump only), this seemed
> > like a reasonable approach.
> 
> Although this is a reasonable approach, I don't think it's the best 
> approach.
> 
> AFAICS there's no reason why we can't use a device tree property for this, 
> so I
> think we should do that.

OK, I'll look into that.

> 
> > diff --git a/arch/powerpc/platforms/85xx/smp.c 
> > b/arch/powerpc/platforms/85xx/smp.c
> > index 5152289..4abda43 100644
> > --- a/arch/powerpc/platforms/85xx/smp.c
> > +++ b/arch/powerpc/platforms/85xx/smp.c
> > @@ -305,10 +310,13 @@ static int smp_85xx_kick_cpu(int nr)
> >             __secondary_hold_acknowledge = -1;
> >     }
> >  #endif
> > -   flush_spin_table(spin_table);
> > -   out_be32(&spin_table->pir, hw_cpu);
> > -   out_be32(&spin_table->addr_l, __pa(__early_start));
> > -   flush_spin_table(spin_table);
> > +
> > +   if (have_spin_table) {
> > +           flush_spin_table(spin_table);
> > +           out_be32(&spin_table->pir, hw_cpu);
> > +           out_be32(&spin_table->addr_l, __pa(__early_start));
> > +           flush_spin_table(spin_table);
> > +   }
> >  
> >     /* Wait a bit for the CPU to ack. */
> >     if (!spin_event_timeout(__secondary_hold_acknowledge == hw_cpu,
> 
> This looks like it's inside an #ifdef CONFIG_PPC32 block, which doesn't make
> sense, so I must be missing a lead-up patch or something? (I looked on the 
> list
> but didn't find anything immediately)

Thanks for catching this.

This is apparently a mismerge due to the code having been previously worked 
on in the context of the SDK tree, which does not have that code inside 
#ifdef CONFIG_PPC32.  When I then applied the result to mainline, everything 
still appeared to work, because there's no real consequence to writing to the 
spin table in this case -- it's just a no-op.  setup_64.c is the part where 
checking booted_from_kexec (or devicetree equivalent) really matters.

-Scott
Michael Ellerman Aug. 25, 2015, 1:57 a.m. UTC | #5
On Mon, 2015-08-24 at 15:25 -0500, Scott Wood wrote:
> On Thu, 2015-08-20 at 14:54 +1000, Michael Ellerman wrote:
> > Hi Scott,
> > 
> > Sorry for the delay. So I'm back to square one on this patch.
> > 
> > On Sat, 2015-07-18 at 15:08 -0500, Scott Wood wrote:
> > > booted_from_exec is similar to __run_at_load, except that it is set for
> > > regular kexec as well as kdump.
> > > 
> > > The flag is needed because the SMP release mechanism for FSL book3e is
> > > different from when booting with normal hardware.  In theory we could
> > > simulate the normal spin table mechanism, but not at the addresses
> > > U-Boot put in the device tree -- so there'd need to be even more
> > > communication between the kernel and kexec to set that up.  Since
> > > there's already a similar flag being set (for kdump only), this seemed
> > > like a reasonable approach.
> > 
> > Although this is a reasonable approach, I don't think it's the best 
> > approach.
> > 
> > AFAICS there's no reason why we can't use a device tree property for this, 
> > so I think we should do that.
> 
> OK, I'll look into that.

Thanks.

> > > diff --git a/arch/powerpc/platforms/85xx/smp.c 
> > > b/arch/powerpc/platforms/85xx/smp.c
> > > index 5152289..4abda43 100644
> > > --- a/arch/powerpc/platforms/85xx/smp.c
> > > +++ b/arch/powerpc/platforms/85xx/smp.c
> > > @@ -305,10 +310,13 @@ static int smp_85xx_kick_cpu(int nr)
> > >             __secondary_hold_acknowledge = -1;
> > >     }
> > >  #endif
> > > -   flush_spin_table(spin_table);
> > > -   out_be32(&spin_table->pir, hw_cpu);
> > > -   out_be32(&spin_table->addr_l, __pa(__early_start));
> > > -   flush_spin_table(spin_table);
> > > +
> > > +   if (have_spin_table) {
> > > +           flush_spin_table(spin_table);
> > > +           out_be32(&spin_table->pir, hw_cpu);
> > > +           out_be32(&spin_table->addr_l, __pa(__early_start));
> > > +           flush_spin_table(spin_table);
> > > +   }
> > >  
> > >     /* Wait a bit for the CPU to ack. */
> > >     if (!spin_event_timeout(__secondary_hold_acknowledge == hw_cpu,
> > 
> > This looks like it's inside an #ifdef CONFIG_PPC32 block, which doesn't make
> > sense, so I must be missing a lead-up patch or something? (I looked on the 
> > list
> > but didn't find anything immediately)
> 
> Thanks for catching this.
> 
> This is apparently a mismerge due to the code having been previously worked 
> on in the context of the SDK tree, which does not have that code inside 
> #ifdef CONFIG_PPC32.  When I then applied the result to mainline, everything 
> still appeared to work, because there's no real consequence to writing to the 
> spin table in this case -- it's just a no-op.  

Aha, that's good, I stared at it for ages thinking I was going mad, but I wasn't!

> setup_64.c is the part where checking booted_from_kexec (or devicetree
> equivalent) really matters.

OK. Can we avoid that too?

All smp_release_cpus() does is whack __secondary_hold_spinloop and then spin
for a while. For the non-kexec case writing to __secondary_hold_spinloop should
be harmless I think, so the only problem is we'll get stuck for a while in the
udelay() loop.

But you could avoid that by preemptively setting spinning_secondaries to 0 in
platform code.

That'd have to be in ppc_md.init_early(), but that's actually not very early,
the device tree is already unflattened.

I guess it's arguable whether that's more or less horrible than adding an
#ifdef'ed booted_from_kexec check, but I think I'd prefer the
spinning_secondaries solution.

cheers
Scott Wood Aug. 25, 2015, 11:40 p.m. UTC | #6
On Tue, 2015-08-25 at 11:57 +1000, Michael Ellerman wrote:
> On Mon, 2015-08-24 at 15:25 -0500, Scott Wood wrote:
> > On Thu, 2015-08-20 at 14:54 +1000, Michael Ellerman wrote:
> > > Hi Scott,
> > > 
> > > Sorry for the delay. So I'm back to square one on this patch.
> > > 
> > > On Sat, 2015-07-18 at 15:08 -0500, Scott Wood wrote:
> > > > booted_from_exec is similar to __run_at_load, except that it is set 
> > > > for
> > > > regular kexec as well as kdump.
> > > > 
> > > > The flag is needed because the SMP release mechanism for FSL book3e is
> > > > different from when booting with normal hardware.  In theory we could
> > > > simulate the normal spin table mechanism, but not at the addresses
> > > > U-Boot put in the device tree -- so there'd need to be even more
> > > > communication between the kernel and kexec to set that up.  Since
> > > > there's already a similar flag being set (for kdump only), this seemed
> > > > like a reasonable approach.
> > > 
> > > Although this is a reasonable approach, I don't think it's the best 
> > > approach.
> > > 
> > > AFAICS there's no reason why we can't use a device tree property for 
> > > this, 
> > > so I think we should do that.
> > 
> > OK, I'll look into that.
> 
> Thanks.
> 
> > > > diff --git a/arch/powerpc/platforms/85xx/smp.c 
> > > > b/arch/powerpc/platforms/85xx/smp.c
> > > > index 5152289..4abda43 100644
> > > > --- a/arch/powerpc/platforms/85xx/smp.c
> > > > +++ b/arch/powerpc/platforms/85xx/smp.c
> > > > @@ -305,10 +310,13 @@ static int smp_85xx_kick_cpu(int nr)
> > > >             __secondary_hold_acknowledge = -1;
> > > >     }
> > > >  #endif
> > > > -   flush_spin_table(spin_table);
> > > > -   out_be32(&spin_table->pir, hw_cpu);
> > > > -   out_be32(&spin_table->addr_l, __pa(__early_start));
> > > > -   flush_spin_table(spin_table);
> > > > +
> > > > +   if (have_spin_table) {
> > > > +           flush_spin_table(spin_table);
> > > > +           out_be32(&spin_table->pir, hw_cpu);
> > > > +           out_be32(&spin_table->addr_l, __pa(__early_start));
> > > > +           flush_spin_table(spin_table);
> > > > +   }
> > > >  
> > > >     /* Wait a bit for the CPU to ack. */
> > > >     if (!spin_event_timeout(__secondary_hold_acknowledge == hw_cpu,
> > > 
> > > This looks like it's inside an #ifdef CONFIG_PPC32 block, which doesn't 
> > > make
> > > sense, so I must be missing a lead-up patch or something? (I looked on 
> > > the 
> > > list
> > > but didn't find anything immediately)
> > 
> > Thanks for catching this.
> > 
> > This is apparently a mismerge due to the code having been previously 
> > worked 
> > on in the context of the SDK tree, which does not have that code inside 
> > #ifdef CONFIG_PPC32.  When I then applied the result to mainline, 
> > everything 
> > still appeared to work, because there's no real consequence to writing to 
> > the 
> > spin table in this case -- it's just a no-op.  
> 
> Aha, that's good, I stared at it for ages thinking I was going mad, but I 
> wasn't!
> 
> > setup_64.c is the part where checking booted_from_kexec (or devicetree
> > equivalent) really matters.
> 
> OK. Can we avoid that too?
> 
> All smp_release_cpus() does is whack __secondary_hold_spinloop and then spin
> for a while. For the non-kexec case writing to __secondary_hold_spinloop 
> should
> be harmless I think, so the only problem is we'll get stuck for a while in 
> the
> udelay() loop.
> 
> But you could avoid that by preemptively setting spinning_secondaries to 0 
> in
> platform code.
> 
> That'd have to be in ppc_md.init_early(), but that's actually not very 
> early,
> the device tree is already unflattened.
> 
> I guess it's arguable whether that's more or less horrible than adding an
> #ifdef'ed booted_from_kexec check, but I think I'd prefer the
> spinning_secondaries solution.

We'd still need the device tree property regardless of whether we keep 
use_spinloop() or set spinning_secondaries to zero.

use_spinloop() (with a device tree property rather than booted_from_kexec) 
seems cleaner:
 - Avoids depending on the fact that some piece of platform code executes 
after spinning_secondaries is initialized but before smp_release_cpus().
 - Doesn't put a different requirement on platform code based on 32 versus 64 
bit (we have too many 32 versus 64 bit differences as is).
 - Doesn't require the change in all relevant platform code files (we have 
both corenet_generic and qemu_e500, both of which support both 32 and 64 bit, 
and custom boards might not all use corenet_generic), whether the platform 
supports kexec or not.  I doesn't look like there's any non-Freescale book3e-
64 left in the kernel[1], but if it ever gets added, it would also be 
affected by a solution that requires platform code to do something to 
preserve the current behavior.

-Scott

[1] If this is true, and won't likely change, can the non-fsl book3e-64 TLB 
miss handlers and such come out?
Michael Ellerman Aug. 26, 2015, 1:13 a.m. UTC | #7
On Tue, 2015-08-25 at 18:40 -0500, Scott Wood wrote:
> On Tue, 2015-08-25 at 11:57 +1000, Michael Ellerman wrote:
> > I guess it's arguable whether that's more or less horrible than adding an
> > #ifdef'ed booted_from_kexec check, but I think I'd prefer the
> > spinning_secondaries solution.
> 
> We'd still need the device tree property regardless of whether we keep 
> use_spinloop() or set spinning_secondaries to zero.

Yep.

> use_spinloop() (with a device tree property rather than booted_from_kexec) 
> seems cleaner:
>  - Avoids depending on the fact that some piece of platform code executes 
> after spinning_secondaries is initialized but before smp_release_cpus().

True, that is a bit fragile.

>  - Doesn't put a different requirement on platform code based on 32 versus 64 
> bit (we have too many 32 versus 64 bit differences as is).

Yeah I didn't think of that.

>  - Doesn't require the change in all relevant platform code files (we have 
> both corenet_generic and qemu_e500, both of which support both 32 and 64 bit, 
> and custom boards might not all use corenet_generic), whether the platform 
> supports kexec or not.

Yep, though they could all call a common implementation of init_early().

So I guess do it with use_spinloop(). I was just hoping to avoid more platform
specific ifdefs in the "generic" code.

> I doesn't look like there's any non-Freescale book3e-64 left in the kernel[1]
...
> [1] If this is true, and won't likely change, can the non-fsl book3e-64 TLB 
> miss handlers and such come out?

It is true, see fb5a515704d7 ("powerpc: Remove platforms/wsp and associated
pieces").

It will not change as far as I'm aware, and all the code's in the git history
anyway, so if there's unused code in there please rip it out.

cheers
diff mbox

Patch

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 825663c..f9245df 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -197,6 +197,7 @@  extern void generic_secondary_thread_init(void);
 extern unsigned long __secondary_hold_spinloop;
 extern unsigned long __secondary_hold_acknowledge;
 extern char __secondary_hold;
+extern u32 booted_from_kexec;
 
 extern void __early_start(void);
 #endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 1b77956..ae2d6b5 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -91,6 +91,21 @@  __secondary_hold_spinloop:
 __secondary_hold_acknowledge:
 	.llong	0x0
 
+	/* Do not move this variable as kexec-tools knows about it. */
+	. = 0x58
+	.globl	booted_from_kexec
+booted_from_kexec:
+	/*
+	 * "nkxc" -- not (necessarily) from kexec by default
+	 *
+	 * This flag is set to 1 by a loader if the kernel is being
+	 * booted by kexec.  Older kexec-tools don't know about this
+	 * flag, so platforms other than fsl-book3e should treat a value
+	 * of "nkxc" as inconclusive.  fsl-book3e relies on this to
+	 * know how to release secondary cpus.
+	 */
+	.long	0x6e6b7863
+
 #ifdef CONFIG_RELOCATABLE
 	/* This flag is set to 1 by a loader if the kernel should run
 	 * at the loaded address instead of the linked address.  This
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 505ec2c..baeddcc 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -340,11 +340,23 @@  void early_setup_secondary(void)
 #endif /* CONFIG_SMP */
 
 #if defined(CONFIG_SMP) || defined(CONFIG_KEXEC)
+static bool use_spinloop(void)
+{
+#ifdef CONFIG_PPC_FSL_BOOK3E
+	return booted_from_kexec == 1;
+#else
+	return true;
+#endif
+}
+
 void smp_release_cpus(void)
 {
 	unsigned long *ptr;
 	int i;
 
+	if (!use_spinloop())
+		return;
+
 	DBG(" -> smp_release_cpus()\n");
 
 	/* All secondary cpus are spinning on a common spinloop, release them
@@ -524,7 +536,7 @@  void __init setup_system(void)
 	 * Freescale Book3e parts spin in a loop provided by firmware,
 	 * so smp_release_cpus() does nothing for them
 	 */
-#if defined(CONFIG_SMP) && !defined(CONFIG_PPC_FSL_BOOK3E)
+#if defined(CONFIG_SMP)
 	/* Release secondary cpus out of their spinloops at 0x60 now that
 	 * we can map physical -> logical CPU ids
 	 */
diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
index 5152289..4abda43 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -20,6 +20,7 @@ 
 #include <linux/highmem.h>
 #include <linux/cpu.h>
 
+#include <asm/kexec.h>
 #include <asm/machdep.h>
 #include <asm/pgtable.h>
 #include <asm/page.h>
@@ -203,6 +204,7 @@  static int smp_85xx_kick_cpu(int nr)
 	int hw_cpu = get_hard_smp_processor_id(nr);
 	int ioremappable;
 	int ret = 0;
+	bool have_spin_table = true;
 
 	WARN_ON(nr < 0 || nr >= NR_CPUS);
 	WARN_ON(hw_cpu < 0 || hw_cpu >= NR_CPUS);
@@ -210,6 +212,9 @@  static int smp_85xx_kick_cpu(int nr)
 	pr_debug("smp_85xx_kick_cpu: kick CPU #%d\n", nr);
 
 #ifdef CONFIG_PPC64
+	if (booted_from_kexec == 1 && system_state != SYSTEM_RUNNING)
+		have_spin_table = false;
+
 	/* Threads don't use the spin table */
 	if (cpu_thread_in_core(nr) != 0) {
 		int primary = cpu_first_thread_sibling(nr);
@@ -305,10 +310,13 @@  static int smp_85xx_kick_cpu(int nr)
 		__secondary_hold_acknowledge = -1;
 	}
 #endif
-	flush_spin_table(spin_table);
-	out_be32(&spin_table->pir, hw_cpu);
-	out_be32(&spin_table->addr_l, __pa(__early_start));
-	flush_spin_table(spin_table);
+
+	if (have_spin_table) {
+		flush_spin_table(spin_table);
+		out_be32(&spin_table->pir, hw_cpu);
+		out_be32(&spin_table->addr_l, __pa(__early_start));
+		flush_spin_table(spin_table);
+	}
 
 	/* Wait a bit for the CPU to ack. */
 	if (!spin_event_timeout(__secondary_hold_acknowledge == hw_cpu,