diff mbox

[2,1/4] powerpc: drop the ability to tweak SMT mode at boot time

Message ID 20141205151341.11028.47570.stgit@bahia.lab.toulouse-stg.fr.ibm.com (mailing list archive)
State Changes Requested
Delegated to: Michael Ellerman
Headers show

Commit Message

Greg Kurz Dec. 5, 2014, 3:14 p.m. UTC
The smt-enabled kernel parameter basically leaves unwanted cpus executing
in firmware or wherever they happen to be. The very same applies to the
ibm,smt-enabled DT property which is no more used by anything known. These
are hacks that shoudn't be used in a production environment.

Quoting mpe, "there are better ways for firmware to disable SMT".

It also has an evil side effect on the split-core feature for powernv. The
code needs all the cpus to participate to the split mode update: it relies
on smp_send_reschedule() to get offline ones to do so. This doesn't work with
cpus that haven't come up... The consequence is a kernel hang on powernv when
trying to limit the number of hw threads at boot time (e.g. smt-enabled to
anything but 8 on POWER8).

This patch simply removes both the smt-enabled kernel parameter and the
ibm,smt-enabled property for all platforms. The new default is to start
all hw threads. That leaves /sys the only supported API to change SMT
settings.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---

v2: also drop ibm,smt-enabled

 arch/powerpc/kernel/setup_64.c |   46 ----------------------------------------
 1 file changed, 46 deletions(-)

Comments

Scott Wood Dec. 5, 2014, 6:52 p.m. UTC | #1
On Fri, 2014-12-05 at 16:14 +0100, Greg Kurz wrote:
> The smt-enabled kernel parameter basically leaves unwanted cpus executing
> in firmware or wherever they happen to be. The very same applies to the
> ibm,smt-enabled DT property which is no more used by anything known. These
> are hacks that shoudn't be used in a production environment.
> 
> Quoting mpe, "there are better ways for firmware to disable SMT".

Those "better ways" don't apply to Freescale chips, where the OS enables
(or not) SMT without any interaction with firmware.  I don't care about
the ibm,smt-enabled property, but can we please keep the smt-enabled
boot option?

> It also has an evil side effect on the split-core feature for powernv. The
> code needs all the cpus to participate to the split mode update: it relies
> on smp_send_reschedule() to get offline ones to do so. This doesn't work with
> cpus that haven't come up... The consequence is a kernel hang on powernv when
> trying to limit the number of hw threads at boot time (e.g. smt-enabled to
> anything but 8 on POWER8).

In that case could you disable the option only on that hardware?

> This patch simply removes both the smt-enabled kernel parameter and the
> ibm,smt-enabled property for all platforms. The new default is to start
> all hw threads. That leaves /sys the only supported API to change SMT
> settings.

How would you use /sys for this?  Are you talking about CPU hotplug?

-Scott
Greg Kurz Dec. 8, 2014, 8:23 a.m. UTC | #2
On Fri, 5 Dec 2014 12:52:45 -0600
Scott Wood <scottwood@freescale.com> wrote:

> On Fri, 2014-12-05 at 16:14 +0100, Greg Kurz wrote:
> > The smt-enabled kernel parameter basically leaves unwanted cpus executing
> > in firmware or wherever they happen to be. The very same applies to the
> > ibm,smt-enabled DT property which is no more used by anything known. These
> > are hacks that shoudn't be used in a production environment.
> > 
> > Quoting mpe, "there are better ways for firmware to disable SMT".
> 

Hi Scott,

> Those "better ways" don't apply to Freescale chips, where the OS enables
> (or not) SMT without any interaction with firmware.  I don't care about
> the ibm,smt-enabled property, but can we please keep the smt-enabled
> boot option?
> 

Fair enough for the firmware side, what about CPU hot(un)plug then ?

> > It also has an evil side effect on the split-core feature for powernv. The
> > code needs all the cpus to participate to the split mode update: it relies
> > on smp_send_reschedule() to get offline ones to do so. This doesn't work with
> > cpus that haven't come up... The consequence is a kernel hang on powernv when
> > trying to limit the number of hw threads at boot time (e.g. smt-enabled to
> > anything but 8 on POWER8).
> 
> In that case could you disable the option only on that hardware?
> 

The fact it breaks only powernv doesn't mean it is a powernv only issue.
The smt-enabled feature is a hack because it leaves some cpus in a undefined
state from a kernel POV. Moreover it drags about 80 lines of code and sits
entirely in common ppc64 code. I would reverse the question then ? Why not
moving smt-enabled code to freescale only ?

> > This patch simply removes both the smt-enabled kernel parameter and the
> > ibm,smt-enabled property for all platforms. The new default is to start
> > all hw threads. That leaves /sys the only supported API to change SMT
> > settings.
> 
> How would you use /sys for this?  Are you talking about CPU hotplug?
> 

Yes. This is the safer way to offline cpus.

> -Scott
> 

Cheers.

--
Greg
Scott Wood Dec. 8, 2014, 10:39 p.m. UTC | #3
On Mon, 2014-12-08 at 09:23 +0100, Greg Kurz wrote:
> On Fri, 5 Dec 2014 12:52:45 -0600
> Scott Wood <scottwood@freescale.com> wrote:
> 
> > On Fri, 2014-12-05 at 16:14 +0100, Greg Kurz wrote:
> > > The smt-enabled kernel parameter basically leaves unwanted cpus executing
> > > in firmware or wherever they happen to be. The very same applies to the
> > > ibm,smt-enabled DT property which is no more used by anything known. These
> > > are hacks that shoudn't be used in a production environment.
> > > 
> > > Quoting mpe, "there are better ways for firmware to disable SMT".
> > 
> 
> Hi Scott,
> 
> > Those "better ways" don't apply to Freescale chips, where the OS enables
> > (or not) SMT without any interaction with firmware.  I don't care about
> > the ibm,smt-enabled property, but can we please keep the smt-enabled
> > boot option?
> > 
> 
> Fair enough for the firmware side, what about CPU hot(un)plug then ?

Not yet supported in mainline for e6500 (or maybe it works with
generic_mach_cpu_die which would not be helpful).

Plus, it's more complicated (both to use and how it works internally)
and doesn't avoid having the secondary thread ever run.  Sometimes it's
useful to ensure that the second thread has never run when debugging a
problem.

> > > It also has an evil side effect on the split-core feature for powernv. The
> > > code needs all the cpus to participate to the split mode update: it relies
> > > on smp_send_reschedule() to get offline ones to do so. This doesn't work with
> > > cpus that haven't come up... The consequence is a kernel hang on powernv when
> > > trying to limit the number of hw threads at boot time (e.g. smt-enabled to
> > > anything but 8 on POWER8).
> > 
> > In that case could you disable the option only on that hardware?
> > 
> 
> The fact it breaks only powernv doesn't mean it is a powernv only issue.
> The smt-enabled feature is a hack because it leaves some cpus in a undefined
> state from a kernel POV.

I'm aware of an issue where per-cpu threads get created for these CPUs,
which seems like a bug if they were never marked online (it's on my todo
list to investigate further).  Are there other issues?  It seems like
there ought to be some way to do this right.

>  Moreover it drags about 80 lines of code and sits entirely in common
> ppc64 code. I would reverse the question then ? Why not moving
> smt-enabled code to freescale only ?

I'm fine with making it Freescale-only.

-Scott
Michael Ellerman Dec. 9, 2014, 4:11 a.m. UTC | #4
On Fri, 2014-12-05 at 12:52 -0600, Scott Wood wrote:
> On Fri, 2014-12-05 at 16:14 +0100, Greg Kurz wrote:
> > The smt-enabled kernel parameter basically leaves unwanted cpus executing
> > in firmware or wherever they happen to be. The very same applies to the
> > ibm,smt-enabled DT property which is no more used by anything known. These
> > are hacks that shoudn't be used in a production environment.
> > 
> > Quoting mpe, "there are better ways for firmware to disable SMT".
> 
> Those "better ways" don't apply to Freescale chips, where the OS enables
> (or not) SMT without any interaction with firmware.

But how does it know there even are SMT threads? From the device tree? So
just don't present the threads in the device tree?

cheers
Greg Kurz Dec. 9, 2014, 8:53 a.m. UTC | #5
On Tue, 09 Dec 2014 15:11:02 +1100
Michael Ellerman <mpe@ellerman.id.au> wrote:

> On Fri, 2014-12-05 at 12:52 -0600, Scott Wood wrote:
> > On Fri, 2014-12-05 at 16:14 +0100, Greg Kurz wrote:
> > > The smt-enabled kernel parameter basically leaves unwanted cpus executing
> > > in firmware or wherever they happen to be. The very same applies to the
> > > ibm,smt-enabled DT property which is no more used by anything known. These
> > > are hacks that shoudn't be used in a production environment.
> > > 
> > > Quoting mpe, "there are better ways for firmware to disable SMT".
> > 
> > Those "better ways" don't apply to Freescale chips, where the OS enables
> > (or not) SMT without any interaction with firmware.
> 
> But how does it know there even are SMT threads? From the device tree? So
> just don't present the threads in the device tree?
> 
> cheers
> 
> 

Michael,

Maybe we can first kill the cpu_bootable hook in powernv only, for bug fix.
Then we can take time to do the thing right for all platforms. Thoughts ?

--
Greg
Scott Wood Dec. 9, 2014, 9:04 p.m. UTC | #6
On Tue, 2014-12-09 at 15:11 +1100, Michael Ellerman wrote:
> On Fri, 2014-12-05 at 12:52 -0600, Scott Wood wrote:
> > On Fri, 2014-12-05 at 16:14 +0100, Greg Kurz wrote:
> > > The smt-enabled kernel parameter basically leaves unwanted cpus executing
> > > in firmware or wherever they happen to be. The very same applies to the
> > > ibm,smt-enabled DT property which is no more used by anything known. These
> > > are hacks that shoudn't be used in a production environment.
> > > 
> > > Quoting mpe, "there are better ways for firmware to disable SMT".
> > 
> > Those "better ways" don't apply to Freescale chips, where the OS enables
> > (or not) SMT without any interaction with firmware.
> 
> But how does it know there even are SMT threads? From the device tree? So
> just don't present the threads in the device tree?

The device tree is for hardware description, not configuration...

-Scott
Michael Ellerman Dec. 9, 2014, 11:56 p.m. UTC | #7
On Tue, 2014-12-09 at 15:04 -0600, Scott Wood wrote:
> On Tue, 2014-12-09 at 15:11 +1100, Michael Ellerman wrote:
> > On Fri, 2014-12-05 at 12:52 -0600, Scott Wood wrote:
> > > On Fri, 2014-12-05 at 16:14 +0100, Greg Kurz wrote:
> > > > The smt-enabled kernel parameter basically leaves unwanted cpus executing
> > > > in firmware or wherever they happen to be. The very same applies to the
> > > > ibm,smt-enabled DT property which is no more used by anything known. These
> > > > are hacks that shoudn't be used in a production environment.
> > > > 
> > > > Quoting mpe, "there are better ways for firmware to disable SMT".
> > > 
> > > Those "better ways" don't apply to Freescale chips, where the OS enables
> > > (or not) SMT without any interaction with firmware.
> > 
> > But how does it know there even are SMT threads? From the device tree? So
> > just don't present the threads in the device tree?
> 
> The device tree is for hardware description, not configuration...

Oh please, you're quoting device tree scripture to me now?

cheers
Scott Wood Dec. 10, 2014, 12:14 a.m. UTC | #8
On Wed, 2014-12-10 at 10:56 +1100, Michael Ellerman wrote:
> On Tue, 2014-12-09 at 15:04 -0600, Scott Wood wrote:
> > On Tue, 2014-12-09 at 15:11 +1100, Michael Ellerman wrote:
> > > On Fri, 2014-12-05 at 12:52 -0600, Scott Wood wrote:
> > > > On Fri, 2014-12-05 at 16:14 +0100, Greg Kurz wrote:
> > > > > The smt-enabled kernel parameter basically leaves unwanted cpus executing
> > > > > in firmware or wherever they happen to be. The very same applies to the
> > > > > ibm,smt-enabled DT property which is no more used by anything known. These
> > > > > are hacks that shoudn't be used in a production environment.
> > > > > 
> > > > > Quoting mpe, "there are better ways for firmware to disable SMT".
> > > > 
> > > > Those "better ways" don't apply to Freescale chips, where the OS enables
> > > > (or not) SMT without any interaction with firmware.
> > > 
> > > But how does it know there even are SMT threads? From the device tree? So
> > > just don't present the threads in the device tree?
> > 
> > The device tree is for hardware description, not configuration...
> 
> Oh please, you're quoting device tree scripture to me now?

What benefit is there to ignoring "scripture" here?  Going from an easy
to use command line option to needing to mess around with the dts file
is not a usability improvement.  If you want to make it Freescale-only,
fine.  If you want to push me to fix the problems with the
implementation, fine.

-Scott
Michael Ellerman Dec. 10, 2014, 2:14 a.m. UTC | #9
On Tue, 2014-12-09 at 18:14 -0600, Scott Wood wrote:
> On Wed, 2014-12-10 at 10:56 +1100, Michael Ellerman wrote:
> > On Tue, 2014-12-09 at 15:04 -0600, Scott Wood wrote:
> > > On Tue, 2014-12-09 at 15:11 +1100, Michael Ellerman wrote:
> > > > On Fri, 2014-12-05 at 12:52 -0600, Scott Wood wrote:
> > > > > On Fri, 2014-12-05 at 16:14 +0100, Greg Kurz wrote:
> > > > > > The smt-enabled kernel parameter basically leaves unwanted cpus executing
> > > > > > in firmware or wherever they happen to be. The very same applies to the
> > > > > > ibm,smt-enabled DT property which is no more used by anything known. These
> > > > > > are hacks that shoudn't be used in a production environment.
> > > > > > 
> > > > > > Quoting mpe, "there are better ways for firmware to disable SMT".
> > > > > 
> > > > > Those "better ways" don't apply to Freescale chips, where the OS enables
> > > > > (or not) SMT without any interaction with firmware.
> > > > 
> > > > But how does it know there even are SMT threads? From the device tree? So
> > > > just don't present the threads in the device tree?
> > > 
> > > The device tree is for hardware description, not configuration...
> > 
> > Oh please, you're quoting device tree scripture to me now?
> 
> What benefit is there to ignoring "scripture" here?  Going from an easy
> to use command line option to needing to mess around with the dts file
> is not a usability improvement.  If you want to make it Freescale-only,
> fine.  If you want to push me to fix the problems with the
> implementation, fine.

It's easy to use but it doesn't necessarily work.

You said in your other mail to Greg "Sometimes it's useful to ensure that the
second thread has never run when debugging a problem.".

But you don't know that, for all you know your firmware has started the thread
and it's busy looping somewhere. Perhaps you guys know that your firmware
doesn't do that, but it's still a hack.

We end up with cpus in the present map, but we have no idea where they are or
what they are doing.

So as far as I'm concerned it's only useful as a debugging hack, and one that
we don't really use anymore. But if you guys think it's useful then we'll keep
it.

I'll work out with Greg what the cleanest solution is.

It looks like you only need it on e6500? Which is platforms/85xx I think.
Anywhere else?

cheers
Scott Wood Dec. 10, 2014, 11:50 p.m. UTC | #10
On Wed, 2014-12-10 at 13:14 +1100, Michael Ellerman wrote:
> On Tue, 2014-12-09 at 18:14 -0600, Scott Wood wrote:
> > What benefit is there to ignoring "scripture" here?  Going from an easy
> > to use command line option to needing to mess around with the dts file
> > is not a usability improvement.  If you want to make it Freescale-only,
> > fine.  If you want to push me to fix the problems with the
> > implementation, fine.
> 
> It's easy to use but it doesn't necessarily work.
> 
> You said in your other mail to Greg "Sometimes it's useful to ensure that the
> second thread has never run when debugging a problem.".
> 
> But you don't know that, for all you know your firmware has started the thread
> and it's busy looping somewhere. Perhaps you guys know that your firmware
> doesn't do that, but it's still a hack.

I know that our firmware doesn't do that, and I can verify by reading
the relevant register.

> We end up with cpus in the present map, but we have no idea where they are or
> what they are doing.

Can we check smt-enabled a little earlier and refrain from marking the
secondary threads as present if smt is disabled?

> So as far as I'm concerned it's only useful as a debugging hack, and one that
> we don't really use anymore. But if you guys think it's useful then we'll keep
> it.
> 
> I'll work out with Greg what the cleanest solution is.
> 
> It looks like you only need it on e6500? Which is platforms/85xx I think.
> Anywhere else?

Yes, just e6500.

-Scott
Michael Ellerman Dec. 12, 2014, 5:19 a.m. UTC | #11
On Wed, 2014-12-10 at 17:50 -0600, Scott Wood wrote:
> On Wed, 2014-12-10 at 13:14 +1100, Michael Ellerman wrote:
> > On Tue, 2014-12-09 at 18:14 -0600, Scott Wood wrote:
> > > What benefit is there to ignoring "scripture" here?  Going from an easy
> > > to use command line option to needing to mess around with the dts file
> > > is not a usability improvement.  If you want to make it Freescale-only,
> > > fine.  If you want to push me to fix the problems with the
> > > implementation, fine.
> > 
> > It's easy to use but it doesn't necessarily work.
> > 
> > You said in your other mail to Greg "Sometimes it's useful to ensure that the
> > second thread has never run when debugging a problem.".
> > 
> > But you don't know that, for all you know your firmware has started the thread
> > and it's busy looping somewhere. Perhaps you guys know that your firmware
> > doesn't do that, but it's still a hack.
> 
> I know that our firmware doesn't do that, and I can verify by reading
> the relevant register.

Lucky you :)

> > We end up with cpus in the present map, but we have no idea where they are or
> > what they are doing.
> 
> Can we check smt-enabled a little earlier and refrain from marking the
> secondary threads as present if smt is disabled?

We could, and that would make the semantics much saner. But it would actually
be exactly the opposite of what the folks who originally hit this bug want to
happen.

They are using it as a shortcut for cpu hotunplug, ie. they want the threads
asleep in Linux ready to be hotplugged back in.

This is why I wanted to remove it, because those are both valid expectations of
what smt-enabled=off should do, but they are mutually exclusive. Not to mention
that the current code doesn't implement either of those properly.

Anyway for now we should just ignore it on powernv. We can look at doing
something saner in general in future.

cheers
diff mbox

Patch

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 49f553b..29c1845 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -131,57 +131,11 @@  static void setup_tlb_core_data(void)
 
 #ifdef CONFIG_SMP
 
-static char *smt_enabled_cmdline;
-
-/* Look for ibm,smt-enabled OF option */
 static void check_smt_enabled(void)
 {
-	struct device_node *dn;
-	const char *smt_option;
-
 	/* Default to enabling all threads */
 	smt_enabled_at_boot = threads_per_core;
-
-	/* Allow the command line to overrule the OF option */
-	if (smt_enabled_cmdline) {
-		if (!strcmp(smt_enabled_cmdline, "on"))
-			smt_enabled_at_boot = threads_per_core;
-		else if (!strcmp(smt_enabled_cmdline, "off"))
-			smt_enabled_at_boot = 0;
-		else {
-			int smt;
-			int rc;
-
-			rc = kstrtoint(smt_enabled_cmdline, 10, &smt);
-			if (!rc)
-				smt_enabled_at_boot =
-					min(threads_per_core, smt);
-		}
-	} else {
-		dn = of_find_node_by_path("/options");
-		if (dn) {
-			smt_option = of_get_property(dn, "ibm,smt-enabled",
-						     NULL);
-
-			if (smt_option) {
-				if (!strcmp(smt_option, "on"))
-					smt_enabled_at_boot = threads_per_core;
-				else if (!strcmp(smt_option, "off"))
-					smt_enabled_at_boot = 0;
-			}
-
-			of_node_put(dn);
-		}
-	}
-}
-
-/* Look for smt-enabled= cmdline option */
-static int __init early_smt_enabled(char *p)
-{
-	smt_enabled_cmdline = p;
-	return 0;
 }
-early_param("smt-enabled", early_smt_enabled);
 
 #else
 #define check_smt_enabled()