[2/2] powerpc: Enable ASYM_SMT on interleaved big-core systems

Message ID 1526037444-22876-3-git-send-email-ego@linux.vnet.ibm.com
State Changes Requested
Headers show
Series
  • powerpc: Scheduler optimization for POWER9 bigcores
Related show

Commit Message

Gautham R Shenoy May 11, 2018, 11:17 a.m.
From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Each of the SMT4 cores forming a fused-core are more or less
independent units. Thus when multiple tasks are scheduled to run on
the fused core, we get the best performance when the tasks are spread
across the pair of SMT4 cores.

Since the threads in the pair of SMT4 cores of an interleaved big-core
are numbered {0,2,4,6} and {1,3,5,7} respectively, enable ASYM_SMT on
such interleaved big-cores that will bias the load-balancing of tasks
on smaller numbered threads, which will automatically result in
spreading the tasks uniformly across the associated pair of SMT4
cores.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Michael Neuling May 14, 2018, 3:22 a.m. | #1
On Fri, 2018-05-11 at 16:47 +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> Each of the SMT4 cores forming a fused-core are more or less
> independent units. Thus when multiple tasks are scheduled to run on
> the fused core, we get the best performance when the tasks are spread
> across the pair of SMT4 cores.
> 
> Since the threads in the pair of SMT4 cores of an interleaved big-core
> are numbered {0,2,4,6} and {1,3,5,7} respectively, enable ASYM_SMT on
> such interleaved big-cores that will bias the load-balancing of tasks
> on smaller numbered threads, which will automatically result in
> spreading the tasks uniformly across the associated pair of SMT4
> cores.
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/smp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 9ca7148..0153f01 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1082,7 +1082,7 @@ static int powerpc_smt_flags(void)
>  {
>  	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
>  
> -	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
> +	if (cpu_has_feature(CPU_FTR_ASYM_SMT) || has_interleaved_big_core) {

Shouldn't we just set CPU_FTR_ASYM_SMT and leave this code unchanged?


>  		printk_once(KERN_INFO "Enabling Asymmetric SMT
> scheduling\n");
>  		flags |= SD_ASYM_PACKING;
>  	}
Gautham R Shenoy May 16, 2018, 5:05 a.m. | #2
On Mon, May 14, 2018 at 01:22:07PM +1000, Michael Neuling wrote:
> On Fri, 2018-05-11 at 16:47 +0530, Gautham R. Shenoy wrote:
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> > 
> > Each of the SMT4 cores forming a fused-core are more or less
> > independent units. Thus when multiple tasks are scheduled to run on
> > the fused core, we get the best performance when the tasks are spread
> > across the pair of SMT4 cores.
> > 
> > Since the threads in the pair of SMT4 cores of an interleaved big-core
> > are numbered {0,2,4,6} and {1,3,5,7} respectively, enable ASYM_SMT on
> > such interleaved big-cores that will bias the load-balancing of tasks
> > on smaller numbered threads, which will automatically result in
> > spreading the tasks uniformly across the associated pair of SMT4
> > cores.
> > 
> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/kernel/smp.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > index 9ca7148..0153f01 100644
> > --- a/arch/powerpc/kernel/smp.c
> > +++ b/arch/powerpc/kernel/smp.c
> > @@ -1082,7 +1082,7 @@ static int powerpc_smt_flags(void)
> >  {
> >  	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
> >  
> > -	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
> > +	if (cpu_has_feature(CPU_FTR_ASYM_SMT) || has_interleaved_big_core) {
> 
> Shouldn't we just set CPU_FTR_ASYM_SMT and leave this code
unchanged?

Yes, that would have the same effect. I refrained from doing that
since I thought CPU_FTR_ASYM_SMT has the "lower numbered threads
expedite thread-folding" connotation from the POWER7 generation.

If it is ok to overload CPU_FTR_ASYM_SMT, we can do what you suggest
and have all the changes in setup-common.c

--
Thanks and Regards
gautham.
Michael Ellerman May 18, 2018, 1:05 p.m. | #3
Gautham R Shenoy <ego@linux.vnet.ibm.com> writes:

> On Mon, May 14, 2018 at 01:22:07PM +1000, Michael Neuling wrote:
>> On Fri, 2018-05-11 at 16:47 +0530, Gautham R. Shenoy wrote:
>> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>> > 
>> > Each of the SMT4 cores forming a fused-core are more or less
>> > independent units. Thus when multiple tasks are scheduled to run on
>> > the fused core, we get the best performance when the tasks are spread
>> > across the pair of SMT4 cores.
>> > 
>> > Since the threads in the pair of SMT4 cores of an interleaved big-core
>> > are numbered {0,2,4,6} and {1,3,5,7} respectively, enable ASYM_SMT on
>> > such interleaved big-cores that will bias the load-balancing of tasks
>> > on smaller numbered threads, which will automatically result in
>> > spreading the tasks uniformly across the associated pair of SMT4
>> > cores.
>> > 
>> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>> > ---
>> >  arch/powerpc/kernel/smp.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > 
>> > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
>> > index 9ca7148..0153f01 100644
>> > --- a/arch/powerpc/kernel/smp.c
>> > +++ b/arch/powerpc/kernel/smp.c
>> > @@ -1082,7 +1082,7 @@ static int powerpc_smt_flags(void)
>> >  {
>> >  	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
>> >  
>> > -	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
>> > +	if (cpu_has_feature(CPU_FTR_ASYM_SMT) || has_interleaved_big_core) {
>> 
>> Shouldn't we just set CPU_FTR_ASYM_SMT and leave this code
> unchanged?
>
> Yes, that would have the same effect. I refrained from doing that
> since I thought CPU_FTR_ASYM_SMT has the "lower numbered threads
> expedite thread-folding" connotation from the POWER7 generation.

The above code is the only use of the feature, so I don't think we need
to worry about any other connotations.

> If it is ok to overload CPU_FTR_ASYM_SMT, we can do what you suggest
> and have all the changes in setup-common.c

Yeah let's do that.

cheers

Patch

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 9ca7148..0153f01 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1082,7 +1082,7 @@  static int powerpc_smt_flags(void)
 {
 	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
 
-	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
+	if (cpu_has_feature(CPU_FTR_ASYM_SMT) || has_interleaved_big_core) {
 		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
 		flags |= SD_ASYM_PACKING;
 	}