diff mbox

[4/5] powerpc/smp: add cpu_cache_mask

Message ID 20170302004920.21948-4-oohall@gmail.com (mailing list archive)
State Changes Requested
Headers show

Commit Message

Oliver O'Halloran March 2, 2017, 12:49 a.m. UTC
Traditionally we have only ever tracked which CPUs are in the same core
(cpu_sibling_mask) and on the same die (cpu_core_mask). For Power9 we
need to be aware of which CPUs share cache with each other so this patch
adds cpu_cache_mask and the underlying cpu_cache_map variable to track
this.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/include/asm/smp.h | 6 ++++++
 arch/powerpc/kernel/smp.c      | 5 +++++
 2 files changed, 11 insertions(+)

Comments

Michael Ellerman March 15, 2017, 11:26 a.m. UTC | #1
Oliver O'Halloran <oohall@gmail.com> writes:

> Traditionally we have only ever tracked which CPUs are in the same core
> (cpu_sibling_mask) and on the same die (cpu_core_mask). For Power9 we
> need to be aware of which CPUs share cache with each other so this patch
> adds cpu_cache_mask and the underlying cpu_cache_map variable to track
> this.

But which cache?

Some CPUs on Power8 share L3, or L4.

I think just call it cpu_l2cache_map to make it explicit.

cheers
Oliver O'Halloran March 23, 2017, 3:33 a.m. UTC | #2
On Wed, Mar 15, 2017 at 10:26 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> Oliver O'Halloran <oohall@gmail.com> writes:
>
>> Traditionally we have only ever tracked which CPUs are in the same core
>> (cpu_sibling_mask) and on the same die (cpu_core_mask). For Power9 we
>> need to be aware of which CPUs share cache with each other so this patch
>> adds cpu_cache_mask and the underlying cpu_cache_map variable to track
>> this.
>
> But which cache?

I'm not sure it matters. All the scheduler really wants to know is
that that migrating between cpus with a shared cache is cheaper than
migrating elsewhere.

> Some CPUs on Power8 share L3, or L4.

Eh... it's not really the same. The "L4" is part of the memory buffers
and it's function is conceptually different to the processor caches.
The L3 on P8 is only shared when the core that owns is offline (or
sleeping) so the scheduler doesn't really need to be aware of it. Even
if the scheduler was aware I don't think it can take advantage of it
without some terrible hacks.

>
> I think just call it cpu_l2cache_map to make it explicit.

I was being deliberately vague. I know it's only a shared currently,
but it's possible we might have a (real) shared L3 in the future. The
latest high-end x86 chips have some of l3 sharing across the entire
chip so you never know. I'm not particularly attached to the name
though, so i'll rename it if you really want.

Oliver
Michael Ellerman March 28, 2017, 1:05 a.m. UTC | #3
Oliver O'Halloran <oohall@gmail.com> writes:

> On Wed, Mar 15, 2017 at 10:26 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> Oliver O'Halloran <oohall@gmail.com> writes:
>>
>>> Traditionally we have only ever tracked which CPUs are in the same core
>>> (cpu_sibling_mask) and on the same die (cpu_core_mask). For Power9 we
>>> need to be aware of which CPUs share cache with each other so this patch
>>> adds cpu_cache_mask and the underlying cpu_cache_map variable to track
>>> this.
>
>> Some CPUs on Power8 share L3, or L4.
>
> Eh... it's not really the same. The "L4" is part of the memory buffers
> and it's function is conceptually different to the processor caches.
> The L3 on P8 is only shared when the core that owns is offline (or
> sleeping) so the scheduler doesn't really need to be aware of it.

But that's exactly my point, this mask only tracks whether CPUs share an
L2, so it should be named as such.

>> I think just call it cpu_l2cache_map to make it explicit.
>
> I was being deliberately vague. I know it's only a shared currently,
> but it's possible we might have a (real) shared L3 in the future. The
> latest high-end x86 chips have some of l3 sharing across the entire
> chip so you never know.

Sure, but in that case we'd probably want a new mask to track which CPUs
share L3, because we'd probably also have CPUs sharing L2, and those
masks might not be equal.

But if I'm wrong we can just rename it *then*.

>I'm not particularly attached to the name though, so i'll rename it
>if you really want.

Yes please.

cheers
diff mbox

Patch

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 32db16d2e7ad..a7fc3a105d61 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -94,6 +94,7 @@  static inline void set_hard_smp_processor_id(int cpu, int phys)
 #endif
 
 DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_map);
+DECLARE_PER_CPU(cpumask_var_t, cpu_cache_map);
 DECLARE_PER_CPU(cpumask_var_t, cpu_core_map);
 
 static inline struct cpumask *cpu_sibling_mask(int cpu)
@@ -106,6 +107,11 @@  static inline struct cpumask *cpu_core_mask(int cpu)
 	return per_cpu(cpu_core_map, cpu);
 }
 
+static inline struct cpumask *cpu_cache_mask(int cpu)
+{
+	return per_cpu(cpu_cache_map, cpu);
+}
+
 extern int cpu_to_core_id(int cpu);
 
 /* Since OpenPIC has only 4 IPIs, we use slightly different message numbers.
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 3922cace927e..5571f30ff72d 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -72,9 +72,11 @@  static DEFINE_PER_CPU(int, cpu_state) = { 0 };
 struct thread_info *secondary_ti;
 
 DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
+DEFINE_PER_CPU(cpumask_var_t, cpu_cache_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
 
 EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
+EXPORT_PER_CPU_SYMBOL(cpu_cache_map);
 EXPORT_PER_CPU_SYMBOL(cpu_core_map);
 
 /* SMP operations for this machine */
@@ -415,6 +417,8 @@  void __init smp_prepare_cpus(unsigned int max_cpus)
 	for_each_possible_cpu(cpu) {
 		zalloc_cpumask_var_node(&per_cpu(cpu_sibling_map, cpu),
 					GFP_KERNEL, cpu_to_node(cpu));
+		zalloc_cpumask_var_node(&per_cpu(cpu_cache_map, cpu),
+					GFP_KERNEL, cpu_to_node(cpu));
 		zalloc_cpumask_var_node(&per_cpu(cpu_core_map, cpu),
 					GFP_KERNEL, cpu_to_node(cpu));
 		/*
@@ -428,6 +432,7 @@  void __init smp_prepare_cpus(unsigned int max_cpus)
 	}
 
 	cpumask_set_cpu(boot_cpuid, cpu_sibling_mask(boot_cpuid));
+	cpumask_set_cpu(boot_cpuid, cpu_cache_mask(boot_cpuid));
 	cpumask_set_cpu(boot_cpuid, cpu_core_mask(boot_cpuid));
 
 	if (smp_ops && smp_ops->probe)