diff mbox

[RFC,v2,1/2] powerpc: numa: enable USE_PERCPU_NUMA_NODE_ID

Message ID 20140519181423.GL8941@linux.vnet.ibm.com
State Accepted
Commit 8c272261194dfda11cc046fbe808e052f6f284eb
Headers show

Commit Message

Nishanth Aravamudan May 19, 2014, 6:14 p.m. UTC
Hi Andrew,

I found one issue with my patch, fixed below...

On 16.05.2014 [16:39:45 -0700], Nishanth Aravamudan wrote:
> Based off 3bccd996 for ia64, convert powerpc to use the generic per-CPU
> topology tracking, specifically:
>     
> 	initialize per cpu numa_node entry in start_secondary
>     	remove the powerpc cpu_to_node()
>     	define CONFIG_USE_PERCPU_NUMA_NODE_ID if NUMA
>     
> Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>

<snip>

> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index e2a4232..b95be24 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -750,6 +750,11 @@ void start_secondary(void *unused)
>  	}
>  	traverse_core_siblings(cpu, true);
>  
> +	/*
> +	 * numa_node_id() works after this.
> +	 */
> +	set_numa_node(numa_cpu_lookup_table[cpu]);
> +

Similar change is needed for the boot CPU. Update patch:


powerpc: numa: enable USE_PERCPU_NUMA_NODE_ID
    
Based off 3bccd996 for ia64, convert powerpc to use the generic per-CPU
topology tracking, specifically:
    
    initialize per cpu numa_node entry in start_secondary
    remove the powerpc cpu_to_node()
    define CONFIG_USE_PERCPU_NUMA_NODE_ID if NUMA
    
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>

Comments

Nishanth Aravamudan May 27, 2014, 11:44 p.m. UTC | #1
On 19.05.2014 [11:14:23 -0700], Nishanth Aravamudan wrote:
> Hi Andrew,
> 
> I found one issue with my patch, fixed below...
> 
> On 16.05.2014 [16:39:45 -0700], Nishanth Aravamudan wrote:
> > Based off 3bccd996 for ia64, convert powerpc to use the generic per-CPU
> > topology tracking, specifically:
> >     
> > 	initialize per cpu numa_node entry in start_secondary
> >     	remove the powerpc cpu_to_node()
> >     	define CONFIG_USE_PERCPU_NUMA_NODE_ID if NUMA
> >     
> > Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
> 
> <snip>
> 
> > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > index e2a4232..b95be24 100644
> > --- a/arch/powerpc/kernel/smp.c
> > +++ b/arch/powerpc/kernel/smp.c
> > @@ -750,6 +750,11 @@ void start_secondary(void *unused)
> >  	}
> >  	traverse_core_siblings(cpu, true);
> >  
> > +	/*
> > +	 * numa_node_id() works after this.
> > +	 */
> > +	set_numa_node(numa_cpu_lookup_table[cpu]);
> > +
> 
> Similar change is needed for the boot CPU. Update patch:
> 
> 
> powerpc: numa: enable USE_PERCPU_NUMA_NODE_ID
>     
> Based off 3bccd996 for ia64, convert powerpc to use the generic per-CPU
> topology tracking, specifically:
>     
>     initialize per cpu numa_node entry in start_secondary
>     remove the powerpc cpu_to_node()
>     define CONFIG_USE_PERCPU_NUMA_NODE_ID if NUMA
>     
> Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>

Ping on this and patch 2/2. Ben, would you be willing to pull these into
your -next branch so they'd get some testing?

http://patchwork.ozlabs.org/patch/350368/
http://patchwork.ozlabs.org/patch/349838/

Without any further changes, these two help quite a bit with the slab
consumption on CONFIG_SLUB kernels when memoryless nodes are present.

Thanks,
Nish
Benjamin Herrenschmidt May 27, 2014, 11:56 p.m. UTC | #2
On Tue, 2014-05-27 at 16:44 -0700, Nishanth Aravamudan wrote:
> > Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
> 
> Ping on this and patch 2/2. Ben, would you be willing to pull these
> into
> your -next branch so they'd get some testing?
> 
> http://patchwork.ozlabs.org/patch/350368/
> http://patchwork.ozlabs.org/patch/349838/
> 
> Without any further changes, these two help quite a bit with the slab
> consumption on CONFIG_SLUB kernels when memoryless nodes are present.

I don't mind at all :-) I haven't really been following that story
so I was waiting for the dust to settle and maybe acks from MM people
but if you tell me they are good I'm prepared to trust you.

Cheers,
Ben.
Nishanth Aravamudan May 28, 2014, 12:09 a.m. UTC | #3
On 28.05.2014 [09:56:14 +1000], Benjamin Herrenschmidt wrote:
> On Tue, 2014-05-27 at 16:44 -0700, Nishanth Aravamudan wrote:
> > > Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
> > 
> > Ping on this and patch 2/2. Ben, would you be willing to pull these
> > into
> > your -next branch so they'd get some testing?
> > 
> > http://patchwork.ozlabs.org/patch/350368/
> > http://patchwork.ozlabs.org/patch/349838/
> > 
> > Without any further changes, these two help quite a bit with the slab
> > consumption on CONFIG_SLUB kernels when memoryless nodes are present.
> 
> I don't mind at all :-) I haven't really been following that story
> so I was waiting for the dust to settle and maybe acks from MM people
> but if you tell me they are good I'm prepared to trust you.

The patches themselves are pretty minimal and similar to the ia64
changes (and the affected code seems like it hasn't changed in some
time, beyond in the common code). I'd mostly like to get some
broad-range build & boot testing.

Also, is NUMA a sufficient symbol to depend, you think? I figure most of
the PPC NUMA systems are the pSeries/IBM variety, which is where I've
run into memoryless nodes in the first place.

Thanks,
Nish
Andrew Morton May 28, 2014, 11:36 p.m. UTC | #4
On Tue, 27 May 2014 17:09:58 -0700 Nishanth Aravamudan <nacc@linux.vnet.ibm.com> wrote:

> On 28.05.2014 [09:56:14 +1000], Benjamin Herrenschmidt wrote:
> > On Tue, 2014-05-27 at 16:44 -0700, Nishanth Aravamudan wrote:
> > > > Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
> > > 
> > > Ping on this and patch 2/2. Ben, would you be willing to pull these
> > > into
> > > your -next branch so they'd get some testing?
> > > 
> > > http://patchwork.ozlabs.org/patch/350368/
> > > http://patchwork.ozlabs.org/patch/349838/
> > > 
> > > Without any further changes, these two help quite a bit with the slab
> > > consumption on CONFIG_SLUB kernels when memoryless nodes are present.
> > 
> > I don't mind at all :-) I haven't really been following that story
> > so I was waiting for the dust to settle and maybe acks from MM people
> > but if you tell me they are good I'm prepared to trust you.
> 
> The patches themselves are pretty minimal and similar to the ia64
> changes (and the affected code seems like it hasn't changed in some
> time, beyond in the common code). I'd mostly like to get some
> broad-range build & boot testing.
> 
> Also, is NUMA a sufficient symbol to depend, you think? I figure most of
> the PPC NUMA systems are the pSeries/IBM variety, which is where I've
> run into memoryless nodes in the first place.

It's a powerpc-only patchset so I didn't do anything.

Nits:

- When referring to git commits, use 12 digits of hash and include
  the name of the patch as well (because patches can have different
  hashes in different trees).  So

       Based on 3bccd996276b ("numa: ia64: use generic percpu
       var numa_node_id() implementation") for ia64.

- It's nice to include a diffstat.  If there's a [0/n] patch then
  that's a great place for it, so people can see the overall impact at
  a glance.
Nishanth Aravamudan May 28, 2014, 11:43 p.m. UTC | #5
On 28.05.2014 [16:36:07 -0700], Andrew Morton wrote:
> On Tue, 27 May 2014 17:09:58 -0700 Nishanth Aravamudan <nacc@linux.vnet.ibm.com> wrote:
> 
> > On 28.05.2014 [09:56:14 +1000], Benjamin Herrenschmidt wrote:
> > > On Tue, 2014-05-27 at 16:44 -0700, Nishanth Aravamudan wrote:
> > > > > Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
> > > > 
> > > > Ping on this and patch 2/2. Ben, would you be willing to pull these
> > > > into
> > > > your -next branch so they'd get some testing?
> > > > 
> > > > http://patchwork.ozlabs.org/patch/350368/
> > > > http://patchwork.ozlabs.org/patch/349838/
> > > > 
> > > > Without any further changes, these two help quite a bit with the slab
> > > > consumption on CONFIG_SLUB kernels when memoryless nodes are present.
> > > 
> > > I don't mind at all :-) I haven't really been following that story
> > > so I was waiting for the dust to settle and maybe acks from MM people
> > > but if you tell me they are good I'm prepared to trust you.
> > 
> > The patches themselves are pretty minimal and similar to the ia64
> > changes (and the affected code seems like it hasn't changed in some
> > time, beyond in the common code). I'd mostly like to get some
> > broad-range build & boot testing.
> > 
> > Also, is NUMA a sufficient symbol to depend, you think? I figure most of
> > the PPC NUMA systems are the pSeries/IBM variety, which is where I've
> > run into memoryless nodes in the first place.
> 
> It's a powerpc-only patchset so I didn't do anything.
> 
> Nits:
> 
> - When referring to git commits, use 12 digits of hash and include
>   the name of the patch as well (because patches can have different
>   hashes in different trees).  So
> 
>        Based on 3bccd996276b ("numa: ia64: use generic percpu
>        var numa_node_id() implementation") for ia64.
> 
> - It's nice to include a diffstat.  If there's a [0/n] patch then
>   that's a great place for it, so people can see the overall impact at
>   a glance.

Thanks for the notes, I'll include those in any updated patches.

-Nish
diff mbox

Patch

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e099899..9125964 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -453,6 +453,10 @@  config NODES_SHIFT
 	default "4"
 	depends on NEED_MULTIPLE_NODES
 
+config USE_PERCPU_NUMA_NODE_ID
+	def_bool y
+	depends on NUMA
+
 config ARCH_SELECT_MEMORY_MODEL
 	def_bool y
 	depends on PPC64
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index c920215..5ecf7ea 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -20,19 +20,6 @@  struct device_node;
 
 #include <asm/mmzone.h>
 
-static inline int cpu_to_node(int cpu)
-{
-	int nid;
-
-	nid = numa_cpu_lookup_table[cpu];
-
-	/*
-	 * During early boot, the numa-cpu lookup table might not have been
-	 * setup for all CPUs yet. In such cases, default to node 0.
-	 */
-	return (nid < 0) ? 0 : nid;
-}
-
 #define parent_node(node)	(node)
 
 #define cpumask_of_node(node) ((node) == -1 ?				\
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index e2a4232..d7252ad 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -390,6 +390,7 @@  void smp_prepare_boot_cpu(void)
 #ifdef CONFIG_PPC64
 	paca[boot_cpuid].__current = current;
 #endif
+	set_numa_node(numa_cpu_lookup_table[boot_cpuid]);
 	current_set[boot_cpuid] = task_thread_info(current);
 }
 
@@ -750,6 +751,11 @@  void start_secondary(void *unused)
 	}
 	traverse_core_siblings(cpu, true);
 
+	/*
+	 * numa_node_id() works after this.
+	 */
+	set_numa_node(numa_cpu_lookup_table[cpu]);
+
 	smp_wmb();
 	notify_cpu_starting(cpu);
 	set_cpu_online(cpu, true);