Patchwork Enable hashdist by default on 64bit NUMA

login
register
mail settings
Submitter Anton Blanchard
Date Feb. 26, 2009, 11:24 a.m.
Message ID <20090226112431.GA25330@kryten>
Download mbox | patch
Permalink /patch/23771/
State Accepted, archived
Commit c2fdf3a9b2d52842808a8e551b53b55dd9b45030
Delegated to: Benjamin Herrenschmidt
Headers show

Comments

Anton Blanchard - Feb. 26, 2009, 11:24 a.m.
Hi David,
 
> Hmmm... my bad, I think you need to keep the CONFIG_NUMA
> there too as there is a TLB usage penalty for non-NUMA
> systems if you only use CONFIG_64BIT there.

Sorry that was my screwup, here's a fixed version.

Anton

--

On PowerPC we allocate large boot time hashes on node 0. This leads to
an imbalance in the free memory, for example on a 64GB box (4 x 16GB
nodes):

Free memory:
Node 0: 97.03%
Node 1: 98.54%
Node 2: 98.42%
Node 3: 98.53%

If we switch to using vmalloc (like ia64 and x86-64) things are more
balanced:

Free memory:
Node 0: 97.53%
Node 1: 98.35%
Node 2: 98.33%
Node 3: 98.33%

For many HPC applications we are limited by the free available memory on
the smallest node, so even though the same amount of memory is used the
better balancing helps.

Since all 64bit NUMA capable architectures should have sufficient
vmalloc space, it makes sense to enable it via CONFIG_64BIT.

Signed-off-by: Anton Blanchard <anton@samba.org>
---
David Miller - Feb. 26, 2009, 11:34 a.m.
From: Anton Blanchard <anton@samba.org>
Date: Thu, 26 Feb 2009 22:24:32 +1100

> On PowerPC we allocate large boot time hashes on node 0. This leads to
> an imbalance in the free memory, for example on a 64GB box (4 x 16GB
> nodes):
> 
> Free memory:
> Node 0: 97.03%
> Node 1: 98.54%
> Node 2: 98.42%
> Node 3: 98.53%
> 
> If we switch to using vmalloc (like ia64 and x86-64) things are more
> balanced:
> 
> Free memory:
> Node 0: 97.53%
> Node 1: 98.35%
> Node 2: 98.33%
> Node 3: 98.33%
> 
> For many HPC applications we are limited by the free available memory on
> the smallest node, so even though the same amount of memory is used the
> better balancing helps.
> 
> Since all 64bit NUMA capable architectures should have sufficient
> vmalloc space, it makes sense to enable it via CONFIG_64BIT.
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>

Acked-by: David S. Miller <davem@davemloft.net>
Benjamin Herrenschmidt - March 3, 2009, 5:27 a.m.
On Thu, 2009-02-26 at 22:24 +1100, Anton Blanchard wrote:
> Hi David,
>  
> > Hmmm... my bad, I think you need to keep the CONFIG_NUMA
> > there too as there is a TLB usage penalty for non-NUMA
> > systems if you only use CONFIG_64BIT there.
> 
> Sorry that was my screwup, here's a fixed version.

Sounds good, how do we proceed for merging that ? Andrew ? Should it hop
by linux-mm ?

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

> Anton
> 
> --
> 
> On PowerPC we allocate large boot time hashes on node 0. This leads to
> an imbalance in the free memory, for example on a 64GB box (4 x 16GB
> nodes):
> 
> Free memory:
> Node 0: 97.03%
> Node 1: 98.54%
> Node 2: 98.42%
> Node 3: 98.53%
> 
> If we switch to using vmalloc (like ia64 and x86-64) things are more
> balanced:
> 
> Free memory:
> Node 0: 97.53%
> Node 1: 98.35%
> Node 2: 98.33%
> Node 3: 98.33%
> 
> For many HPC applications we are limited by the free available memory on
> the smallest node, so even though the same amount of memory is used the
> better balancing helps.
> 
> Since all 64bit NUMA capable architectures should have sufficient
> vmalloc space, it makes sense to enable it via CONFIG_64BIT.
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>
> ---
> 
> diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
> index 95837bf..0c4d4b7 100644
> --- a/include/linux/bootmem.h
> +++ b/include/linux/bootmem.h
> @@ -144,10 +144,10 @@ extern void *alloc_large_system_hash(const char *tablename,
>  
>  #define HASH_EARLY	0x00000001	/* Allocating during early boot? */
>  
> -/* Only NUMA needs hash distribution.
> - * IA64 and x86_64 have sufficient vmalloc space.
> +/* Only NUMA needs hash distribution. 64bit NUMA architectures have
> + * sufficient vmalloc space.
>   */
> -#if defined(CONFIG_NUMA) && (defined(CONFIG_IA64) || defined(CONFIG_X86_64))
> +#if defined(CONFIG_NUMA) && defined(CONFIG_64BIT)
>  #define HASHDIST_DEFAULT 1
>  #else
>  #define HASHDIST_DEFAULT 0
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev

Patch

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index 95837bf..0c4d4b7 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -144,10 +144,10 @@  extern void *alloc_large_system_hash(const char *tablename,
 
 #define HASH_EARLY	0x00000001	/* Allocating during early boot? */
 
-/* Only NUMA needs hash distribution.
- * IA64 and x86_64 have sufficient vmalloc space.
+/* Only NUMA needs hash distribution. 64bit NUMA architectures have
+ * sufficient vmalloc space.
  */
-#if defined(CONFIG_NUMA) && (defined(CONFIG_IA64) || defined(CONFIG_X86_64))
+#if defined(CONFIG_NUMA) && defined(CONFIG_64BIT)
 #define HASHDIST_DEFAULT 1
 #else
 #define HASHDIST_DEFAULT 0