Patchwork Enable hashdist by default on PowerPC

login
register
mail settings
Submitter Anton Blanchard
Date Feb. 18, 2009, 5:11 a.m.
Message ID <20090218051112.GA32195@kryten>
Download mbox | patch
Permalink /patch/23322/
State Superseded
Headers show

Comments

Anton Blanchard - Feb. 18, 2009, 5:11 a.m.
On PowerPC we allocate large boot time hashes on node 0. This leads to
an imbalance in the free memory, for example on a 64GB box (4 x 16GB nodes):

Free memory:
Node 0: 97.03%
Node 1: 98.54%
Node 2: 98.42%
Node 3: 98.53%

If we switch to using vmalloc (like ia64 and x86-64) things are more
balanced:

Free memory:
Node 0: 97.53%
Node 1: 98.35%
Node 2: 98.33%
Node 3: 98.33%

For many HPC applications we are limited by the free available memory on
the smallest node, so even though the same amount of memory is used the
better balancing helps.

Signed-off-by: Anton Blanchard <anton@samba.org>
---
Benjamin Herrenschmidt - Feb. 18, 2009, 5:41 a.m.
> For many HPC applications we are limited by the free available memory on
> the smallest node, so even though the same amount of memory is used the
> better balancing helps.
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>
> ---

You have numbers ? :-) I'm asking mostly because I've been wondering
whether it offsets the 16M pages vs. 4K or 64K pages in term of TLB/ERAT
impact.

Cheers,
Ben.

> diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
> index 95837bf..c0c63ee 100644
> --- a/include/linux/bootmem.h
> +++ b/include/linux/bootmem.h
> @@ -145,9 +145,10 @@ extern void *alloc_large_system_hash(const char *tablename,
>  #define HASH_EARLY	0x00000001	/* Allocating during early boot? */
>  
>  /* Only NUMA needs hash distribution.
> - * IA64 and x86_64 have sufficient vmalloc space.
> + * IA64, x86_64 and PowerPC have sufficient vmalloc space.
>   */
> -#if defined(CONFIG_NUMA) && (defined(CONFIG_IA64) || defined(CONFIG_X86_64))
> +#if defined(CONFIG_NUMA) && (defined(CONFIG_IA64) || defined(CONFIG_X86_64) || \
> +	defined(CONFIG_PPC64))
>  #define HASHDIST_DEFAULT 1
>  #else
>  #define HASHDIST_DEFAULT 0
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
Anton Blanchard - Feb. 18, 2009, 6:20 a.m.
Hi Ben,

> You have numbers ? :-) I'm asking mostly because I've been wondering
> whether it offsets the 16M pages vs. 4K or 64K pages in term of TLB/ERAT
> impact.

The speedup is application dependent. Things like linpack usually
improve when you throw more memmory at them.

The potential slowdown will be in heavy dcache use (eg fileserving). We
originally added the large boot time hash code when we were benchmarking
SPECsfs (an NFS benchmark).

We can go back to the old behaviour with the hashdist=0 boot option, so
it's mostly a question of what the default should be.

Anton
David Miller - Feb. 18, 2009, 9:19 a.m.
From: Anton Blanchard <anton@samba.org>
Date: Wed, 18 Feb 2009 16:11:12 +1100

> @@ -145,9 +145,10 @@ extern void *alloc_large_system_hash(const char *tablename,
>  #define HASH_EARLY	0x00000001	/* Allocating during early boot? */
>  
>  /* Only NUMA needs hash distribution.
> - * IA64 and x86_64 have sufficient vmalloc space.
> + * IA64, x86_64 and PowerPC have sufficient vmalloc space.
>   */
> -#if defined(CONFIG_NUMA) && (defined(CONFIG_IA64) || defined(CONFIG_X86_64))
> +#if defined(CONFIG_NUMA) && (defined(CONFIG_IA64) || defined(CONFIG_X86_64) || \
> +	defined(CONFIG_PPC64))
>  #define HASHDIST_DEFAULT 1
>  #else
>  #define HASHDIST_DEFAULT 0

I should probably do this on sparc64 too.

Why don't we just change this thing to CONFIG_64BIT?

Patch

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index 95837bf..c0c63ee 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -145,9 +145,10 @@  extern void *alloc_large_system_hash(const char *tablename,
 #define HASH_EARLY	0x00000001	/* Allocating during early boot? */
 
 /* Only NUMA needs hash distribution.
- * IA64 and x86_64 have sufficient vmalloc space.
+ * IA64, x86_64 and PowerPC have sufficient vmalloc space.
  */
-#if defined(CONFIG_NUMA) && (defined(CONFIG_IA64) || defined(CONFIG_X86_64))
+#if defined(CONFIG_NUMA) && (defined(CONFIG_IA64) || defined(CONFIG_X86_64) || \
+	defined(CONFIG_PPC64))
 #define HASHDIST_DEFAULT 1
 #else
 #define HASHDIST_DEFAULT 0