diff mbox series

[SRU,ZESTY] UBUNTU: SAUCE: use CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y as default

Message ID 20171123142948.25993-1-colin.king@canonical.com
State New
Headers show
Series [SRU,ZESTY] UBUNTU: SAUCE: use CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y as default | expand

Commit Message

Colin Ian King Nov. 23, 2017, 2:29 p.m. UTC
From: Colin Ian King <colin.king@canonical.com>

BugLink: https://bugs.launchpad.net/bugs/1703742

The current configuration is set to always use transparent hugepages
by default. There exists plenty of anecdotal evidence that this is
less than perfect a choice and in some scenarios it leads to some
performance issues.

My own investigations with stress-ng stream and malloc tests show that
the current default impacts performance. I ran various test scenarios
on different MADVISE configurations, each result below is based on
the average of 5 runs on an i7-3770 CPU @ 3.4GHz with 8GB memory,
8MB L3 cache, 256K L2 cache, 32K/32K L1 cache.

All the above results are from an average of 5 rounds of tests.

malloc allocation stressor:

     malloc     always    madvise
    size (MB)   ops/sec   ops/sec
         32     1254.43   2422.49
         64     2100.36   4300.28
        128     3768.57   7215.38
        256     7940.73  14893.85
        512    17618.62  26861.29
       1024    32777.17  48029.37

Clearly madvise is more performent.

stream bandwidth/compute stressor:

    stream      always    madvise
			 NOHUGEPAGE
    size (MB)   MB/sec     MB/sec
          1   17713.54   18439.69
          2   12460.34   13015.46
          4   12195.81   12694.51
          8   12085.11   12674.26
         16   12054.09   12649.00
         32   12082.42   12409.65
         64   12262.88   12084.85
        128   12235.25   11788.49
        256   11808.69   11283.69
        512   11970.01   12434.82

For small allocations, always is less performant. Large
allocations can enable the more performant transparent
huge pages with madvise(2) if we disable always as default.

Other stress-ng memory allocation/writing/freeing and madvise
operations showed little significant differences.

I have also experimented with boot testing Ubuntu with kernels
configured with different MADVISE configs and found there is
little noticeable difference in performance, so I believe that
there is little scope for any kitten killer performance regressions
with this change.

This change will by default not use transparent huge pages unless
madvise(2) is used to instruct the kernel to do so on a memory
mapping.  According to the madvise(2) manual, this only takes
effect on private anonymous mappings with MADV_HUGEPAGE.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 debian.master/config/annotations          | 4 ++--
 debian.master/config/config.common.ubuntu | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

Comments

Kleber Sacilotto de Souza Jan. 5, 2018, 3:42 p.m. UTC | #1
On 11/23/17 15:29, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1703742
> 
> The current configuration is set to always use transparent hugepages
> by default. There exists plenty of anecdotal evidence that this is
> less than perfect a choice and in some scenarios it leads to some
> performance issues.
> 
> My own investigations with stress-ng stream and malloc tests show that
> the current default impacts performance. I ran various test scenarios
> on different MADVISE configurations, each result below is based on
> the average of 5 runs on an i7-3770 CPU @ 3.4GHz with 8GB memory,
> 8MB L3 cache, 256K L2 cache, 32K/32K L1 cache.
> 
> All the above results are from an average of 5 rounds of tests.
> 
> malloc allocation stressor:
> 
>      malloc     always    madvise
>     size (MB)   ops/sec   ops/sec
>          32     1254.43   2422.49
>          64     2100.36   4300.28
>         128     3768.57   7215.38
>         256     7940.73  14893.85
>         512    17618.62  26861.29
>        1024    32777.17  48029.37
> 
> Clearly madvise is more performent.
> 
> stream bandwidth/compute stressor:
> 
>     stream      always    madvise
> 			 NOHUGEPAGE
>     size (MB)   MB/sec     MB/sec
>           1   17713.54   18439.69
>           2   12460.34   13015.46
>           4   12195.81   12694.51
>           8   12085.11   12674.26
>          16   12054.09   12649.00
>          32   12082.42   12409.65
>          64   12262.88   12084.85
>         128   12235.25   11788.49
>         256   11808.69   11283.69
>         512   11970.01   12434.82
> 
> For small allocations, always is less performant. Large
> allocations can enable the more performant transparent
> huge pages with madvise(2) if we disable always as default.
> 
> Other stress-ng memory allocation/writing/freeing and madvise
> operations showed little significant differences.
> 
> I have also experimented with boot testing Ubuntu with kernels
> configured with different MADVISE configs and found there is
> little noticeable difference in performance, so I believe that
> there is little scope for any kitten killer performance regressions
> with this change.
> 
> This change will by default not use transparent huge pages unless
> madvise(2) is used to instruct the kernel to do so on a memory
> mapping.  According to the madvise(2) manual, this only takes
> effect on private anonymous mappings with MADV_HUGEPAGE.
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Good test results, makes sense to me.

Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

> ---
>  debian.master/config/annotations          | 4 ++--
>  debian.master/config/config.common.ubuntu | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/debian.master/config/annotations b/debian.master/config/annotations
> index faf8c8a..47e7783 100644
> --- a/debian.master/config/annotations
> +++ b/debian.master/config/annotations
> @@ -10528,8 +10528,8 @@ CONFIG_HZ_200                                   policy<{'armhf': 'n'}>
>  CONFIG_HZ_500                                   policy<{'armhf': 'n'}>
>  
>  # Menu: Processor type and features >> Transparent Hugepage Support sysfs defaults
> -CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS              policy<{'amd64': 'y', 'arm64': 'y', 'armhf-generic-lpae': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
> -CONFIG_TRANSPARENT_HUGEPAGE_MADVISE             policy<{'amd64': 'n', 'arm64': 'n', 'armhf-generic-lpae': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
> +CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS              policy<{'amd64': 'n', 'arm64': 'n', 'armhf-generic-lpae': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
> +CONFIG_TRANSPARENT_HUGEPAGE_MADVISE             policy<{'amd64': 'y', 'arm64': 'y', 'armhf-generic-lpae': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
>  
>  # Menu: Processor type and features >> Tune code generation >> Architecture: s390
>  CONFIG_TUNE_DEFAULT                             policy<{'s390x': 'n'}>
> diff --git a/debian.master/config/config.common.ubuntu b/debian.master/config/config.common.ubuntu
> index f35a86a..fb78bb4 100644
> --- a/debian.master/config/config.common.ubuntu
> +++ b/debian.master/config/config.common.ubuntu
> @@ -8601,8 +8601,8 @@ CONFIG_TRACING_EVENTS_GPIO=y
>  CONFIG_TRACING_MAP=y
>  CONFIG_TRACING_SUPPORT=y
>  CONFIG_TRANSPARENT_HUGEPAGE=y
> -CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> -# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
> +# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
> +CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
>  CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
>  CONFIG_TREE_RCU=y
>  # CONFIG_TREE_RCU_TRACE is not set
>
Kleber Sacilotto de Souza Jan. 23, 2018, 4:13 p.m. UTC | #2
On 11/23/17 15:29, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1703742
> 
> The current configuration is set to always use transparent hugepages
> by default. There exists plenty of anecdotal evidence that this is
> less than perfect a choice and in some scenarios it leads to some
> performance issues.
> 
> My own investigations with stress-ng stream and malloc tests show that
> the current default impacts performance. I ran various test scenarios
> on different MADVISE configurations, each result below is based on
> the average of 5 runs on an i7-3770 CPU @ 3.4GHz with 8GB memory,
> 8MB L3 cache, 256K L2 cache, 32K/32K L1 cache.
> 
> All the above results are from an average of 5 rounds of tests.
> 
> malloc allocation stressor:
> 
>      malloc     always    madvise
>     size (MB)   ops/sec   ops/sec
>          32     1254.43   2422.49
>          64     2100.36   4300.28
>         128     3768.57   7215.38
>         256     7940.73  14893.85
>         512    17618.62  26861.29
>        1024    32777.17  48029.37
> 
> Clearly madvise is more performent.
> 
> stream bandwidth/compute stressor:
> 
>     stream      always    madvise
> 			 NOHUGEPAGE
>     size (MB)   MB/sec     MB/sec
>           1   17713.54   18439.69
>           2   12460.34   13015.46
>           4   12195.81   12694.51
>           8   12085.11   12674.26
>          16   12054.09   12649.00
>          32   12082.42   12409.65
>          64   12262.88   12084.85
>         128   12235.25   11788.49
>         256   11808.69   11283.69
>         512   11970.01   12434.82
> 
> For small allocations, always is less performant. Large
> allocations can enable the more performant transparent
> huge pages with madvise(2) if we disable always as default.
> 
> Other stress-ng memory allocation/writing/freeing and madvise
> operations showed little significant differences.
> 
> I have also experimented with boot testing Ubuntu with kernels
> configured with different MADVISE configs and found there is
> little noticeable difference in performance, so I believe that
> there is little scope for any kitten killer performance regressions
> with this change.
> 
> This change will by default not use transparent huge pages unless
> madvise(2) is used to instruct the kernel to do so on a memory
> mapping.  According to the madvise(2) manual, this only takes
> effect on private anonymous mappings with MADV_HUGEPAGE.
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>  debian.master/config/annotations          | 4 ++--
>  debian.master/config/config.common.ubuntu | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/debian.master/config/annotations b/debian.master/config/annotations
> index faf8c8a..47e7783 100644
> --- a/debian.master/config/annotations
> +++ b/debian.master/config/annotations
> @@ -10528,8 +10528,8 @@ CONFIG_HZ_200                                   policy<{'armhf': 'n'}>
>  CONFIG_HZ_500                                   policy<{'armhf': 'n'}>
>  
>  # Menu: Processor type and features >> Transparent Hugepage Support sysfs defaults
> -CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS              policy<{'amd64': 'y', 'arm64': 'y', 'armhf-generic-lpae': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
> -CONFIG_TRANSPARENT_HUGEPAGE_MADVISE             policy<{'amd64': 'n', 'arm64': 'n', 'armhf-generic-lpae': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
> +CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS              policy<{'amd64': 'n', 'arm64': 'n', 'armhf-generic-lpae': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
> +CONFIG_TRANSPARENT_HUGEPAGE_MADVISE             policy<{'amd64': 'y', 'arm64': 'y', 'armhf-generic-lpae': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
>  
>  # Menu: Processor type and features >> Tune code generation >> Architecture: s390
>  CONFIG_TUNE_DEFAULT                             policy<{'s390x': 'n'}>
> diff --git a/debian.master/config/config.common.ubuntu b/debian.master/config/config.common.ubuntu
> index f35a86a..fb78bb4 100644
> --- a/debian.master/config/config.common.ubuntu
> +++ b/debian.master/config/config.common.ubuntu
> @@ -8601,8 +8601,8 @@ CONFIG_TRACING_EVENTS_GPIO=y
>  CONFIG_TRACING_MAP=y
>  CONFIG_TRACING_SUPPORT=y
>  CONFIG_TRANSPARENT_HUGEPAGE=y
> -CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> -# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
> +# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
> +CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
>  CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
>  CONFIG_TREE_RCU=y
>  # CONFIG_TREE_RCU_TRACE is not set
> 

Zesty is EOL.
diff mbox series

Patch

diff --git a/debian.master/config/annotations b/debian.master/config/annotations
index faf8c8a..47e7783 100644
--- a/debian.master/config/annotations
+++ b/debian.master/config/annotations
@@ -10528,8 +10528,8 @@  CONFIG_HZ_200                                   policy<{'armhf': 'n'}>
 CONFIG_HZ_500                                   policy<{'armhf': 'n'}>
 
 # Menu: Processor type and features >> Transparent Hugepage Support sysfs defaults
-CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS              policy<{'amd64': 'y', 'arm64': 'y', 'armhf-generic-lpae': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
-CONFIG_TRANSPARENT_HUGEPAGE_MADVISE             policy<{'amd64': 'n', 'arm64': 'n', 'armhf-generic-lpae': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
+CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS              policy<{'amd64': 'n', 'arm64': 'n', 'armhf-generic-lpae': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
+CONFIG_TRANSPARENT_HUGEPAGE_MADVISE             policy<{'amd64': 'y', 'arm64': 'y', 'armhf-generic-lpae': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
 
 # Menu: Processor type and features >> Tune code generation >> Architecture: s390
 CONFIG_TUNE_DEFAULT                             policy<{'s390x': 'n'}>
diff --git a/debian.master/config/config.common.ubuntu b/debian.master/config/config.common.ubuntu
index f35a86a..fb78bb4 100644
--- a/debian.master/config/config.common.ubuntu
+++ b/debian.master/config/config.common.ubuntu
@@ -8601,8 +8601,8 @@  CONFIG_TRACING_EVENTS_GPIO=y
 CONFIG_TRACING_MAP=y
 CONFIG_TRACING_SUPPORT=y
 CONFIG_TRANSPARENT_HUGEPAGE=y
-CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
-# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
+# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
+CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
 CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
 CONFIG_TREE_RCU=y
 # CONFIG_TREE_RCU_TRACE is not set