diff mbox series

[unstable] UBUNTU: [Config] Enable CONFIG_PAGE_POISONING configs

Message ID 20180824161805.20260-1-colin.king@canonical.com
State New
Headers show
Series [unstable] UBUNTU: [Config] Enable CONFIG_PAGE_POISONING configs | expand

Commit Message

Colin Ian King Aug. 24, 2018, 4:18 p.m. UTC
From: Colin Ian King <colin.king@canonical.com>

BugLink: https://bugs.launchpad.net/bugs/1783651

As requested by Kees, enable the following to help with finding certain
types of memory corruption:

CONFIG_PAGE_POISONING=y
CONFIG_PAGE_POISONING_ZERO=y
CONFIG_PAGE_POISONING_NO_SANITY=y

"this should have no impact on regular boots, and if someone boots with
"page_poison=1" then they get page wiping when page_alloc pages are freed
(and then GFP_ZERO is a no-op since it was already freed), so it becomes
a reasonable trade-off on performance vs gaining the wipe-on-free ability
of the buddy allocator."

This has been benchmarked, with the tests comparing kernels without the
config, with the config and with the config with page_poison=1 for the
4.18 and 4.15 kernels. I ran nearly 200 stress-ng stress tests and
gathered the throughput (based on bogo ops per second on the usr+sys time
consumed) for each stress test. Each of the stress tests were run for 60
seconds on an idle 8 thread Xeon i7-3770.

The bogo-ops data was then normalized against the kernel that didn't have
the config changes. The data to look at is the geometric means of all the
normalized test results:

4.18 kernel, geometric mean of normalized bogo/ops throughput:

No page poisoning: 1.000
Config page poisoning: 1.003
Config page poionsing + page_poison=1: 0.991

4.15 kernel, geometric mean of normalized bogo/ops throughput:

No page poisoning: 1.000
Config page poisoning: 1.025
Config page poionsing + page_poison=1: 0.977

where > 1.000 shows more throughput and < 1.000 shows degraded throughput

So it appears that enabling page poisoning configs does not degrade performance
and setting page_poison=1 degrades performance by a very small amount.

Raw data for this can be found at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1783651/+attachment/5170997/+files/kernel-poison-page-analysis.ods

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 debian.master/config/annotations          | 4 +++-
 debian.master/config/config.common.ubuntu | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

Comments

Stefan Bader Aug. 27, 2018, 1:03 p.m. UTC | #1
On 24.08.2018 18:18, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1783651
> 
> As requested by Kees, enable the following to help with finding certain
> types of memory corruption:
> 
> CONFIG_PAGE_POISONING=y
> CONFIG_PAGE_POISONING_ZERO=y
> CONFIG_PAGE_POISONING_NO_SANITY=y
> 
> "this should have no impact on regular boots, and if someone boots with
> "page_poison=1" then they get page wiping when page_alloc pages are freed
> (and then GFP_ZERO is a no-op since it was already freed), so it becomes
> a reasonable trade-off on performance vs gaining the wipe-on-free ability
> of the buddy allocator."
> 
> This has been benchmarked, with the tests comparing kernels without the
> config, with the config and with the config with page_poison=1 for the
> 4.18 and 4.15 kernels. I ran nearly 200 stress-ng stress tests and
> gathered the throughput (based on bogo ops per second on the usr+sys time
> consumed) for each stress test. Each of the stress tests were run for 60
> seconds on an idle 8 thread Xeon i7-3770.
> 
> The bogo-ops data was then normalized against the kernel that didn't have
> the config changes. The data to look at is the geometric means of all the
> normalized test results:
> 
> 4.18 kernel, geometric mean of normalized bogo/ops throughput:
> 
> No page poisoning: 1.000
> Config page poisoning: 1.003
> Config page poionsing + page_poison=1: 0.991
> 
> 4.15 kernel, geometric mean of normalized bogo/ops throughput:
> 
> No page poisoning: 1.000
> Config page poisoning: 1.025
> Config page poionsing + page_poison=1: 0.977
> 
> where > 1.000 shows more throughput and < 1.000 shows degraded throughput
> 
> So it appears that enabling page poisoning configs does not degrade performance
> and setting page_poison=1 degrades performance by a very small amount.
> 
> Raw data for this can be found at:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1783651/+attachment/5170997/+files/kernel-poison-page-analysis.ods
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
> ---

I'd leave it to Seth/Thadeu whether they would want to add
enforcement/commentary to the annotations file.

-Stefan

>  debian.master/config/annotations          | 4 +++-
>  debian.master/config/config.common.ubuntu | 4 +++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/debian.master/config/annotations b/debian.master/config/annotations
> index 2079eb4..e3c6e3a 100644
> --- a/debian.master/config/annotations
> +++ b/debian.master/config/annotations
> @@ -9876,7 +9876,9 @@ CONFIG_WW_MUTEX_SELFTEST                        policy<{'amd64': 'n', 'arm64': '
>  # Menu: Kernel hacking >> Memory Debugging
>  CONFIG_PAGE_EXTENSION                           policy<{'amd64': 'n', 'arm64': 'n', 'armhf': 'y', 'i386': 'y', 'ppc64el': 'n', 's390x': 'n'}>
>  CONFIG_DEBUG_PAGEALLOC                          policy<{'amd64': 'n', 'arm64': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
> -CONFIG_PAGE_POISONING                           policy<{'amd64': 'n', 'arm64': 'n', 'armhf': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
> +CONFIG_PAGE_POISONING                           policy<{'amd64': 'y', 'arm64': 'y', 'armhf': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
> +CONFIG_PAGE_POISONING_ZERO                      policy<{'amd64': 'y', 'arm64': 'y', 'armhf': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
> +CONFIG_PAGE_POISONING_NO_SANITY                 policy<{'amd64': 'y', 'arm64': 'y', 'armhf': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
>  CONFIG_DEBUG_PAGE_REF                           policy<{'amd64': 'n', 'arm64': 'n', 'armhf': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
>  CONFIG_DEBUG_RODATA_TEST                        policy<{'amd64': 'n', 'arm64': 'n', 'armhf': 'n', 'i386': 'n', 's390x': 'n'}>
>  CONFIG_SLUB_DEBUG_ON                            policy<{'amd64': 'n', 'arm64': 'n', 'armhf': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
> diff --git a/debian.master/config/config.common.ubuntu b/debian.master/config/config.common.ubuntu
> index d248512..7ce3acd 100644
> --- a/debian.master/config/config.common.ubuntu
> +++ b/debian.master/config/config.common.ubuntu
> @@ -6510,7 +6510,9 @@ CONFIG_PACK_STACK=y
>  CONFIG_PADATA=y
>  CONFIG_PAGE_COUNTER=y
>  # CONFIG_PAGE_OWNER is not set
> -# CONFIG_PAGE_POISONING is not set
> +CONFIG_PAGE_POISONING=y
> +CONFIG_PAGE_POISONING_ZERO=y
> +CONFIG_PAGE_POISONING_NO_SANITY=y
>  CONFIG_PAGE_POOL=y
>  CONFIG_PAGE_TABLE_ISOLATION=y
>  CONFIG_PALMAS_GPADC=m
>
Seth Forshee Aug. 29, 2018, 8:47 p.m. UTC | #2
On Fri, Aug 24, 2018 at 05:18:05PM +0100, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1783651
> 
> As requested by Kees, enable the following to help with finding certain
> types of memory corruption:
> 
> CONFIG_PAGE_POISONING=y
> CONFIG_PAGE_POISONING_ZERO=y
> CONFIG_PAGE_POISONING_NO_SANITY=y
> 
> "this should have no impact on regular boots, and if someone boots with
> "page_poison=1" then they get page wiping when page_alloc pages are freed
> (and then GFP_ZERO is a no-op since it was already freed), so it becomes
> a reasonable trade-off on performance vs gaining the wipe-on-free ability
> of the buddy allocator."
> 
> This has been benchmarked, with the tests comparing kernels without the
> config, with the config and with the config with page_poison=1 for the
> 4.18 and 4.15 kernels. I ran nearly 200 stress-ng stress tests and
> gathered the throughput (based on bogo ops per second on the usr+sys time
> consumed) for each stress test. Each of the stress tests were run for 60
> seconds on an idle 8 thread Xeon i7-3770.
> 
> The bogo-ops data was then normalized against the kernel that didn't have
> the config changes. The data to look at is the geometric means of all the
> normalized test results:
> 
> 4.18 kernel, geometric mean of normalized bogo/ops throughput:
> 
> No page poisoning: 1.000
> Config page poisoning: 1.003
> Config page poionsing + page_poison=1: 0.991
> 
> 4.15 kernel, geometric mean of normalized bogo/ops throughput:
> 
> No page poisoning: 1.000
> Config page poisoning: 1.025
> Config page poionsing + page_poison=1: 0.977
> 
> where > 1.000 shows more throughput and < 1.000 shows degraded throughput
> 
> So it appears that enabling page poisoning configs does not degrade performance
> and setting page_poison=1 degrades performance by a very small amount.
> 
> Raw data for this can be found at:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1783651/+attachment/5170997/+files/kernel-poison-page-analysis.ods
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Applied to unstable/master. Also added enforcement to the annotations
file. Thanks!
diff mbox series

Patch

diff --git a/debian.master/config/annotations b/debian.master/config/annotations
index 2079eb4..e3c6e3a 100644
--- a/debian.master/config/annotations
+++ b/debian.master/config/annotations
@@ -9876,7 +9876,9 @@  CONFIG_WW_MUTEX_SELFTEST                        policy<{'amd64': 'n', 'arm64': '
 # Menu: Kernel hacking >> Memory Debugging
 CONFIG_PAGE_EXTENSION                           policy<{'amd64': 'n', 'arm64': 'n', 'armhf': 'y', 'i386': 'y', 'ppc64el': 'n', 's390x': 'n'}>
 CONFIG_DEBUG_PAGEALLOC                          policy<{'amd64': 'n', 'arm64': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
-CONFIG_PAGE_POISONING                           policy<{'amd64': 'n', 'arm64': 'n', 'armhf': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
+CONFIG_PAGE_POISONING                           policy<{'amd64': 'y', 'arm64': 'y', 'armhf': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
+CONFIG_PAGE_POISONING_ZERO                      policy<{'amd64': 'y', 'arm64': 'y', 'armhf': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
+CONFIG_PAGE_POISONING_NO_SANITY                 policy<{'amd64': 'y', 'arm64': 'y', 'armhf': 'y', 'i386': 'y', 'ppc64el': 'y', 's390x': 'y'}>
 CONFIG_DEBUG_PAGE_REF                           policy<{'amd64': 'n', 'arm64': 'n', 'armhf': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
 CONFIG_DEBUG_RODATA_TEST                        policy<{'amd64': 'n', 'arm64': 'n', 'armhf': 'n', 'i386': 'n', 's390x': 'n'}>
 CONFIG_SLUB_DEBUG_ON                            policy<{'amd64': 'n', 'arm64': 'n', 'armhf': 'n', 'i386': 'n', 'ppc64el': 'n', 's390x': 'n'}>
diff --git a/debian.master/config/config.common.ubuntu b/debian.master/config/config.common.ubuntu
index d248512..7ce3acd 100644
--- a/debian.master/config/config.common.ubuntu
+++ b/debian.master/config/config.common.ubuntu
@@ -6510,7 +6510,9 @@  CONFIG_PACK_STACK=y
 CONFIG_PADATA=y
 CONFIG_PAGE_COUNTER=y
 # CONFIG_PAGE_OWNER is not set
-# CONFIG_PAGE_POISONING is not set
+CONFIG_PAGE_POISONING=y
+CONFIG_PAGE_POISONING_ZERO=y
+CONFIG_PAGE_POISONING_NO_SANITY=y
 CONFIG_PAGE_POOL=y
 CONFIG_PAGE_TABLE_ISOLATION=y
 CONFIG_PALMAS_GPADC=m