diff mbox series

init: fix false positives in W+X checking

Message ID 20180515200636.10802-2-manoj.iyer@canonical.com
State New
Headers show
Series init: fix false positives in W+X checking | expand

Commit Message

Manoj Iyer May 15, 2018, 8:06 p.m. UTC
From: Jeffrey Hugo <jhugo@codeaurora.org>

load_module() creates W+X mappings via __vmalloc_node_range() (from
layout_and_allocate()->move_module()->module_alloc()) by using
PAGE_KERNEL_EXEC.  These mappings are later cleaned up via
"call_rcu_sched(&freeinit->rcu, do_free_init)" from do_init_module().

This is a problem because call_rcu_sched() queues work, which can be run
after debug_checkwx() is run, resulting in a race condition.  If hit,
the race results in a nasty splat about insecure W+X mappings, which
results in a poor user experience as these are not the mappings that
debug_checkwx() is intended to catch.

This issue is observed on multiple arm64 platforms, and has been
artificially triggered on an x86 platform.

Address the race by flushing the queued work before running the
arch-defined mark_rodata_ro() which then calls debug_checkwx().

Link: http://lkml.kernel.org/r/1525103946-29526-1-git-send-email-jhugo@codeaurora.org
Fixes: e1a58320a38d ("x86/mm: Warn on W^X mappings")

BugLink: https://launchpad.net/bugs/1769696

Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
Reported-by: Timur Tabi <timur@codeaurora.org>
Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Laura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(backported from commit ae646f0b9ca135b87bc73ff606ef996c3029780a)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
---
 init/main.c     | 7 +++++++
 kernel/module.c | 5 +++++
 2 files changed, 12 insertions(+)

Comments

Kleber Sacilotto de Souza May 16, 2018, 2:21 p.m. UTC | #1
On 05/15/18 22:06, Manoj Iyer wrote:
> From: Jeffrey Hugo <jhugo@codeaurora.org>
> 
> load_module() creates W+X mappings via __vmalloc_node_range() (from
> layout_and_allocate()->move_module()->module_alloc()) by using
> PAGE_KERNEL_EXEC.  These mappings are later cleaned up via
> "call_rcu_sched(&freeinit->rcu, do_free_init)" from do_init_module().
> 
> This is a problem because call_rcu_sched() queues work, which can be run
> after debug_checkwx() is run, resulting in a race condition.  If hit,
> the race results in a nasty splat about insecure W+X mappings, which
> results in a poor user experience as these are not the mappings that
> debug_checkwx() is intended to catch.
> 
> This issue is observed on multiple arm64 platforms, and has been
> artificially triggered on an x86 platform.
> 
> Address the race by flushing the queued work before running the
> arch-defined mark_rodata_ro() which then calls debug_checkwx().
> 
> Link: http://lkml.kernel.org/r/1525103946-29526-1-git-send-email-jhugo@codeaurora.org
> Fixes: e1a58320a38d ("x86/mm: Warn on W^X mappings")
> 
> BugLink: https://launchpad.net/bugs/1769696
> 
> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
> Reported-by: Timur Tabi <timur@codeaurora.org>
> Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
> Acked-by: Kees Cook <keescook@chromium.org>
> Acked-by: Ingo Molnar <mingo@kernel.org>
> Acked-by: Will Deacon <will.deacon@arm.com>
> Acked-by: Laura Abbott <labbott@redhat.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Stephen Smalley <sds@tycho.nsa.gov>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> (backported from commit ae646f0b9ca135b87bc73ff606ef996c3029780a)
> Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>

The backport looks correct and the fix has been tested.

Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

> ---
>  init/main.c     | 7 +++++++
>  kernel/module.c | 5 +++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/init/main.c b/init/main.c
> index 49926d95442f..ae5ab5f339eb 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -939,6 +939,13 @@ static int __ref kernel_init(void *unused)
>  	/* need to finish all async __init code before freeing the memory */
>  	async_synchronize_full();
>  	free_initmem();
> +	/*
> +	 * load_module() results in W+X mappings, which are cleaned up
> +	 * with call_rcu_sched().  Let's make sure that queued work is
> +	 * flushed so that we don't hit false positives looking for
> +	 * insecure pages which are W+X.
> +	 */
> +	rcu_barrier_sched();
>  	mark_rodata_ro();
>  	system_state = SYSTEM_RUNNING;
>  	numa_default_policy();
> diff --git a/kernel/module.c b/kernel/module.c
> index de7bf8c52fb3..e1b5320943ad 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3311,6 +3311,11 @@ static noinline int do_init_module(struct module *mod)
>  	 * walking this with preempt disabled.  In all the failure paths, we
>  	 * call synchronize_sched(), but we don't want to slow down the success
>  	 * path, so use actual RCU here.
> +	 * Note that module_alloc() on most architectures creates W+X page
> +	 * mappings which won't be cleaned up until do_free_init() runs.  Any
> +	 * code such as mark_rodata_ro() which depends on those mappings to
> +	 * be cleaned up needs to sync with the queued work - ie
> +	 * rcu_barrier_sched()
>  	 */
>  	call_rcu_sched(&freeinit->rcu, do_free_init);
>  	mutex_unlock(&module_mutex);
>
Stefan Bader May 23, 2018, 2:41 p.m. UTC | #2
On 15.05.2018 22:06, Manoj Iyer wrote:
> From: Jeffrey Hugo <jhugo@codeaurora.org>
> 
> load_module() creates W+X mappings via __vmalloc_node_range() (from
> layout_and_allocate()->move_module()->module_alloc()) by using
> PAGE_KERNEL_EXEC.  These mappings are later cleaned up via
> "call_rcu_sched(&freeinit->rcu, do_free_init)" from do_init_module().
> 
> This is a problem because call_rcu_sched() queues work, which can be run
> after debug_checkwx() is run, resulting in a race condition.  If hit,
> the race results in a nasty splat about insecure W+X mappings, which
> results in a poor user experience as these are not the mappings that
> debug_checkwx() is intended to catch.
> 
> This issue is observed on multiple arm64 platforms, and has been
> artificially triggered on an x86 platform.
> 
> Address the race by flushing the queued work before running the
> arch-defined mark_rodata_ro() which then calls debug_checkwx().
> 
> Link: http://lkml.kernel.org/r/1525103946-29526-1-git-send-email-jhugo@codeaurora.org
> Fixes: e1a58320a38d ("x86/mm: Warn on W^X mappings")
> 
> BugLink: https://launchpad.net/bugs/1769696
> 
> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
> Reported-by: Timur Tabi <timur@codeaurora.org>
> Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
> Acked-by: Kees Cook <keescook@chromium.org>
> Acked-by: Ingo Molnar <mingo@kernel.org>
> Acked-by: Will Deacon <will.deacon@arm.com>
> Acked-by: Laura Abbott <labbott@redhat.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Stephen Smalley <sds@tycho.nsa.gov>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> (backported from commit ae646f0b9ca135b87bc73ff606ef996c3029780a)
> Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>

> ---
>  init/main.c     | 7 +++++++
>  kernel/module.c | 5 +++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/init/main.c b/init/main.c
> index 49926d95442f..ae5ab5f339eb 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -939,6 +939,13 @@ static int __ref kernel_init(void *unused)
>  	/* need to finish all async __init code before freeing the memory */
>  	async_synchronize_full();
>  	free_initmem();
> +	/*
> +	 * load_module() results in W+X mappings, which are cleaned up
> +	 * with call_rcu_sched().  Let's make sure that queued work is
> +	 * flushed so that we don't hit false positives looking for
> +	 * insecure pages which are W+X.
> +	 */
> +	rcu_barrier_sched();
>  	mark_rodata_ro();
>  	system_state = SYSTEM_RUNNING;
>  	numa_default_policy();
> diff --git a/kernel/module.c b/kernel/module.c
> index de7bf8c52fb3..e1b5320943ad 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3311,6 +3311,11 @@ static noinline int do_init_module(struct module *mod)
>  	 * walking this with preempt disabled.  In all the failure paths, we
>  	 * call synchronize_sched(), but we don't want to slow down the success
>  	 * path, so use actual RCU here.
> +	 * Note that module_alloc() on most architectures creates W+X page
> +	 * mappings which won't be cleaned up until do_free_init() runs.  Any
> +	 * code such as mark_rodata_ro() which depends on those mappings to
> +	 * be cleaned up needs to sync with the queued work - ie
> +	 * rcu_barrier_sched()
>  	 */
>  	call_rcu_sched(&freeinit->rcu, do_free_init);
>  	mutex_unlock(&module_mutex);
>
diff mbox series

Patch

diff --git a/init/main.c b/init/main.c
index 49926d95442f..ae5ab5f339eb 100644
--- a/init/main.c
+++ b/init/main.c
@@ -939,6 +939,13 @@  static int __ref kernel_init(void *unused)
 	/* need to finish all async __init code before freeing the memory */
 	async_synchronize_full();
 	free_initmem();
+	/*
+	 * load_module() results in W+X mappings, which are cleaned up
+	 * with call_rcu_sched().  Let's make sure that queued work is
+	 * flushed so that we don't hit false positives looking for
+	 * insecure pages which are W+X.
+	 */
+	rcu_barrier_sched();
 	mark_rodata_ro();
 	system_state = SYSTEM_RUNNING;
 	numa_default_policy();
diff --git a/kernel/module.c b/kernel/module.c
index de7bf8c52fb3..e1b5320943ad 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3311,6 +3311,11 @@  static noinline int do_init_module(struct module *mod)
 	 * walking this with preempt disabled.  In all the failure paths, we
 	 * call synchronize_sched(), but we don't want to slow down the success
 	 * path, so use actual RCU here.
+	 * Note that module_alloc() on most architectures creates W+X page
+	 * mappings which won't be cleaned up until do_free_init() runs.  Any
+	 * code such as mark_rodata_ro() which depends on those mappings to
+	 * be cleaned up needs to sync with the queued work - ie
+	 * rcu_barrier_sched()
 	 */
 	call_rcu_sched(&freeinit->rcu, do_free_init);
 	mutex_unlock(&module_mutex);