diff mbox series

[SRU,Trusty,Bionic,1/1] proc: restrict kernel stack dumps to root

Message ID 20181204022104.13665-2-khalid.elmously@canonical.com
State New
Headers show
Series Fix for CVE-2018-17972 | expand

Commit Message

Khalid Elmously Dec. 4, 2018, 2:21 a.m. UTC
From: Jann Horn <jannh@google.com>

CVE-2018-17972

Currently, you can use /proc/self/task/*/stack to cause a stack walk on
a task you control while it is running on another CPU.  That means that
the stack can change under the stack walker.  The stack walker does
have guards against going completely off the rails and into random
kernel memory, but it can interpret random data from your kernel stack
as instruction pointers and stack pointers.  This can cause exposure of
kernel stack contents to userspace.

Restrict the ability to inspect kernel stacks of arbitrary tasks to root
in order to prevent a local attacker from exploiting racy stack unwinding
to leak kernel task stack contents.  See the added comment for a longer
rationale.

There don't seem to be any users of this userspace API that can't
gracefully bail out if reading from the file fails.  Therefore, I believe
that this change is unlikely to break things.  In the case that this patch
does end up needing a revert, the next-best solution might be to fake a
single-entry stack based on wchan.

Link: http://lkml.kernel.org/r/20180927153316.200286-1-jannh@google.com
Fixes: 2ec220e27f50 ("proc: add /proc/*/stack")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ken Chen <kenchen@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(backported from f8a00cef17206ecd1b30d3d9f99e10d9fa707aa7)
[ kmously: Minor context adjustment ]
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
---
 fs/proc/base.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

Comments

Juerg Haefliger Dec. 4, 2018, 11:12 a.m. UTC | #1
On Mon,  3 Dec 2018 21:21:04 -0500
Khalid Elmously <khalid.elmously@canonical.com> wrote:

> From: Jann Horn <jannh@google.com>
> 
> CVE-2018-17972
> 
> Currently, you can use /proc/self/task/*/stack to cause a stack walk on
> a task you control while it is running on another CPU.  That means that
> the stack can change under the stack walker.  The stack walker does
> have guards against going completely off the rails and into random
> kernel memory, but it can interpret random data from your kernel stack
> as instruction pointers and stack pointers.  This can cause exposure of
> kernel stack contents to userspace.
> 
> Restrict the ability to inspect kernel stacks of arbitrary tasks to root
> in order to prevent a local attacker from exploiting racy stack unwinding
> to leak kernel task stack contents.  See the added comment for a longer
> rationale.
> 
> There don't seem to be any users of this userspace API that can't
> gracefully bail out if reading from the file fails.  Therefore, I believe
> that this change is unlikely to break things.  In the case that this patch
> does end up needing a revert, the next-best solution might be to fake a
> single-entry stack based on wchan.
> 
> Link: http://lkml.kernel.org/r/20180927153316.200286-1-jannh@google.com
> Fixes: 2ec220e27f50 ("proc: add /proc/*/stack")
> Signed-off-by: Jann Horn <jannh@google.com>
> Acked-by: Kees Cook <keescook@chromium.org>
> Cc: Alexey Dobriyan <adobriyan@gmail.com>
> Cc: Ken Chen <kenchen@google.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Laura Abbott <labbott@redhat.com>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H . Peter Anvin" <hpa@zytor.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Empty line here please.

> (backported from f8a00cef17206ecd1b30d3d9f99e10d9fa707aa7)

(backported from commit ...

...Juerg


> [ kmously: Minor context adjustment ]
> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
> ---
>  fs/proc/base.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 4dd9d5541088..7c798e95a69e 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -304,6 +304,20 @@ static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
>  	int err;
>  	int i;
>  
> +	/*
> +	 * The ability to racily run the kernel stack unwinder on a running task
> +	 * and then observe the unwinder output is scary; while it is useful for
> +	 * debugging kernel issues, it can also allow an attacker to leak kernel
> +	 * stack contents.
> +	 * Doing this in a manner that is at least safe from races would require
> +	 * some work to ensure that the remote task can not be scheduled; and
> +	 * even then, this would still expose the unwinder as local attack
> +	 * surface.
> +	 * Therefore, this interface is restricted to root.
> +	 */
> +	if (!file_ns_capable(m->file, &init_user_ns, CAP_SYS_ADMIN))
> +		return -EACCES;
> +
>  	entries = kmalloc(MAX_STACK_TRACE_DEPTH * sizeof(*entries), GFP_KERNEL);
>  	if (!entries)
>  		return -ENOMEM;
Po-Hsu Lin Dec. 17, 2018, 9:37 a.m. UTC | #2
Backport looks ok with changes limited to the original patch.
Test build OK.

I think we don't have a convention to keep an empty line before (backport ...),
don't we?

https://wiki.ubuntu.com/Kernel/Dev/StablePatchFormat

Acked-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Kleber Sacilotto de Souza Jan. 7, 2019, 3:27 p.m. UTC | #3
On 12/4/18 3:21 AM, Khalid Elmously wrote:
> From: Jann Horn <jannh@google.com>
>
> CVE-2018-17972
>
> Currently, you can use /proc/self/task/*/stack to cause a stack walk on
> a task you control while it is running on another CPU.  That means that
> the stack can change under the stack walker.  The stack walker does
> have guards against going completely off the rails and into random
> kernel memory, but it can interpret random data from your kernel stack
> as instruction pointers and stack pointers.  This can cause exposure of
> kernel stack contents to userspace.
>
> Restrict the ability to inspect kernel stacks of arbitrary tasks to root
> in order to prevent a local attacker from exploiting racy stack unwinding
> to leak kernel task stack contents.  See the added comment for a longer
> rationale.
>
> There don't seem to be any users of this userspace API that can't
> gracefully bail out if reading from the file fails.  Therefore, I believe
> that this change is unlikely to break things.  In the case that this patch
> does end up needing a revert, the next-best solution might be to fake a
> single-entry stack based on wchan.
>
> Link: http://lkml.kernel.org/r/20180927153316.200286-1-jannh@google.com
> Fixes: 2ec220e27f50 ("proc: add /proc/*/stack")
> Signed-off-by: Jann Horn <jannh@google.com>
> Acked-by: Kees Cook <keescook@chromium.org>
> Cc: Alexey Dobriyan <adobriyan@gmail.com>
> Cc: Ken Chen <kenchen@google.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Laura Abbott <labbott@redhat.com>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H . Peter Anvin" <hpa@zytor.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> (backported from f8a00cef17206ecd1b30d3d9f99e10d9fa707aa7)

As Juerg pointed out:

(backported from commit ...)

This can be fixed when applying the patch.

> [ kmously: Minor context adjustment ]
> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

> ---
>  fs/proc/base.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 4dd9d5541088..7c798e95a69e 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -304,6 +304,20 @@ static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
>  	int err;
>  	int i;
>  
> +	/*
> +	 * The ability to racily run the kernel stack unwinder on a running task
> +	 * and then observe the unwinder output is scary; while it is useful for
> +	 * debugging kernel issues, it can also allow an attacker to leak kernel
> +	 * stack contents.
> +	 * Doing this in a manner that is at least safe from races would require
> +	 * some work to ensure that the remote task can not be scheduled; and
> +	 * even then, this would still expose the unwinder as local attack
> +	 * surface.
> +	 * Therefore, this interface is restricted to root.
> +	 */
> +	if (!file_ns_capable(m->file, &init_user_ns, CAP_SYS_ADMIN))
> +		return -EACCES;
> +
>  	entries = kmalloc(MAX_STACK_TRACE_DEPTH * sizeof(*entries), GFP_KERNEL);
>  	if (!entries)
>  		return -ENOMEM;
diff mbox series

Patch

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 4dd9d5541088..7c798e95a69e 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -304,6 +304,20 @@  static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
 	int err;
 	int i;
 
+	/*
+	 * The ability to racily run the kernel stack unwinder on a running task
+	 * and then observe the unwinder output is scary; while it is useful for
+	 * debugging kernel issues, it can also allow an attacker to leak kernel
+	 * stack contents.
+	 * Doing this in a manner that is at least safe from races would require
+	 * some work to ensure that the remote task can not be scheduled; and
+	 * even then, this would still expose the unwinder as local attack
+	 * surface.
+	 * Therefore, this interface is restricted to root.
+	 */
+	if (!file_ns_capable(m->file, &init_user_ns, CAP_SYS_ADMIN))
+		return -EACCES;
+
 	entries = kmalloc(MAX_STACK_TRACE_DEPTH * sizeof(*entries), GFP_KERNEL);
 	if (!entries)
 		return -ENOMEM;