From patchwork Thu Apr 5 05:51:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tyler Hicks X-Patchwork-Id: 895268 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 40GsRS12y4z9s1s; Thu, 5 Apr 2018 15:52:20 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1f3xol-0008DL-SJ; Thu, 05 Apr 2018 05:52:15 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1f3xoj-0008Bz-Mr for kernel-team@lists.ubuntu.com; Thu, 05 Apr 2018 05:52:13 +0000 Received: from 2.general.tyhicks.us.vpn ([10.172.64.53] helo=sec.l.tihix.com) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1f3xoj-0003HT-8m; Thu, 05 Apr 2018 05:52:13 +0000 From: Tyler Hicks To: kernel-team@lists.ubuntu.com Subject: [PATCH 1/3] Revert "x86/mm: Only set IBPB when the new thread cannot ptrace current thread" Date: Thu, 5 Apr 2018 05:51:35 +0000 Message-Id: <1522907497-14743-2-git-send-email-tyhicks@canonical.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522907497-14743-1-git-send-email-tyhicks@canonical.com> References: <1522907497-14743-1-git-send-email-tyhicks@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" BugLink: https://bugs.launchpad.net/bugs/1759920 This reverts commit 96d520d0fd4994643216f30fe91eea770ba934bc. Using a ptrace access check in the middle of a task switch was causing a hard lockup in some cases when the old task was confined by AppArmor. If the AppArmor policy for the the old task didn't allow the task to ptrace the new task, AppArmor would attempt to emit an audit message and deadlock on the task's pi_lock would occur. The fix is to revert this change and go with upstream's implementation that uses the task's dumpable state to determine if IBPB should be used. Signed-off-by: Tyler Hicks --- arch/x86/mm/tlb.c | 5 +---- include/linux/ptrace.h | 6 ------ kernel/ptrace.c | 18 ++++-------------- 3 files changed, 5 insertions(+), 24 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 6365f76..fdcfa70 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -6,7 +6,6 @@ #include #include #include -#include #include #include @@ -220,9 +219,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, u16 new_asid; bool need_flush; - /* Null tsk means switching to kernel, so that's safe */ - if (ibpb_inuse && tsk && - ___ptrace_may_access(tsk, current, PTRACE_MODE_IBPB)) + if (ibpb_inuse && boot_cpu_has(X86_FEATURE_SPEC_CTRL)) native_wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB); if (IS_ENABLED(CONFIG_VMAP_STACK)) { diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h index d6afefd..0e5fcc1 100644 --- a/include/linux/ptrace.h +++ b/include/linux/ptrace.h @@ -63,15 +63,12 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead); #define PTRACE_MODE_NOAUDIT 0x04 #define PTRACE_MODE_FSCREDS 0x08 #define PTRACE_MODE_REALCREDS 0x10 -#define PTRACE_MODE_NOACCESS_CHK 0x20 /* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */ #define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS) #define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS) #define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS) #define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS) -#define PTRACE_MODE_IBPB (PTRACE_MODE_ATTACH | PTRACE_MODE_NOAUDIT \ - | PTRACE_MODE_NOACCESS_CHK | PTRACE_MODE_REALCREDS) /** * ptrace_may_access - check whether the caller is permitted to access @@ -89,9 +86,6 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead); */ extern bool ptrace_may_access(struct task_struct *task, unsigned int mode); -extern int ___ptrace_may_access(struct task_struct *cur, struct task_struct *task, - unsigned int mode); - static inline int ptrace_reparented(struct task_struct *child) { return !same_thread_group(child->real_parent, child->parent); diff --git a/kernel/ptrace.c b/kernel/ptrace.c index f2f0f1a..60f356d 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -268,10 +268,9 @@ static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode) } /* Returns 0 on success, -errno on denial. */ -int ___ptrace_may_access(struct task_struct *cur, struct task_struct *task, - unsigned int mode) +static int __ptrace_may_access(struct task_struct *task, unsigned int mode) { - const struct cred *cred = __task_cred(cur), *tcred; + const struct cred *cred = current_cred(), *tcred; struct mm_struct *mm; kuid_t caller_uid; kgid_t caller_gid; @@ -291,7 +290,7 @@ int ___ptrace_may_access(struct task_struct *cur, struct task_struct *task, */ /* Don't let security modules deny introspection */ - if (same_thread_group(task, cur)) + if (same_thread_group(task, current)) return 0; rcu_read_lock(); if (mode & PTRACE_MODE_FSCREDS) { @@ -329,16 +328,7 @@ int ___ptrace_may_access(struct task_struct *cur, struct task_struct *task, !ptrace_has_cap(mm->user_ns, mode))) return -EPERM; - if (!(mode & PTRACE_MODE_NOACCESS_CHK)) - return security_ptrace_access_check(task, mode); - - return 0; -} -EXPORT_SYMBOL_GPL(___ptrace_may_access); - -static int __ptrace_may_access(struct task_struct *task, unsigned int mode) -{ - return ___ptrace_may_access(current, task, mode); + return security_ptrace_access_check(task, mode); } bool ptrace_may_access(struct task_struct *task, unsigned int mode) From patchwork Thu Apr 5 05:51:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tyler Hicks X-Patchwork-Id: 895269 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 40GsRW4wvxz9s1s; Thu, 5 Apr 2018 15:52:23 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1f3xop-0008FR-09; Thu, 05 Apr 2018 05:52:19 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1f3xon-0008Eo-LY for kernel-team@lists.ubuntu.com; Thu, 05 Apr 2018 05:52:17 +0000 Received: from 2.general.tyhicks.us.vpn ([10.172.64.53] helo=sec.l.tihix.com) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1f3xon-0003HT-7X; Thu, 05 Apr 2018 05:52:17 +0000 From: Tyler Hicks To: kernel-team@lists.ubuntu.com Subject: [PATCH 2/3] x86/speculation: Use Indirect Branch Prediction Barrier in context switch Date: Thu, 5 Apr 2018 05:51:36 +0000 Message-Id: <1522907497-14743-3-git-send-email-tyhicks@canonical.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522907497-14743-1-git-send-email-tyhicks@canonical.com> References: <1522907497-14743-1-git-send-email-tyhicks@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Tim Chen CVE-2017-5715 (Spectre v2 Intel) Flush indirect branches when switching into a process that marked itself non dumpable. This protects high value processes like gpg better, without having too high performance overhead. If done naïvely, we could switch to a kernel idle thread and then back to the original process, such as: process A -> idle -> process A In such scenario, we do not have to do IBPB here even though the process is non-dumpable, as we are switching back to the same process after a hiatus. To avoid the redundant IBPB, which is expensive, we track the last mm user context ID. The cost is to have an extra u64 mm context id to track the last mm we were using before switching to the init_mm used by idle. Avoiding the extra IBPB is probably worth the extra memory for this common scenario. For those cases where tlb_defer_switch_to_init_mm() returns true (non PCID), lazy tlb will defer switch to init_mm, so we will not be changing the mm for the process A -> idle -> process A switch. So IBPB will be skipped for this case. Thanks to the reviewers and Andy Lutomirski for the suggestion of using ctx_id which got rid of the problem of mm pointer recycling. Signed-off-by: Tim Chen Signed-off-by: David Woodhouse Signed-off-by: Thomas Gleixner Cc: ak@linux.intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: linux@dominikbrodowski.net Cc: peterz@infradead.org Cc: bp@alien8.de Cc: luto@kernel.org Cc: pbonzini@redhat.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517263487-3708-1-git-send-email-dwmw@amazon.co.uk (backported from commit 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7) Signed-off-by: Tyler Hicks --- arch/x86/include/asm/tlbflush.h | 2 ++ arch/x86/mm/tlb.c | 33 +++++++++++++++++++++++++++++++-- 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 037c4b6..2c75b76 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -164,6 +164,8 @@ struct tlb_state { struct mm_struct *loaded_mm; u16 loaded_mm_asid; u16 next_asid; + /* last user mm's ctx id */ + u64 last_ctx_id; /* * We can be in one of several states: diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index fdcfa70..20c1afb 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -6,14 +6,15 @@ #include #include #include +#include #include #include +#include #include #include #include #include -#include /* * TLB flushing, formerly SMP-only @@ -218,8 +219,27 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, } else { u16 new_asid; bool need_flush; + u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id); - if (ibpb_inuse && boot_cpu_has(X86_FEATURE_SPEC_CTRL)) + /* + * Avoid user/user BTB poisoning by flushing the branch + * predictor when switching between processes. This stops + * one process from doing Spectre-v2 attacks on another. + * + * As an optimization, flush indirect branches only when + * switching into processes that disable dumping. This + * protects high value processes like gpg, without having + * too high performance overhead. IBPB is *expensive*! + * + * This will not flush branches when switching into kernel + * threads. It will also not flush if we switch to idle + * thread and back to the same process. It will flush if we + * switch to a different non-dumpable process. + */ + if (tsk && tsk->mm && + tsk->mm->context.ctx_id != last_ctx_id && + get_dumpable(tsk->mm) != SUID_DUMP_USER && + ibpb_inuse && boot_cpu_has(X86_FEATURE_SPEC_CTRL)) native_wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB); if (IS_ENABLED(CONFIG_VMAP_STACK)) { @@ -270,6 +290,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0); } + /* + * Record last user mm's context id, so we can avoid + * flushing branch buffer with IBPB if we switch back + * to the same user. + */ + if (next != &init_mm) + this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id); + this_cpu_write(cpu_tlbstate.loaded_mm, next); this_cpu_write(cpu_tlbstate.loaded_mm_asid, new_asid); } @@ -344,6 +372,7 @@ void initialize_tlbstate_and_flush(void) write_cr3(build_cr3(mm->pgd, 0)); /* Reinitialize tlbstate. */ + this_cpu_write(cpu_tlbstate.last_ctx_id, mm->context.ctx_id); this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0); this_cpu_write(cpu_tlbstate.next_asid, 1); this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id); From patchwork Thu Apr 5 05:51:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tyler Hicks X-Patchwork-Id: 895270 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 40GsRZ0lXJz9s1s; Thu, 5 Apr 2018 15:52:26 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1f3xos-0008HF-46; Thu, 05 Apr 2018 05:52:22 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1f3xoq-0008GH-6n for kernel-team@lists.ubuntu.com; Thu, 05 Apr 2018 05:52:20 +0000 Received: from 2.general.tyhicks.us.vpn ([10.172.64.53] helo=sec.l.tihix.com) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1f3xop-0003HT-P7; Thu, 05 Apr 2018 05:52:20 +0000 From: Tyler Hicks To: kernel-team@lists.ubuntu.com Subject: [PATCH 3/3] x86/mm: Reinitialize TLB state on hotplug and resume Date: Thu, 5 Apr 2018 05:51:37 +0000 Message-Id: <1522907497-14743-4-git-send-email-tyhicks@canonical.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522907497-14743-1-git-send-email-tyhicks@canonical.com> References: <1522907497-14743-1-git-send-email-tyhicks@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Andy Lutomirski CVE-2017-5754 When Linux brings a CPU down and back up, it switches to init_mm and then loads swapper_pg_dir into CR3. With PCID enabled, this has the side effect of masking off the ASID bits in CR3. This can result in some confusion in the TLB handling code. If we bring a CPU down and back up with any ASID other than 0, we end up with the wrong ASID active on the CPU after resume. This could cause our internal state to become corrupt, although major corruption is unlikely because init_mm doesn't have any user pages. More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion in the next context switch. The result of *that* is a failure to resume from suspend with probability 1 - 1/6^(cpus-1). Fix it by reinitializing cpu_tlbstate on resume and CPU bringup. Reported-by: Linus Torvalds Reported-by: Jiri Kosina Fixes: 10af6235e0d3 ("x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID") Signed-off-by: Andy Lutomirski Signed-off-by: Linus Torvalds (backported from commit 72c0098d92cedb11c7e0151e84918840a4e96b31) [tyhicks: initialize_tlbstate_and_flush() was added in 72be211ba] Signed-off-by: Tyler Hicks --- arch/x86/include/asm/tlbflush.h | 2 ++ arch/x86/kernel/cpu/common.c | 2 ++ arch/x86/power/cpu.c | 1 + 3 files changed, 5 insertions(+) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 2c75b76..c3427e9 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -331,6 +331,8 @@ static inline void invalidate_user_asid(u16 asid) (unsigned long *)this_cpu_ptr(&cpu_tlbstate.user_pcid_flush_mask)); } +extern void initialize_tlbstate_and_flush(void); + /* * flush the entire current user mapping */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 01abbf6..0fc65de 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1563,6 +1563,7 @@ void cpu_init(void) mmgrab(&init_mm); me->active_mm = &init_mm; BUG_ON(me->mm); + initialize_tlbstate_and_flush(); enter_lazy_tlb(&init_mm, me); /* @@ -1620,6 +1621,7 @@ void cpu_init(void) mmgrab(&init_mm); curr->active_mm = &init_mm; BUG_ON(curr->mm); + initialize_tlbstate_and_flush(); enter_lazy_tlb(&init_mm, curr); /* diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c index 2a717e0..e90f1c7 100644 --- a/arch/x86/power/cpu.c +++ b/arch/x86/power/cpu.c @@ -183,6 +183,7 @@ static void fix_processor_context(void) #endif load_TR_desc(); /* This does ltr */ load_mm_ldt(current->active_mm); /* This does lldt */ + initialize_tlbstate_and_flush(); fpu__resume_cpu();