From patchwork Wed Jan 13 21:09:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Berg X-Patchwork-Id: 1426033 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2001:8b0:10b:1231::1; helo=merlin.infradead.org; envelope-from=linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=sipsolutions.net Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=merlin.20170209 header.b=pELbyVow; dkim-atps=neutral Received: from merlin.infradead.org (merlin.infradead.org [IPv6:2001:8b0:10b:1231::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DGKpN5Qv1z9sWF for ; Thu, 14 Jan 2021 08:10:04 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:To:From: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=sU71tS0Aiwo7CEJV8Qw39iSyNK4Leid+bT7TFLBQ3bc=; b=pELbyVowm04yuUgzslD+7o3GFj PWIV2GlaE7uNNI7gqnynqTOwGrk8b+7w9L7G1CNgqQM1J8c8YyAzGRpB6GAZhwOIh3EzJrzZSp2zY Agd8xHLo3dcAubCqmNzOe6gGHpwDr5TAcHVVRfs1PQmoDseNHFMtI4hpeFZt1U2oT2F54PpKT+9ti eod35bEZmcmmTB2+v4/JLKGSrTnmgigu4HPFrjmrt9hi/kjgnuBaLz1osOkUstafuqroAsV9/9Nin eWq0eDtCBOmErx5rlB56TVazLpSVKcNEtuG/B6KZOqv02PuGSsnOxeYUDmHeS17A52NJK/4FWmoLj 9lib0Lzw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kznOu-0000nR-EQ; Wed, 13 Jan 2021 21:09:56 +0000 Received: from s3.sipsolutions.net ([2a01:4f8:191:4433::2] helo=sipsolutions.net) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kznOr-0000ld-Ew for linux-um@lists.infradead.org; Wed, 13 Jan 2021 21:09:55 +0000 Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94) (envelope-from ) id 1kznOp-005vHf-SI; Wed, 13 Jan 2021 22:09:51 +0100 From: Johannes Berg To: linux-um@lists.infradead.org Subject: [PATCH 1/3] um: separate child and parent errors in clone stub Date: Wed, 13 Jan 2021 22:09:42 +0100 Message-Id: <20210113220944.7732f6bfd3bb.Ib87c91b49d57d27314cf444696273da6d8463e9c@changeid> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210113_160953_574106_B1A20998 X-CRM114-Status: GOOD ( 16.66 ) X-Spam-Score: 0.4 (/) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (0.4 points) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_NONE SPF: sender does not publish an SPF Record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: linux-um@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Johannes Berg Sender: "linux-um" Errors-To: linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org From: Johannes Berg If the two are mixed up, then it looks as though the parent returned an error if the child failed (before) the mmap(), and then the resulting process never gets killed. Fix this by splitting the child and parent errors, reporting and using them appropriately. Signed-off-by: Johannes Berg --- arch/um/include/shared/skas/stub-data.h | 2 +- arch/um/kernel/skas/clone.c | 25 +++++++++++-------------- arch/um/os-Linux/skas/process.c | 23 +++++++++++++---------- arch/x86/um/shared/sysdep/stub_32.h | 2 +- arch/x86/um/shared/sysdep/stub_64.h | 2 +- 5 files changed, 27 insertions(+), 27 deletions(-) diff --git a/arch/um/include/shared/skas/stub-data.h b/arch/um/include/shared/skas/stub-data.h index 6b01d97a9386..5e3ade3fb38b 100644 --- a/arch/um/include/shared/skas/stub-data.h +++ b/arch/um/include/shared/skas/stub-data.h @@ -11,7 +11,7 @@ struct stub_data { unsigned long offset; int fd; - long err; + long parent_err, child_err; }; #endif diff --git a/arch/um/kernel/skas/clone.c b/arch/um/kernel/skas/clone.c index bfb70c456b30..7c592c788cbf 100644 --- a/arch/um/kernel/skas/clone.c +++ b/arch/um/kernel/skas/clone.c @@ -24,29 +24,26 @@ void __attribute__ ((__section__ (".__syscall_stub"))) stub_clone_handler(void) { - struct stub_data *data = (struct stub_data *) STUB_DATA; + int stack; + struct stub_data *data = (void *) ((unsigned long)&stack & ~(UM_KERN_PAGE_SIZE - 1)); long err; err = stub_syscall2(__NR_clone, CLONE_PARENT | CLONE_FILES | SIGCHLD, - STUB_DATA + UM_KERN_PAGE_SIZE / 2 - sizeof(void *)); - if (err != 0) - goto out; + (unsigned long)data + UM_KERN_PAGE_SIZE / 2 - sizeof(void *)); + if (err) { + data->parent_err = err; + goto done; + } err = stub_syscall4(__NR_ptrace, PTRACE_TRACEME, 0, 0, 0); - if (err) - goto out; + if (err) { + data->child_err = err; + goto done; + } remap_stack(data->fd, data->offset); goto done; - out: - /* - * save current result. - * Parent: pid; - * child: retcode of mmap already saved and it jumps around this - * assignment - */ - data->err = err; done: trap_myself(); } diff --git a/arch/um/os-Linux/skas/process.c b/arch/um/os-Linux/skas/process.c index d910e25c273e..623b0aeadf4c 100644 --- a/arch/um/os-Linux/skas/process.c +++ b/arch/um/os-Linux/skas/process.c @@ -545,8 +545,14 @@ int copy_context_skas0(unsigned long new_stack, int pid) * and child's mmap2 calls */ *data = ((struct stub_data) { - .offset = MMAP_OFFSET(new_offset), - .fd = new_fd + .offset = MMAP_OFFSET(new_offset), + .fd = new_fd, + .parent_err = -ESRCH, + .child_err = 0, + }); + + *child_data = ((struct stub_data) { + .child_err = -ESRCH, }); err = ptrace_setregs(pid, thread_regs); @@ -564,9 +570,6 @@ int copy_context_skas0(unsigned long new_stack, int pid) return err; } - /* set a well known return code for detection of child write failure */ - child_data->err = 12345678; - /* * Wait, until parent has finished its work: read child's pid from * parent's stack, and check, if bad result. @@ -581,7 +584,7 @@ int copy_context_skas0(unsigned long new_stack, int pid) wait_stub_done(pid); - pid = data->err; + pid = data->parent_err; if (pid < 0) { printk(UM_KERN_ERR "copy_context_skas0 - stub-parent reports " "error %d\n", -pid); @@ -593,10 +596,10 @@ int copy_context_skas0(unsigned long new_stack, int pid) * child's stack and check it. */ wait_stub_done(pid); - if (child_data->err != STUB_DATA) { - printk(UM_KERN_ERR "copy_context_skas0 - stub-child reports " - "error %ld\n", child_data->err); - err = child_data->err; + if (child_data->child_err != STUB_DATA) { + printk(UM_KERN_ERR "copy_context_skas0 - stub-child %d reports " + "error %ld\n", pid, data->child_err); + err = data->child_err; goto out_kill; } diff --git a/arch/x86/um/shared/sysdep/stub_32.h b/arch/x86/um/shared/sysdep/stub_32.h index 51fd256c75f0..8ea69211e53c 100644 --- a/arch/x86/um/shared/sysdep/stub_32.h +++ b/arch/x86/um/shared/sysdep/stub_32.h @@ -86,7 +86,7 @@ static inline void remap_stack(int fd, unsigned long offset) "d" (PROT_READ | PROT_WRITE), "S" (MAP_FIXED | MAP_SHARED), "D" (fd), "a" (offset), - "i" (&((struct stub_data *) STUB_DATA)->err) + "i" (&((struct stub_data *) STUB_DATA)->child_err) : "memory"); } diff --git a/arch/x86/um/shared/sysdep/stub_64.h b/arch/x86/um/shared/sysdep/stub_64.h index 994df93c5ed3..b7b8b8e4359d 100644 --- a/arch/x86/um/shared/sysdep/stub_64.h +++ b/arch/x86/um/shared/sysdep/stub_64.h @@ -92,7 +92,7 @@ static inline void remap_stack(long fd, unsigned long offset) "d" (PROT_READ | PROT_WRITE), "g" (MAP_FIXED | MAP_SHARED), "g" (fd), "g" (offset), - "i" (&((struct stub_data *) STUB_DATA)->err) + "i" (&((struct stub_data *) STUB_DATA)->child_err) : __syscall_clobber, "r10", "r8", "r9" ); } From patchwork Wed Jan 13 21:09:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Berg X-Patchwork-Id: 1426034 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2001:8b0:10b:1231::1; helo=merlin.infradead.org; envelope-from=linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=sipsolutions.net Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=merlin.20170209 header.b=fIMahFhO; dkim-atps=neutral Received: from merlin.infradead.org (merlin.infradead.org [IPv6:2001:8b0:10b:1231::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DGKpS4tyYz9sWF for ; Thu, 14 Jan 2021 08:10:08 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=zzjAN3Pb/UGQTuPxpgkgLj7UxAzE0S31WFVfRAiMzd4=; b=fIMahFhOKyezjlQHOWb/+N1Y6 KToRYqohqZSTFlk/Sj5OhAZnMiUxYIxgBLAkE1FGyM3AyNw32UC+eaj8DZE3UNEn54go9TaoQwhq9 mjuZUKqpzTOkdbjehRZC4yaAgQohycuGt4rMQOhQzSCr6rXDp5KOT3kZJfEGfCqf4Yprz6CBmMRZA lQs3AuEJLSbaEcf0YaTL7JCjuKrRxrfCMqEubgQ0Prp6QDNDCkw/J0aE1OwBXjLjGAAg9tcl9FcKJ cfGPxWuTU3PnzUOblSd1T8fCWQtpdOHGvvCRfFYeSTRdzlkR0RnXHP/+fzo1xqyv+swTgDrU9MFFF xWZlkaEpg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kznOu-0000na-TP; Wed, 13 Jan 2021 21:09:56 +0000 Received: from s3.sipsolutions.net ([2a01:4f8:191:4433::2] helo=sipsolutions.net) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kznOr-0000ls-OA for linux-um@lists.infradead.org; Wed, 13 Jan 2021 21:09:55 +0000 Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94) (envelope-from ) id 1kznOq-005vHf-5i; Wed, 13 Jan 2021 22:09:52 +0100 From: Johannes Berg To: linux-um@lists.infradead.org Subject: [PATCH 2/3] um: rework userspace stubs to not hard-code stub location Date: Wed, 13 Jan 2021 22:09:43 +0100 Message-Id: <20210113220944.5a2ed3c1e619.I08d3297d2a88fda957e5139b9803a20bc600b0eb@changeid> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210113220944.7732f6bfd3bb.Ib87c91b49d57d27314cf444696273da6d8463e9c@changeid> References: <20210113220944.7732f6bfd3bb.Ib87c91b49d57d27314cf444696273da6d8463e9c@changeid> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210113_160953_981040_2489DDA7 X-CRM114-Status: GOOD ( 19.54 ) X-Spam-Score: 0.4 (/) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (0.4 points) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_NONE SPF: sender does not publish an SPF Record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: linux-um@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Johannes Berg Sender: "linux-um" Errors-To: linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org From: Johannes Berg The userspace stacks mostly have a stack (and in the case of the syscall stub we can just set their stack pointer) that points to the location of the stub data page already. Rework the stubs to use the stack pointer to derive the start of the data page, rather than requiring it to be hard-coded. In the clone stub, also integrate the int3 into the stack remap, since we really must not use the stack while we remap it. This prepares for putting the stub at a variable location that's not part of the normal address space of the userspace processes running inside the UML machine. Signed-off-by: Johannes Berg --- arch/um/include/shared/as-layout.h | 16 +++-------- arch/um/include/shared/common-offsets.h | 6 +++++ arch/um/kernel/skas/clone.c | 3 +-- arch/um/os-Linux/skas/mem.c | 2 ++ arch/x86/um/shared/sysdep/stub_32.h | 33 +++++++++++++++-------- arch/x86/um/shared/sysdep/stub_64.h | 36 ++++++++++++++++--------- arch/x86/um/stub_32.S | 17 +++++++----- arch/x86/um/stub_64.S | 5 ++-- arch/x86/um/stub_segv.c | 5 ++-- 9 files changed, 75 insertions(+), 48 deletions(-) diff --git a/arch/um/include/shared/as-layout.h b/arch/um/include/shared/as-layout.h index 5f286ef2721b..56408bf3480d 100644 --- a/arch/um/include/shared/as-layout.h +++ b/arch/um/include/shared/as-layout.h @@ -20,18 +20,10 @@ * 'UL' and other type specifiers unilaterally. We * use the following macros to deal with this. */ - -#ifdef __ASSEMBLY__ -#define _UML_AC(X, Y) (Y) -#else -#define __UML_AC(X, Y) (X(Y)) -#define _UML_AC(X, Y) __UML_AC(X, Y) -#endif - -#define STUB_START _UML_AC(, 0x100000) -#define STUB_CODE _UML_AC((unsigned long), STUB_START) -#define STUB_DATA _UML_AC((unsigned long), STUB_CODE + UM_KERN_PAGE_SIZE) -#define STUB_END _UML_AC((unsigned long), STUB_DATA + UM_KERN_PAGE_SIZE) +#define STUB_START 0x100000UL +#define STUB_CODE STUB_START +#define STUB_DATA (STUB_CODE + UM_KERN_PAGE_SIZE) +#define STUB_END (STUB_DATA + UM_KERN_PAGE_SIZE) #ifndef __ASSEMBLY__ diff --git a/arch/um/include/shared/common-offsets.h b/arch/um/include/shared/common-offsets.h index 16a51a8c800f..edc90ab73734 100644 --- a/arch/um/include/shared/common-offsets.h +++ b/arch/um/include/shared/common-offsets.h @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* for use by sys-$SUBARCH/kernel-offsets.c */ +#include DEFINE(KERNEL_MADV_REMOVE, MADV_REMOVE); @@ -43,3 +44,8 @@ DEFINE(UML_CONFIG_64BIT, CONFIG_64BIT); #ifdef CONFIG_UML_TIME_TRAVEL_SUPPORT DEFINE(UML_CONFIG_UML_TIME_TRAVEL_SUPPORT, CONFIG_UML_TIME_TRAVEL_SUPPORT); #endif + +/* for stub */ +DEFINE(UML_STUB_FIELD_OFFSET, offsetof(struct stub_data, offset)); +DEFINE(UML_STUB_FIELD_CHILD_ERR, offsetof(struct stub_data, child_err)); +DEFINE(UML_STUB_FIELD_FD, offsetof(struct stub_data, fd)); diff --git a/arch/um/kernel/skas/clone.c b/arch/um/kernel/skas/clone.c index 7c592c788cbf..592cdb138441 100644 --- a/arch/um/kernel/skas/clone.c +++ b/arch/um/kernel/skas/clone.c @@ -41,8 +41,7 @@ stub_clone_handler(void) goto done; } - remap_stack(data->fd, data->offset); - goto done; + remap_stack_and_trap(); done: trap_myself(); diff --git a/arch/um/os-Linux/skas/mem.c b/arch/um/os-Linux/skas/mem.c index c546d16f8dfe..3b4975ee67e2 100644 --- a/arch/um/os-Linux/skas/mem.c +++ b/arch/um/os-Linux/skas/mem.c @@ -40,6 +40,8 @@ static int __init init_syscall_regs(void) syscall_regs[REGS_IP_INDEX] = STUB_CODE + ((unsigned long) batch_syscall_stub - (unsigned long) __syscall_stub_start); + syscall_regs[REGS_SP_INDEX] = STUB_DATA; + return 0; } diff --git a/arch/x86/um/shared/sysdep/stub_32.h b/arch/x86/um/shared/sysdep/stub_32.h index 8ea69211e53c..c3891c1ada26 100644 --- a/arch/x86/um/shared/sysdep/stub_32.h +++ b/arch/x86/um/shared/sysdep/stub_32.h @@ -7,8 +7,8 @@ #define __SYSDEP_STUB_H #include +#include -#define STUB_SYSCALL_RET EAX #define STUB_MMAP_NR __NR_mmap2 #define MMAP_OFFSET(o) ((o) >> UM_KERN_PAGE_SHIFT) @@ -77,17 +77,28 @@ static inline void trap_myself(void) __asm("int3"); } -static inline void remap_stack(int fd, unsigned long offset) +static void inline remap_stack_and_trap(void) { - __asm__ volatile ("movl %%eax,%%ebp ; movl %0,%%eax ; int $0x80 ;" - "movl %7, %%ebx ; movl %%eax, (%%ebx)" - : : "g" (STUB_MMAP_NR), "b" (STUB_DATA), - "c" (UM_KERN_PAGE_SIZE), - "d" (PROT_READ | PROT_WRITE), - "S" (MAP_FIXED | MAP_SHARED), "D" (fd), - "a" (offset), - "i" (&((struct stub_data *) STUB_DATA)->child_err) - : "memory"); + __asm__ volatile ( + "movl %%esp,%%ebx ;" + "andl %0,%%ebx ;" + "movl %1,%%eax ;" + "movl %%ebx,%%edi ; addl %2,%%edi ; movl (%%edi),%%edi ;" + "movl %%ebx,%%ebp ; addl %3,%%ebp ; movl (%%ebp),%%ebp ;" + "int $0x80 ;" + "addl %4,%%ebx ; movl %%eax, (%%ebx) ;" + "int $3" + : : + "g" (~(UM_KERN_PAGE_SIZE - 1)), + "g" (STUB_MMAP_NR), + "g" (UML_STUB_FIELD_FD), + "g" (UML_STUB_FIELD_OFFSET), + "g" (UML_STUB_FIELD_CHILD_ERR), + "c" (UM_KERN_PAGE_SIZE), + "d" (PROT_READ | PROT_WRITE), + "S" (MAP_FIXED | MAP_SHARED) + : + "memory"); } #endif diff --git a/arch/x86/um/shared/sysdep/stub_64.h b/arch/x86/um/shared/sysdep/stub_64.h index b7b8b8e4359d..6e2626b77a2e 100644 --- a/arch/x86/um/shared/sysdep/stub_64.h +++ b/arch/x86/um/shared/sysdep/stub_64.h @@ -7,8 +7,8 @@ #define __SYSDEP_STUB_H #include +#include -#define STUB_SYSCALL_RET PT_INDEX(RAX) #define STUB_MMAP_NR __NR_mmap #define MMAP_OFFSET(o) (o) @@ -82,18 +82,30 @@ static inline void trap_myself(void) __asm("int3"); } -static inline void remap_stack(long fd, unsigned long offset) +static inline void remap_stack_and_trap(void) { - __asm__ volatile ("movq %4,%%r10 ; movq %5,%%r8 ; " - "movq %6, %%r9; " __syscall "; movq %7, %%rbx ; " - "movq %%rax, (%%rbx)": - : "a" (STUB_MMAP_NR), "D" (STUB_DATA), - "S" (UM_KERN_PAGE_SIZE), - "d" (PROT_READ | PROT_WRITE), - "g" (MAP_FIXED | MAP_SHARED), "g" (fd), - "g" (offset), - "i" (&((struct stub_data *) STUB_DATA)->child_err) - : __syscall_clobber, "r10", "r8", "r9" ); + __asm__ volatile ( + "movq %0,%%rax ;" + "movq %%rsp,%%rdi ;" + "andq %1,%%rdi ;" + "movq %2,%%r10 ;" + "movq %%rdi,%%r8 ; addq %3,%%r8 ; movq (%%r8),%%r8 ;" + "movq %%rdi,%%r9 ; addq %4,%%r9 ; movq (%%r9),%%r9 ;" + __syscall ";" + "movq %%rsp,%%rdi ; andq %1,%%rdi ;" + "addq %5,%%rdi ; movq %%rax, (%%rdi) ;" + "int3" + : : + "g" (STUB_MMAP_NR), + "g" (~(UM_KERN_PAGE_SIZE - 1)), + "g" (MAP_FIXED | MAP_SHARED), + "g" (UML_STUB_FIELD_FD), + "g" (UML_STUB_FIELD_OFFSET), + "g" (UML_STUB_FIELD_CHILD_ERR), + "S" (UM_KERN_PAGE_SIZE), + "d" (PROT_READ | PROT_WRITE) + : + __syscall_clobber, "r10", "r8", "r9"); } #endif diff --git a/arch/x86/um/stub_32.S b/arch/x86/um/stub_32.S index a193e88536a9..8291899e6aaf 100644 --- a/arch/x86/um/stub_32.S +++ b/arch/x86/um/stub_32.S @@ -5,21 +5,22 @@ .globl batch_syscall_stub batch_syscall_stub: - /* load pointer to first operation */ - mov $(STUB_DATA+8), %esp - + /* %esp comes in as "top of page" */ + mov %esp, %ecx + /* %esp has pointer to first operation */ + add $8, %esp again: /* load length of additional data */ mov 0x0(%esp), %eax /* if(length == 0) : end of list */ /* write possible 0 to header */ - mov %eax, STUB_DATA+4 + mov %eax, 0x4(%ecx) cmpl $0, %eax jz done /* save current pointer */ - mov %esp, STUB_DATA+4 + mov %esp, 0x4(%ecx) /* skip additional data */ add %eax, %esp @@ -38,6 +39,10 @@ again: /* execute syscall */ int $0x80 + /* restore top of page pointer in %ecx */ + mov %esp, %ecx + andl $(~UM_KERN_PAGE_SIZE) + 1, %ecx + /* check return value */ pop %ebx cmp %ebx, %eax @@ -45,7 +50,7 @@ again: done: /* save return value */ - mov %eax, STUB_DATA + mov %eax, (%ecx) /* stop */ int3 diff --git a/arch/x86/um/stub_64.S b/arch/x86/um/stub_64.S index 8a95c5b2eaf9..f3404640197a 100644 --- a/arch/x86/um/stub_64.S +++ b/arch/x86/um/stub_64.S @@ -4,9 +4,8 @@ .section .__syscall_stub, "ax" .globl batch_syscall_stub batch_syscall_stub: - mov $(STUB_DATA), %rbx - /* load pointer to first operation */ - mov %rbx, %rsp + /* %rsp has the pointer to first operation */ + mov %rsp, %rbx add $0x10, %rsp again: /* load length of additional data */ diff --git a/arch/x86/um/stub_segv.c b/arch/x86/um/stub_segv.c index 27361cbb7ca9..21836eaf1725 100644 --- a/arch/x86/um/stub_segv.c +++ b/arch/x86/um/stub_segv.c @@ -11,10 +11,11 @@ void __attribute__ ((__section__ (".__syscall_stub"))) stub_segv_handler(int sig, siginfo_t *info, void *p) { + int stack; ucontext_t *uc = p; + struct faultinfo *f = (void *)(((unsigned long)&stack) & ~(UM_KERN_PAGE_SIZE - 1)); - GET_FAULTINFO_FROM_MC(*((struct faultinfo *) STUB_DATA), - &uc->uc_mcontext); + GET_FAULTINFO_FROM_MC(*f, &uc->uc_mcontext); trap_myself(); } From patchwork Wed Jan 13 21:09:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Berg X-Patchwork-Id: 1426035 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2001:8b0:10b:1231::1; helo=merlin.infradead.org; envelope-from=linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=sipsolutions.net Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=merlin.20170209 header.b=w12ZQtsd; dkim-atps=neutral Received: from merlin.infradead.org (merlin.infradead.org [IPv6:2001:8b0:10b:1231::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DGKpW1TGvz9sWF for ; Thu, 14 Jan 2021 08:10:11 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=KAb7lvrlLeYpDIOAukDS7FV8Rf3T/9f/SsBq0ny2Vkc=; b=w12ZQtsdwoi/PNBo8H+3erX0t ATAPBRf8Tiy6pK6q5qWzIaESw8D2Z66ajRNvSbIQHZ4tudsQyabZcoQGKHqr06HBjIOG3RwxJT766 xWGoamVdUFBfm30x3fvrQdv9Nqy6+lbfEXXj0UIof6cm5kBloAQ6liuRXrrK6uJGyNqq+zidsdqgh e8I6Ww9Jk3wLWmGkFW4sx1+tuqoHR/4ocUMNcltP/IIuHKD4Z0QPrnBstkjzl5Kq3hvOat3MKiLZH PlbOG1zRbT4FsLRMewXbNCWl0hU+o7PUD/0UrGl2DKCe7dkepX3XE39V8Lvd1h2VnWlJNibeZVQr1 /dJ1MyYxg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kznOv-0000nw-GE; Wed, 13 Jan 2021 21:09:57 +0000 Received: from s3.sipsolutions.net ([2a01:4f8:191:4433::2] helo=sipsolutions.net) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kznOr-0000lu-QE for linux-um@lists.infradead.org; Wed, 13 Jan 2021 21:09:55 +0000 Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94) (envelope-from ) id 1kznOq-005vHf-HO; Wed, 13 Jan 2021 22:09:52 +0100 From: Johannes Berg To: linux-um@lists.infradead.org Subject: [PATCH 3/3] um: remove process stub VMA Date: Wed, 13 Jan 2021 22:09:44 +0100 Message-Id: <20210113220944.ba020fe73c75.Ia71e6fe697af9b196278084a9f730c30e20773c9@changeid> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210113220944.7732f6bfd3bb.Ib87c91b49d57d27314cf444696273da6d8463e9c@changeid> References: <20210113220944.7732f6bfd3bb.Ib87c91b49d57d27314cf444696273da6d8463e9c@changeid> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210113_160954_036437_D1635F41 X-CRM114-Status: GOOD ( 26.51 ) X-Spam-Score: 0.4 (/) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (0.4 points) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_NONE SPF: sender does not publish an SPF Record 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: linux-um@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch@vger.kernel.org, Johannes Berg Sender: "linux-um" Errors-To: linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org From: Johannes Berg This mostly reverts the old commit 3963333fe676 ("uml: cover stubs with a VMA") which had added a VMA to the existing PTEs. However, there's no real reason to have the PTEs in the first place and the VMA cannot be 'fixed' in place, which leads to bugs that userspace could try to unmap them and be forcefully killed, or such. Also, there's a bit of an ugly hole in userspace's address space. Simplify all this: just install the stub code/page at the top of the (inner) address space, i.e. put it just above TASK_SIZE. The pages are simply hard-coded to be mapped in the userspace process we use to implement an mm context, and they're out of reach of the inner mmap/munmap/mprotect etc. since they're above TASK_SIZE. Getting rid of the VMA also makes vma_merge() no longer hit one of the VM_WARN_ON()s there because we installed a VMA while the code assumes the stack VMA is the first one. It also removes a lockdep warning about mmap_sem usage since we no longer have uml_setup_stubs() and thus no longer need to do any manipulation that would require mmap_sem in activate_mm(). Signed-off-by: Johannes Berg --- arch/um/include/asm/Kbuild | 1 + arch/um/include/asm/mmu_context.h | 30 +---------- arch/um/include/shared/as-layout.h | 3 +- arch/um/kernel/exec.c | 4 +- arch/um/kernel/skas/mmu.c | 87 ------------------------------ arch/um/kernel/tlb.c | 15 ------ arch/um/kernel/um_arch.c | 5 ++ arch/um/os-Linux/skas/process.c | 4 -- arch/x86/um/os-Linux/task_size.c | 2 +- 9 files changed, 11 insertions(+), 140 deletions(-) diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild index 1c63b260ecc4..da50d0bf1060 100644 --- a/arch/um/include/asm/Kbuild +++ b/arch/um/include/asm/Kbuild @@ -26,3 +26,4 @@ generic-y += topology.h generic-y += trace_clock.h generic-y += word-at-a-time.h generic-y += kprobes.h +generic-y += mm_hooks.h diff --git a/arch/um/include/asm/mmu_context.h b/arch/um/include/asm/mmu_context.h index 17ddd4edf875..6db0c8b54330 100644 --- a/arch/um/include/asm/mmu_context.h +++ b/arch/um/include/asm/mmu_context.h @@ -9,34 +9,9 @@ #include #include #include - +#include #include -extern void uml_setup_stubs(struct mm_struct *mm); -/* - * Needed since we do not use the asm-generic/mm_hooks.h: - */ -static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm) -{ - uml_setup_stubs(mm); - return 0; -} -extern void arch_exit_mmap(struct mm_struct *mm); -static inline void arch_unmap(struct mm_struct *mm, - unsigned long start, unsigned long end) -{ -} -static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, - bool write, bool execute, bool foreign) -{ - /* by default, allow everything */ - return true; -} - -/* - * end asm-generic/mm_hooks.h functions - */ - #define deactivate_mm(tsk,mm) do { } while (0) extern void force_flush_all(void); @@ -48,9 +23,6 @@ static inline void activate_mm(struct mm_struct *old, struct mm_struct *new) * when the new ->mm is used for the first time. */ __switch_mm(&new->context.id); - mmap_write_lock_nested(new, SINGLE_DEPTH_NESTING); - uml_setup_stubs(new); - mmap_write_unlock(new); } static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, diff --git a/arch/um/include/shared/as-layout.h b/arch/um/include/shared/as-layout.h index 56408bf3480d..9a0bd648d872 100644 --- a/arch/um/include/shared/as-layout.h +++ b/arch/um/include/shared/as-layout.h @@ -20,7 +20,7 @@ * 'UL' and other type specifiers unilaterally. We * use the following macros to deal with this. */ -#define STUB_START 0x100000UL +#define STUB_START stub_start #define STUB_CODE STUB_START #define STUB_DATA (STUB_CODE + UM_KERN_PAGE_SIZE) #define STUB_END (STUB_DATA + UM_KERN_PAGE_SIZE) @@ -46,6 +46,7 @@ extern unsigned long long highmem; extern unsigned long brk_start; extern unsigned long host_task_size; +extern unsigned long stub_start; extern int linux_main(int argc, char **argv); extern void uml_finishsetup(void); diff --git a/arch/um/kernel/exec.c b/arch/um/kernel/exec.c index e8fd5d540b05..4d8498100341 100644 --- a/arch/um/kernel/exec.c +++ b/arch/um/kernel/exec.c @@ -26,9 +26,7 @@ void flush_thread(void) arch_flush_thread(¤t->thread.arch); - ret = unmap(¤t->mm->context.id, 0, STUB_START, 0, &data); - ret = ret || unmap(¤t->mm->context.id, STUB_END, - host_task_size - STUB_END, 1, &data); + ret = unmap(¤t->mm->context.id, 0, TASK_SIZE, 1, &data); if (ret) { printk(KERN_ERR "flush_thread - clearing address space failed, " "err = %d\n", ret); diff --git a/arch/um/kernel/skas/mmu.c b/arch/um/kernel/skas/mmu.c index d9961163da66..125df465e8ea 100644 --- a/arch/um/kernel/skas/mmu.c +++ b/arch/um/kernel/skas/mmu.c @@ -14,47 +14,6 @@ #include #include -static int init_stub_pte(struct mm_struct *mm, unsigned long proc, - unsigned long kernel) -{ - pgd_t *pgd; - p4d_t *p4d; - pud_t *pud; - pmd_t *pmd; - pte_t *pte; - - pgd = pgd_offset(mm, proc); - - p4d = p4d_alloc(mm, pgd, proc); - if (!p4d) - goto out; - - pud = pud_alloc(mm, p4d, proc); - if (!pud) - goto out_pud; - - pmd = pmd_alloc(mm, pud, proc); - if (!pmd) - goto out_pmd; - - pte = pte_alloc_map(mm, pmd, proc); - if (!pte) - goto out_pte; - - *pte = mk_pte(virt_to_page(kernel), __pgprot(_PAGE_PRESENT)); - *pte = pte_mkread(*pte); - return 0; - - out_pte: - pmd_free(mm, pmd); - out_pmd: - pud_free(mm, pud); - out_pud: - p4d_free(mm, p4d); - out: - return -ENOMEM; -} - int init_new_context(struct task_struct *task, struct mm_struct *mm) { struct mm_context *from_mm = NULL; @@ -98,52 +57,6 @@ int init_new_context(struct task_struct *task, struct mm_struct *mm) return ret; } -void uml_setup_stubs(struct mm_struct *mm) -{ - int err, ret; - - ret = init_stub_pte(mm, STUB_CODE, - (unsigned long) __syscall_stub_start); - if (ret) - goto out; - - ret = init_stub_pte(mm, STUB_DATA, mm->context.id.stack); - if (ret) - goto out; - - mm->context.stub_pages[0] = virt_to_page(__syscall_stub_start); - mm->context.stub_pages[1] = virt_to_page(mm->context.id.stack); - - /* dup_mmap already holds mmap_lock */ - err = install_special_mapping(mm, STUB_START, STUB_END - STUB_START, - VM_READ | VM_MAYREAD | VM_EXEC | - VM_MAYEXEC | VM_DONTCOPY | VM_PFNMAP, - mm->context.stub_pages); - if (err) { - printk(KERN_ERR "install_special_mapping returned %d\n", err); - goto out; - } - return; - -out: - force_sigsegv(SIGSEGV); -} - -void arch_exit_mmap(struct mm_struct *mm) -{ - pte_t *pte; - - pte = virt_to_pte(mm, STUB_CODE); - if (pte != NULL) - pte_clear(mm, STUB_CODE, pte); - - pte = virt_to_pte(mm, STUB_DATA); - if (pte == NULL) - return; - - pte_clear(mm, STUB_DATA, pte); -} - void destroy_context(struct mm_struct *mm) { struct mm_context *mmu = &mm->context; diff --git a/arch/um/kernel/tlb.c b/arch/um/kernel/tlb.c index 5be1b0da9f3b..bc38f79ca3a3 100644 --- a/arch/um/kernel/tlb.c +++ b/arch/um/kernel/tlb.c @@ -125,9 +125,6 @@ static int add_mmap(unsigned long virt, unsigned long phys, unsigned long len, struct host_vm_op *last; int fd = -1, ret = 0; - if (virt + len > STUB_START && virt < STUB_END) - return -EINVAL; - if (hvc->userspace) fd = phys_mapping(phys, &offset); else @@ -165,9 +162,6 @@ static int add_munmap(unsigned long addr, unsigned long len, struct host_vm_op *last; int ret = 0; - if (addr + len > STUB_START && addr < STUB_END) - return -EINVAL; - if (hvc->index != 0) { last = &hvc->ops[hvc->index - 1]; if ((last->type == MUNMAP) && @@ -195,9 +189,6 @@ static int add_mprotect(unsigned long addr, unsigned long len, struct host_vm_op *last; int ret = 0; - if (addr + len > STUB_START && addr < STUB_END) - return -EINVAL; - if (hvc->index != 0) { last = &hvc->ops[hvc->index - 1]; if ((last->type == MPROTECT) && @@ -232,9 +223,6 @@ static inline int update_pte_range(pmd_t *pmd, unsigned long addr, pte = pte_offset_kernel(pmd, addr); do { - if ((addr >= STUB_START) && (addr < STUB_END)) - continue; - r = pte_read(*pte); w = pte_write(*pte); x = pte_exec(*pte); @@ -478,9 +466,6 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long address) address &= PAGE_MASK; - if (address >= STUB_START && address < STUB_END) - goto kill; - pgd = pgd_offset(mm, address); if (!pgd_present(*pgd)) goto kill; diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c index 9c7e6d7ea1b3..8a0d566f6472 100644 --- a/arch/um/kernel/um_arch.c +++ b/arch/um/kernel/um_arch.c @@ -236,6 +236,7 @@ void uml_finishsetup(void) } /* Set during early boot */ +unsigned long stub_start; unsigned long task_size; EXPORT_SYMBOL(task_size); @@ -267,6 +268,10 @@ int __init linux_main(int argc, char **argv) add_arg(DEFAULT_COMMAND_LINE); host_task_size = os_get_top_address(); + /* reserve two pages for the stubs */ + host_task_size -= 2 * PAGE_SIZE; + stub_start = host_task_size; + /* * TASK_SIZE needs to be PGDIR_SIZE aligned or else exit_mmap craps * out diff --git a/arch/um/os-Linux/skas/process.c b/arch/um/os-Linux/skas/process.c index 623b0aeadf4c..fba674fac8b7 100644 --- a/arch/um/os-Linux/skas/process.c +++ b/arch/um/os-Linux/skas/process.c @@ -251,10 +251,6 @@ static int userspace_tramp(void *stack) signal(SIGTERM, SIG_DFL); signal(SIGWINCH, SIG_IGN); - /* - * This has a pte, but it can't be mapped in with the usual - * tlb_flush mechanism because this is part of that mechanism - */ fd = phys_mapping(to_phys(__syscall_stub_start), &offset); addr = mmap64((void *) STUB_CODE, UM_KERN_PAGE_SIZE, PROT_EXEC, MAP_FIXED | MAP_PRIVATE, fd, offset); diff --git a/arch/x86/um/os-Linux/task_size.c b/arch/x86/um/os-Linux/task_size.c index e62174638f00..1dc9adc20b1c 100644 --- a/arch/x86/um/os-Linux/task_size.c +++ b/arch/x86/um/os-Linux/task_size.c @@ -145,7 +145,7 @@ unsigned long os_get_top_address(void) unsigned long os_get_top_address(void) { /* The old value of CONFIG_TOP_ADDR */ - return 0x7fc0000000; + return 0x7fc0002000; } #endif