From patchwork Tue Aug 27 03:30:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1153557 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46HZGq4302z9sBp for ; Tue, 27 Aug 2019 13:33:55 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="t7yYUtOs"; dkim-atps=neutral Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 46HZGp6LLgzDqrM for ; Tue, 27 Aug 2019 13:33:54 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::642; helo=mail-pl1-x642.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="t7yYUtOs"; dkim-atps=neutral Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 46HZCC1ZsdzDqnk for ; Tue, 27 Aug 2019 13:30:46 +1000 (AEST) Received: by mail-pl1-x642.google.com with SMTP id t14so11036168plr.11 for ; Mon, 26 Aug 2019 20:30:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=vgJl0RoJ5CJqO90iMcqyp0FyFvd1BzQf3fB9Yvm2MUk=; b=t7yYUtOslDpm3Hf+u1NR7dNT+DHkDnzAaorU7ZvDWuZAuoyc08NPZG5AxpHm4kQtwz clAxvRF1SvzV41wjwxozBNGPXV5CoaDqHGlyS17pQXk+9Eb5jTcCy3KwdhdBPVUP4Rze RIHA1Znq0N1sdTkzTkvUkQmaZPRYN7ipfNu5z6uX98i96A97bK+incjwYvrCsjFbPL0I IcXXTcEvXvH11xIvqZ+6174a9cwT1SYaWgZGIpCMrp50/0FmYFZfvBrH5kjsxPA8WCGc BkxGebuk3DJtUR6wrz9GnTbx14jaC+geAjqHElikzIXDmJIyjOa2jRi2mNHLRECwhWuC IHLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vgJl0RoJ5CJqO90iMcqyp0FyFvd1BzQf3fB9Yvm2MUk=; b=W8viI86RAEhkZSnZeOv60MFqvYh2IJzTZqywLp+gGm6N0ecCscJZVM7vOCkmxNeb2i o8TDWc7PY0scnz9YJpl80BmdKEeKrSlQSPH6Ox0/UyuvxktTOqNhcIiIrljfsPrnPBPc WTh5YrSpY8KZEM7AxRBa0Ek8/H2FvT6qtlGP/mfkn22r7sYdEsHhr4wxWvtv0uplvN4b FxlD6++FKVZ8AaMLR7JX2Mj/Gi9lVZvq9FlGynxj334kc7PZBE1IIVoJ/K1hpFraw2ay SLEhqpBjmNpUgmz3B+doi+ZB1gaQjeCo5TiWWXpdQT3zGa1wg7HGyHiHA2mwV3bsVEOS vJjA== X-Gm-Message-State: APjAAAVghx+BWQzKCcMyZ+jo38ZiTTlrcdTmhbGRYuizCAnFa3vVC6fe VXg5K8e7A01qGSH+PrTX0zyQrNPX X-Google-Smtp-Source: APXvYqz//bmBcgqhjemqiByHKJfTC3U5fIsAz6padNtgAO7nKuAATZ6BWy5aZkP6MR8p0vN1yYaQEQ== X-Received: by 2002:a17:902:524:: with SMTP id 33mr22549692plf.27.1566876644604; Mon, 26 Aug 2019 20:30:44 -0700 (PDT) Received: from bobo.local0.net (14-202-91-55.tpgi.com.au. [14.202.91.55]) by smtp.gmail.com with ESMTPSA id x24sm11383984pgl.84.2019.08.26.20.30.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Aug 2019 20:30:44 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH 1/4] powerpc: convert to copy_thread_tls Date: Tue, 27 Aug 2019 13:30:06 +1000 Message-Id: <20190827033010.28090-2-npiggin@gmail.com> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190827033010.28090-1-npiggin@gmail.com> References: <20190827033010.28090-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Commit 3033f14ab78c3 ("clone: support passing tls argument via C rather than pt_regs magic") introduced the HAVE_COPY_THREAD_TLS option. Use it to avoid a subtle assumption about the argument ordering of clone type syscalls. Signed-off-by: Nicholas Piggin --- arch/powerpc/Kconfig | 1 + arch/powerpc/kernel/process.c | 9 +++++---- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index d8dcd8820369..7477a3263225 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -182,6 +182,7 @@ config PPC select HAVE_STACKPROTECTOR if PPC64 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13) select HAVE_STACKPROTECTOR if PPC32 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2) select HAVE_CONTEXT_TRACKING if PPC64 + select HAVE_COPY_THREAD_TLS select HAVE_DEBUG_KMEMLEAK select HAVE_DEBUG_STACKOVERFLOW select HAVE_DYNAMIC_FTRACE diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 8fc4de0d22b4..24621e7e5033 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1600,8 +1600,9 @@ static void setup_ksp_vsid(struct task_struct *p, unsigned long sp) /* * Copy architecture-specific thread state */ -int copy_thread(unsigned long clone_flags, unsigned long usp, - unsigned long kthread_arg, struct task_struct *p) +int copy_thread_tls(unsigned long clone_flags, unsigned long usp, + unsigned long kthread_arg, struct task_struct *p, + unsigned long tls) { struct pt_regs *childregs, *kregs; extern void ret_from_fork(void); @@ -1642,10 +1643,10 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, if (clone_flags & CLONE_SETTLS) { #ifdef CONFIG_PPC64 if (!is_32bit_task()) - childregs->gpr[13] = childregs->gpr[6]; + childregs->gpr[13] = tls; else #endif - childregs->gpr[2] = childregs->gpr[6]; + childregs->gpr[2] = tls; } f = ret_from_fork; From patchwork Tue Aug 27 03:30:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1153558 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46HZJt56Nkz9sBp for ; Tue, 27 Aug 2019 13:35:42 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="SludPViT"; dkim-atps=neutral Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 46HZJt3YkPzDqsN for ; Tue, 27 Aug 2019 13:35:42 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::641; helo=mail-pl1-x641.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="SludPViT"; dkim-atps=neutral Received: from mail-pl1-x641.google.com (mail-pl1-x641.google.com [IPv6:2607:f8b0:4864:20::641]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 46HZCF2MhBzDqnT for ; Tue, 27 Aug 2019 13:30:49 +1000 (AEST) Received: by mail-pl1-x641.google.com with SMTP id go14so11039566plb.0 for ; Mon, 26 Aug 2019 20:30:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4MQv1fZDmpeZG9L25IANzqdKccZIl5dLjm+wLh5C3Oc=; b=SludPViT/8QbNF8i+i3WJud4H8ekVirxX5D1dmT++cZlnqPD/C12/61pUjZgso79oq fTzvEUnqv+6kLZqdLYOTOh7YiWxoBLnyDGphrXSdLgrsW/smBc7a5Vofo5Chd4QdCG/r g7cd7kzTXFoAptjGG7giTQPZj+n7XumzZwAq70yG69li9Skg/iUwoyP4c0Y+C5Jc1eGi GnisBsxtyAApZCd+NFqu63EYdZJ8zEMGk0n7ddF6WQkVLmrrCygxWJv8KXaV+TlaH51r rCU/2u+M5cvVdO1V6dSfKon5Rhouw/omeyyG0kxU2mMFfBMTk8zxc6bpWeF/99AM+YAE RBJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4MQv1fZDmpeZG9L25IANzqdKccZIl5dLjm+wLh5C3Oc=; b=ozbGXzZozxO5B0b3If7EMF/qXZs/YeCVHWZLQ3rR7Ew5obUhe+ssSmf2pJMppJ+R0A jgyquuHinMU/oRnVDOqKjjmrJobNgrE+BmN+RHZufNFwt79gPT+jCHjKvOZqmbBe04LX JHTVXb0donlWna8zjICEsAJjlC2PKPp23MKluT2MrDI8sc2aGUcu98l6c7UZNds22eiV BOPGgwx7uIPYCUCazTOXRuKN7+3Vk77sO0ahJoR51qeDxx2YwyDJ0NjUN0hmJHZypUv0 ndqAilzEkOvnWQaPMudWXUkDTAhoCIOexynllIcVvQo3HLI/qPDpsyDWhd2xIcp/fZ0P WnaA== X-Gm-Message-State: APjAAAWJTwGNsaZ1l7vnUdY/Nhcd6bcu2PqtzrukbNE1Ke/ngRkWKfJf AeIAbM0dSq96Z5XKaqALqpUfFvyI X-Google-Smtp-Source: APXvYqxjTOHPRa04u9UMalH6dFvJClzwK+QhPMEt2sk7oJnngBFau9m9PHIQNHjThjITjQ7LXoa7Vg== X-Received: by 2002:a17:902:b702:: with SMTP id d2mr10406427pls.259.1566876646630; Mon, 26 Aug 2019 20:30:46 -0700 (PDT) Received: from bobo.local0.net (14-202-91-55.tpgi.com.au. [14.202.91.55]) by smtp.gmail.com with ESMTPSA id x24sm11383984pgl.84.2019.08.26.20.30.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Aug 2019 20:30:46 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH 2/4] powerpc/64: remove support for kernel-mode syscalls Date: Tue, 27 Aug 2019 13:30:07 +1000 Message-Id: <20190827033010.28090-3-npiggin@gmail.com> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190827033010.28090-1-npiggin@gmail.com> References: <20190827033010.28090-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" There is support for the kernel to execute the 'sc 0' instruction and make a system call to itself. This is a relic that is unused in the tree, therefore untested. It's also highly questionable for modules to be doing this. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/entry_64.S | 21 ++++++--------------- arch/powerpc/kernel/exceptions-64s.S | 2 -- 2 files changed, 6 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 0a0b5310f54a..6467bdab8d40 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -69,24 +69,20 @@ BEGIN_FTR_SECTION bne .Ltabort_syscall END_FTR_SECTION_IFSET(CPU_FTR_TM) #endif - andi. r10,r12,MSR_PR mr r10,r1 - addi r1,r1,-INT_FRAME_SIZE - beq- 1f ld r1,PACAKSAVE(r13) -1: std r10,0(r1) + std r10,0(r1) std r11,_NIP(r1) std r12,_MSR(r1) std r0,GPR0(r1) std r10,GPR1(r1) - beq 2f /* if from kernel mode */ #ifdef CONFIG_PPC_FSL_BOOK3E START_BTB_FLUSH_SECTION BTB_FLUSH(r10) END_BTB_FLUSH_SECTION #endif ACCOUNT_CPU_USER_ENTRY(r13, r10, r11) -2: std r2,GPR2(r1) + std r2,GPR2(r1) std r3,GPR3(r1) mfcr r2 std r4,GPR4(r1) @@ -122,14 +118,13 @@ END_BTB_FLUSH_SECTION #if defined(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE) && defined(CONFIG_PPC_SPLPAR) BEGIN_FW_FTR_SECTION - beq 33f - /* if from user, see if there are any DTL entries to process */ + /* see if there are any DTL entries to process */ ld r10,PACALPPACAPTR(r13) /* get ptr to VPA */ ld r11,PACA_DTL_RIDX(r13) /* get log read index */ addi r10,r10,LPPACA_DTLIDX LDX_BE r10,0,r10 /* get log write index */ - cmpd cr1,r11,r10 - beq+ cr1,33f + cmpd r11,r10 + beq+ 33f bl accumulate_stolen_time REST_GPR(0,r1) REST_4GPRS(3,r1) @@ -203,6 +198,7 @@ system_call: /* label this so stack traces look sane */ mtctr r12 bctrl /* Call handler */ + /* syscall_exit can exit to kernel mode, via ret_from_kernel_thread */ .Lsyscall_exit: std r3,RESULT(r1) @@ -216,11 +212,6 @@ system_call: /* label this so stack traces look sane */ ld r12, PACA_THREAD_INFO(r13) ld r8,_MSR(r1) -#ifdef CONFIG_PPC_BOOK3S - /* No MSR:RI on BookE */ - andi. r10,r8,MSR_RI - beq- .Lunrecov_restore -#endif /* * This is a few instructions into the actual syscall exit path (which actually diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 6ba3cc2ef8ab..768f133de4f1 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -1521,8 +1521,6 @@ EXC_COMMON(trap_0b_common, 0xb00, unknown_exception) * system call / hypercall (0xc00, 0x4c00) * * The system call exception is invoked with "sc 0" and does not alter HV bit. - * There is support for kernel code to invoke system calls but there are no - * in-tree users. * * The hypercall is invoked with "sc 1" and sets HV=1. * From patchwork Tue Aug 27 03:30:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1153560 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46HZNt3ly0z9sBF for ; Tue, 27 Aug 2019 13:39:10 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="fiFlN9dT"; dkim-atps=neutral Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 46HZNt2fDszDqw4 for ; Tue, 27 Aug 2019 13:39:10 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::641; helo=mail-pl1-x641.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="fiFlN9dT"; dkim-atps=neutral Received: from mail-pl1-x641.google.com (mail-pl1-x641.google.com [IPv6:2607:f8b0:4864:20::641]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 46HZCJ5sjHzDqpf for ; Tue, 27 Aug 2019 13:30:52 +1000 (AEST) Received: by mail-pl1-x641.google.com with SMTP id 4so11040793pld.10 for ; Mon, 26 Aug 2019 20:30:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=y0CEusGeMR6BKaWGMt96JV7gQneFy6WC/pqkhh3twGM=; b=fiFlN9dTSAlJw2Olcs4kGc08jNdnT5JxqeVBuEI5hdnA9gDHKQR2TWYRTsICF7ez6b EK6zBLuu8lXXqroillDAzYwtEz3+MKYunswDvi3lhZEP0ehO/UI0sC+zpNvgednI0K/m NDG6fvex0HJnDV8k6KRUbimuoA8v6fMAkqO7MgjAphqxwN3t2DHNGVS7Ocez270P6CPv +W/sG7TTnSvLhDnUqbysFY/I5fKTY+NsHU51BG1LhaflnxTsleUvbo1CD3OY1dGaoUCA xChcJ/RYgDra7elCz8VzNZKBsyjv/X7ahfBP4J/PW1Y0l9JXk38mzclMBu5nDJYpDjv7 R2Og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=y0CEusGeMR6BKaWGMt96JV7gQneFy6WC/pqkhh3twGM=; b=EspjTTZB62Hm53LDR4hS2vjQuOVf+geCk5XYXZNkQTULaEP3cQzim3VdpI3x76UmLn kcZGcmnisd8Oizf9mj1OaiwN8rzGMLDXwF8kfSMh4wxRnosIAZTA3Dh8YwX0pWU/QyYU 5YECUeSzapXKY9fe7YTcv3Y2f/U5gKnIiDTUpeEMQJhmtRzTfPEhm9tW6O1JuhcHlUuW cX5hMN0d5TANbsz0ZQW/73YkZASMWtyWpYQc12MdbYcndUjH7Jbe5L0MIUh4kQWaHYYa +3UsG3qfAH742mjkkeDCFQcXSiEiYqqkW4s7k/04Il8L/gNBGSrDscPXky2dfeVAmm3T j1vA== X-Gm-Message-State: APjAAAVib16QZwGNmPa8SZNZywtVMF32nGnRoubFXGVSV8iJO3LQNfgh kk24DNTzjr0KqvxIxboBT/LWWw9a X-Google-Smtp-Source: APXvYqx7vKhJQKSh//p+hNFBWu6GOrupRpwJ2mQUTRXWokaEdU1Hc9VoKpAzW2nJdQXOy1vYQoet3A== X-Received: by 2002:a17:902:4201:: with SMTP id g1mr22381067pld.300.1566876650868; Mon, 26 Aug 2019 20:30:50 -0700 (PDT) Received: from bobo.local0.net (14-202-91-55.tpgi.com.au. [14.202.91.55]) by smtp.gmail.com with ESMTPSA id x24sm11383984pgl.84.2019.08.26.20.30.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Aug 2019 20:30:50 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH 3/4] powerpc/64: system call remove non-volatile GPR save optimisation Date: Tue, 27 Aug 2019 13:30:09 +1000 Message-Id: <20190827033010.28090-5-npiggin@gmail.com> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190827033010.28090-1-npiggin@gmail.com> References: <20190827033010.28090-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" powerpc has an optimisation where interrupts avoid saving the non-volatile (or callee saved) registers to the interrupt stack frame if they are not required. Two problems with this are that an interrupt does not always know whether it will need non-volatiles; and if it does need them, they can only be saved from the entry-scoped asm code (because we don't control what the C compiler does with these registers). system calls are about the worst example of this, some system calls always require all registers (e.g., fork, to copy regs into the child). Sometimes registers are only required under certain conditions (e.g., tracing, signal delivery). This causes weird ugly logic in the call chains (e.g., ppc_fork), and also prevents moving substantial amounts of logic to C. Remove this optimisation for system calls, and always save NVGPRs on entry. On modern CPUs this doesn't appear to be worth the complexity because the stores can be hidden by other work quite well the null syscall selftests benchmark on a POWER9 has (124.40ns before and 123.64ns after, i.e., well within the noise). Other interrupts are a little less complex in NVGPR handling, many do less work at entry and also they don't have as clear a fast path for avoiding the GPR restore as system calls, so this patch does not remove the facility completely until more tests can be done. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/entry_64.S | 72 +++++------------------- arch/powerpc/kernel/syscalls/syscall.tbl | 22 +++++--- 2 files changed, 28 insertions(+), 66 deletions(-) diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 6467bdab8d40..5a3e0b5c9ad1 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -98,13 +98,14 @@ END_BTB_FLUSH_SECTION std r11,_XER(r1) std r11,_CTR(r1) std r9,GPR13(r1) + SAVE_NVGPRS(r1) mflr r10 /* * This clears CR0.SO (bit 28), which is the error indication on * return from this system call. */ rldimi r2,r11,28,(63-28) - li r11,0xc01 + li r11,0xc00 std r10,_LINK(r1) std r11,_TRAP(r1) std r3,ORIG_GPR3(r1) @@ -323,7 +324,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) /* Traced system call support */ .Lsyscall_dotrace: - bl save_nvgprs addi r3,r1,STACK_FRAME_OVERHEAD bl do_syscall_trace_enter @@ -408,7 +408,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) mtmsrd r10,1 #endif /* CONFIG_PPC_BOOK3E */ - bl save_nvgprs addi r3,r1,STACK_FRAME_OVERHEAD bl do_syscall_trace_leave b ret_from_except @@ -442,62 +441,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) _ASM_NOKPROBE_SYMBOL(system_call_common); _ASM_NOKPROBE_SYMBOL(system_call_exit); -/* Save non-volatile GPRs, if not already saved. */ -_GLOBAL(save_nvgprs) - ld r11,_TRAP(r1) - andi. r0,r11,1 - beqlr- - SAVE_NVGPRS(r1) - clrrdi r0,r11,1 - std r0,_TRAP(r1) - blr -_ASM_NOKPROBE_SYMBOL(save_nvgprs); - - -/* - * The sigsuspend and rt_sigsuspend system calls can call do_signal - * and thus put the process into the stopped state where we might - * want to examine its user state with ptrace. Therefore we need - * to save all the nonvolatile registers (r14 - r31) before calling - * the C code. Similarly, fork, vfork and clone need the full - * register state on the stack so that it can be copied to the child. - */ - -_GLOBAL(ppc_fork) - bl save_nvgprs - bl sys_fork - b .Lsyscall_exit - -_GLOBAL(ppc_vfork) - bl save_nvgprs - bl sys_vfork - b .Lsyscall_exit - -_GLOBAL(ppc_clone) - bl save_nvgprs - bl sys_clone - b .Lsyscall_exit - -_GLOBAL(ppc_clone3) - bl save_nvgprs - bl sys_clone3 - b .Lsyscall_exit - -_GLOBAL(ppc32_swapcontext) - bl save_nvgprs - bl compat_sys_swapcontext - b .Lsyscall_exit - -_GLOBAL(ppc64_swapcontext) - bl save_nvgprs - bl sys_swapcontext - b .Lsyscall_exit - -_GLOBAL(ppc_switch_endian) - bl save_nvgprs - bl sys_switch_endian - b .Lsyscall_exit - _GLOBAL(ret_from_fork) bl schedule_tail REST_NVGPRS(r1) @@ -516,6 +459,17 @@ _GLOBAL(ret_from_kernel_thread) li r3,0 b .Lsyscall_exit +/* Save non-volatile GPRs, if not already saved. */ +_GLOBAL(save_nvgprs) + ld r11,_TRAP(r1) + andi. r0,r11,1 + beqlr- + SAVE_NVGPRS(r1) + clrrdi r0,r11,1 + std r0,_TRAP(r1) + blr +_ASM_NOKPROBE_SYMBOL(save_nvgprs); + #ifdef CONFIG_PPC_BOOK3S_64 #define FLUSH_COUNT_CACHE \ diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index 43f736ed47f2..010b9f445586 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -9,7 +9,9 @@ # 0 nospu restart_syscall sys_restart_syscall 1 nospu exit sys_exit -2 nospu fork ppc_fork +2 32 fork ppc_fork sys_fork +2 64 fork sys_fork +2 spu fork sys_ni_syscall 3 common read sys_read 4 common write sys_write 5 common open sys_open compat_sys_open @@ -158,7 +160,9 @@ 119 32 sigreturn sys_sigreturn compat_sys_sigreturn 119 64 sigreturn sys_ni_syscall 119 spu sigreturn sys_ni_syscall -120 nospu clone ppc_clone +120 32 clone ppc_clone sys_clone +120 64 clone sys_clone +120 spu clone sys_ni_syscall 121 common setdomainname sys_setdomainname 122 common uname sys_newuname 123 common modify_ldt sys_ni_syscall @@ -240,7 +244,9 @@ 186 spu sendfile sys_sendfile64 187 common getpmsg sys_ni_syscall 188 common putpmsg sys_ni_syscall -189 nospu vfork ppc_vfork +189 32 vfork ppc_vfork sys_vfork +189 64 vfork sys_vfork +189 spu vfork sys_ni_syscall 190 common ugetrlimit sys_getrlimit compat_sys_getrlimit 191 common readahead sys_readahead compat_sys_readahead 192 32 mmap2 sys_mmap2 compat_sys_mmap2 @@ -316,8 +322,8 @@ 248 32 clock_nanosleep sys_clock_nanosleep_time32 248 64 clock_nanosleep sys_clock_nanosleep 248 spu clock_nanosleep sys_clock_nanosleep -249 32 swapcontext ppc_swapcontext ppc32_swapcontext -249 64 swapcontext ppc64_swapcontext +249 32 swapcontext ppc_swapcontext compat_sys_swapcontext +249 64 swapcontext sys_swapcontext 249 spu swapcontext sys_ni_syscall 250 common tgkill sys_tgkill 251 32 utimes sys_utimes_time32 @@ -456,7 +462,7 @@ 361 common bpf sys_bpf 362 nospu execveat sys_execveat compat_sys_execveat 363 32 switch_endian sys_ni_syscall -363 64 switch_endian ppc_switch_endian +363 64 switch_endian sys_switch_endian 363 spu switch_endian sys_ni_syscall 364 common userfaultfd sys_userfaultfd 365 common membarrier sys_membarrier @@ -516,4 +522,6 @@ 432 common fsmount sys_fsmount 433 common fspick sys_fspick 434 common pidfd_open sys_pidfd_open -435 nospu clone3 ppc_clone3 +435 32 clone3 ppc_clone3 sys_clone3 +435 64 clone3 sys_clone3 +435 nospu clone3 sys_ni_syscall From patchwork Tue Aug 27 03:30:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1153561 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46HZRd3Tvgz9sBF for ; Tue, 27 Aug 2019 13:41:33 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="s0QOCDFl"; dkim-atps=neutral Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 46HZRd2bcxzDqXM for ; Tue, 27 Aug 2019 13:41:33 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::541; helo=mail-pg1-x541.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="s0QOCDFl"; dkim-atps=neutral Received: from mail-pg1-x541.google.com (mail-pg1-x541.google.com [IPv6:2607:f8b0:4864:20::541]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 46HZCQ17BZzDqnT for ; Tue, 27 Aug 2019 13:30:57 +1000 (AEST) Received: by mail-pg1-x541.google.com with SMTP id k3so11796536pgb.10 for ; Mon, 26 Aug 2019 20:30:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+hyYRu259BVVhpKUTfQHoyG81wlh7R/UrT6smkfc1y0=; b=s0QOCDFlYLPNLljKs9HEe/sxauMCXnwq6INFp/91qa2JZnGMp6sUmdKWK5yU9gcihC qMVYzUnWra5Up7bg2S653gQVyLzZjQIzeXQT8r/cvvpSfeUJSblf7xO+Q0eEmrZVc+ml sa0b1vmKVSKJ1HTszduyNohzFT+UdhyyGq+BTtRQWkRroog227w00V9pe2yvOjcf9ao1 Zx9G+viPF0QlNW156Z3C4Mb7JeuOd3b6AgjUtH4HyO9D++qtFHjvIxtY5oBbJNsoTTn7 x5dGcwtRjoqmQ9KUArix7lE3g5iYLTpYhWmUZC11Ken+2nws+p6wFrkRcHla2L5LLk3L bx1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+hyYRu259BVVhpKUTfQHoyG81wlh7R/UrT6smkfc1y0=; b=n4eOonf39Vk+9I6J5cjpY13zx1ZqdkwH385NJDQe4hnZY5c8LjhkcfY+vKbQmYYMiF ojUqlREuO8LwQQEQ5yVsLxSPvOhnt4MR5KhjazLK7ap9J7lX3rV7kxU8dU6ym9OMnIEv VWeBuK286eUUWlX2ebMKei9FYdxr0VCyw/hWk7Vl8b/Z+Pm8s2wrAh9evjcAGVthWjlJ Y4cREf9F53CP5DMsEztdR3kyS1WbDrikABPf0ncgmJg/ucoMg4hlt2I2Hc0jsKngBDH+ loBxKBef0Vm7A78QbawGxw/Iyg5Vsm9wtYoXMGkixf637csIyR1QTz74aKKb7TBaC+hG tPNw== X-Gm-Message-State: APjAAAUXu5Ib/C/j+bMwgPfHuA003JzhWbVA7p2TlAZy6goYJRqcy52Q +zcqv4PDlYCQ9y+kwa0Zj1Yfp73z X-Google-Smtp-Source: APXvYqwVqv2+dCf7YM2aK+gsvHoICO4gx7tsq+NT1erryQOwOEcIsJLMkev1DHmvZGJ1tOMyohJ3+g== X-Received: by 2002:a63:4a51:: with SMTP id j17mr19230758pgl.284.1566876653378; Mon, 26 Aug 2019 20:30:53 -0700 (PDT) Received: from bobo.local0.net (14-202-91-55.tpgi.com.au. [14.202.91.55]) by smtp.gmail.com with ESMTPSA id x24sm11383984pgl.84.2019.08.26.20.30.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Aug 2019 20:30:52 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH 4/4] powerpc/64: system call implement the bulk of the logic in C Date: Tue, 27 Aug 2019 13:30:10 +1000 Message-Id: <20190827033010.28090-6-npiggin@gmail.com> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190827033010.28090-1-npiggin@gmail.com> References: <20190827033010.28090-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except a relatively simple test to see whether all GPRs should be restored at exit time. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. The asm instruction scheduling has not really been analysed and optimised yet, as there are higher level features and improvements to add first. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/asm-prototypes.h | 11 - arch/powerpc/include/asm/ptrace.h | 3 + arch/powerpc/include/asm/signal.h | 2 + arch/powerpc/include/asm/switch_to.h | 4 + arch/powerpc/include/asm/time.h | 3 + arch/powerpc/kernel/Makefile | 3 +- arch/powerpc/kernel/entry_64.S | 343 ++++------------------ arch/powerpc/kernel/process.c | 6 +- arch/powerpc/kernel/signal.h | 2 - arch/powerpc/kernel/syscall_64.c | 202 +++++++++++++ 10 files changed, 271 insertions(+), 308 deletions(-) create mode 100644 arch/powerpc/kernel/syscall_64.c diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h index ec1c97a8e8cb..f00ef8924a99 100644 --- a/arch/powerpc/include/asm/asm-prototypes.h +++ b/arch/powerpc/include/asm/asm-prototypes.h @@ -92,14 +92,6 @@ long sys_switch_endian(void); notrace unsigned int __check_irq_replay(void); void notrace restore_interrupts(void); -/* ptrace */ -long do_syscall_trace_enter(struct pt_regs *regs); -void do_syscall_trace_leave(struct pt_regs *regs); - -/* process */ -void restore_math(struct pt_regs *regs); -void restore_tm_state(struct pt_regs *regs); - /* prom_init (OpenFirmware) */ unsigned long __init prom_init(unsigned long r3, unsigned long r4, unsigned long pp, @@ -110,9 +102,6 @@ unsigned long __init prom_init(unsigned long r3, unsigned long r4, void __init early_setup(unsigned long dt_ptr); void early_setup_secondary(void); -/* time */ -void accumulate_stolen_time(void); - /* misc runtime */ extern u64 __bswapdi2(u64); extern s64 __lshrdi3(s64, int); diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h index feee1b21bbd5..af363086403a 100644 --- a/arch/powerpc/include/asm/ptrace.h +++ b/arch/powerpc/include/asm/ptrace.h @@ -138,6 +138,9 @@ extern unsigned long profile_pc(struct pt_regs *regs); #define profile_pc(regs) instruction_pointer(regs) #endif +long do_syscall_trace_enter(struct pt_regs *regs); +void do_syscall_trace_leave(struct pt_regs *regs); + #define kernel_stack_pointer(regs) ((regs)->gpr[1]) static inline int is_syscall_success(struct pt_regs *regs) { diff --git a/arch/powerpc/include/asm/signal.h b/arch/powerpc/include/asm/signal.h index 0803ca8b9149..0113be8dcb59 100644 --- a/arch/powerpc/include/asm/signal.h +++ b/arch/powerpc/include/asm/signal.h @@ -6,4 +6,6 @@ #include #include +void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags); + #endif /* _ASM_POWERPC_SIGNAL_H */ diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index 5b03d8a82409..231fcb2058cf 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -5,6 +5,7 @@ #ifndef _ASM_POWERPC_SWITCH_TO_H #define _ASM_POWERPC_SWITCH_TO_H +#include #include struct thread_struct; @@ -22,6 +23,9 @@ extern void switch_booke_debug_regs(struct debug_reg *new_debug); extern int emulate_altivec(struct pt_regs *); +void restore_math(struct pt_regs *regs); +void restore_tm_state(struct pt_regs *regs); + extern void flush_all_to_thread(struct task_struct *); extern void giveup_all(struct task_struct *); diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index 54f4ec1f9fab..ba97858c9d76 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -199,5 +199,8 @@ DECLARE_PER_CPU(u64, decrementers_next_tb); /* Convert timebase ticks to nanoseconds */ unsigned long long tb_to_ns(unsigned long long tb_ticks); +/* SPLPAR */ +void accumulate_stolen_time(void); + #endif /* __KERNEL__ */ #endif /* __POWERPC_TIME_H */ diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 56dfa7a2a6f2..7f53cc07f933 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -52,7 +52,8 @@ obj-y := cputable.o ptrace.o syscalls.o \ of_platform.o prom_parse.o obj-$(CONFIG_PPC64) += setup_64.o sys_ppc32.o \ signal_64.o ptrace32.o \ - paca.o nvram_64.o firmware.o + paca.o nvram_64.o firmware.o \ + syscall_64.o obj-$(CONFIG_VDSO32) += vdso32/ obj-$(CONFIG_PPC_WATCHDOG) += watchdog.o obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 5a3e0b5c9ad1..2ec825a85f5b 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -63,12 +63,6 @@ exception_marker: .globl system_call_common system_call_common: -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM -BEGIN_FTR_SECTION - extrdi. r10, r12, 1, (63-MSR_TS_T_LG) /* transaction active? */ - bne .Ltabort_syscall -END_FTR_SECTION_IFSET(CPU_FTR_TM) -#endif mr r10,r1 ld r1,PACAKSAVE(r13) std r10,0(r1) @@ -76,350 +70,118 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM) std r12,_MSR(r1) std r0,GPR0(r1) std r10,GPR1(r1) + std r2,GPR2(r1) #ifdef CONFIG_PPC_FSL_BOOK3E + andi. r10,r12,MSR_PR + beq 1f /* if from kernel mode */ START_BTB_FLUSH_SECTION BTB_FLUSH(r10) END_BTB_FLUSH_SECTION +1: #endif - ACCOUNT_CPU_USER_ENTRY(r13, r10, r11) - std r2,GPR2(r1) + ld r2,PACATOC(r13) + mfcr r12 + li r11,0 + /* Can we avoid saving r3-r8 in common case? */ std r3,GPR3(r1) - mfcr r2 std r4,GPR4(r1) std r5,GPR5(r1) std r6,GPR6(r1) std r7,GPR7(r1) std r8,GPR8(r1) - li r11,0 + /* Zero r9-r12, this should only be required when restoring all GPRs */ std r11,GPR9(r1) std r11,GPR10(r1) std r11,GPR11(r1) std r11,GPR12(r1) - std r11,_XER(r1) - std r11,_CTR(r1) std r9,GPR13(r1) SAVE_NVGPRS(r1) + std r11,_XER(r1) + std r11,_CTR(r1) mflr r10 + /* * This clears CR0.SO (bit 28), which is the error indication on * return from this system call. */ - rldimi r2,r11,28,(63-28) + rldimi r12,r11,28,(63-28) li r11,0xc00 std r10,_LINK(r1) std r11,_TRAP(r1) + std r12,_CCR(r1) std r3,ORIG_GPR3(r1) - std r2,_CCR(r1) - ld r2,PACATOC(r13) - addi r9,r1,STACK_FRAME_OVERHEAD + addi r10,r1,STACK_FRAME_OVERHEAD ld r11,exception_marker@toc(r2) - std r11,-16(r9) /* "regshere" marker */ - - kuap_check_amr r10, r11 - -#if defined(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE) && defined(CONFIG_PPC_SPLPAR) -BEGIN_FW_FTR_SECTION - /* see if there are any DTL entries to process */ - ld r10,PACALPPACAPTR(r13) /* get ptr to VPA */ - ld r11,PACA_DTL_RIDX(r13) /* get log read index */ - addi r10,r10,LPPACA_DTLIDX - LDX_BE r10,0,r10 /* get log write index */ - cmpd r11,r10 - beq+ 33f - bl accumulate_stolen_time - REST_GPR(0,r1) - REST_4GPRS(3,r1) - REST_2GPRS(7,r1) - addi r9,r1,STACK_FRAME_OVERHEAD -33: -END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR) -#endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE && CONFIG_PPC_SPLPAR */ - - /* - * A syscall should always be called with interrupts enabled - * so we just unconditionally hard-enable here. When some kind - * of irq tracing is used, we additionally check that condition - * is correct - */ -#if defined(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG) && defined(CONFIG_BUG) - lbz r10,PACAIRQSOFTMASK(r13) -1: tdnei r10,IRQS_ENABLED - EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING -#endif - -#ifdef CONFIG_PPC_BOOK3E - wrteei 1 -#else - li r11,MSR_RI - ori r11,r11,MSR_EE - mtmsrd r11,1 -#endif /* CONFIG_PPC_BOOK3E */ - -system_call: /* label this so stack traces look sane */ - /* We do need to set SOFTE in the stack frame or the return - * from interrupt will be painful - */ - li r10,IRQS_ENABLED - std r10,SOFTE(r1) - - ld r11, PACA_THREAD_INFO(r13) - ld r10,TI_FLAGS(r11) - andi. r11,r10,_TIF_SYSCALL_DOTRACE - bne .Lsyscall_dotrace /* does not return */ - cmpldi 0,r0,NR_syscalls - bge- .Lsyscall_enosys - -.Lsyscall: -/* - * Need to vector to 32 Bit or default sys_call_table here, - * based on caller's run-mode / personality. - */ - ld r11,SYS_CALL_TABLE@toc(2) - andis. r10,r10,_TIF_32BIT@h - beq 15f - ld r11,COMPAT_SYS_CALL_TABLE@toc(2) - clrldi r3,r3,32 - clrldi r4,r4,32 - clrldi r5,r5,32 - clrldi r6,r6,32 - clrldi r7,r7,32 - clrldi r8,r8,32 -15: - slwi r0,r0,3 - - barrier_nospec_asm - /* - * Prevent the load of the handler below (based on the user-passed - * system call number) being speculatively executed until the test - * against NR_syscalls and branch to .Lsyscall_enosys above has - * committed. - */ + std r11,-16(r10) /* "regshere" marker */ - ldx r12,r11,r0 /* Fetch system call handler [ptr] */ - mtctr r12 - bctrl /* Call handler */ + /* Calling convention has r9 = orig r0, r10 = regs */ + mr r9,r0 + bl system_call_exception - /* syscall_exit can exit to kernel mode, via ret_from_kernel_thread */ .Lsyscall_exit: - std r3,RESULT(r1) - -#ifdef CONFIG_DEBUG_RSEQ - /* Check whether the syscall is issued inside a restartable sequence */ - addi r3,r1,STACK_FRAME_OVERHEAD - bl rseq_syscall - ld r3,RESULT(r1) -#endif - - ld r12, PACA_THREAD_INFO(r13) - - ld r8,_MSR(r1) - -/* - * This is a few instructions into the actual syscall exit path (which actually - * starts at .Lsyscall_exit) to cater to kprobe blacklisting and to reduce the - * number of visible symbols for profiling purposes. - * - * We can probe from system_call until this point as MSR_RI is set. But once it - * is cleared below, we won't be able to take a trap. - * - * This is blacklisted from kprobes further below with _ASM_NOKPROBE_SYMBOL(). - */ -system_call_exit: - /* - * Disable interrupts so current_thread_info()->flags can't change, - * and so that we don't get interrupted after loading SRR0/1. - * - * Leave MSR_RI enabled for now, because with THREAD_INFO_IN_TASK we - * could fault on the load of the TI_FLAGS below. - */ -#ifdef CONFIG_PPC_BOOK3E - wrteei 0 -#else - li r11,MSR_RI - mtmsrd r11,1 -#endif /* CONFIG_PPC_BOOK3E */ + addi r4,r1,STACK_FRAME_OVERHEAD + bl syscall_exit_prepare - ld r9,TI_FLAGS(r12) - li r11,-MAX_ERRNO - andi. r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK) - bne- .Lsyscall_exit_work + ld r2,_CCR(r1) + ld r4,_NIP(r1) + ld r5,_MSR(r1) + ld r6,_LINK(r1) - andi. r0,r8,MSR_FP - beq 2f -#ifdef CONFIG_ALTIVEC - andis. r0,r8,MSR_VEC@h - bne 3f -#endif -2: addi r3,r1,STACK_FRAME_OVERHEAD - bl restore_math - ld r8,_MSR(r1) - ld r3,RESULT(r1) - li r11,-MAX_ERRNO - -3: cmpld r3,r11 - ld r5,_CCR(r1) - bge- .Lsyscall_error -.Lsyscall_error_cont: - ld r7,_NIP(r1) BEGIN_FTR_SECTION stdcx. r0,0,r1 /* to clear the reservation */ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) - andi. r6,r8,MSR_PR - ld r4,_LINK(r1) - kuap_check_amr r10, r11 - -#ifdef CONFIG_PPC_BOOK3S - /* - * Clear MSR_RI, MSR_EE is already and remains disabled. We could do - * this later, but testing shows that doing it here causes less slow - * down than doing it closer to the rfid. - */ - li r11,0 - mtmsrd r11,1 + mtspr SPRN_SRR0,r4 + mtspr SPRN_SRR1,r5 +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM + std r5,PACATMSCRATCH(r13) #endif + mtlr r6 - beq- 1f - ACCOUNT_CPU_USER_EXIT(r13, r11, r12) + cmpdi r3,0 + bne syscall_restore_regs +.Lsyscall_restore_regs_cont: BEGIN_FTR_SECTION HMT_MEDIUM_LOW END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM - std r8, PACATMSCRATCH(r13) -#endif - /* * We don't need to restore AMR on the way back to userspace for KUAP. * The value of AMR only matters while we're in the kernel. */ - ld r13,GPR13(r1) /* only restore r13 if returning to usermode */ + mtcr r2 ld r2,GPR2(r1) + ld r3,GPR3(r1) + ld r13,GPR13(r1) ld r1,GPR1(r1) - mtlr r4 - mtcr r5 - mtspr SPRN_SRR0,r7 - mtspr SPRN_SRR1,r8 RFI_TO_USER b . /* prevent speculative execution */ +_ASM_NOKPROBE_SYMBOL(system_call_common); -1: /* exit to kernel */ - kuap_restore_amr r2 - - ld r2,GPR2(r1) - ld r1,GPR1(r1) - mtlr r4 - mtcr r5 - mtspr SPRN_SRR0,r7 - mtspr SPRN_SRR1,r8 - RFI_TO_KERNEL - b . /* prevent speculative execution */ - -.Lsyscall_error: - oris r5,r5,0x1000 /* Set SO bit in CR */ - neg r3,r3 - std r5,_CCR(r1) - b .Lsyscall_error_cont - -/* Traced system call support */ -.Lsyscall_dotrace: - addi r3,r1,STACK_FRAME_OVERHEAD - bl do_syscall_trace_enter - - /* - * We use the return value of do_syscall_trace_enter() as the syscall - * number. If the syscall was rejected for any reason do_syscall_trace_enter() - * returns an invalid syscall number and the test below against - * NR_syscalls will fail. - */ - mr r0,r3 - - /* Restore argument registers just clobbered and/or possibly changed. */ - ld r3,GPR3(r1) - ld r4,GPR4(r1) - ld r5,GPR5(r1) - ld r6,GPR6(r1) - ld r7,GPR7(r1) - ld r8,GPR8(r1) - - /* Repopulate r9 and r10 for the syscall path */ - addi r9,r1,STACK_FRAME_OVERHEAD - ld r10, PACA_THREAD_INFO(r13) - ld r10,TI_FLAGS(r10) - - cmpldi r0,NR_syscalls - blt+ .Lsyscall - - /* Return code is already in r3 thanks to do_syscall_trace_enter() */ - b .Lsyscall_exit - - -.Lsyscall_enosys: - li r3,-ENOSYS - b .Lsyscall_exit - -.Lsyscall_exit_work: - /* If TIF_RESTOREALL is set, don't scribble on either r3 or ccr. - If TIF_NOERROR is set, just save r3 as it is. */ - - andi. r0,r9,_TIF_RESTOREALL - beq+ 0f +syscall_restore_regs: + ld r3,_CTR(r1) + ld r4,_XER(r1) REST_NVGPRS(r1) - b 2f -0: cmpld r3,r11 /* r11 is -MAX_ERRNO */ - blt+ 1f - andi. r0,r9,_TIF_NOERROR - bne- 1f - ld r5,_CCR(r1) - neg r3,r3 - oris r5,r5,0x1000 /* Set SO bit in CR */ - std r5,_CCR(r1) -1: std r3,GPR3(r1) -2: andi. r0,r9,(_TIF_PERSYSCALL_MASK) - beq 4f - - /* Clear per-syscall TIF flags if any are set. */ - - li r11,_TIF_PERSYSCALL_MASK - addi r12,r12,TI_FLAGS -3: ldarx r10,0,r12 - andc r10,r10,r11 - stdcx. r10,0,r12 - bne- 3b - subi r12,r12,TI_FLAGS - -4: /* Anything else left to do? */ -BEGIN_FTR_SECTION - lis r3,DEFAULT_PPR@highest /* Set default PPR */ - sldi r3,r3,32 /* bits 11-13 are used for ppr */ - std r3,_PPR(r1) -END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) - - andi. r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP) - beq ret_from_except_lite - - /* Re-enable interrupts */ -#ifdef CONFIG_PPC_BOOK3E - wrteei 1 -#else - li r10,MSR_RI - ori r10,r10,MSR_EE - mtmsrd r10,1 -#endif /* CONFIG_PPC_BOOK3E */ - - addi r3,r1,STACK_FRAME_OVERHEAD - bl do_syscall_trace_leave - b ret_from_except + mtctr r3 + mtspr SPRN_XER,r4 + ld r0,GPR0(r1) + REST_8GPRS(4, r1) + ld r12,GPR12(r1) + b .Lsyscall_restore_regs_cont #ifdef CONFIG_PPC_TRANSACTIONAL_MEM -.Ltabort_syscall: + .globl tabort_syscall +tabort_syscall: /* Firstly we need to enable TM in the kernel */ mfmsr r10 li r9, 1 rldimi r10, r9, MSR_TM_LG, 63-MSR_TM_LG mtmsrd r10, 0 + ld r11,_NIP(r13) + ld r12,_MSR(r13) + /* tabort, this dooms the transaction, nothing else */ li r9, (TM_CAUSE_SYSCALL|TM_CAUSE_PERSISTENT) TABORT(R9) @@ -438,9 +200,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) RFI_TO_USER b . /* prevent speculative execution */ #endif -_ASM_NOKPROBE_SYMBOL(system_call_common); -_ASM_NOKPROBE_SYMBOL(system_call_exit); - _GLOBAL(ret_from_fork) bl schedule_tail REST_NVGPRS(r1) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 24621e7e5033..f444525da9ce 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1609,7 +1609,6 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long usp, extern void ret_from_kernel_thread(void); void (*f)(void); unsigned long sp = (unsigned long)task_stack_page(p) + THREAD_SIZE; - struct thread_info *ti = task_thread_info(p); klp_init_thread_info(p); @@ -1617,6 +1616,8 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long usp, sp -= sizeof(struct pt_regs); childregs = (struct pt_regs *) sp; if (unlikely(p->flags & PF_KTHREAD)) { + struct thread_info *ti = task_thread_info(p); + /* kernel thread */ memset(childregs, 0, sizeof(struct pt_regs)); childregs->gpr[1] = sp + sizeof(struct pt_regs); @@ -1634,12 +1635,13 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long usp, } else { /* user thread */ struct pt_regs *regs = current_pt_regs(); + CHECK_FULL_REGS(regs); *childregs = *regs; if (usp) childregs->gpr[1] = usp; p->thread.regs = childregs; - childregs->gpr[3] = 0; /* Result from fork() */ + /* ret_from_fork sets fork() result to 0 */ if (clone_flags & CLONE_SETTLS) { #ifdef CONFIG_PPC64 if (!is_32bit_task()) diff --git a/arch/powerpc/kernel/signal.h b/arch/powerpc/kernel/signal.h index 800433685888..d396efca4068 100644 --- a/arch/powerpc/kernel/signal.h +++ b/arch/powerpc/kernel/signal.h @@ -10,8 +10,6 @@ #ifndef _POWERPC_ARCH_SIGNAL_H #define _POWERPC_ARCH_SIGNAL_H -extern void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags); - extern void __user *get_sigframe(struct ksignal *ksig, unsigned long sp, size_t frame_size, int is_32); diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c new file mode 100644 index 000000000000..98ed970796d5 --- /dev/null +++ b/arch/powerpc/kernel/syscall_64.c @@ -0,0 +1,202 @@ +#include +#include +#include +#include +#include +#include +#include +#include +#include + +extern void __noreturn tabort_syscall(void); + +#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE +static inline void account_cpu_user_entry(void) +{ + unsigned long tb = mftb(); + + local_paca->accounting.utime += (tb - local_paca->accounting.starttime_user); + local_paca->accounting.starttime = tb; +} +static inline void account_cpu_user_exit(void) +{ + unsigned long tb = mftb(); + + local_paca->accounting.stime += (tb - local_paca->accounting.starttime); + local_paca->accounting.starttime_user = tb; +} +#else +static inline void account_cpu_user_entry(void) +{ +} +static inline void account_cpu_user_exit(void) +{ +} +#endif + +typedef long (*syscall_fn)(long, long, long, long, long, long); + +long system_call_exception(long r3, long r4, long r5, long r6, long r7, long r8, unsigned long r0, struct pt_regs *regs) +{ + unsigned long ti_flags; + syscall_fn f; + + BUG_ON(!(regs->msr & MSR_PR)); + +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM + if (unlikely(regs->msr & MSR_TS_T)) + tabort_syscall(); +#endif + + account_cpu_user_entry(); + +#if defined(CONFIG_VIRT_CPU_ACCOUNTING_NATIVE) && defined(CONFIG_PPC_SPLPAR) + if (firmware_has_feature(FW_FEATURE_SPLPAR)) { + struct lppaca *lp = get_lppaca(); + + if (unlikely(local_paca->dtl_ridx != lp->dtl_idx)) + accumulate_stolen_time(); + } +#endif + +#ifdef CONFIG_PPC_KUAP_DEBUG + if (mmu_has_feature(MMU_FTR_RADIX_KUAP)) + WARN_ON_ONCE(mfspr(SPRN_AMR) != AMR_KUAP_BLOCKED); +#endif + + /* + * A syscall should always be called with interrupts enabled + * so we just unconditionally hard-enable here. When some kind + * of irq tracing is used, we additionally check that condition + * is correct + */ +#if defined(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG) + WARN_ON(irq_soft_mask_return() != IRQS_ENABLED); + WARN_ON(local_paca->irq_happened); +#endif + + __hard_irq_enable(); + + /* + * We do need to set SOFTE in the stack frame or the return + * from interrupt will be painful + */ + regs->softe = IRQS_ENABLED; + + ti_flags = current_thread_info()->flags; + if (unlikely(ti_flags & _TIF_SYSCALL_DOTRACE)) { + /* + * We use the return value of do_syscall_trace_enter() as the + * syscall number. If the syscall was rejected for any reason + * do_syscall_trace_enter() returns an invalid syscall number + * and the test below against NR_syscalls will fail. + */ + r0 = do_syscall_trace_enter(regs); + } + + if (unlikely(r0 >= NR_syscalls)) + return -ENOSYS; + + /* May be faster to do array_index_nospec? */ + barrier_nospec(); + + if (unlikely(ti_flags & _TIF_32BIT)) { + f = (void *)compat_sys_call_table[r0]; + + r3 &= 0x00000000ffffffffULL; + r4 &= 0x00000000ffffffffULL; + r5 &= 0x00000000ffffffffULL; + r6 &= 0x00000000ffffffffULL; + r7 &= 0x00000000ffffffffULL; + r8 &= 0x00000000ffffffffULL; + + } else { + f = (void *)sys_call_table[r0]; + } + + return f(r3, r4, r5, r6, r7, r8); +} + +unsigned long syscall_exit_prepare(unsigned long r3, struct pt_regs *regs) +{ + unsigned long *ti_flagsp = ¤t_thread_info()->flags; + unsigned long ti_flags; + unsigned long ret = 0; + + regs->result = r3; + + /* Check whether the syscall is issued inside a restartable sequence */ + rseq_syscall(regs); + + ti_flags = *ti_flagsp; + if (unlikely(ti_flags & _TIF_SYSCALL_DOTRACE)) + do_syscall_trace_leave(regs); + + if (unlikely(r3 >= (unsigned long)-MAX_ERRNO)) { + if (likely(!(ti_flags & (_TIF_NOERROR | _TIF_RESTOREALL)))) { + r3 = -r3; + regs->ccr |= 0x10000000; /* Set SO bit in CR */ + } + } + + if (unlikely(ti_flags & _TIF_PERSYSCALL_MASK)) { + if (ti_flags & _TIF_RESTOREALL) + ret = _TIF_RESTOREALL; + else + regs->gpr[3] = r3; + clear_bits(_TIF_PERSYSCALL_MASK, ti_flagsp); + } else { + regs->gpr[3] = r3; + } + +again: + local_irq_disable(); + ti_flags = READ_ONCE(*ti_flagsp); + while (unlikely(ti_flags & _TIF_USER_WORK_MASK)) { + local_irq_enable(); + if (ti_flags & _TIF_NEED_RESCHED) { + schedule(); + } else { + /* + * SIGPENDING must restore signal handler function + * argument GPRs, and some non-volatiles (e.g., r1). + * Restore all for now. This could be made lighter. + */ + if (ti_flags & _TIF_SIGPENDING) + ret |= _TIF_RESTOREALL; + do_notify_resume(regs, ti_flags); + } + local_irq_disable(); + ti_flags = READ_ONCE(*ti_flagsp); + } + +#ifdef CONFIG_ALTIVEC + if ((regs->msr & (MSR_FP|MSR_VEC)) != (MSR_FP|MSR_VEC)) +#else + if ((regs->msr & MSR_FP) != MSR_FP) +#endif + restore_math(regs); + + /* This pattern matches prep_irq_for_idle */ + __mtmsrd(0, 1); /* Disable MSR_EE and MSR_RI */ + if (unlikely(lazy_irq_pending())) { + __mtmsrd(MSR_RI, 1); + local_paca->irq_happened |= PACA_IRQ_HARD_DIS; + local_irq_enable(); + /* Took an interrupt which may have more exit work to do. */ + goto again; + } + trace_hardirqs_on(); + local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS; + irq_soft_mask_set(IRQS_ENABLED); + + account_cpu_user_exit(); + +#ifdef CONFIG_PPC_KUAP_DEBUG + if (mmu_has_feature(MMU_FTR_RADIX_KUAP)) + WARN_ON_ONCE(mfspr(SPRN_AMR) != AMR_KUAP_BLOCKED); +#endif + + return ret; +} +