From patchwork Wed Jun 13 15:54:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joseph Salisbury X-Patchwork-Id: 928953 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 415WXh1Vs3z9s2g; Thu, 14 Jun 2018 01:54:44 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1fT86U-0002SS-LO; Wed, 13 Jun 2018 15:54:34 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1fT86T-0002Ro-8H for kernel-team@lists.ubuntu.com; Wed, 13 Jun 2018 15:54:33 +0000 Received: from 1.general.jsalisbury.us.vpn ([10.172.67.212] helo=salisbury) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1fT86S-0004cr-Qb for kernel-team@lists.ubuntu.com; Wed, 13 Jun 2018 15:54:33 +0000 Received: by salisbury (Postfix, from userid 1000) id A29D87E04A5; Wed, 13 Jun 2018 11:54:31 -0400 (EDT) From: Joseph Salisbury To: kernel-team@lists.ubuntu.com Subject: [SRU][Artful][PATCH 1/3] powerpc: Do not send system reset request through the oops path Date: Wed, 13 Jun 2018 11:54:29 -0400 Message-Id: X-Mailer: git-send-email 2.17.0 In-Reply-To: References: In-Reply-To: References: X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Nicholas Piggin BugLink: http://bugs.launchpad.net/bugs/1776211 A system reset is a request to crash / debug the system rather than necessarily caused by encountering a BUG. So there is no need to serialize all CPUs behind the die lock, adding taints to all subsequent traces beyond the first, breaking console locks, etc. The system reset is NMI context which has its own printk buffers to prevent output being interleaved. Then it's better to have all secondaries print out their debug as quickly as possible and the primary will flush out all printk buffers during panic(). So remove the 0x100 path from die, and move it into system_reset. Name the crash/dump reasons "System Reset". This gives "not tained" traces when crashing an untainted kernel. It also gives the panic reason as "System Reset" as opposed to "Fatal exception in interrupt" (or "die oops" for fadump). Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman (cherry picked from commit 4388c9b3a6ee7d6afc36c8a0bb5579b1606229b5) Signed-off-by: Joseph Salisbury --- arch/powerpc/kernel/traps.c | 47 ++++++++++++++++++++++++++++++--------------- 1 file changed, 31 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 5fbe81d..8332c77e 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -163,21 +163,9 @@ static void oops_end(unsigned long flags, struct pt_regs *regs, crash_fadump(regs, "die oops"); - /* - * A system reset (0x100) is a request to dump, so we always send - * it through the crashdump code. - */ - if (kexec_should_crash(current) || (TRAP(regs) == 0x100)) { + if (kexec_should_crash(current)) crash_kexec(regs); - /* - * We aren't the primary crash CPU. We need to send it - * to a holding pattern to avoid it ending up in the panic - * code. - */ - crash_kexec_secondary(regs); - } - if (!signr) return; @@ -295,17 +283,44 @@ void system_reset_exception(struct pt_regs *regs) goto out; } - die("System Reset", regs, SIGABRT); + if (debugger(regs)) + goto out; + + /* + * A system reset is a request to dump, so we always send + * it through the crashdump code (if fadump or kdump are + * registered). + */ + crash_fadump(regs, "System Reset"); + + crash_kexec(regs); + + /* + * We aren't the primary crash CPU. We need to send it + * to a holding pattern to avoid it ending up in the panic + * code. + */ + crash_kexec_secondary(regs); + + /* + * No debugger or crash dump registered, print logs then + * panic. + */ + __die("System Reset", regs, SIGABRT); + + mdelay(2*MSEC_PER_SEC); /* Wait a little while for others to print */ + add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE); + nmi_panic(regs, "System Reset"); out: #ifdef CONFIG_PPC_BOOK3S_64 BUG_ON(get_paca()->in_nmi == 0); if (get_paca()->in_nmi > 1) - panic("Unrecoverable nested System Reset"); + nmi_panic(regs, "Unrecoverable nested System Reset"); #endif /* Must die if the interrupt is not recoverable */ if (!(regs->msr & MSR_RI)) - panic("Unrecoverable System Reset"); + nmi_panic(regs, "Unrecoverable System Reset"); if (!nested) nmi_exit(); From patchwork Wed Jun 13 15:54:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joseph Salisbury X-Patchwork-Id: 928951 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 415WXg2Hzzz9s47; Thu, 14 Jun 2018 01:54:43 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1fT86U-0002SI-Ht; Wed, 13 Jun 2018 15:54:34 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1fT86T-0002Rm-5T for kernel-team@lists.ubuntu.com; Wed, 13 Jun 2018 15:54:33 +0000 Received: from 1.general.jsalisbury.us.vpn ([10.172.67.212] helo=salisbury) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1fT86S-0004cq-Qb for kernel-team@lists.ubuntu.com; Wed, 13 Jun 2018 15:54:32 +0000 Received: by salisbury (Postfix, from userid 1000) id A457D7E249D; Wed, 13 Jun 2018 11:54:31 -0400 (EDT) From: Joseph Salisbury To: kernel-team@lists.ubuntu.com Subject: [SRU][Artful][PATCH 2/3] powerpc/crash: Remove the test for cpu_online in the IPI callback Date: Wed, 13 Jun 2018 11:54:30 -0400 Message-Id: X-Mailer: git-send-email 2.17.0 In-Reply-To: References: In-Reply-To: References: X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Balbir Singh BugLink: http://bugs.launchpad.net/bugs/1776211 Our check was extra cautious, we've audited crash_send_ipi and it sends an IPI only to online CPU's. Removal of this check should have not functional impact on crash kdump. Signed-off-by: Balbir Singh Reviewed-by: Nicholas Piggin Signed-off-by: Michael Ellerman (cherry picked from commit 04b9c96eae72d862726f2f4bfcec2078240c33c5) Signed-off-by: Joseph Salisbury --- arch/powerpc/kernel/crash.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/powerpc/kernel/crash.c b/arch/powerpc/kernel/crash.c index cbabb5a..29c56ca 100644 --- a/arch/powerpc/kernel/crash.c +++ b/arch/powerpc/kernel/crash.c @@ -69,9 +69,6 @@ static void crash_ipi_callback(struct pt_regs *regs) int cpu = smp_processor_id(); - if (!cpu_online(cpu)) - return; - hard_irq_disable(); if (!cpumask_test_cpu(cpu, &cpus_state_saved)) { crash_save_cpu(regs, cpu); From patchwork Wed Jun 13 15:54:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joseph Salisbury X-Patchwork-Id: 928954 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 415WXk1h67z9s2g; Thu, 14 Jun 2018 01:54:46 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1fT86Y-0002TY-Rw; Wed, 13 Jun 2018 15:54:38 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1fT86T-0002Rn-5L for kernel-team@lists.ubuntu.com; Wed, 13 Jun 2018 15:54:33 +0000 Received: from 1.general.jsalisbury.us.vpn ([10.172.67.212] helo=salisbury) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1fT86S-0004ct-Qc for kernel-team@lists.ubuntu.com; Wed, 13 Jun 2018 15:54:32 +0000 Received: by salisbury (Postfix, from userid 1000) id A640B7E2511; Wed, 13 Jun 2018 11:54:31 -0400 (EDT) From: Joseph Salisbury To: kernel-team@lists.ubuntu.com Subject: [SRU][Artful][PATCH 3/3] powerpc: System reset avoid interleaving oops using die synchronisation Date: Wed, 13 Jun 2018 11:54:31 -0400 Message-Id: <810031bce268b18267a118a8e5a45f8ce7f62c01.1528729715.git.joseph.salisbury@canonical.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: References: In-Reply-To: References: X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Nicholas Piggin BugLink: http://bugs.launchpad.net/bugs/1776211 The die() oops path contains a serializing lock to prevent oops messages from being interleaved. In the case of a system reset initiated oops (e.g., qemu nmi command), __die was being called which lacks that synchronisation and oops reports could be interleaved across CPUs. A recent patch 4388c9b3a6ee7 ("powerpc: Do not send system reset request through the oops path") changed this to __die to avoid the debugger() call, but there is no real harm to calling it twice if the first time fell through. So go back to using die() here. This was observed to fix the problem. Fixes: 4388c9b3a6ee7 ("powerpc: Do not send system reset request through the oops path") Signed-off-by: Nicholas Piggin Reviewed-by: David Gibson Signed-off-by: Michael Ellerman (cherry picked from commit 4552d128c26e0f0f27a5bd2fadc24092b8f6c1d7) Signed-off-by: Joseph Salisbury --- arch/powerpc/kernel/traps.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 8332c77e..174524b 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -306,7 +306,7 @@ void system_reset_exception(struct pt_regs *regs) * No debugger or crash dump registered, print logs then * panic. */ - __die("System Reset", regs, SIGABRT); + die("System Reset", regs, SIGABRT); mdelay(2*MSEC_PER_SEC); /* Wait a little while for others to print */ add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);