From patchwork Fri Oct 6 06:10:04 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 822236 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3y7fWf4n0xz9t4R for ; Fri, 6 Oct 2017 17:15:26 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="qQiLZf8u"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3y7fWf3Y9dzDqnn for ; Fri, 6 Oct 2017 17:15:26 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="qQiLZf8u"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:400e:c00::242; helo=mail-pf0-x242.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="qQiLZf8u"; dkim-atps=neutral Received: from mail-pf0-x242.google.com (mail-pf0-x242.google.com [IPv6:2607:f8b0:400e:c00::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3y7fPs4CxhzDqmb for ; Fri, 6 Oct 2017 17:10:25 +1100 (AEDT) Received: by mail-pf0-x242.google.com with SMTP id e69so16008649pfg.4 for ; Thu, 05 Oct 2017 23:10:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=nuPNe4OQJuDXGE5W9CM/fj+JSaAmR3OML6MhY4XE9qM=; b=qQiLZf8uD9LD5ayPgq82hUWJh1MR1hS6dcrNP89Wugq+pzGierA6gPRNUx7olMwq+s kbuA9JehySmnqgfHELg+d2LjD2xQ7elK4r6ndpnMxqTfKfuRPsJezWdHVrdMe4Ermx54 HhY/q+k34g8I75um0yAur5YKq/J7ZZrm61Pe35cT5ZlPXB1WeOhcnPXJBnA6fFUSBZE4 aV5tT7fGCpK3jW5InVhBzLuV++Dz6LZEwNz/5qWTegTetvQ0VPNMBOM6DXMQll0eqgpC AZKPkob2Bw7/o6hi4tZjDKi6pJSThmjYzePv7dprWUSh5DX4L1liPZkTAIRd8WOt7Ytf rYPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=nuPNe4OQJuDXGE5W9CM/fj+JSaAmR3OML6MhY4XE9qM=; b=iKqUm2+QwfEhhlBuxZLDtg1fu1XlrM3zVtd6a6k4IsokbDZ8znuPwgl2aj4nf1K1Sl mhPqCemiJv2eb4tZedYteDIZnHrelWhCkg3kWP0wUkIb2kZvz/v/ZwFBLrKSgkfRQXm8 o+PvrJf9lRs35eFhpL550ufBS7zkOHEBSZgb+UpmBtgNtQCvRbVrxup9Hm7kJON1iWAo c3scg+AaPmcrJQefDwFRGOffXtE0vxbTMVIz/QBIwjLAelrfm4ApKBVXwYPtgoXe/Xwy 63GLYjCtzUSASZyAASXdcPWU2zBrvyQmKamf+Xg0hdu2xpclxos2CNJ5easj6jRWO/dw q84g== X-Gm-Message-State: AMCzsaV2h4BeFsK2+yG+yK4be/+xY9Cx0rtRoO5T97jL3zcM9luk/hLp eSsCthK+T1FqFpn9V3jT71qA5g== X-Google-Smtp-Source: AOwi7QAzArXLn/7OXBOev7To+P198620UmMi4hTPq6zXp7mLfFMINMUN1oW9fTJL4bDC5b2lRmhHfA== X-Received: by 10.99.124.91 with SMTP id l27mr1062502pgn.49.1507270223495; Thu, 05 Oct 2017 23:10:23 -0700 (PDT) Received: from roar.au.ibm.com (220-244-152-14.tpgi.com.au. [220.244.152.14]) by smtp.gmail.com with ESMTPSA id q62sm987456pga.75.2017.10.05.23.10.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 05 Oct 2017 23:10:22 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH 2/3] powerpc/powernv: Always stop secondaries before reboot/shutdown Date: Fri, 6 Oct 2017 16:10:04 +1000 Message-Id: <20171006061005.29891-3-npiggin@gmail.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20171006061005.29891-1-npiggin@gmail.com> References: <20171006061005.29891-1-npiggin@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Vasant Hegde , Stewart Smith , Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Currently powernv reboot and shutdown requests just leave secondaries to do their own things. This is undesirable because they can trigger any number of watchdogs while waiting for reboot, but also we don't know what else they might be doing, or they might be stuck somewhere causing trouble. The opal scheduled flash update code already ran into watchdog problems due to flashing taking a long time, but it's possible for regular reboots to trigger problems too (this is with watchdog_thresh set to 1, but I have seen it with watchdog_thresh at the default value once too): reboot: Restarting system [ 360.038896709,5] OPAL: Reboot request... Watchdog CPU:0 Hard LOCKUP Watchdog CPU:44 detected Hard LOCKUP other CPUS:16 Watchdog CPU:16 Hard LOCKUP watchdog: BUG: soft lockup - CPU#16 stuck for 3s! [swapper/16:0] So remove the special case for flash update, and unconditionally do smp_send_stop before rebooting. Return the CPUs to Linux stop loops rather than OPAL. The reason for this is that in firmware, CPUs will check for jobs, whereas smp_send_stop puts them into a simple infinite loop. If there is some corruption, it is better to do the latter, to maximize the chance of a successful reboot. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/opal.h | 2 +- arch/powerpc/platforms/powernv/opal-flash.c | 28 +--------------------------- arch/powerpc/platforms/powernv/setup.c | 15 +++++---------- 3 files changed, 7 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index 04c32b08ffa1..ce58f4139ff5 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -317,7 +317,7 @@ struct rtc_time; extern unsigned long opal_get_boot_time(void); extern void opal_nvram_init(void); extern void opal_flash_update_init(void); -extern void opal_flash_term_callback(void); +extern void opal_flash_update_print_message(void); extern int opal_elog_init(void); extern void opal_platform_dump_init(void); extern void opal_sys_param_init(void); diff --git a/arch/powerpc/platforms/powernv/opal-flash.c b/arch/powerpc/platforms/powernv/opal-flash.c index 2fa3ac80cb4e..632871d78576 100644 --- a/arch/powerpc/platforms/powernv/opal-flash.c +++ b/arch/powerpc/platforms/powernv/opal-flash.c @@ -303,26 +303,9 @@ static int opal_flash_update(int op) return rc; } -/* Return CPUs to OPAL before starting FW update */ -static void flash_return_cpu(void *info) -{ - int cpu = smp_processor_id(); - - if (!cpu_online(cpu)) - return; - - /* Disable IRQ */ - hard_irq_disable(); - - /* Return the CPU to OPAL */ - opal_return_cpu(); -} - /* This gets called just before system reboots */ -void opal_flash_term_callback(void) +void opal_flash_update_print_message(void) { - struct cpumask mask; - if (update_flash_data.status != FLASH_IMG_READY) return; @@ -333,15 +316,6 @@ void opal_flash_term_callback(void) /* Small delay to help getting the above message out */ msleep(500); - - /* Return secondary CPUs to firmware */ - cpumask_copy(&mask, cpu_online_mask); - cpumask_clear_cpu(smp_processor_id(), &mask); - if (!cpumask_empty(&mask)) - smp_call_function_many(&mask, - flash_return_cpu, NULL, false); - /* Hard disable interrupts */ - hard_irq_disable(); } /* diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c index cf52d53da460..0d2f70d24747 100644 --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -112,17 +112,12 @@ static void pnv_prepare_going_down(void) */ opal_event_shutdown(); - /* Soft disable interrupts */ - local_irq_disable(); + /* Print flash update message if one is scheduled. */ + opal_flash_update_print_message(); - /* - * Return secondary CPUs to firwmare if a flash update - * is pending otherwise we will get all sort of error - * messages about CPU being stuck etc.. This will also - * have the side effect of hard disabling interrupts so - * past this point, the kernel is effectively dead. - */ - opal_flash_term_callback(); + smp_send_stop(); + + hard_irq_disable(); } static void __noreturn pnv_restart(char *cmd)