From patchwork Wed Jan 16 15:55:40 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Herton Ronaldo Krzesinski X-Patchwork-Id: 212716 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from chlorine.canonical.com (chlorine.canonical.com [91.189.94.204]) by ozlabs.org (Postfix) with ESMTP id C106A2C007E for ; Thu, 17 Jan 2013 03:07:05 +1100 (EST) Received: from localhost ([127.0.0.1] helo=chlorine.canonical.com) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1TvVVt-0004hn-Vo; Wed, 16 Jan 2013 16:06:54 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1TvVVP-0004Og-Be for kernel-team@lists.ubuntu.com; Wed, 16 Jan 2013 16:06:23 +0000 Received: from [177.132.109.150] (helo=canonical.com) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1TvVVO-0006vH-DO; Wed, 16 Jan 2013 16:06:23 +0000 From: Herton Ronaldo Krzesinski To: linux-kernel@vger.kernel.org, stable@vger.kernel.org, kernel-team@lists.ubuntu.com Subject: [PATCH 140/222] SGI-XP: handle non-fatal traps Date: Wed, 16 Jan 2013 13:55:40 -0200 Message-Id: <1358351822-7675-141-git-send-email-herton.krzesinski@canonical.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1358351822-7675-1-git-send-email-herton.krzesinski@canonical.com> References: <1358351822-7675-1-git-send-email-herton.krzesinski@canonical.com> X-Extended-Stable: 3.5 Cc: Andrew Morton , Robin Holt , Thomas Gleixner , Linus Torvalds , Ingo Molnar X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.13 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: kernel-team-bounces@lists.ubuntu.com Errors-To: kernel-team-bounces@lists.ubuntu.com 3.5.7.3 -stable review patch. If anyone has any objections, please let me know. ------------------ From: Robin Holt commit 891348ca0f66206f1dc0e30d63757e3df1ae2d15 upstream. We found a user code which was raising a divide-by-zero trap. That trap would lead to XPC connections between system-partitions being torn down due to the die_chain notifier callouts it received. This also revealed a different issue where multiple callers into xpc_die_deactivate() would all attempt to do the disconnect in parallel which would sometimes lock up but often overwhelm the console on very large machines as each would print at least one line of output at the end of the deactivate. I reviewed all the users of the die_chain notifier and changed the code to ignore the notifier callouts for reasons which will not actually lead to a system to continue on to call die(). [akpm@linux-foundation.org: fix ia64] Signed-off-by: Robin Holt Cc: Thomas Gleixner Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Herton Ronaldo Krzesinski --- drivers/misc/sgi-xp/xpc_main.c | 34 ++++++++++++++++++++++++++++++++-- 1 file changed, 32 insertions(+), 2 deletions(-) diff --git a/drivers/misc/sgi-xp/xpc_main.c b/drivers/misc/sgi-xp/xpc_main.c index 8d082b4..d971817 100644 --- a/drivers/misc/sgi-xp/xpc_main.c +++ b/drivers/misc/sgi-xp/xpc_main.c @@ -53,6 +53,10 @@ #include #include "xpc.h" +#ifdef CONFIG_X86_64 +#include +#endif + /* define two XPC debug device structures to be used with dev_dbg() et al */ struct device_driver xpc_dbg_name = { @@ -1079,6 +1083,9 @@ xpc_system_reboot(struct notifier_block *nb, unsigned long event, void *unused) return NOTIFY_DONE; } +/* Used to only allow one cpu to complete disconnect */ +static unsigned int xpc_die_disconnecting; + /* * Notify other partitions to deactivate from us by first disengaging from all * references to our memory. @@ -1092,6 +1099,9 @@ xpc_die_deactivate(void) long keep_waiting; long wait_to_print; + if (cmpxchg(&xpc_die_disconnecting, 0, 1)) + return; + /* keep xpc_hb_checker thread from doing anything (just in case) */ xpc_exiting = 1; @@ -1159,7 +1169,7 @@ xpc_die_deactivate(void) * about the lack of a heartbeat. */ static int -xpc_system_die(struct notifier_block *nb, unsigned long event, void *unused) +xpc_system_die(struct notifier_block *nb, unsigned long event, void *_die_args) { #ifdef CONFIG_IA64 /* !!! temporary kludge */ switch (event) { @@ -1191,7 +1201,27 @@ xpc_system_die(struct notifier_block *nb, unsigned long event, void *unused) break; } #else - xpc_die_deactivate(); + struct die_args *die_args = _die_args; + + switch (event) { + case DIE_TRAP: + if (die_args->trapnr == X86_TRAP_DF) + xpc_die_deactivate(); + + if (((die_args->trapnr == X86_TRAP_MF) || + (die_args->trapnr == X86_TRAP_XF)) && + !user_mode_vm(die_args->regs)) + xpc_die_deactivate(); + + break; + case DIE_INT3: + case DIE_DEBUG: + break; + case DIE_OOPS: + case DIE_GPF: + default: + xpc_die_deactivate(); + } #endif return NOTIFY_DONE;