diff mbox series

[1/2] powerpc: System reset avoid interleaving oops using die synchronisation

Message ID 20171223164923.10587-2-npiggin@gmail.com (mailing list archive)
State Accepted
Commit 4552d128c26e0f0f27a5bd2fadc24092b8f6c1d7
Headers show
Series sreset driven panic/oops message printing fixes | expand

Commit Message

Nicholas Piggin Dec. 23, 2017, 4:49 p.m. UTC
The die() oops path contains a serializing lock to prevent oops
messages from being interleaved. In the case of a system reset
initiated oops (e.g., qemu nmi command), __die was being called
which lacks that synchronisation and oops reports could be
interleaved across CPUs.

A recent patch 4388c9b3a6ee7 ("powerpc: Do not send system reset
request through the oops path") changed this to __die to avoid
the debugger() call, but there is no real harm to calling it twice
if the first time fell through. So go back to using die() here.
This was observed to fix the problem.

Fixes: 4388c9b3a6ee7 ("powerpc: Do not send system reset request through the oops path")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/traps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Gibson Jan. 3, 2018, 12:49 a.m. UTC | #1
On Sun, Dec 24, 2017 at 02:49:22AM +1000, Nicholas Piggin wrote:
> The die() oops path contains a serializing lock to prevent oops
> messages from being interleaved. In the case of a system reset
> initiated oops (e.g., qemu nmi command), __die was being called
> which lacks that synchronisation and oops reports could be
> interleaved across CPUs.
> 
> A recent patch 4388c9b3a6ee7 ("powerpc: Do not send system reset
> request through the oops path") changed this to __die to avoid
> the debugger() call, but there is no real harm to calling it twice
> if the first time fell through. So go back to using die() here.
> This was observed to fix the problem.
> 
> Fixes: 4388c9b3a6ee7 ("powerpc: Do not send system reset request through the oops path")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  arch/powerpc/kernel/traps.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index f3eb61be0d30..109989676776 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -337,7 +337,7 @@ void system_reset_exception(struct pt_regs *regs)
>  	 * No debugger or crash dump registered, print logs then
>  	 * panic.
>  	 */
> -	__die("System Reset", regs, SIGABRT);
> +	die("System Reset", regs, SIGABRT);
>  
>  	mdelay(2*MSEC_PER_SEC); /* Wait a little while for others to print */
>  	add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);
Michael Ellerman Jan. 22, 2018, 3:34 a.m. UTC | #2
On Sat, 2017-12-23 at 16:49:22 UTC, Nicholas Piggin wrote:
> The die() oops path contains a serializing lock to prevent oops
> messages from being interleaved. In the case of a system reset
> initiated oops (e.g., qemu nmi command), __die was being called
> which lacks that synchronisation and oops reports could be
> interleaved across CPUs.
> 
> A recent patch 4388c9b3a6ee7 ("powerpc: Do not send system reset
> request through the oops path") changed this to __die to avoid
> the debugger() call, but there is no real harm to calling it twice
> if the first time fell through. So go back to using die() here.
> This was observed to fix the problem.
> 
> Fixes: 4388c9b3a6ee7 ("powerpc: Do not send system reset request through the oops path")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/4552d128c26e0f0f27a5bd2fadc240

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index f3eb61be0d30..109989676776 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -337,7 +337,7 @@  void system_reset_exception(struct pt_regs *regs)
 	 * No debugger or crash dump registered, print logs then
 	 * panic.
 	 */
-	__die("System Reset", regs, SIGABRT);
+	die("System Reset", regs, SIGABRT);
 
 	mdelay(2*MSEC_PER_SEC); /* Wait a little while for others to print */
 	add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);