Patchwork Badness at drivers/char/tty_ldisc.c:210 during shutdown

login
register
mail settings
Submitter Michael Ellerman
Date June 22, 2009, 7:23 a.m.
Message ID <1245655421.4400.78.camel@concordia>
Download mbox | patch
Permalink /patch/28977/
State Not Applicable
Headers show

Comments

Michael Ellerman - June 22, 2009, 7:23 a.m.
On Mon, 2009-06-22 at 12:13 +0530, Sachin Sant wrote:
> Sachin Sant wrote:
> > I came across the following badness message during shutdown on a 
> > Power6 box.
> > This was with 2.6.30-git12(3fe0344faf7fdcb158bd5c1a9aec960a8d70c8e8)
> >
> > ------------[ cut here ]------------
> > Badness at drivers/char/tty_ldisc.c:210
> The badness message is still present with git18.
> 
> ------------[ cut here ]------------
> Badness at drivers/char/tty_ldisc.c:210
> NIP: c00000000040a3e8 LR: c00000000040a3d0 CTR: 0000000000000000
> REGS: c00000003cf6b7f0 TRAP: 0700   Not tainted  (2.6.30-git18)
> MSR: 8000000000029032 <EE,ME,CE,IR,DR>  CR: 24000424  XER: 00000001
> TASK = c00000003e308660[3846] 'vhangup' THREAD: c00000003cf68000 CPU: 1
> <6>GPR00: 0000000000000001 c00000003cf6ba70 c000000000ef48c0 0000000000000001 
> <6>GPR04: 0000000000000001 c00000003819f000 c000000000407b60 0000000000000000 
> <6>GPR08: 0000000000000000 0000000000000000 0000000000000001 c000000000e1bce8 
> <6>GPR12: 0000000044000428 c000000001002600 00000000ffffffff ffffffffffffffff 
> <6>GPR16: 0000000021fd8a50 0000000000000002 0000000000000000 0000000021fc03b0 
> <6>GPR20: 0000000000000000 0000000000000000 c00000003d04c700 0000000000000001 
> <6>GPR24: 0000000000000000 0000000000000000 0000000000000001 c000000040007e20 
> <6>GPR28: 0000000000000000 c0000000013ffd38 c000000000e7e860 c00000003cf6ba70 
> NIP [c00000000040a3e8] .tty_ldisc_put+0xbc/0xf4
> LR [c00000000040a3d0] .tty_ldisc_put+0xa4/0xf4
> Call Trace:
> [c00000003cf6ba70] [c00000000040a3d0] .tty_ldisc_put+0xa4/0xf4 (unreliable)
> [c00000003cf6bb10] [c00000000040a7c8] .tty_ldisc_reinit+0x38/0x80
> [c00000003cf6bba0] [c00000000040b1d8] .tty_ldisc_hangup+0x190/0x260
> [c00000003cf6bc40] [c000000000401090] .do_tty_hangup+0x188/0x4c0
> [c00000003cf6bd20] [c000000000401440] .tty_vhangup_self+0x34/0x54
> [c00000003cf6bdb0] [c00000000019236c] .sys_vhangup+0x38/0x58
> [c00000003cf6be30] [c000000000008534] syscall_exit+0x0/0x40
> Instruction dump:
> 912b0088 4bcd17bd 60000000 e87e8008 7f44d378 481c04fd 60000000 801b0008 
> 7c09fe70 7d200278 7c004850 54000ffe <0b000000> 7f63db78 4bd7c98d 60000000 

Ah right, so this has check has just gone in, and the code in question
has been rewritten somewhat just recently.

commit 677ca3060c474d7d89941948e32493d9c18c52d2
Author: Alan Cox <alan@linux.intel.com>
Date:   Tue Jun 16 17:00:53 2009 +0100

    ldisc: debug aids
    
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


I don't grok this code much, but is the WARN racing with something else
doing a get? ie. what is the value of ld->refcount before we drop the
lock?

> Let me know if i can provide any other information.

Try enabling TTY_DEBUG_HANGUP in drivers/char/tty_io.c ?

cheers
Alan Cox - June 22, 2009, 8:52 a.m.
> > [c00000003cf6ba70] [c00000000040a3d0] .tty_ldisc_put+0xa4/0xf4 (unreliable)
> > [c00000003cf6bb10] [c00000000040a7c8] .tty_ldisc_reinit+0x38/0x80
> > [c00000003cf6bba0] [c00000000040b1d8] .tty_ldisc_hangup+0x190/0x260
> > [c00000003cf6bc40] [c000000000401090] .do_tty_hangup+0x188/0x4c0
> > [c00000003cf6bd20] [c000000000401440] .tty_vhangup_self+0x34/0x54
> > [c00000003cf6bdb0] [c00000000019236c] .sys_vhangup+0x38/0x58
> > [c00000003cf6be30] [c000000000008534] syscall_exit+0x0/0x40
> > Instruction dump:
> > 912b0088 4bcd17bd 60000000 e87e8008 7f44d378 481c04fd 60000000 801b0008 
> > 7c09fe70 7d200278 7c004850 54000ffe <0b000000> 7f63db78 4bd7c98d 60000000 
> 
> Ah right, so this has check has just gone in, and the code in question
> has been rewritten somewhat just recently.

The check is to catch any cases where a line discipline is being freed up
but has a refcount that is non zero. I think I know what is going on here.

Alan
Sachin P. Sant - July 10, 2009, 8:35 a.m.
Alan Cox wrote:
>>> [c00000003cf6ba70] [c00000000040a3d0] .tty_ldisc_put+0xa4/0xf4 (unreliable)
>>> [c00000003cf6bb10] [c00000000040a7c8] .tty_ldisc_reinit+0x38/0x80
>>> [c00000003cf6bba0] [c00000000040b1d8] .tty_ldisc_hangup+0x190/0x260
>>> [c00000003cf6bc40] [c000000000401090] .do_tty_hangup+0x188/0x4c0
>>> [c00000003cf6bd20] [c000000000401440] .tty_vhangup_self+0x34/0x54
>>> [c00000003cf6bdb0] [c00000000019236c] .sys_vhangup+0x38/0x58
>>> [c00000003cf6be30] [c000000000008534] syscall_exit+0x0/0x40
>>> Instruction dump:
>>> 912b0088 4bcd17bd 60000000 e87e8008 7f44d378 481c04fd 60000000 801b0008 
>>> 7c09fe70 7d200278 7c004850 54000ffe <0b000000> 7f63db78 4bd7c98d 60000000 
>>>       
>> Ah right, so this has check has just gone in, and the code in question
>> has been rewritten somewhat just recently.
>>     
>
> The check is to catch any cases where a line discipline is being freed up
> but has a refcount that is non zero. I think I know what is going on here.
>   
This issue can be recreated with 2.6.31-rc2-git4 kernel
(34f25476ace556263784ea2f8173e22b25557a13).

Thanks
-Sachin

Patch

diff --git a/drivers/char/tty_ldisc.c b/drivers/char/tty_ldisc.c
index 874c248..a19e935 100644
--- a/drivers/char/tty_ldisc.c
+++ b/drivers/char/tty_ldisc.c
@@ -207,6 +207,7 @@  static void tty_ldisc_put(struct tty_ldisc *ld)
        ldo->refcount--;
        module_put(ldo->owner);
        spin_unlock_irqrestore(&tty_ldisc_lock, flags);
+       WARN_ON(ld->refcount);
        kfree(ld);
 }