diff mbox

[6/8] netpoll: Allow netpoll_setup/cleanup recursion

Message ID 1277453336.22715.2154.camel@twins
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Peter Zijlstra June 25, 2010, 8:08 a.m. UTC
On Thu, 2010-06-24 at 21:42 -0700, Andrew Morton wrote:
> That being said, I wonder why Herbert didn't hit this in his testing. 
> I suspect that he'd enabled lockdep, which hid the bug.  I haven't
> worked out _why_ lockdep hides the double-mutex_unlock bug, but it's a
> pretty bad thing to do. 

Most weird indeed, lockdep is supposed so shout its lungs out when
someone wants to unlock a lock that isn't actually owned by him (and it
not being locked at all certainly implies you're not the owner).

In fact, the below patch results in the below splat -- its also
something that's tested by the locking self-test:

/*
 * Double unlock:
 */
#define E()                                     \
                                                \
        LOCK(A);                                \
        UNLOCK(A);                              \
        UNLOCK(A); /* fail */


---
 kernel/timer.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)




=====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
init/1 is trying to release lock (foo) at:
[<ffffffff815bf3b6>] mutex_unlock+0xe/0x10
but there are no more locks to release!

other info that might help us debug this:
no locks held by init/1.

stack backtrace:
Pid: 1, comm: init Not tainted 2.6.35-rc3-tip-01034-g5205803-dirty #399
Call Trace:
 [<ffffffff815bf3b6>] ? mutex_unlock+0xe/0x10
 [<ffffffff8106d718>] print_unlock_inbalance_bug+0xd6/0xe1
 [<ffffffff815bf3b6>] ? mutex_unlock+0xe/0x10
 [<ffffffff8106e8c6>] lock_release+0xdb/0x196
 [<ffffffff815bf32f>] __mutex_unlock_slowpath+0xc1/0x13a
 [<ffffffff815bf3b6>] mutex_unlock+0xe/0x10
 [<ffffffff8104f262>] sys_getppid+0x34/0xd8
 [<ffffffff81002cdb>] system_call_fastpath+0x16/0x1b

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Andrew Morton June 25, 2010, 8:42 a.m. UTC | #1
On Fri, 25 Jun 2010 10:08:56 +0200 Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, 2010-06-24 at 21:42 -0700, Andrew Morton wrote:
> > That being said, I wonder why Herbert didn't hit this in his testing. 
> > I suspect that he'd enabled lockdep, which hid the bug.  I haven't
> > worked out _why_ lockdep hides the double-mutex_unlock bug, but it's a
> > pretty bad thing to do. 
> 
> Most weird indeed, lockdep is supposed so shout its lungs out when
> someone wants to unlock a lock that isn't actually owned by him (and it
> not being locked at all certainly implies you're not the owner).
> 
> In fact, the below patch results in the below splat -- its also
> something that's tested by the locking self-test:

When I enabled lockdep, the bug actually went away.  Is it possible
that when lockdep detects this bug, it prevents mutex.count from going
from 1 to 2?

It could be that lockdep _did_ detect (and correct!) the bug.  But
because I had no usable console output at the time, I didn't see it.

I did notice that the taint output was "G W".  So something warned
about something, but I don't know what.  But that was happening with
lockdep disabled.


> @@ -1344,6 +1346,10 @@ SYSCALL_DEFINE0(getppid)
>  {
>  	int pid;
>  
> +	mutex_lock(&foo);
> +	mutex_unlock(&foo);
> +	mutex_unlock(&foo);
> +
>  	rcu_read_lock();
>  	pid = task_tgid_vnr(current->real_parent);
>  	rcu_read_unlock();

It'd be interesting to add

	printk("%d:%d\n", __LINE__, atomic_read(&foo.count));

after the mutex_unlock()s.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra June 25, 2010, 9:45 a.m. UTC | #2
On Fri, 2010-06-25 at 01:42 -0700, Andrew Morton wrote:
> On Fri, 25 Jun 2010 10:08:56 +0200 Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Thu, 2010-06-24 at 21:42 -0700, Andrew Morton wrote:
> > > That being said, I wonder why Herbert didn't hit this in his testing. 
> > > I suspect that he'd enabled lockdep, which hid the bug.  I haven't
> > > worked out _why_ lockdep hides the double-mutex_unlock bug, but it's a
> > > pretty bad thing to do. 
> > 
> > Most weird indeed, lockdep is supposed so shout its lungs out when
> > someone wants to unlock a lock that isn't actually owned by him (and it
> > not being locked at all certainly implies you're not the owner).
> > 
> > In fact, the below patch results in the below splat -- its also
> > something that's tested by the locking self-test:
> 
> When I enabled lockdep, the bug actually went away.  Is it possible
> that when lockdep detects this bug, it prevents mutex.count from going
> from 1 to 2?

Not lockdep itself but the DEBUG_MUTEXES code (forced by lockdep).

The difference between the normal and the debug code is that the debug
code disables all fast-path code.

The x86 fast-path code does:

 LOCK incl &lock->count
 jg done:
 call slowpath
done:

Since 1++ is >0 it will complete without calling the slow-path, would
do:

 if (__mutex_slowpath_needs_to_unlock()) /* 1 regardless of DEBUG_MUTEX */
   atomic_set(&lock->count, 1);

The question I guess is, do we want double unlocks to go silently
unnoticed? In that case we need to touch the fastpath asm.

> It could be that lockdep _did_ detect (and correct!) the bug.  But
> because I had no usable console output at the time, I didn't see it.
> 
> I did notice that the taint output was "G W".  So something warned
> about something, but I don't know what.  But that was happening with
> lockdep disabled.

Hrmm,. yeah without console output lockdep isn't going to help much,
should we maybe use the speaker to read out the dmesg :-)

> It'd be interesting to add
> 
> 	printk("%d:%d\n", __LINE__, atomic_read(&foo.count));
> 
> after the mutex_unlock()s.

1352:1


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/kernel/timer.c b/kernel/timer.c
index ee3f116..0496f71 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1334,6 +1334,8 @@  SYSCALL_DEFINE0(getpid)
 	return task_tgid_vnr(current);
 }
 
+static DEFINE_MUTEX(foo);
+
 /*
  * Accessing ->real_parent is not SMP-safe, it could
  * change from under us. However, we can use a stale
@@ -1344,6 +1346,10 @@  SYSCALL_DEFINE0(getppid)
 {
 	int pid;
 
+	mutex_lock(&foo);
+	mutex_unlock(&foo);
+	mutex_unlock(&foo);
+
 	rcu_read_lock();
 	pid = task_tgid_vnr(current->real_parent);
 	rcu_read_unlock();