linux-next: spinlock lockup with next-20081118 on powerpc

Submitted by Jens Axboe on Nov. 19, 2008, 1:34 p.m.

Details

Message ID 20081119133408.GE26308@kernel.dk
State Not Applicable, archived
Headers show

Commit Message

Jens Axboe Nov. 19, 2008, 1:34 p.m.
On Thu, Nov 20 2008, Stephen Rothwell wrote:
> Hi Jens,
> 
> On Wed, 19 Nov 2008 11:58:33 +0100 Jens Axboe <jens.axboe@oracle.com> wrote:
> >
> > ;-) I'm aware of that, I meant the 'timer' data argument. But you are
> > right, it's probably q->queue_lock being NULL here or we would have
> > oopsed earlier. There's no code line.
> > 
> > > address of the spinlock (though I need to check more to be sure) as it
> > > crashed inside _spin_lock_irqsave.
> > 
> > Do you know what device this might be? It still makes no sense, if the
> > timer was added, we went through the normal IO paths and we would have
> > crashed on NULL ->queue_lock much earlier.
> 
> I don't know much more, but I may find out tomorrow with Paul's help.
> However it bisects down to commit
> 279430a72bb6e83d335b4219e9af5557e2ff3350 "block: leave the request
> timeout timer running even on an empty list" and reverting that commit on
> next-20081118 makes the spinlock lockup go away.

Are you removing devices or modules? We have a bug there it seems, does
this help?

Comments

Stephen Rothwell Nov. 19, 2008, 2:35 p.m.
Hi Jens,

On Wed, 19 Nov 2008 14:34:09 +0100 Jens Axboe <jens.axboe@oracle.com> wrote:
>
> Are you removing devices or modules? We have a bug there it seems, does
> this help?

This is early in boot (we are waiting for the root device while running
on the initramfs) so there could well be modules being unloaded.

That patch makes the problem go away.
Jens Axboe Nov. 19, 2008, 2:37 p.m.
On Thu, Nov 20 2008, Stephen Rothwell wrote:
> Hi Jens,
> 
> On Wed, 19 Nov 2008 14:34:09 +0100 Jens Axboe <jens.axboe@oracle.com> wrote:
> >
> > Are you removing devices or modules? We have a bug there it seems, does
> > this help?
> 
> This is early in boot (we are waiting for the root device while running
> on the initramfs) so there could well be modules being unloaded.
> 
> That patch makes the problem go away.

Excellent, since it was an apparent but, I already updated the original
patch with this hunk.

Thanks a lot for your bisection work, Stephen!

Patch hide | download patch | download mbox

diff --git a/block/blk-core.c b/block/blk-core.c
index 04267d6..44f547c 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -391,6 +391,7 @@  EXPORT_SYMBOL(blk_stop_queue);
 void blk_sync_queue(struct request_queue *q)
 {
 	del_timer_sync(&q->unplug_timer);
+	del_timer_sync(&q->timeout);
 	kblockd_flush_work(&q->unplug_work);
 }
 EXPORT_SYMBOL(blk_sync_queue);