Patchwork linux-next: spinlock lockup with next-20081118 on powerpc

login
register
mail settings
Submitter Jens Axboe
Date Nov. 19, 2008, 1:34 p.m.
Message ID <20081119133408.GE26308@kernel.dk>
Download mbox | patch
Permalink /patch/9588/
State Not Applicable, archived
Headers show

Comments

Jens Axboe - Nov. 19, 2008, 1:34 p.m.
On Thu, Nov 20 2008, Stephen Rothwell wrote:
> Hi Jens,
> 
> On Wed, 19 Nov 2008 11:58:33 +0100 Jens Axboe <jens.axboe@oracle.com> wrote:
> >
> > ;-) I'm aware of that, I meant the 'timer' data argument. But you are
> > right, it's probably q->queue_lock being NULL here or we would have
> > oopsed earlier. There's no code line.
> > 
> > > address of the spinlock (though I need to check more to be sure) as it
> > > crashed inside _spin_lock_irqsave.
> > 
> > Do you know what device this might be? It still makes no sense, if the
> > timer was added, we went through the normal IO paths and we would have
> > crashed on NULL ->queue_lock much earlier.
> 
> I don't know much more, but I may find out tomorrow with Paul's help.
> However it bisects down to commit
> 279430a72bb6e83d335b4219e9af5557e2ff3350 "block: leave the request
> timeout timer running even on an empty list" and reverting that commit on
> next-20081118 makes the spinlock lockup go away.

Are you removing devices or modules? We have a bug there it seems, does
this help?
Stephen Rothwell - Nov. 19, 2008, 2:35 p.m.
Hi Jens,

On Wed, 19 Nov 2008 14:34:09 +0100 Jens Axboe <jens.axboe@oracle.com> wrote:
>
> Are you removing devices or modules? We have a bug there it seems, does
> this help?

This is early in boot (we are waiting for the root device while running
on the initramfs) so there could well be modules being unloaded.

That patch makes the problem go away.
Jens Axboe - Nov. 19, 2008, 2:37 p.m.
On Thu, Nov 20 2008, Stephen Rothwell wrote:
> Hi Jens,
> 
> On Wed, 19 Nov 2008 14:34:09 +0100 Jens Axboe <jens.axboe@oracle.com> wrote:
> >
> > Are you removing devices or modules? We have a bug there it seems, does
> > this help?
> 
> This is early in boot (we are waiting for the root device while running
> on the initramfs) so there could well be modules being unloaded.
> 
> That patch makes the problem go away.

Excellent, since it was an apparent but, I already updated the original
patch with this hunk.

Thanks a lot for your bisection work, Stephen!

Patch

diff --git a/block/blk-core.c b/block/blk-core.c
index 04267d6..44f547c 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -391,6 +391,7 @@  EXPORT_SYMBOL(blk_stop_queue);
 void blk_sync_queue(struct request_queue *q)
 {
 	del_timer_sync(&q->unplug_timer);
+	del_timer_sync(&q->timeout);
 	kblockd_flush_work(&q->unplug_work);
 }
 EXPORT_SYMBOL(blk_sync_queue);