Patchwork next-20090310: ext4 hangs

login
register
mail settings
Submitter Jan Kara
Date March 31, 2009, 12:33 p.m.
Message ID <20090331123307.GG11808@duck.suse.cz>
Download mbox | patch
Permalink /patch/25389/
State RFC
Delegated to: David Miller
Headers show

Comments

Jan Kara - March 31, 2009, 12:33 p.m.
On Tue 31-03-09 14:07:30, Alexander Beregalov wrote:
> 2009/3/31 Jan Kara <jack@suse.cz>:
> > On Thu 26-03-09 01:38:32, Alexander Beregalov wrote:
> >> 2009/3/25 Jan Kara <jack@suse.cz>:
> >> > On Wed 25-03-09 20:07:46, Alexander Beregalov wrote:
> >> >> 2009/3/25 Jan Kara <jack@suse.cz>:
> >> >> > On Wed 25-03-09 18:29:10, Alexander Beregalov wrote:
> >> >> >> 2009/3/25 Jan Kara <jack@suse.cz>:
> >> >> >> > On Wed 25-03-09 18:18:43, Alexander Beregalov wrote:
> >> >> >> >> 2009/3/25 Jan Kara <jack@suse.cz>:
> >> >> >> >> >> > So, I think I need to try it on 2.6.29-rc7 again.
> >> >> >> >> >>   I've looked into this. Obviously, what's happenning is that we delete
> >> >> >> >> >> an inode and jbd2_journal_release_jbd_inode() finds inode is just under
> >> >> >> >> >> writeout in transaction commit and thus it waits. But it gets never woken
> >> >> >> >> >> up and because it has a handle from the transaction, every one eventually
> >> >> >> >> >> blocks on waiting for a transaction to finish.
> >> >> >> >> >>   But I don't really see how that can happen. The code is really
> >> >> >> >> >> straightforward and everything happens under j_list_lock... Strange.
> >> >> >> >> >  BTW: Is the system SMP?
> >> >> >> >> No, it is UP system.
> >> >> >> >  Even stranger. And do you have CONFIG_PREEMPT set?
> >> >> >> >
> >> >> >> >> The bug exists even in 2.6.29, I posted it with a new topic.
> >> >> >> >  OK, I've sort-of expected this.
> >> >> >>
> >> >> >> CONFIG_PREEMPT_RCU=y
> >> >> >> CONFIG_PREEMPT_RCU_TRACE=y
> >> >> >> # CONFIG_PREEMPT_NONE is not set
> >> >> >> # CONFIG_PREEMPT_VOLUNTARY is not set
> >> >> >> CONFIG_PREEMPT=y
> >> >> >> CONFIG_DEBUG_PREEMPT=y
> >> >> >> # CONFIG_PREEMPT_TRACER is not set
> >> >> >>
> >> >> >> config is attached.
> >> >> >  Thanks for the data. I still don't see how the wakeup can get lost. The
> >> >> > process even cannot be preempted when we are in the section protected by
> >> >> > j_list_lock... Can you send me a disassembly of functions
> >> >> > jbd2_journal_release_jbd_inode() and journal_submit_data_buffers() so that
> >> >> > I can see whether the compiler has not reordered something unexpectedly?
> >> >  Thanks for the disassembly...
> >> >
> >> >> By default gcc inlines journal_submit_data_buffers()
> >> >> Here is -fno-inline version. Default version is in attach.
> >  <snip>
> >
> >  I'm helpless here. I don't see how we can miss a wakeup (plus you seem to
> > be the only one reporting the bug). Could you please compile and test the kernel
> > with the attached patch? It will print to kernel log when we go to sleep
> > waiting for inode commit and when we send wakeups etc. When you hit the
> > deadlock, please send me your kernel log. It should help with debugging why do
> > we miss the wakeup. Thanks.
> 
> Which patch?
  Ups. Forgot to attach ;).

										Honza
Alexander Beregalov - April 2, 2009, 6:50 p.m.
>> >  I'm helpless here. I don't see how we can miss a wakeup (plus you seem to
>> > be the only one reporting the bug). Could you please compile and test the kernel
>> > with the attached patch? It will print to kernel log when we go to sleep
>> > waiting for inode commit and when we send wakeups etc. When you hit the
>> > deadlock, please send me your kernel log. It should help with debugging why do
>> > we miss the wakeup. Thanks.
>>
>> Which patch?
>  Ups. Forgot to attach ;).

Cannot reproduce it on current 2.6.29-git. Strange.
It should already have all ext4/jbd2 patches from next-20090310,
but anyway it happened with 2.6.29-rc8 also.
I ran dbench in cycle on two indentical hosts for more than 24 hours
with no hang tasks.

I will try 2.6.29.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Beregalov - April 4, 2009, 9:09 p.m.
2009/4/2 Alexander Beregalov <a.beregalov@gmail.com>:
>>> >  I'm helpless here. I don't see how we can miss a wakeup (plus you seem to
>>> > be the only one reporting the bug). Could you please compile and test the kernel
>>> > with the attached patch? It will print to kernel log when we go to sleep
>>> > waiting for inode commit and when we send wakeups etc. When you hit the
>>> > deadlock, please send me your kernel log. It should help with debugging why do
>>> > we miss the wakeup. Thanks.
>>>
>>> Which patch?
>>  Ups. Forgot to attach ;).
>
> Cannot reproduce it on current 2.6.29-git. Strange.
> It should already have all ext4/jbd2 patches from next-20090310,
> but anyway it happened with 2.6.29-rc8 also.
> I ran dbench in cycle on two indentical hosts for more than 24 hours
> with no hang tasks.
>
> I will try 2.6.29.

I cannot reproduce it with vanilla v2.6.29.
It seems the problem has gone.
Thanks Jan.

The patch output:
[133886.375874] JBD2: Waiting for ino 1062
[133886.376372] JBD2: Waking up sleeper on ino 1062
[133886.376824] JBD2: Woken on ino 1062
[134611.108451] JBD2: Waiting for ino 1102
[134611.108903] JBD2: Waking up sleeper on ino 1102
[134611.109787] JBD2: Woken on ino 1102
[134611.132912] JBD2: Waiting for ino 1074
[134611.133311] JBD2: Waking up sleeper on ino 1074
[134611.133707] JBD2: Woken on ino 1074
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kara - April 6, 2009, 9:20 a.m.
On Sun 05-04-09 01:09:31, Alexander Beregalov wrote:
> 2009/4/2 Alexander Beregalov <a.beregalov@gmail.com>:
> >>> >  I'm helpless here. I don't see how we can miss a wakeup (plus you seem to
> >>> > be the only one reporting the bug). Could you please compile and test the kernel
> >>> > with the attached patch? It will print to kernel log when we go to sleep
> >>> > waiting for inode commit and when we send wakeups etc. When you hit the
> >>> > deadlock, please send me your kernel log. It should help with debugging why do
> >>> > we miss the wakeup. Thanks.
> >>>
> >>> Which patch?
> >>  Ups. Forgot to attach ;).
> >
> > Cannot reproduce it on current 2.6.29-git. Strange.
> > It should already have all ext4/jbd2 patches from next-20090310,
> > but anyway it happened with 2.6.29-rc8 also.
> > I ran dbench in cycle on two indentical hosts for more than 24 hours
> > with no hang tasks.
> >
> > I will try 2.6.29.
> 
> I cannot reproduce it with vanilla v2.6.29.
> It seems the problem has gone.
> Thanks Jan.
  Thanks for testing. I'm glad we have one mystery less ;).
 
> The patch output:
> [133886.375874] JBD2: Waiting for ino 1062
> [133886.376372] JBD2: Waking up sleeper on ino 1062
> [133886.376824] JBD2: Woken on ino 1062
> [134611.108451] JBD2: Waiting for ino 1102
> [134611.108903] JBD2: Waking up sleeper on ino 1102
> [134611.109787] JBD2: Woken on ino 1102
> [134611.132912] JBD2: Waiting for ino 1074
> [134611.133311] JBD2: Waking up sleeper on ino 1074
> [134611.133707] JBD2: Woken on ino 1074
  Yes, this is how it should always look...

									Honza

Patch

From 123ab7510c04c698077e5756b4de6c66ce8ee71e Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Tue, 31 Mar 2009 11:57:10 +0200
Subject: [PATCH] ext4: Debug sleepers in iput()

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/jbd2/commit.c  |    4 ++++
 fs/jbd2/journal.c |    6 ++++++
 2 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 62804e5..f47b8a3 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -259,6 +259,8 @@  static int journal_submit_data_buffers(journal_t *journal,
 		spin_lock(&journal->j_list_lock);
 		J_ASSERT(jinode->i_transaction == commit_transaction);
 		jinode->i_flags &= ~JI_COMMIT_RUNNING;
+		if (jinode->i_flags & 4)
+			printk(KERN_INFO "JBD2: Waking up sleeper on ino %lu\n", jinode->i_vfs_inode->i_ino);
 		wake_up_bit(&jinode->i_flags, __JI_COMMIT_RUNNING);
 	}
 	spin_unlock(&journal->j_list_lock);
@@ -296,6 +298,8 @@  static int journal_finish_inode_data_buffers(journal_t *journal,
 		}
 		spin_lock(&journal->j_list_lock);
 		jinode->i_flags &= ~JI_COMMIT_RUNNING;
+		if (jinode->i_flags & 4)
+			printk(KERN_INFO "JBD2: Waking up sleeper on ino %lu\n", jinode->i_vfs_inode->i_ino);
 		wake_up_bit(&jinode->i_flags, __JI_COMMIT_RUNNING);
 	}
 
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 5814410..5459fd9 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -2225,11 +2225,17 @@  restart:
 	if (jinode->i_flags & JI_COMMIT_RUNNING) {
 		wait_queue_head_t *wq;
 		DEFINE_WAIT_BIT(wait, &jinode->i_flags, __JI_COMMIT_RUNNING);
+		unsigned long ino = jinode->i_vfs_inode->i_ino;
+
+		jinode->i_flags |= 4;
+		printk(KERN_INFO "JBD2: Waiting for ino %lu\n", ino);
+
 		wq = bit_waitqueue(&jinode->i_flags, __JI_COMMIT_RUNNING);
 		prepare_to_wait(wq, &wait.wait, TASK_UNINTERRUPTIBLE);
 		spin_unlock(&journal->j_list_lock);
 		schedule();
 		finish_wait(wq, &wait.wait);
+		printk(KERN_INFO "JBD2: Woken on ino %lu\n", ino);
 		goto restart;
 	}
 
-- 
1.6.0.2