diff mbox

next-20090206: deadlock on ext4

Message ID 4995F27E.3070401@redhat.com
State Superseded, archived
Headers show

Commit Message

Eric Sandeen Feb. 13, 2009, 10:21 p.m. UTC
Alexander Beregalov wrote:

> I have reproduced it with 2.6.29-rc4.
> Loop file was on XFS.
> I can not reproduce it on ext4 on raw device.
> Let me know if I can do anything else to help resolving it.
>   
Can you try this patch from Aneesh?  Works for me...

Thanks,
-Eric

From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Subject: [PATCH] ext4: Don't use the range_cylic mode implemented by
write_cache_pages

With delayed allocation we lock the page in write_cache_pages and 
try to build an in memory extent of contiguous blocks. This is 
needed so that we can get a large contiguous blocks request. Now 
with range_cyclic mode in write_cache_pages if we have not done an 
I/O we loop back to 0 index and try to write the page.  That would 
imply we will attempt to take page lock of lower index page holding
the page lock of higher index page. This can cause a dead lock with
other writeback thread.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Tested-by: Eric Sandeen <sandeen@redhat.com>
---
 fs/ext4/inode.c |   20 ++++++++++++++++++--
 1 files changed, 18 insertions(+), 2 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Alexander Beregalov Feb. 18, 2009, 9:56 a.m. UTC | #1
2009/2/14 Eric Sandeen <sandeen@redhat.com>:
> Alexander Beregalov wrote:
>
>> I have reproduced it with 2.6.29-rc4.
>> Loop file was on XFS.
>> I can not reproduce it on ext4 on raw device.
>> Let me know if I can do anything else to help resolving it.
>>
> Can you try this patch from Aneesh?  Works for me...
>

Yes, it works, I can not reproduce the bug on
2.6.29-rc5-00122-g5955c7a which contains 2acf2c261b82
"ext4: Implement range_cyclic in ext4_da_writepages instead of
write_cache_pages"

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ralf Hildebrandt Feb. 18, 2009, 10:12 a.m. UTC | #2
> Yes, it works, I can not reproduce the bug on
> 2.6.29-rc5-00122-g5955c7a which contains 2acf2c261b82
> "ext4: Implement range_cyclic in ext4_da_writepages instead of
> write_cache_pages"

Will there be a backport for 2.6.28.x ?
I'm suffering here.
Theodore Ts'o Feb. 18, 2009, 1:04 p.m. UTC | #3
On Wed, Feb 18, 2009 at 11:12:18AM +0100, Ralf Hildebrandt wrote:
> > Yes, it works, I can not reproduce the bug on
> > 2.6.29-rc5-00122-g5955c7a which contains 2acf2c261b82
> > "ext4: Implement range_cyclic in ext4_da_writepages instead of
> > write_cache_pages"
> 
> Will there be a backport for 2.6.28.x ?
> I'm suffering here.

The patch just hit mainline, so yes, I'll get a backport series set up.

    	       	   	     	     	  - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-2.6.28.x86_64/fs/ext4/inode.c
===================================================================
--- linux-2.6.28.x86_64.orig/fs/ext4/inode.c
+++ linux-2.6.28.x86_64/fs/ext4/inode.c
@@ -2437,6 +2437,7 @@  static int ext4_da_writepages(struct add
 	int no_nrwrite_index_update;
 	int pages_written = 0;
 	long pages_skipped;
+       int range_cyclic = 0, cycled = 1, io_done = 0;
 	int needed_blocks, ret = 0, nr_to_writebump = 0;
 	struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb);
 
@@ -2488,9 +2489,14 @@  static int ext4_da_writepages(struct add
 	if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
 		range_whole = 1;
 
-	if (wbc->range_cyclic)
+       if (wbc->range_cyclic) {
 		index = mapping->writeback_index;
-	else
+	       wbc->range_start = index << PAGE_CACHE_SHIFT;
+	       wbc->range_end  = LLONG_MAX;
+	       wbc->range_cyclic = 0;
+	       range_cyclic = 1;
+	       cycled = 0;
+       } else
 		index = wbc->range_start >> PAGE_CACHE_SHIFT;
 
 	mpd.wbc = wbc;
@@ -2504,6 +2510,7 @@  static int ext4_da_writepages(struct add
 	wbc->no_nrwrite_index_update = 1;
 	pages_skipped = wbc->pages_skipped;
 
+retry:
 	while (!ret && wbc->nr_to_write > 0) {
 
 		/*
@@ -2546,6 +2553,7 @@  static int ext4_da_writepages(struct add
 			pages_written += mpd.pages_written;
 			wbc->pages_skipped = pages_skipped;
 			ret = 0;
+		       io_done = 1;
 		} else if (wbc->nr_to_write)
 			/*
 			 * There is no more writeout needed
@@ -2554,6 +2562,13 @@  static int ext4_da_writepages(struct add
 			 */
 			break;
 	}
+       if (!io_done && !cycled) {
+	       cycled = 1;
+	       index = 0;
+	       wbc->range_start = index << PAGE_CACHE_SHIFT;
+	       wbc->range_end  = mapping->writeback_index - 1;
+	       goto retry;
+       }
 	if (pages_skipped != wbc->pages_skipped)
 		printk(KERN_EMERG "This should not happen leaving %s "
 				"with nr_to_write = %ld ret = %d\n",
@@ -2561,6 +2576,7 @@  static int ext4_da_writepages(struct add
 
 	/* Update index */
 	index += pages_written;
+       wbc->range_cyclic = range_cyclic;
 	if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
 		/*
 		 * set the writeback_index so that range_cyclic