Patchwork ext4: fix s_dirty_blocks_counter if block allocation failed with nodelalloc

login
register
mail settings
Submitter Akira Fujita
Date Dec. 1, 2008, 10:21 a.m.
Message ID <4933BAC2.6080004@rs.jp.nec.com>
Download mbox | patch
Permalink /patch/11545/
State New
Headers show

Comments

Akira Fujita - Dec. 1, 2008, 10:21 a.m.
ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc

From: Akira Fujita <a-fujita@rs.jp.nec.com>

If block allocation failed after marking claimed blocks as dirty blocks
with nodelalloc, we have to subtract these blocks from
s_dirty_blocks_counter in error handling.
Otherwise s_dirty_blocks_counter goes wrong so that
filesystem's free blocks decreases incorrectly.

This issue was reported as ext4 online defrag's bug by Li Zefan.
http://marc.info/?l=linux-ext4&m=122697235715170&w=2

Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
---
 mballoc.c |    9 +++++++++
 1 file changed, 9 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Aneesh Kumar K.V - Dec. 1, 2008, 10:36 a.m.
On Mon, Dec 01, 2008 at 07:21:54PM +0900, Akira Fujita wrote:
> ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
> 
> From: Akira Fujita <a-fujita@rs.jp.nec.com>
> 
> If block allocation failed after marking claimed blocks as dirty blocks
> with nodelalloc, we have to subtract these blocks from
> s_dirty_blocks_counter in error handling.
> Otherwise s_dirty_blocks_counter goes wrong so that
> filesystem's free blocks decreases incorrectly.

Why did the block allocation fail ? With delayed allocation ENOSPC
should not happen during block allocation. That would mean we did
something wrong in block reservation.

> 
> This issue was reported as ext4 online defrag's bug by Li Zefan.
> http://marc.info/?l=linux-ext4&m=122697235715170&w=2

-aneesh
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Akira Fujita - Dec. 4, 2008, 1:27 a.m.
Hi Aneesh,
Aneesh Kumar K.V wrote:
> On Mon, Dec 01, 2008 at 07:21:54PM +0900, Akira Fujita wrote:
>> ext4: Fix s_dirty_blocks_counter if block allocation failed with nodelalloc
>>
>> From: Akira Fujita <a-fujita@rs.jp.nec.com>
>>
>> If block allocation failed after marking claimed blocks as dirty blocks
>> with nodelalloc, we have to subtract these blocks from
>> s_dirty_blocks_counter in error handling.
>> Otherwise s_dirty_blocks_counter goes wrong so that
>> filesystem's free blocks decreases incorrectly.
> 
> Why did the block allocation fail ? With delayed allocation ENOSPC
> should not happen during block allocation. That would mean we did
> something wrong in block reservation.

My case was *nodelalloc* and FS was almost full.
This problem occurs in multiple defrag running in short time.
Usually defrag releases temporary inode's blocks with iput,
then FS free blocks are recover but contiguous blocks do not recover
until next journal commit.
so we can not re-use contiguous blocks immediately.
There are enough free blocks in FS so that
ext4_claim_free_blocks marks claimed blocks as dirty,
but ext4_regular_allocator can not find enough blocks,
so mb_new_blocks returns ENOSPC without decreasing dirty blocks.

Regards,
Akira Fujita

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff -X linux-2.6.28-rc6-ext4/Documentation/dontdiff -upNr linux-2.6.28-rc6-ext4/fs/ext4/mballoc.c linux-2.6.28-rc6-mballoc-fix/fs/ext4/mballoc.c
--- linux-2.6.28-rc6-ext4/fs/ext4/mballoc.c	2008-12-01 11:44:28.000000000 +0900
+++ linux-2.6.28-rc6-mballoc-fix/fs/ext4/mballoc.c	2008-12-01 12:04:06.000000000 +0900
@@ -4495,12 +4495,18 @@  ext4_fsblk_t ext4_mb_new_blocks(handle_t
 	if (!ac) {
 		ar->len = 0;
 		*errp = -ENOMEM;
+		if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
+			percpu_counter_sub(&sbi->s_dirtyblocks_counter,
+						reserv_blks);
 		goto out1;
 	}

 	*errp = ext4_mb_initialize_context(ac, ar);
 	if (*errp) {
 		ar->len = 0;
+		if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
+			percpu_counter_sub(&sbi->s_dirtyblocks_counter,
+						reserv_blks);
 		goto out2;
 	}

@@ -4541,6 +4547,9 @@  repeat:
 		if (freed)
 			goto repeat;
 		*errp = -ENOSPC;
+		if (!(ac->ac_flags & EXT4_MB_DELALLOC_RESERVED))
+			percpu_counter_sub(&sbi->s_dirtyblocks_counter,
+						reserv_blks);
 		ac->ac_b_ex.fe_len = 0;
 		ar->len = 0;
 		ext4_mb_show_ac(ac);