diff mbox

[-V2,3/5] ext4: Fix the race between read_block_bitmap and mark_diskspace_used

Message ID 20081124182132.GF8462@skywalker
State Superseded, archived
Headers show

Commit Message

Aneesh Kumar K.V Nov. 24, 2008, 6:21 p.m. UTC
On Mon, Nov 24, 2008 at 09:17:53PM +0300, Alex Zhuravlev wrote:
> Aneesh Kumar K.V wrote:
>> With commit c806e68f we do a init_bitmap every time we do a
>> read_block_bitmap.
>
> can you explain why do we need to init it every time?
>

The commit message  explains it well. It is because the buffer_head
can be marked uptodate by a read from userspace. So we would skip doing
a init_bitmap on the uninit group during resize.

commit c806e68f5647109350ec546fee5b526962970fd2
Author: Frederic Bohe <frederic.bohe@bull.net>
Date:   Fri Oct 10 08:09:18 2008 -0400

    ext4: fix initialization of UNINIT bitmap blocks
    
    This fixes a bug which caused on-line resizing of filesystems with a
    1k blocksize to fail.  The root cause of this bug was the fact that if
    an uninitalized bitmap block gets read in by userspace (which
    e2fsprogs does try to avoid, but can happen when the blocksize is less
    than the pagesize and an adjacent blocks is read into memory)
    ext4_read_block_bitmap() was erroneously depending on the buffer
    uptodate flag to decide whether it needed to initialize the bitmap
    block in memory --- i.e., to set the standard set of blocks in use by
    a block group (superblock, bitmaps, inode table, etc.).  Essentially,
    ext4_read_block_bitmap() assumed it was the only routine that might
    try to read a block containing a block bitmap, which is simply not
    true.
    
    To fix this, ext4_read_block_bitmap() and ext4_read_inode_bitmap()
    must always initialize uninitialized bitmap blocks.  Once a block or
    inode is allocated out of that bitmap, it will be marked as
    initialized in the block group descriptor, so in general this won't
    result any extra unnecessary work.
    
    Signed-off-by: Frederic Bohe <frederic.bohe@bull.net>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Alex Zhuravlev Nov. 24, 2008, 6:28 p.m. UTC | #1
strange ... I'd expect the code to check "uptodate" once buffer is locked. you?

thanks, Alex

Aneesh Kumar K.V wrote:
> @@ -319,9 +319,11 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
>  			    block_group, bitmap_blk);
>  		return NULL;
>  	}
> -	if (bh_uptodate_or_lock(bh))
> +	if (buffer_uptodate(bh) &&
> +	    !(desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))
>  		return bh;
>  
> +	lock_buffer(bh);
>  	spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
>  	if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
>  		ext4_init_block_bitmap(sb, bh, block_group, desc);

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Zhuravlev Nov. 24, 2008, 6:41 p.m. UTC | #2
looks even more strange, IMHO. do I understand correct that two processes
doing allocation in the same group can do two initializations? what if one
process just allocated block(s) and not cleared UNINIT bit yet?

thanks, Alex



Aneesh Kumar K.V wrote:
> On Mon, Nov 24, 2008 at 09:17:53PM +0300, Alex Zhuravlev wrote:
>> Aneesh Kumar K.V wrote:
>>> With commit c806e68f we do a init_bitmap every time we do a
>>> read_block_bitmap.
>> can you explain why do we need to init it every time?
>>
> 
> The commit message  explains it well. It is because the buffer_head
> can be marked uptodate by a read from userspace. So we would skip doing
> a init_bitmap on the uninit group during resize.
> 
> commit c806e68f5647109350ec546fee5b526962970fd2
> Author: Frederic Bohe <frederic.bohe@bull.net>
> Date:   Fri Oct 10 08:09:18 2008 -0400
> 
>     ext4: fix initialization of UNINIT bitmap blocks
>     
>     This fixes a bug which caused on-line resizing of filesystems with a
>     1k blocksize to fail.  The root cause of this bug was the fact that if
>     an uninitalized bitmap block gets read in by userspace (which
>     e2fsprogs does try to avoid, but can happen when the blocksize is less
>     than the pagesize and an adjacent blocks is read into memory)
>     ext4_read_block_bitmap() was erroneously depending on the buffer
>     uptodate flag to decide whether it needed to initialize the bitmap
>     block in memory --- i.e., to set the standard set of blocks in use by
>     a block group (superblock, bitmaps, inode table, etc.).  Essentially,
>     ext4_read_block_bitmap() assumed it was the only routine that might
>     try to read a block containing a block bitmap, which is simply not
>     true.
>     
>     To fix this, ext4_read_block_bitmap() and ext4_read_inode_bitmap()
>     must always initialize uninitialized bitmap blocks.  Once a block or
>     inode is allocated out of that bitmap, it will be marked as
>     initialized in the block group descriptor, so in general this won't
>     result any extra unnecessary work.
>     
>     Signed-off-by: Frederic Bohe <frederic.bohe@bull.net>
>     Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> 
> diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
> index 59566c0..bd2ece2 100644
> --- a/fs/ext4/balloc.c
> +++ b/fs/ext4/balloc.c
> @@ -319,9 +319,11 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
>  			    block_group, bitmap_blk);
>  		return NULL;
>  	}
> -	if (bh_uptodate_or_lock(bh))
> +	if (buffer_uptodate(bh) &&
> +	    !(desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))
>  		return bh;
>  
> +	lock_buffer(bh);
>  	spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
>  	if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
>  		ext4_init_block_bitmap(sb, bh, block_group, desc);
> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
> index 1343bf1..fe34d74 100644
> --- a/fs/ext4/ialloc.c
> +++ b/fs/ext4/ialloc.c
> @@ -115,9 +115,11 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
>  			    block_group, bitmap_blk);
>  		return NULL;
>  	}
> -	if (bh_uptodate_or_lock(bh))
> +	if (buffer_uptodate(bh) &&
> +	    !(desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)))
>  		return bh;
>  
> +	lock_buffer(bh);
>  	spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
>  	if (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)) {
>  		ext4_init_inode_bitmap(sb, bh, block_group, desc);
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 335faee..b580714 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -782,9 +782,11 @@ static int ext4_mb_init_cache(struct page *page, char *incore)
>  		if (bh[i] == NULL)
>  			goto out;
>  
> -		if (bh_uptodate_or_lock(bh[i]))
> +		if (buffer_uptodate(bh[i]) &&
> +		    !(desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))
>  			continue;
>  
> +		lock_buffer(bh[i]);
>  		spin_lock(sb_bgl_lock(EXT4_SB(sb), first_group + i));
>  		if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
>  			ext4_init_block_bitmap(sb, bh[i],

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 59566c0..bd2ece2 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -319,9 +319,11 @@  ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
 			    block_group, bitmap_blk);
 		return NULL;
 	}
-	if (bh_uptodate_or_lock(bh))
+	if (buffer_uptodate(bh) &&
+	    !(desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))
 		return bh;
 
+	lock_buffer(bh);
 	spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
 	if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
 		ext4_init_block_bitmap(sb, bh, block_group, desc);
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 1343bf1..fe34d74 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -115,9 +115,11 @@  ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
 			    block_group, bitmap_blk);
 		return NULL;
 	}
-	if (bh_uptodate_or_lock(bh))
+	if (buffer_uptodate(bh) &&
+	    !(desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)))
 		return bh;
 
+	lock_buffer(bh);
 	spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
 	if (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)) {
 		ext4_init_inode_bitmap(sb, bh, block_group, desc);
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 335faee..b580714 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -782,9 +782,11 @@  static int ext4_mb_init_cache(struct page *page, char *incore)
 		if (bh[i] == NULL)
 			goto out;
 
-		if (bh_uptodate_or_lock(bh[i]))
+		if (buffer_uptodate(bh[i]) &&
+		    !(desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)))
 			continue;
 
+		lock_buffer(bh[i]);
 		spin_lock(sb_bgl_lock(EXT4_SB(sb), first_group + i));
 		if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
 			ext4_init_block_bitmap(sb, bh[i],