Patchwork Free allocated and pre-allocated blocks when check_eofblocks_fl fails

login
register
mail settings
Submitter Jiaying Zhang
Date June 21, 2011, 3:28 a.m.
Message ID <20110621032816.6A0A942247@ruihe.smo.corp.google.com>
Download mbox | patch
Permalink /patch/101234/
State Superseded
Headers show

Comments

Jiaying Zhang - June 21, 2011, 3:28 a.m.
We have hit the same BUG_ON as described in
https://bugzilla.kernel.org/show_bug.cgi?id=31222
on some of our servers that have disk failures or corrupted inodes. After
looking at the code, I think the problem is that we are not freeing inode's
preallocation list when check_eofblocks_fl fails in ext4_ext_map_blocks(),
which leaves the inode's preallocation list in an inconsistent state.

Below is a proposed patch to fix the bug. I have tested it by manually
inserting a random failure in check_eofblocks_fl() and run a test that
creates and uses an inode's preallocated blocks. Without the fix, the kernel
crashes after a few runs. With the fix, no crash is observed.
    
ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails

Upon corrupted inode or disk failures, we may fail after we already allocate
some blocks from the inode or take some blocks from the inode's preallocation
list, but before we successfully insert the corresponding extent to the extent
tree. In this case, we should free any allocated blocks and discard the inode's
preallocated blocks because the entries in the inode's preallocation list may
be in an inconsistent state.

Signed-off-by: Jiaying Zhang <jiayingz@google.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner - June 21, 2011, 3:24 p.m.
On Mon, 20 Jun 2011, Jiaying Zhang wrote:

> We have hit the same BUG_ON as described in
> https://bugzilla.kernel.org/show_bug.cgi?id=31222
> on some of our servers that have disk failures or corrupted inodes. After
> looking at the code, I think the problem is that we are not freeing inode's
> preallocation list when check_eofblocks_fl fails in ext4_ext_map_blocks(),
> which leaves the inode's preallocation list in an inconsistent state.
> 
> Below is a proposed patch to fix the bug. I have tested it by manually
> inserting a random failure in check_eofblocks_fl() and run a test that
> creates and uses an inode's preallocated blocks. Without the fix, the kernel
> crashes after a few runs. With the fix, no crash is observed.
>     
> ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails
> 
> Upon corrupted inode or disk failures, we may fail after we already allocate
> some blocks from the inode or take some blocks from the inode's preallocation
> list, but before we successfully insert the corresponding extent to the extent
> tree. In this case, we should free any allocated blocks and discard the inode's
> preallocated blocks because the entries in the inode's preallocation list may
> be in an inconsistent state.
> 
> Signed-off-by: Jiaying Zhang <jiayingz@google.com>

Hi,

the change looks good to me (with one note), but the patch itself is not very
well formated, and Ted would have to copy it out. Please use this:

ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails

for Subject line and put any comments which should not appear in the
commit description after signed-off-by line starting with "--- " line.
Or just use git format-patch.

> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 5199bac..8cf6ec9 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3596,10 +3596,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
>  	}
>  
>  	err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
> -	if (err)
> -		goto out2;
> -
> -	err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> +	if (!err)
> +		err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
>  	if (err) {

why not to use:
	else {


>  		/* free data blocks we just allocated */
>  		/* not a good idea to call discard here directly,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Thanks!
-Lukas
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiaying Zhang - June 21, 2011, 7:40 p.m.
On Tue, Jun 21, 2011 at 8:24 AM, Lukas Czerner <lczerner@redhat.com> wrote:
>
> On Mon, 20 Jun 2011, Jiaying Zhang wrote:
>
> > We have hit the same BUG_ON as described in
> > https://bugzilla.kernel.org/show_bug.cgi?id=31222
> > on some of our servers that have disk failures or corrupted inodes. After
> > looking at the code, I think the problem is that we are not freeing inode's
> > preallocation list when check_eofblocks_fl fails in ext4_ext_map_blocks(),
> > which leaves the inode's preallocation list in an inconsistent state.
> >
> > Below is a proposed patch to fix the bug. I have tested it by manually
> > inserting a random failure in check_eofblocks_fl() and run a test that
> > creates and uses an inode's preallocated blocks. Without the fix, the kernel
> > crashes after a few runs. With the fix, no crash is observed.
> >
> > ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails
> >
> > Upon corrupted inode or disk failures, we may fail after we already allocate
> > some blocks from the inode or take some blocks from the inode's preallocation
> > list, but before we successfully insert the corresponding extent to the extent
> > tree. In this case, we should free any allocated blocks and discard the inode's
> > preallocated blocks because the entries in the inode's preallocation list may
> > be in an inconsistent state.
> >
> > Signed-off-by: Jiaying Zhang <jiayingz@google.com>
>
> Hi,
>
> the change looks good to me (with one note), but the patch itself is not very
> well formated, and Ted would have to copy it out. Please use this:
>
> ext4: free allocated and pre-allocated blocks when check_eofblocks_fl fails
>
> for Subject line and put any comments which should not appear in the
> commit description after signed-off-by line starting with "--- " line.
> Or just use git format-patch.
Thanks for the suggestion. I will re-send the patch with correct format.

>
> >
> > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> > index 5199bac..8cf6ec9 100644
> > --- a/fs/ext4/extents.c
> > +++ b/fs/ext4/extents.c
> > @@ -3596,10 +3596,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
> >       }
> >
> >       err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
> > -     if (err)
> > -             goto out2;
> > -
> > -     err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> > +     if (!err)
> > +             err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> >       if (err) {
>
> why not to use:
>        else {
>
It is not the same. If there is an error from ext4_ext_insert_extent(),
we also want to free allocated blocks.

Jiaying
>
> >               /* free data blocks we just allocated */
> >               /* not a good idea to call discard here directly,
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
>
> Thanks!
> -Lukas
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner - June 22, 2011, 8:28 a.m.
On Tue, 21 Jun 2011, Jiaying Zhang wrote:

> > >
> > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> > > index 5199bac..8cf6ec9 100644
> > > --- a/fs/ext4/extents.c
> > > +++ b/fs/ext4/extents.c
> > > @@ -3596,10 +3596,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
> > >       }
> > >
> > >       err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
> > > -     if (err)
> > > -             goto out2;
> > > -
> > > -     err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> > > +     if (!err)
> > > +             err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> > >       if (err) {
> >
> > why not to use:
> >        else {
> >
> It is not the same. If there is an error from ext4_ext_insert_extent(),
> we also want to free allocated blocks.

Oh, of course I have missed the assignment, sorry.

Thanks!
-Lukas

> 
> Jiaying
> >
> > >               /* free data blocks we just allocated */
> > >               /* not a good idea to call discard here directly,
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >
> >
> > Thanks!
> > -Lukas
>

Patch

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 5199bac..8cf6ec9 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3596,10 +3596,8 @@  int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
 	}
 
 	err = check_eofblocks_fl(handle, inode, map->m_lblk, path, ar.len);
-	if (err)
-		goto out2;
-
-	err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
+	if (!err)
+		err = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
 	if (err) {
 		/* free data blocks we just allocated */
 		/* not a good idea to call discard here directly,