Patchwork ext4: don't re-try to remove the entry from es tree when we encounter a ENOMEM in ext4_ext_truncate

login
register
mail settings
Submitter Zheng Liu
Date July 25, 2013, 11:56 a.m.
Message ID <1374753397-26432-1-git-send-email-wenqing.lz@taobao.com>
Download mbox | patch
Permalink /patch/261676/
State Rejected
Headers show

Comments

Zheng Liu - July 25, 2013, 11:56 a.m.
From: Zheng Liu <wenqing.lz@taobao.com>

ext4_es_remove_extent returns ENOMEM only if we need to split an entry
and insert a part into es tree.  After applied this commit (e15f742c),
we have retried to do this.  So we don't need to do this again in
ext4_ext_truncate().

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
---
 fs/ext4/extents.c |    6 ------
 1 file changed, 6 deletions(-)
Zheng Liu - July 29, 2013, 3:07 p.m.
On Thu, Jul 25, 2013 at 07:56:37PM +0800, Zheng Liu wrote:
> From: Zheng Liu <wenqing.lz@taobao.com>
> 
> ext4_es_remove_extent returns ENOMEM only if we need to split an entry
> and insert a part into es tree.  After applied this commit (e15f742c),
> we have retried to do this.  So we don't need to do this again in
> ext4_ext_truncate().
> 
> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
> Cc: "Theodore Ts'o" <tytso@mit.edu>

Any comment?

Thanks
                                                - Zheng

> ---
>  fs/ext4/extents.c |    6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index a618738..4404297 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -4409,14 +4409,8 @@ void ext4_ext_truncate(handle_t *handle, struct inode *inode)
>  
>  	last_block = (inode->i_size + sb->s_blocksize - 1)
>  			>> EXT4_BLOCK_SIZE_BITS(sb);
> -retry:
>  	err = ext4_es_remove_extent(inode, last_block,
>  				    EXT_MAX_BLOCKS - last_block);
> -	if (err == ENOMEM) {
> -		cond_resched();
> -		congestion_wait(BLK_RW_ASYNC, HZ/50);
> -		goto retry;
> -	}
>  	if (err) {
>  		ext4_std_error(inode->i_sb, err);
>  		return;
> -- 
> 1.7.9.7
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - July 29, 2013, 3:42 p.m.
On Thu, Jul 25, 2013 at 07:56:37PM +0800, Zheng Liu wrote:
> From: Zheng Liu <wenqing.lz@taobao.com>
> 
> ext4_es_remove_extent returns ENOMEM only if we need to split an entry
> and insert a part into es tree.  After applied this commit (e15f742c),
> we have retried to do this.  So we don't need to do this again in
> ext4_ext_truncate().
> 
> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
> Cc: "Theodore Ts'o" <tytso@mit.edu>

Actually, we still need to do this, since the retry loop in
__es_remove_extent() tries to shrink the extent status tree for the
inode in question, and only retries if we were able to free up some
memory.  (We only do it for the inode we're working on, since we have
it locked already.)  So __es_remove_extent() can still return ENOMEM,
and so callers of ext4_es_insert_extent() and ext4_es_remove_extent()
still need to check for ENOMEM and try to do something sane if
possible.

The problem with truncate is that the VFS assumes truncate() will
always succeed (the method function is returns a void, so there isn't
even a way to propagate an error code back p to the VFS), so we really
do need to do a retry in ext4's truncate code.

For other code paths, like for example fallocate(), it's completely
fair game for it to return ENOMEM, although we need to make sure that
we've gotten the error handling correct.  

For the writeback paths, where the application which performed the
write may have exited already and we have dirty pages in the page
cache, retrying an ENOMEM after calling congestion_wait() is something
that *does* make sense.

This is why I didn't add an unconditional retry loop to the low-level
extent_status tree code, since where we can return ENOMEM, it's better
to do that, since that way applications can start failing fast in OOM
conditions.  Whether or not we want do that is going to depend on the
higher level code paths.

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zheng Liu - July 29, 2013, 11:50 p.m.
On Mon, Jul 29, 2013 at 11:42:39AM -0400, Theodore Ts'o wrote:
> On Thu, Jul 25, 2013 at 07:56:37PM +0800, Zheng Liu wrote:
> > From: Zheng Liu <wenqing.lz@taobao.com>
> > 
> > ext4_es_remove_extent returns ENOMEM only if we need to split an entry
> > and insert a part into es tree.  After applied this commit (e15f742c),
> > we have retried to do this.  So we don't need to do this again in
> > ext4_ext_truncate().
> > 
> > Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
> > Cc: "Theodore Ts'o" <tytso@mit.edu>
> 
> Actually, we still need to do this, since the retry loop in
> __es_remove_extent() tries to shrink the extent status tree for the
> inode in question, and only retries if we were able to free up some
> memory.  (We only do it for the inode we're working on, since we have
> it locked already.)  So __es_remove_extent() can still return ENOMEM,
> and so callers of ext4_es_insert_extent() and ext4_es_remove_extent()
> still need to check for ENOMEM and try to do something sane if
> possible.
> 
> The problem with truncate is that the VFS assumes truncate() will
> always succeed (the method function is returns a void, so there isn't
> even a way to propagate an error code back p to the VFS), so we really
> do need to do a retry in ext4's truncate code.
> 
> For other code paths, like for example fallocate(), it's completely
> fair game for it to return ENOMEM, although we need to make sure that
> we've gotten the error handling correct.  
> 
> For the writeback paths, where the application which performed the
> write may have exited already and we have dirty pages in the page
> cache, retrying an ENOMEM after calling congestion_wait() is something
> that *does* make sense.
> 
> This is why I didn't add an unconditional retry loop to the low-level
> extent_status tree code, since where we can return ENOMEM, it's better
> to do that, since that way applications can start failing fast in OOM
> conditions.  Whether or not we want do that is going to depend on the
> higher level code paths.

Got it, thanks for your explanation.

                                                - Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig - Aug. 1, 2013, 8:45 a.m.
On Mon, Jul 29, 2013 at 11:42:39AM -0400, Theodore Ts'o wrote:
> The problem with truncate is that the VFS assumes truncate() will
> always succeed (the method function is returns a void, so there isn't
> even a way to propagate an error code back p to the VFS), so we really
> do need to do a retry in ext4's truncate code.

It hasn't for a long time.  The ill suite truncate method is gone for
a long time, and setattr can return errors just fine.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index a618738..4404297 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4409,14 +4409,8 @@  void ext4_ext_truncate(handle_t *handle, struct inode *inode)
 
 	last_block = (inode->i_size + sb->s_blocksize - 1)
 			>> EXT4_BLOCK_SIZE_BITS(sb);
-retry:
 	err = ext4_es_remove_extent(inode, last_block,
 				    EXT_MAX_BLOCKS - last_block);
-	if (err == ENOMEM) {
-		cond_resched();
-		congestion_wait(BLK_RW_ASYNC, HZ/50);
-		goto retry;
-	}
 	if (err) {
 		ext4_std_error(inode->i_sb, err);
 		return;