diff mbox series

ext4: make sure enough credits are reserved for dioread_nolock writes

Message ID 20181225053326.7012-1-tytso@mit.edu
State New
Headers show
Series ext4: make sure enough credits are reserved for dioread_nolock writes | expand

Commit Message

Theodore Ts'o Dec. 25, 2018, 5:33 a.m. UTC
There are enough credits reserved for most dioread_nolock writes;
however, if the extent tree is sufficiently deep, and/or quota is
enabled, the code was not allowing for all eventualities when
reserving journal credits for the unwritten extent conversion.

This problem can be seen using xfstests ext4/034:

   WARNING: CPU: 1 PID: 257 at fs/ext4/ext4_jbd2.c:271 __ext4_handle_dirty_metadata+0x10c/0x180
   Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
   RIP: 0010:__ext4_handle_dirty_metadata+0x10c/0x180
   	...
   EXT4-fs: ext4_free_blocks:4938: aborting transaction: error 28 in __ext4_handle_dirty_metadata
   EXT4: jbd2_journal_dirty_metadata failed: handle type 11 started at line 4921, credits 4/0, errcode -28
   EXT4-fs error (device dm-1) in ext4_free_blocks:4950: error 28

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
---
 fs/ext4/inode.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Liu Bo Dec. 26, 2018, 12:10 a.m. UTC | #1
</bo.liu@linux.alibaba.com>
On Mon, Dec 24, 2018 at 9:36 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> There are enough credits reserved for most dioread_nolock writes;
> however, if the extent tree is sufficiently deep, and/or quota is
> enabled, the code was not allowing for all eventualities when
> reserving journal credits for the unwritten extent conversion.
>
> This problem can be seen using xfstests ext4/034:
>
>    WARNING: CPU: 1 PID: 257 at fs/ext4/ext4_jbd2.c:271 __ext4_handle_dirty_metadata+0x10c/0x180
>    Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
>    RIP: 0010:__ext4_handle_dirty_metadata+0x10c/0x180
>         ...
>    EXT4-fs: ext4_free_blocks:4938: aborting transaction: error 28 in __ext4_handle_dirty_metadata
>    EXT4: jbd2_journal_dirty_metadata failed: handle type 11 started at line 4921, credits 4/0, errcode -28
>    EXT4-fs error (device dm-1) in ext4_free_blocks:4950: error 28
>

I had a patch[1] to address the problem but with adding reservation
only when we need to (in ext4_ext_try_to_merge_up).

I'm fine with either way.

Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>

BTW, the ext4/034 may need to be updated with the correct patch title/commit.

thanks,
liubo

[1]:
https://patchwork.ozlabs.org/patch/991794/

> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Cc: stable@kernel.org
> ---
>  fs/ext4/inode.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 9affabd07682..165ff331d998 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2778,7 +2778,8 @@ static int ext4_writepages(struct address_space *mapping,
>                  * We may need to convert up to one extent per block in
>                  * the page and we may dirty the inode.
>                  */
> -               rsv_blocks = 1 + (PAGE_SIZE >> inode->i_blkbits);
> +               rsv_blocks = 1 + ext4_chunk_trans_blocks(inode,
> +                                               PAGE_SIZE >> inode->i_blkbits);
>         }
>
>         /*
> --
> 2.19.1
>
Theodore Ts'o Dec. 26, 2018, 4:07 a.m. UTC | #2
On Tue, Dec 25, 2018 at 04:10:25PM -0800, Liu Bo wrote:
> 
> I had a patch[1] to address the problem but with adding reservation
> only when we need to (in ext4_ext_try_to_merge_up).

Sorry, I had forgotten that you had that patch pending.  I believe
that merging up isn't the only case we need to worry about, though.
For example, if we write into the middle of a larger unwritten region,
we might havec to change a single region into three regions, which
might require a node split --- and in that case, it's not optional, as
would be in the try_to_merge_up situation.  If we don't have enough
credits to do a node split our only choices would be mark the file
system as corrupted, or to fail the write with ENOSPC (which would be
very confusing to the user/application).

Cheers,

					- Ted
Liu Bo Jan. 1, 2019, 11:40 p.m. UTC | #3
On Tue, Dec 25, 2018 at 8:07 PM Theodore Y. Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Dec 25, 2018 at 04:10:25PM -0800, Liu Bo wrote:
> >
> > I had a patch[1] to address the problem but with adding reservation
> > only when we need to (in ext4_ext_try_to_merge_up).
>
> Sorry, I had forgotten that you had that patch pending.  I believe
> that merging up isn't the only case we need to worry about, though.
> For example, if we write into the middle of a larger unwritten region,
> we might havec to change a single region into three regions, which
> might require a node split --- and in that case, it's not optional, as
> would be in the try_to_merge_up situation.  If we don't have enough
> credits to do a node split our only choices would be mark the file
> system as corrupted, or to fail the write with ENOSPC (which would be
> very confusing to the user/application).
>

I see it now, thanks for the explanation.

thanks,
liubo

> Cheers,
>
>                                         - Ted
diff mbox series

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 9affabd07682..165ff331d998 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2778,7 +2778,8 @@  static int ext4_writepages(struct address_space *mapping,
 		 * We may need to convert up to one extent per block in
 		 * the page and we may dirty the inode.
 		 */
-		rsv_blocks = 1 + (PAGE_SIZE >> inode->i_blkbits);
+		rsv_blocks = 1 + ext4_chunk_trans_blocks(inode,
+						PAGE_SIZE >> inode->i_blkbits);
 	}
 
 	/*