Message ID | 20181225053326.7012-1-tytso@mit.edu |
---|---|
State | New |
Headers | show |
Series | ext4: make sure enough credits are reserved for dioread_nolock writes | expand |
</bo.liu@linux.alibaba.com> On Mon, Dec 24, 2018 at 9:36 PM Theodore Ts'o <tytso@mit.edu> wrote: > > There are enough credits reserved for most dioread_nolock writes; > however, if the extent tree is sufficiently deep, and/or quota is > enabled, the code was not allowing for all eventualities when > reserving journal credits for the unwritten extent conversion. > > This problem can be seen using xfstests ext4/034: > > WARNING: CPU: 1 PID: 257 at fs/ext4/ext4_jbd2.c:271 __ext4_handle_dirty_metadata+0x10c/0x180 > Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work > RIP: 0010:__ext4_handle_dirty_metadata+0x10c/0x180 > ... > EXT4-fs: ext4_free_blocks:4938: aborting transaction: error 28 in __ext4_handle_dirty_metadata > EXT4: jbd2_journal_dirty_metadata failed: handle type 11 started at line 4921, credits 4/0, errcode -28 > EXT4-fs error (device dm-1) in ext4_free_blocks:4950: error 28 > I had a patch[1] to address the problem but with adding reservation only when we need to (in ext4_ext_try_to_merge_up). I'm fine with either way. Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> BTW, the ext4/034 may need to be updated with the correct patch title/commit. thanks, liubo [1]: https://patchwork.ozlabs.org/patch/991794/ > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > Cc: stable@kernel.org > --- > fs/ext4/inode.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 9affabd07682..165ff331d998 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -2778,7 +2778,8 @@ static int ext4_writepages(struct address_space *mapping, > * We may need to convert up to one extent per block in > * the page and we may dirty the inode. > */ > - rsv_blocks = 1 + (PAGE_SIZE >> inode->i_blkbits); > + rsv_blocks = 1 + ext4_chunk_trans_blocks(inode, > + PAGE_SIZE >> inode->i_blkbits); > } > > /* > -- > 2.19.1 >
On Tue, Dec 25, 2018 at 04:10:25PM -0800, Liu Bo wrote: > > I had a patch[1] to address the problem but with adding reservation > only when we need to (in ext4_ext_try_to_merge_up). Sorry, I had forgotten that you had that patch pending. I believe that merging up isn't the only case we need to worry about, though. For example, if we write into the middle of a larger unwritten region, we might havec to change a single region into three regions, which might require a node split --- and in that case, it's not optional, as would be in the try_to_merge_up situation. If we don't have enough credits to do a node split our only choices would be mark the file system as corrupted, or to fail the write with ENOSPC (which would be very confusing to the user/application). Cheers, - Ted
On Tue, Dec 25, 2018 at 8:07 PM Theodore Y. Ts'o <tytso@mit.edu> wrote: > > On Tue, Dec 25, 2018 at 04:10:25PM -0800, Liu Bo wrote: > > > > I had a patch[1] to address the problem but with adding reservation > > only when we need to (in ext4_ext_try_to_merge_up). > > Sorry, I had forgotten that you had that patch pending. I believe > that merging up isn't the only case we need to worry about, though. > For example, if we write into the middle of a larger unwritten region, > we might havec to change a single region into three regions, which > might require a node split --- and in that case, it's not optional, as > would be in the try_to_merge_up situation. If we don't have enough > credits to do a node split our only choices would be mark the file > system as corrupted, or to fail the write with ENOSPC (which would be > very confusing to the user/application). > I see it now, thanks for the explanation. thanks, liubo > Cheers, > > - Ted
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 9affabd07682..165ff331d998 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2778,7 +2778,8 @@ static int ext4_writepages(struct address_space *mapping, * We may need to convert up to one extent per block in * the page and we may dirty the inode. */ - rsv_blocks = 1 + (PAGE_SIZE >> inode->i_blkbits); + rsv_blocks = 1 + ext4_chunk_trans_blocks(inode, + PAGE_SIZE >> inode->i_blkbits); } /*
There are enough credits reserved for most dioread_nolock writes; however, if the extent tree is sufficiently deep, and/or quota is enabled, the code was not allowing for all eventualities when reserving journal credits for the unwritten extent conversion. This problem can be seen using xfstests ext4/034: WARNING: CPU: 1 PID: 257 at fs/ext4/ext4_jbd2.c:271 __ext4_handle_dirty_metadata+0x10c/0x180 Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work RIP: 0010:__ext4_handle_dirty_metadata+0x10c/0x180 ... EXT4-fs: ext4_free_blocks:4938: aborting transaction: error 28 in __ext4_handle_dirty_metadata EXT4: jbd2_journal_dirty_metadata failed: handle type 11 started at line 4921, credits 4/0, errcode -28 EXT4-fs error (device dm-1) in ext4_free_blocks:4950: error 28 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org --- fs/ext4/inode.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)