Message ID | 20230412124126.2286716-2-libaokun1@huawei.com |
---|---|
State | Superseded |
Headers | show |
Series | ext4: fix WARNING in ext4_da_update_reserve_space | expand |
On Wed 12-04-23 20:41:19, Baokun Li wrote: > In our fault injection test, we create an ext4 file, migrate it to > non-extent based file, then punch a hole and finally trigger a WARN_ON > in the ext4_da_update_reserve_space(): > > EXT4-fs warning (device sda): ext4_da_update_reserve_space:369: > ino 14, used 11 with only 10 reserved data blocks > > When writing back a non-extent based file, if we enable delalloc, the > number of reserved blocks will be subtracted from the number of blocks > mapped by ext4_ind_map_blocks(), and the extent status tree will be > updated. We update the extent status tree by first removing the old > extent_status and then inserting the new extent_status. If the block range > we remove happens to be in an extent, then we need to allocate another > extent_status with ext4_es_alloc_extent(). > > use old to remove to add new > |----------|------------|------------| > old extent_status > > The problem is that the allocation of a new extent_status failed due to a > fault injection, and __es_shrink() did not get free memory, resulting in > a return of -ENOMEM. Then do_writepages() retries after receiving -ENOMEM, > we map to the same extent again, and the number of reserved blocks is again > subtracted from the number of blocks in that extent. Since the blocks in > the same extent are subtracted twice, we end up triggering WARN_ON at > ext4_da_update_reserve_space() because used > ei->i_reserved_data_blocks. > > For non-extent based file, we update the number of reserved blocks after > ext4_ind_map_blocks() is executed, which causes a problem that when we call > ext4_ind_map_blocks() to create a block, it doesn't always create a block, > but we always reduce the number of reserved blocks. So we move the logic > for updating reserved blocks to ext4_ind_map_blocks() to ensure that the > number of reserved blocks is updated only after we do succeed in allocating > some new blocks. > > Fixes: 5f634d064c70 ("ext4: Fix quota accounting error with fallocate") > Signed-off-by: Baokun Li <libaokun1@huawei.com> Looks good to me. Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > V1->V2: > Modify the patch description and add the Fixes tag. > V2->V3: > Remove the redundant judgment of count. > > fs/ext4/indirect.c | 8 ++++++++ > fs/ext4/inode.c | 10 ---------- > 2 files changed, 8 insertions(+), 10 deletions(-) > > diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c > index c68bebe7ff4b..a9f3716119d3 100644 > --- a/fs/ext4/indirect.c > +++ b/fs/ext4/indirect.c > @@ -651,6 +651,14 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode, > > ext4_update_inode_fsync_trans(handle, inode, 1); > count = ar.len; > + > + /* > + * Update reserved blocks/metadata blocks after successful block > + * allocation which had been deferred till now. > + */ > + if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) > + ext4_da_update_reserve_space(inode, count, 1); > + > got_it: > map->m_flags |= EXT4_MAP_MAPPED; > map->m_pblk = le32_to_cpu(chain[depth-1].key); > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 97eb728cb958..33ae92f0ddfb 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -659,16 +659,6 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode, > */ > ext4_clear_inode_state(inode, EXT4_STATE_EXT_MIGRATE); > } > - > - /* > - * Update reserved blocks/metadata blocks after successful > - * block allocation which had been deferred till now. We don't > - * support fallocate for non extent files. So we can update > - * reserve space here. > - */ > - if ((retval > 0) && > - (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)) > - ext4_da_update_reserve_space(inode, retval, 1); > } > > if (retval > 0) { > -- > 2.31.1 >
diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c index c68bebe7ff4b..a9f3716119d3 100644 --- a/fs/ext4/indirect.c +++ b/fs/ext4/indirect.c @@ -651,6 +651,14 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode, ext4_update_inode_fsync_trans(handle, inode, 1); count = ar.len; + + /* + * Update reserved blocks/metadata blocks after successful block + * allocation which had been deferred till now. + */ + if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) + ext4_da_update_reserve_space(inode, count, 1); + got_it: map->m_flags |= EXT4_MAP_MAPPED; map->m_pblk = le32_to_cpu(chain[depth-1].key); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 97eb728cb958..33ae92f0ddfb 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -659,16 +659,6 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode, */ ext4_clear_inode_state(inode, EXT4_STATE_EXT_MIGRATE); } - - /* - * Update reserved blocks/metadata blocks after successful - * block allocation which had been deferred till now. We don't - * support fallocate for non extent files. So we can update - * reserve space here. - */ - if ((retval > 0) && - (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)) - ext4_da_update_reserve_space(inode, retval, 1); } if (retval > 0) {
In our fault injection test, we create an ext4 file, migrate it to non-extent based file, then punch a hole and finally trigger a WARN_ON in the ext4_da_update_reserve_space(): EXT4-fs warning (device sda): ext4_da_update_reserve_space:369: ino 14, used 11 with only 10 reserved data blocks When writing back a non-extent based file, if we enable delalloc, the number of reserved blocks will be subtracted from the number of blocks mapped by ext4_ind_map_blocks(), and the extent status tree will be updated. We update the extent status tree by first removing the old extent_status and then inserting the new extent_status. If the block range we remove happens to be in an extent, then we need to allocate another extent_status with ext4_es_alloc_extent(). use old to remove to add new |----------|------------|------------| old extent_status The problem is that the allocation of a new extent_status failed due to a fault injection, and __es_shrink() did not get free memory, resulting in a return of -ENOMEM. Then do_writepages() retries after receiving -ENOMEM, we map to the same extent again, and the number of reserved blocks is again subtracted from the number of blocks in that extent. Since the blocks in the same extent are subtracted twice, we end up triggering WARN_ON at ext4_da_update_reserve_space() because used > ei->i_reserved_data_blocks. For non-extent based file, we update the number of reserved blocks after ext4_ind_map_blocks() is executed, which causes a problem that when we call ext4_ind_map_blocks() to create a block, it doesn't always create a block, but we always reduce the number of reserved blocks. So we move the logic for updating reserved blocks to ext4_ind_map_blocks() to ensure that the number of reserved blocks is updated only after we do succeed in allocating some new blocks. Fixes: 5f634d064c70 ("ext4: Fix quota accounting error with fallocate") Signed-off-by: Baokun Li <libaokun1@huawei.com> --- V1->V2: Modify the patch description and add the Fixes tag. V2->V3: Remove the redundant judgment of count. fs/ext4/indirect.c | 8 ++++++++ fs/ext4/inode.c | 10 ---------- 2 files changed, 8 insertions(+), 10 deletions(-)