diff mbox series

[v2,3/9] ext4: trim delalloc extent

Message ID 20240410034203.2188357-4-yi.zhang@huaweicloud.com
State New
Headers show
Series ext4: support adding multi-delalloc blocks | expand

Commit Message

Zhang Yi April 10, 2024, 3:41 a.m. UTC
From: Zhang Yi <yi.zhang@huawei.com>

The cached delalloc or hole extent should be trimed to the map->map_len
if we map delalloc blocks in ext4_da_map_blocks(). But it doesn't
trigger any issue now because the map->m_len is always set to one and we
always insert one delayed block once a time. Fix this by trim the extent
once we get one from the cached extent tree, prearing for mapping a
extent with multiple delalloc blocks.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/inode.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Jan Kara April 25, 2024, 3:56 p.m. UTC | #1
On Wed 10-04-24 11:41:57, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> The cached delalloc or hole extent should be trimed to the map->map_len
> if we map delalloc blocks in ext4_da_map_blocks(). But it doesn't
> trigger any issue now because the map->m_len is always set to one and we
> always insert one delayed block once a time. Fix this by trim the extent
> once we get one from the cached extent tree, prearing for mapping a
> extent with multiple delalloc blocks.
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

Well, but we already do the trimming in ext4_da_map_blocks(), don't we? You
just move it to a different place... Or do you mean that we actually didn't
set 'map' at all in some cases and now we do? In either case the 'map'
handling looks a bit sloppy in ext4_da_map_blocks() as e.g. the
'add_delayed' case doesn't seem to bother with properly setting 'map' based
on what it does. So maybe we should clean that up to always set 'map' just
before returning at the same place where we update the 'bh'? And maybe bh
update could be updated in some common helper because it's content is
determined by the 'map' content?

								Honza

> ---
>  fs/ext4/inode.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 118b0497a954..e4043ddb07a5 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1734,6 +1734,11 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>  
>  	/* Lookup extent status tree firstly */
>  	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
> +		retval = es.es_len - (iblock - es.es_lblk);
> +		if (retval > map->m_len)
> +			retval = map->m_len;
> +		map->m_len = retval;
> +
>  		if (ext4_es_is_hole(&es))
>  			goto add_delayed;
>  
> @@ -1750,10 +1755,6 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>  		}
>  
>  		map->m_pblk = ext4_es_pblock(&es) + iblock - es.es_lblk;
> -		retval = es.es_len - (iblock - es.es_lblk);
> -		if (retval > map->m_len)
> -			retval = map->m_len;
> -		map->m_len = retval;
>  		if (ext4_es_is_written(&es))
>  			map->m_flags |= EXT4_MAP_MAPPED;
>  		else if (ext4_es_is_unwritten(&es))
> @@ -1788,6 +1789,11 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>  	 * whitout holding i_rwsem and folio lock.
>  	 */
>  	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
> +		retval = es.es_len - (iblock - es.es_lblk);
> +		if (retval > map->m_len)
> +			retval = map->m_len;
> +		map->m_len = retval;
> +
>  		if (!ext4_es_is_hole(&es)) {
>  			up_write(&EXT4_I(inode)->i_data_sem);
>  			goto found;
> -- 
> 2.39.2
>
Zhang Yi April 26, 2024, 9:38 a.m. UTC | #2
On 2024/4/25 23:56, Jan Kara wrote:
> On Wed 10-04-24 11:41:57, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@huawei.com>
>>
>> The cached delalloc or hole extent should be trimed to the map->map_len
>> if we map delalloc blocks in ext4_da_map_blocks(). But it doesn't
>> trigger any issue now because the map->m_len is always set to one and we
>> always insert one delayed block once a time. Fix this by trim the extent
>> once we get one from the cached extent tree, prearing for mapping a
>> extent with multiple delalloc blocks.
>>
>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> 
> Well, but we already do the trimming in ext4_da_map_blocks(), don't we? You
> just move it to a different place... Or do you mean that we actually didn't
> set 'map' at all in some cases and now we do? 

Yeah, now we only trim map len if we found an unwritten extent or written
extent in the cache, this isn't okay if we found a hole and
ext4_insert_delayed_block() and ext4_da_map_blocks() support inserting
map->len blocks. If we found a hole which es->es_len is shorter than the
length we want to write, we could delay more blocks than we expected.

Please assume we write data [A, C) to a file that contains a hole extent
[A, B) and a written extent [B, D) in cache.

                      A     B  C  D
before da write:   ...hhhhhh|wwwwww....

Then we will get extent [A, B), we should trim map->m_len to B-A before
inserting new delalloc blocks, if not, the range [B, C) is duplicated.

> In either case the 'map'
> handling looks a bit sloppy in ext4_da_map_blocks() as e.g. the
> 'add_delayed' case doesn't seem to bother with properly setting 'map' based
> on what it does. So maybe we should clean that up to always set 'map' just
> before returning at the same place where we update the 'bh'? And maybe bh
> update could be updated in some common helper because it's content is
> determined by the 'map' content?
> 

I agree with you, it looks that we should always revise the map->m_len
once we found an extent from the cache, and then do corresponding handling
according to the extent type. so it's hard to put it to a common place.
But we can merge the handling of written and unwritten extent, I've moved
the bh updating into ext4_da_get_block_prep() and do some cleanup in
patch 9, please look at that patch, does it looks fine to you?

Thanks,
Yi.

> 								Honza
> 
>> ---
>>  fs/ext4/inode.c | 14 ++++++++++----
>>  1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 118b0497a954..e4043ddb07a5 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -1734,6 +1734,11 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>>  
>>  	/* Lookup extent status tree firstly */
>>  	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
>> +		retval = es.es_len - (iblock - es.es_lblk);
>> +		if (retval > map->m_len)
>> +			retval = map->m_len;
>> +		map->m_len = retval;
>> +
>>  		if (ext4_es_is_hole(&es))
>>  			goto add_delayed;
>>  
>> @@ -1750,10 +1755,6 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>>  		}
>>  
>>  		map->m_pblk = ext4_es_pblock(&es) + iblock - es.es_lblk;
>> -		retval = es.es_len - (iblock - es.es_lblk);
>> -		if (retval > map->m_len)
>> -			retval = map->m_len;
>> -		map->m_len = retval;
>>  		if (ext4_es_is_written(&es))
>>  			map->m_flags |= EXT4_MAP_MAPPED;
>>  		else if (ext4_es_is_unwritten(&es))
>> @@ -1788,6 +1789,11 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
>>  	 * whitout holding i_rwsem and folio lock.
>>  	 */
>>  	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
>> +		retval = es.es_len - (iblock - es.es_lblk);
>> +		if (retval > map->m_len)
>> +			retval = map->m_len;
>> +		map->m_len = retval;
>> +
>>  		if (!ext4_es_is_hole(&es)) {
>>  			up_write(&EXT4_I(inode)->i_data_sem);
>>  			goto found;
>> -- 
>> 2.39.2
>>
Jan Kara April 29, 2024, 10:25 a.m. UTC | #3
On Fri 26-04-24 17:38:23, Zhang Yi wrote:
> On 2024/4/25 23:56, Jan Kara wrote:
> > On Wed 10-04-24 11:41:57, Zhang Yi wrote:
> >> From: Zhang Yi <yi.zhang@huawei.com>
> >>
> >> The cached delalloc or hole extent should be trimed to the map->map_len
> >> if we map delalloc blocks in ext4_da_map_blocks(). But it doesn't
> >> trigger any issue now because the map->m_len is always set to one and we
> >> always insert one delayed block once a time. Fix this by trim the extent
> >> once we get one from the cached extent tree, prearing for mapping a
> >> extent with multiple delalloc blocks.
> >>
> >> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> > 
> > Well, but we already do the trimming in ext4_da_map_blocks(), don't we? You
> > just move it to a different place... Or do you mean that we actually didn't
> > set 'map' at all in some cases and now we do? 
> 
> Yeah, now we only trim map len if we found an unwritten extent or written
> extent in the cache, this isn't okay if we found a hole and
> ext4_insert_delayed_block() and ext4_da_map_blocks() support inserting
> map->len blocks. If we found a hole which es->es_len is shorter than the
> length we want to write, we could delay more blocks than we expected.
> 
> Please assume we write data [A, C) to a file that contains a hole extent
> [A, B) and a written extent [B, D) in cache.
> 
>                       A     B  C  D
> before da write:   ...hhhhhh|wwwwww....
> 
> Then we will get extent [A, B), we should trim map->m_len to B-A before
> inserting new delalloc blocks, if not, the range [B, C) is duplicated.

Thanks for explanation!

> > In either case the 'map'
> > handling looks a bit sloppy in ext4_da_map_blocks() as e.g. the
> > 'add_delayed' case doesn't seem to bother with properly setting 'map' based
> > on what it does. So maybe we should clean that up to always set 'map' just
> > before returning at the same place where we update the 'bh'? And maybe bh
> > update could be updated in some common helper because it's content is
> > determined by the 'map' content?
> > 
> 
> I agree with you, it looks that we should always revise the map->m_len
> once we found an extent from the cache, and then do corresponding handling
> according to the extent type. so it's hard to put it to a common place.
> But we can merge the handling of written and unwritten extent, I've moved
> the bh updating into ext4_da_get_block_prep() and do some cleanup in
> patch 9, please look at that patch, does it looks fine to you?

Oh, yes, what patch 9 does improve things significantly and it addresses my
concern. So feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

Maybe in the changelog you can just mention that the remaining cases not
setting map->m_len will be handled in patch 9.

								Honza
Zhang Yi April 29, 2024, 11:28 a.m. UTC | #4
On 2024/4/29 18:25, Jan Kara wrote:
> On Fri 26-04-24 17:38:23, Zhang Yi wrote:
>> On 2024/4/25 23:56, Jan Kara wrote:
>>> On Wed 10-04-24 11:41:57, Zhang Yi wrote:
>>>> From: Zhang Yi <yi.zhang@huawei.com>
>>>>
>>>> The cached delalloc or hole extent should be trimed to the map->map_len
>>>> if we map delalloc blocks in ext4_da_map_blocks(). But it doesn't
>>>> trigger any issue now because the map->m_len is always set to one and we
>>>> always insert one delayed block once a time. Fix this by trim the extent
>>>> once we get one from the cached extent tree, prearing for mapping a
>>>> extent with multiple delalloc blocks.
>>>>
>>>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
>>>
>>> Well, but we already do the trimming in ext4_da_map_blocks(), don't we? You
>>> just move it to a different place... Or do you mean that we actually didn't
>>> set 'map' at all in some cases and now we do? 
>>
>> Yeah, now we only trim map len if we found an unwritten extent or written
>> extent in the cache, this isn't okay if we found a hole and
>> ext4_insert_delayed_block() and ext4_da_map_blocks() support inserting
>> map->len blocks. If we found a hole which es->es_len is shorter than the
>> length we want to write, we could delay more blocks than we expected.
>>
>> Please assume we write data [A, C) to a file that contains a hole extent
>> [A, B) and a written extent [B, D) in cache.
>>
>>                       A     B  C  D
>> before da write:   ...hhhhhh|wwwwww....
>>
>> Then we will get extent [A, B), we should trim map->m_len to B-A before
>> inserting new delalloc blocks, if not, the range [B, C) is duplicated.
> 
> Thanks for explanation!
> 

Current change log is not clear enough now, I will put this explanation
into the changelog in my nextn iteratio, make it easier to understand.

>>> In either case the 'map'
>>> handling looks a bit sloppy in ext4_da_map_blocks() as e.g. the
>>> 'add_delayed' case doesn't seem to bother with properly setting 'map' based
>>> on what it does. So maybe we should clean that up to always set 'map' just
>>> before returning at the same place where we update the 'bh'? And maybe bh
>>> update could be updated in some common helper because it's content is
>>> determined by the 'map' content?
>>>
>>
>> I agree with you, it looks that we should always revise the map->m_len
>> once we found an extent from the cache, and then do corresponding handling
>> according to the extent type. so it's hard to put it to a common place.
>> But we can merge the handling of written and unwritten extent, I've moved
>> the bh updating into ext4_da_get_block_prep() and do some cleanup in
>> patch 9, please look at that patch, does it looks fine to you?
> 
> Oh, yes, what patch 9 does improve things significantly and it addresses my
> concern. So feel free to add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> 
> Maybe in the changelog you can just mention that the remaining cases not
> setting map->m_len will be handled in patch 9.
> 

Sure.

Thanks,
Yi.
diff mbox series

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 118b0497a954..e4043ddb07a5 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1734,6 +1734,11 @@  static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 
 	/* Lookup extent status tree firstly */
 	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+		retval = es.es_len - (iblock - es.es_lblk);
+		if (retval > map->m_len)
+			retval = map->m_len;
+		map->m_len = retval;
+
 		if (ext4_es_is_hole(&es))
 			goto add_delayed;
 
@@ -1750,10 +1755,6 @@  static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 		}
 
 		map->m_pblk = ext4_es_pblock(&es) + iblock - es.es_lblk;
-		retval = es.es_len - (iblock - es.es_lblk);
-		if (retval > map->m_len)
-			retval = map->m_len;
-		map->m_len = retval;
 		if (ext4_es_is_written(&es))
 			map->m_flags |= EXT4_MAP_MAPPED;
 		else if (ext4_es_is_unwritten(&es))
@@ -1788,6 +1789,11 @@  static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
 	 * whitout holding i_rwsem and folio lock.
 	 */
 	if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) {
+		retval = es.es_len - (iblock - es.es_lblk);
+		if (retval > map->m_len)
+			retval = map->m_len;
+		map->m_len = retval;
+
 		if (!ext4_es_is_hole(&es)) {
 			up_write(&EXT4_I(inode)->i_data_sem);
 			goto found;