diff mbox series

ext4: fix infinite loop when replaying fast_commit

Message ID 20240510115252.11850-1-luis.henriques@linux.dev
State New
Headers show
Series ext4: fix infinite loop when replaying fast_commit | expand

Commit Message

Luis Henriques (SUSE) May 10, 2024, 11:52 a.m. UTC
When doing fast_commit replay an infinite loop may occur due to an
uninitialized extent_status struct.  ext4_ext_determine_insert_hole() does
not detect the replay and calls ext4_es_find_extent_range(), which will
return immediately without initializing the 'es' variable.

Because 'es' contains garbage, an integer overflow may happen causing an
infinite loop in this function, easily reproducible using fstest generic/039.

This commit fixes this issue by detecting the replay in function
ext4_ext_determine_insert_hole().  It also adds initialization code to the
error path in function ext4_es_find_extent_range().

Thanks to Zhang Yi, for figuring out the real problem!

Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
---
Hi!

Two comments:
1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
   macro instead.  I decided not to do so simply because I wasn't sure if
   that would be safe, but I'm fine changing that if you think it is.

2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
   ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
   the extra change to ext4_ext_map_blocks().  '0' sounds like the right
   value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.

And again thanks to Zhang Yi for pointing me the *real* problem!

 fs/ext4/extents.c        | 6 +++++-
 fs/ext4/extents_status.c | 5 ++++-
 2 files changed, 9 insertions(+), 2 deletions(-)

Comments

Zhang Yi May 11, 2024, 6:24 a.m. UTC | #1
On 2024/5/10 19:52, Luis Henriques (SUSE) wrote:
> When doing fast_commit replay an infinite loop may occur due to an
> uninitialized extent_status struct.  ext4_ext_determine_insert_hole() does
> not detect the replay and calls ext4_es_find_extent_range(), which will
> return immediately without initializing the 'es' variable.
> 
> Because 'es' contains garbage, an integer overflow may happen causing an
> infinite loop in this function, easily reproducible using fstest generic/039.
> 
> This commit fixes this issue by detecting the replay in function
> ext4_ext_determine_insert_hole().  It also adds initialization code to the
> error path in function ext4_es_find_extent_range().
> 
> Thanks to Zhang Yi, for figuring out the real problem!
> 
> Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
> ---
> Hi!
> 
> Two comments:
> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
>    macro instead.  I decided not to do so simply because I wasn't sure if
>    that would be safe, but I'm fine changing that if you think it is.
> 
> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
>    ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
>    the extra change to ext4_ext_map_blocks().  '0' sounds like the right
>    value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.
> 
> And again thanks to Zhang Yi for pointing me the *real* problem!
> 
>  fs/ext4/extents.c        | 6 +++++-
>  fs/ext4/extents_status.c | 5 ++++-
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index e57054bdc5fd..b5bfcb6c18a0 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
>  	ext4_lblk_t hole_start, len;
>  	struct extent_status es;
>  
> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
> +		return 0;
> +

Sorry, I think it's may not correct. When replaying the jouranl, although
we don't use the extent statue tree, we still need to query the accurate
hole length, e.g. please see skip_hole(). If you do this, the hole length
becomes incorrect, right?

Thanks,
Yi.

>  	hole_start = lblk;
>  	len = ext4_ext_find_hole(inode, path, &hole_start);
>  again:
> @@ -4226,7 +4229,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
>  		len = ext4_ext_determine_insert_hole(inode, path, map->m_lblk);
>  
>  		map->m_pblk = 0;
> -		map->m_len = min_t(unsigned int, map->m_len, len);
> +		if (len > 0)
> +			map->m_len = min_t(unsigned int, map->m_len, len);
>  		goto out;
>  	}
>  
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index 4a00e2f019d9..acb9616ca119 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -310,8 +310,11 @@ void ext4_es_find_extent_range(struct inode *inode,
>  			       ext4_lblk_t lblk, ext4_lblk_t end,
>  			       struct extent_status *es)
>  {
> -	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
> +		/* Initialize extent to zero */
> +		es->es_lblk = es->es_len = es->es_pblk = 0;
>  		return;
> +	}
>  
>  	trace_ext4_es_find_extent_range_enter(inode, lblk);
>  
>
Markus Elfring May 12, 2024, 4:44 p.m. UTC | #2
> This commit fixes this issue by detecting the replay …

Would corresponding imperative wordings be more desirable for such a change description?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.9-rc7#n94


> Thanks to Zhang Yi, for figuring out the real problem!
…

Will another tag become relevant here?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.9-rc7#n527

Regards,
Markus
Luis Henriques (SUSE) May 14, 2024, 1:04 p.m. UTC | #3
On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote;

> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote:
>> When doing fast_commit replay an infinite loop may occur due to an
>> uninitialized extent_status struct.  ext4_ext_determine_insert_hole() does
>> not detect the replay and calls ext4_es_find_extent_range(), which will
>> return immediately without initializing the 'es' variable.
>> 
>> Because 'es' contains garbage, an integer overflow may happen causing an
>> infinite loop in this function, easily reproducible using fstest generic/039.
>> 
>> This commit fixes this issue by detecting the replay in function
>> ext4_ext_determine_insert_hole().  It also adds initialization code to the
>> error path in function ext4_es_find_extent_range().
>> 
>> Thanks to Zhang Yi, for figuring out the real problem!
>> 
>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>> ---
>> Hi!
>> 
>> Two comments:
>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
>>    macro instead.  I decided not to do so simply because I wasn't sure if
>>    that would be safe, but I'm fine changing that if you think it is.
>> 
>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
>>    ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
>>    the extra change to ext4_ext_map_blocks().  '0' sounds like the right
>>    value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.
>> 
>> And again thanks to Zhang Yi for pointing me the *real* problem!
>> 
>>  fs/ext4/extents.c        | 6 +++++-
>>  fs/ext4/extents_status.c | 5 ++++-
>>  2 files changed, 9 insertions(+), 2 deletions(-)
>> 
>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>> index e57054bdc5fd..b5bfcb6c18a0 100644
>> --- a/fs/ext4/extents.c
>> +++ b/fs/ext4/extents.c
>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
>>  	ext4_lblk_t hole_start, len;
>>  	struct extent_status es;
>>  
>> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>> +		return 0;
>> +
>
> Sorry, I think it's may not correct. When replaying the jouranl, although
> we don't use the extent statue tree, we still need to query the accurate
> hole length, e.g. please see skip_hole(). If you do this, the hole length
> becomes incorrect, right?

Thank you for your review (and sorry for my delay replying).

So, I see three different options to follow your suggestion:

1) Initialize 'es' immediately when declaring it in function
   ext4_ext_determine_insert_hole():

	es.es_lblk = es.es_len = es.es_pblk = 0;

2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an
   fc replay is in progress (my patch was already doing something like
   that):

	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
		/* Initialize extent to zero */
		es->es_lblk = es->es_len = es->es_pblk = 0;
		return;
	}

3) Remove the check for fc replay in function ext4_es_find_extent_range(),
   which will then unconditionally call __es_find_extent_range().  This
   will effectively also initialize the 'es' fields to '0' and, because
   __es_tree_search() will return NULL (at least in generic/039 test!),
   nothing else will be done.

Since all these 3 options seem to have the same result, I believe option
1) is probably the best as it initializes the structure shortly after it's
declaration.  Would you agree?  Or did I misunderstood you?

Cheers,
Zhang Yi May 15, 2024, 4:59 a.m. UTC | #4
On 2024/5/14 21:04, Luis Henriques wrote:
> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote;
> 
>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote:
>>> When doing fast_commit replay an infinite loop may occur due to an
>>> uninitialized extent_status struct.  ext4_ext_determine_insert_hole() does
>>> not detect the replay and calls ext4_es_find_extent_range(), which will
>>> return immediately without initializing the 'es' variable.
>>>
>>> Because 'es' contains garbage, an integer overflow may happen causing an
>>> infinite loop in this function, easily reproducible using fstest generic/039.
>>>
>>> This commit fixes this issue by detecting the replay in function
>>> ext4_ext_determine_insert_hole().  It also adds initialization code to the
>>> error path in function ext4_es_find_extent_range().
>>>
>>> Thanks to Zhang Yi, for figuring out the real problem!
>>>
>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>>> ---
>>> Hi!
>>>
>>> Two comments:
>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
>>>    macro instead.  I decided not to do so simply because I wasn't sure if
>>>    that would be safe, but I'm fine changing that if you think it is.
>>>
>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
>>>    ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
>>>    the extra change to ext4_ext_map_blocks().  '0' sounds like the right
>>>    value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.
>>>
>>> And again thanks to Zhang Yi for pointing me the *real* problem!
>>>
>>>  fs/ext4/extents.c        | 6 +++++-
>>>  fs/ext4/extents_status.c | 5 ++++-
>>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>>> index e57054bdc5fd..b5bfcb6c18a0 100644
>>> --- a/fs/ext4/extents.c
>>> +++ b/fs/ext4/extents.c
>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
>>>  	ext4_lblk_t hole_start, len;
>>>  	struct extent_status es;
>>>  
>>> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>>> +		return 0;
>>> +
>>
>> Sorry, I think it's may not correct. When replaying the jouranl, although
>> we don't use the extent statue tree, we still need to query the accurate
>> hole length, e.g. please see skip_hole(). If you do this, the hole length
>> becomes incorrect, right?
> 
> Thank you for your review (and sorry for my delay replying).
> 
> So, I see three different options to follow your suggestion:
> 
> 1) Initialize 'es' immediately when declaring it in function
>    ext4_ext_determine_insert_hole():
> 
> 	es.es_lblk = es.es_len = es.es_pblk = 0;
> 
> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an
>    fc replay is in progress (my patch was already doing something like
>    that):
> 
> 	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
> 		/* Initialize extent to zero */
> 		es->es_lblk = es->es_len = es->es_pblk = 0;
> 		return;
> 	}
> 
> 3) Remove the check for fc replay in function ext4_es_find_extent_range(),
>    which will then unconditionally call __es_find_extent_range().  This
>    will effectively also initialize the 'es' fields to '0' and, because
>    __es_tree_search() will return NULL (at least in generic/039 test!),
>    nothing else will be done.
> 
> Since all these 3 options seem to have the same result, I believe option
> 1) is probably the best as it initializes the structure shortly after it's
> declaration.  Would you agree?  Or did I misunderstood you?
> 

Both 1 and 2 are looks fine to me, but I would prefer to initialize it
unconditionally in ext4_es_find_extent_range().

@@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode,
				ext4_lblk_t lblk, ext4_lblk_t end,
				struct extent_status *es)
 {
+	es->es_lblk = es->es_len = es->es_pblk = 0;
+
	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
		return;

Thanks,
Yi.
Luis Henriques (SUSE) May 15, 2024, 8:28 a.m. UTC | #5
On Wed 15 May 2024 12:59:26 PM +08, Zhang Yi wrote;

> On 2024/5/14 21:04, Luis Henriques wrote:
>> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote;
>> 
>>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote:
>>>> When doing fast_commit replay an infinite loop may occur due to an
>>>> uninitialized extent_status struct.  ext4_ext_determine_insert_hole() does
>>>> not detect the replay and calls ext4_es_find_extent_range(), which will
>>>> return immediately without initializing the 'es' variable.
>>>>
>>>> Because 'es' contains garbage, an integer overflow may happen causing an
>>>> infinite loop in this function, easily reproducible using fstest generic/039.
>>>>
>>>> This commit fixes this issue by detecting the replay in function
>>>> ext4_ext_determine_insert_hole().  It also adds initialization code to the
>>>> error path in function ext4_es_find_extent_range().
>>>>
>>>> Thanks to Zhang Yi, for figuring out the real problem!
>>>>
>>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
>>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>>>> ---
>>>> Hi!
>>>>
>>>> Two comments:
>>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
>>>>    macro instead.  I decided not to do so simply because I wasn't sure if
>>>>    that would be safe, but I'm fine changing that if you think it is.
>>>>
>>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
>>>>    ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
>>>>    the extra change to ext4_ext_map_blocks().  '0' sounds like the right
>>>>    value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.
>>>>
>>>> And again thanks to Zhang Yi for pointing me the *real* problem!
>>>>
>>>>  fs/ext4/extents.c        | 6 +++++-
>>>>  fs/ext4/extents_status.c | 5 ++++-
>>>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>>>> index e57054bdc5fd..b5bfcb6c18a0 100644
>>>> --- a/fs/ext4/extents.c
>>>> +++ b/fs/ext4/extents.c
>>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
>>>>  	ext4_lblk_t hole_start, len;
>>>>  	struct extent_status es;
>>>>  
>>>> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>>>> +		return 0;
>>>> +
>>>
>>> Sorry, I think it's may not correct. When replaying the jouranl, although
>>> we don't use the extent statue tree, we still need to query the accurate
>>> hole length, e.g. please see skip_hole(). If you do this, the hole length
>>> becomes incorrect, right?
>> 
>> Thank you for your review (and sorry for my delay replying).
>> 
>> So, I see three different options to follow your suggestion:
>> 
>> 1) Initialize 'es' immediately when declaring it in function
>>    ext4_ext_determine_insert_hole():
>> 
>> 	es.es_lblk = es.es_len = es.es_pblk = 0;
>> 
>> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an
>>    fc replay is in progress (my patch was already doing something like
>>    that):
>> 
>> 	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
>> 		/* Initialize extent to zero */
>> 		es->es_lblk = es->es_len = es->es_pblk = 0;
>> 		return;
>> 	}
>> 
>> 3) Remove the check for fc replay in function ext4_es_find_extent_range(),
>>    which will then unconditionally call __es_find_extent_range().  This
>>    will effectively also initialize the 'es' fields to '0' and, because
>>    __es_tree_search() will return NULL (at least in generic/039 test!),
>>    nothing else will be done.
>> 
>> Since all these 3 options seem to have the same result, I believe option
>> 1) is probably the best as it initializes the structure shortly after it's
>> declaration.  Would you agree?  Or did I misunderstood you?
>> 
>
> Both 1 and 2 are looks fine to me, but I would prefer to initialize it
> unconditionally in ext4_es_find_extent_range().
>
> @@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode,
> 				ext4_lblk_t lblk, ext4_lblk_t end,
> 				struct extent_status *es)
>  {
> +	es->es_lblk = es->es_len = es->es_pblk = 0;
> +
> 	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
> 		return;

Thank you, Yi.  I'll send out v2 shortly.  Although, to be fair, the real
patch author shouldn't be me. :-)

Cheers,
Zhang Yi May 15, 2024, 8:52 a.m. UTC | #6
On 2024/5/15 16:28, Luis Henriques wrote:
> On Wed 15 May 2024 12:59:26 PM +08, Zhang Yi wrote;
> 
>> On 2024/5/14 21:04, Luis Henriques wrote:
>>> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote;
>>>
>>>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote:
>>>>> When doing fast_commit replay an infinite loop may occur due to an
>>>>> uninitialized extent_status struct.  ext4_ext_determine_insert_hole() does
>>>>> not detect the replay and calls ext4_es_find_extent_range(), which will
>>>>> return immediately without initializing the 'es' variable.
>>>>>
>>>>> Because 'es' contains garbage, an integer overflow may happen causing an
>>>>> infinite loop in this function, easily reproducible using fstest generic/039.
>>>>>
>>>>> This commit fixes this issue by detecting the replay in function
>>>>> ext4_ext_determine_insert_hole().  It also adds initialization code to the
>>>>> error path in function ext4_es_find_extent_range().
>>>>>
>>>>> Thanks to Zhang Yi, for figuring out the real problem!
>>>>>
>>>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
>>>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>>>>> ---
>>>>> Hi!
>>>>>
>>>>> Two comments:
>>>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
>>>>>    macro instead.  I decided not to do so simply because I wasn't sure if
>>>>>    that would be safe, but I'm fine changing that if you think it is.
>>>>>
>>>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
>>>>>    ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
>>>>>    the extra change to ext4_ext_map_blocks().  '0' sounds like the right
>>>>>    value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.
>>>>>
>>>>> And again thanks to Zhang Yi for pointing me the *real* problem!
>>>>>
>>>>>  fs/ext4/extents.c        | 6 +++++-
>>>>>  fs/ext4/extents_status.c | 5 ++++-
>>>>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>>>>> index e57054bdc5fd..b5bfcb6c18a0 100644
>>>>> --- a/fs/ext4/extents.c
>>>>> +++ b/fs/ext4/extents.c
>>>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
>>>>>  	ext4_lblk_t hole_start, len;
>>>>>  	struct extent_status es;
>>>>>  
>>>>> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>>>>> +		return 0;
>>>>> +
>>>>
>>>> Sorry, I think it's may not correct. When replaying the jouranl, although
>>>> we don't use the extent statue tree, we still need to query the accurate
>>>> hole length, e.g. please see skip_hole(). If you do this, the hole length
>>>> becomes incorrect, right?
>>>
>>> Thank you for your review (and sorry for my delay replying).
>>>
>>> So, I see three different options to follow your suggestion:
>>>
>>> 1) Initialize 'es' immediately when declaring it in function
>>>    ext4_ext_determine_insert_hole():
>>>
>>> 	es.es_lblk = es.es_len = es.es_pblk = 0;
>>>
>>> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an
>>>    fc replay is in progress (my patch was already doing something like
>>>    that):
>>>
>>> 	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
>>> 		/* Initialize extent to zero */
>>> 		es->es_lblk = es->es_len = es->es_pblk = 0;
>>> 		return;
>>> 	}
>>>
>>> 3) Remove the check for fc replay in function ext4_es_find_extent_range(),
>>>    which will then unconditionally call __es_find_extent_range().  This
>>>    will effectively also initialize the 'es' fields to '0' and, because
>>>    __es_tree_search() will return NULL (at least in generic/039 test!),
>>>    nothing else will be done.
>>>
>>> Since all these 3 options seem to have the same result, I believe option
>>> 1) is probably the best as it initializes the structure shortly after it's
>>> declaration.  Would you agree?  Or did I misunderstood you?
>>>
>>
>> Both 1 and 2 are looks fine to me, but I would prefer to initialize it
>> unconditionally in ext4_es_find_extent_range().
>>
>> @@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode,
>> 				ext4_lblk_t lblk, ext4_lblk_t end,
>> 				struct extent_status *es)
>>  {
>> +	es->es_lblk = es->es_len = es->es_pblk = 0;
>> +
>> 	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>> 		return;
> 
> Thank you, Yi.  I'll send out v2 shortly.  Although, to be fair, the real
> patch author shouldn't be me. :-)
> 

Never mind, I just give a suggestion and also I didn't do a full test on
this change.

Thanks,
Yi.
Luis Henriques (SUSE) May 15, 2024, 9:13 a.m. UTC | #7
On Wed 15 May 2024 04:52:54 PM +08, Zhang Yi wrote;

> On 2024/5/15 16:28, Luis Henriques wrote:
>> On Wed 15 May 2024 12:59:26 PM +08, Zhang Yi wrote;
>> 
>>> On 2024/5/14 21:04, Luis Henriques wrote:
>>>> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote;
>>>>
>>>>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote:
>>>>>> When doing fast_commit replay an infinite loop may occur due to an
>>>>>> uninitialized extent_status struct.  ext4_ext_determine_insert_hole() does
>>>>>> not detect the replay and calls ext4_es_find_extent_range(), which will
>>>>>> return immediately without initializing the 'es' variable.
>>>>>>
>>>>>> Because 'es' contains garbage, an integer overflow may happen causing an
>>>>>> infinite loop in this function, easily reproducible using fstest generic/039.
>>>>>>
>>>>>> This commit fixes this issue by detecting the replay in function
>>>>>> ext4_ext_determine_insert_hole().  It also adds initialization code to the
>>>>>> error path in function ext4_es_find_extent_range().
>>>>>>
>>>>>> Thanks to Zhang Yi, for figuring out the real problem!
>>>>>>
>>>>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
>>>>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>>>>>> ---
>>>>>> Hi!
>>>>>>
>>>>>> Two comments:
>>>>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
>>>>>>    macro instead.  I decided not to do so simply because I wasn't sure if
>>>>>>    that would be safe, but I'm fine changing that if you think it is.
>>>>>>
>>>>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
>>>>>>    ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
>>>>>>    the extra change to ext4_ext_map_blocks().  '0' sounds like the right
>>>>>>    value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.
>>>>>>
>>>>>> And again thanks to Zhang Yi for pointing me the *real* problem!
>>>>>>
>>>>>>  fs/ext4/extents.c        | 6 +++++-
>>>>>>  fs/ext4/extents_status.c | 5 ++++-
>>>>>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>>>>>> index e57054bdc5fd..b5bfcb6c18a0 100644
>>>>>> --- a/fs/ext4/extents.c
>>>>>> +++ b/fs/ext4/extents.c
>>>>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
>>>>>>  	ext4_lblk_t hole_start, len;
>>>>>>  	struct extent_status es;
>>>>>>  
>>>>>> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>>>>>> +		return 0;
>>>>>> +
>>>>>
>>>>> Sorry, I think it's may not correct. When replaying the jouranl, although
>>>>> we don't use the extent statue tree, we still need to query the accurate
>>>>> hole length, e.g. please see skip_hole(). If you do this, the hole length
>>>>> becomes incorrect, right?
>>>>
>>>> Thank you for your review (and sorry for my delay replying).
>>>>
>>>> So, I see three different options to follow your suggestion:
>>>>
>>>> 1) Initialize 'es' immediately when declaring it in function
>>>>    ext4_ext_determine_insert_hole():
>>>>
>>>> 	es.es_lblk = es.es_len = es.es_pblk = 0;
>>>>
>>>> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an
>>>>    fc replay is in progress (my patch was already doing something like
>>>>    that):
>>>>
>>>> 	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
>>>> 		/* Initialize extent to zero */
>>>> 		es->es_lblk = es->es_len = es->es_pblk = 0;
>>>> 		return;
>>>> 	}
>>>>
>>>> 3) Remove the check for fc replay in function ext4_es_find_extent_range(),
>>>>    which will then unconditionally call __es_find_extent_range().  This
>>>>    will effectively also initialize the 'es' fields to '0' and, because
>>>>    __es_tree_search() will return NULL (at least in generic/039 test!),
>>>>    nothing else will be done.
>>>>
>>>> Since all these 3 options seem to have the same result, I believe option
>>>> 1) is probably the best as it initializes the structure shortly after it's
>>>> declaration.  Would you agree?  Or did I misunderstood you?
>>>>
>>>
>>> Both 1 and 2 are looks fine to me, but I would prefer to initialize it
>>> unconditionally in ext4_es_find_extent_range().
>>>
>>> @@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode,
>>> 				ext4_lblk_t lblk, ext4_lblk_t end,
>>> 				struct extent_status *es)
>>>  {
>>> +	es->es_lblk = es->es_len = es->es_pblk = 0;
>>> +
>>> 	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>>> 		return;
>> 
>> Thank you, Yi.  I'll send out v2 shortly.  Although, to be fair, the real
>> patch author shouldn't be me. :-)
>> 
>
> Never mind, I just give a suggestion and also I didn't do a full test on
> this change.

Oh, talking about testing, I forgot to mention that I see the same
behaviour with generic/311.  I.e. this test also enters an infinite loop,
but fixed with this patch.

Cheers,
Zhang Yi May 15, 2024, 12:24 p.m. UTC | #8
On 2024/5/15 17:13, Luis Henriques wrote:
> On Wed 15 May 2024 04:52:54 PM +08, Zhang Yi wrote;
> 
>> On 2024/5/15 16:28, Luis Henriques wrote:
>>> On Wed 15 May 2024 12:59:26 PM +08, Zhang Yi wrote;
>>>
>>>> On 2024/5/14 21:04, Luis Henriques wrote:
>>>>> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote;
>>>>>
>>>>>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote:
>>>>>>> When doing fast_commit replay an infinite loop may occur due to an
>>>>>>> uninitialized extent_status struct.  ext4_ext_determine_insert_hole() does
>>>>>>> not detect the replay and calls ext4_es_find_extent_range(), which will
>>>>>>> return immediately without initializing the 'es' variable.
>>>>>>>
>>>>>>> Because 'es' contains garbage, an integer overflow may happen causing an
>>>>>>> infinite loop in this function, easily reproducible using fstest generic/039.
>>>>>>>
>>>>>>> This commit fixes this issue by detecting the replay in function
>>>>>>> ext4_ext_determine_insert_hole().  It also adds initialization code to the
>>>>>>> error path in function ext4_es_find_extent_range().
>>>>>>>
>>>>>>> Thanks to Zhang Yi, for figuring out the real problem!
>>>>>>>
>>>>>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
>>>>>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>>>>>>> ---
>>>>>>> Hi!
>>>>>>>
>>>>>>> Two comments:
>>>>>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero
>>>>>>>    macro instead.  I decided not to do so simply because I wasn't sure if
>>>>>>>    that would be safe, but I'm fine changing that if you think it is.
>>>>>>>
>>>>>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in
>>>>>>>    ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid
>>>>>>>    the extra change to ext4_ext_map_blocks().  '0' sounds like the right
>>>>>>>    value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead.
>>>>>>>
>>>>>>> And again thanks to Zhang Yi for pointing me the *real* problem!
>>>>>>>
>>>>>>>  fs/ext4/extents.c        | 6 +++++-
>>>>>>>  fs/ext4/extents_status.c | 5 ++++-
>>>>>>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>>>>>>> index e57054bdc5fd..b5bfcb6c18a0 100644
>>>>>>> --- a/fs/ext4/extents.c
>>>>>>> +++ b/fs/ext4/extents.c
>>>>>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
>>>>>>>  	ext4_lblk_t hole_start, len;
>>>>>>>  	struct extent_status es;
>>>>>>>  
>>>>>>> +	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>>>>>>> +		return 0;
>>>>>>> +
>>>>>>
>>>>>> Sorry, I think it's may not correct. When replaying the jouranl, although
>>>>>> we don't use the extent statue tree, we still need to query the accurate
>>>>>> hole length, e.g. please see skip_hole(). If you do this, the hole length
>>>>>> becomes incorrect, right?
>>>>>
>>>>> Thank you for your review (and sorry for my delay replying).
>>>>>
>>>>> So, I see three different options to follow your suggestion:
>>>>>
>>>>> 1) Initialize 'es' immediately when declaring it in function
>>>>>    ext4_ext_determine_insert_hole():
>>>>>
>>>>> 	es.es_lblk = es.es_len = es.es_pblk = 0;
>>>>>
>>>>> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an
>>>>>    fc replay is in progress (my patch was already doing something like
>>>>>    that):
>>>>>
>>>>> 	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
>>>>> 		/* Initialize extent to zero */
>>>>> 		es->es_lblk = es->es_len = es->es_pblk = 0;
>>>>> 		return;
>>>>> 	}
>>>>>
>>>>> 3) Remove the check for fc replay in function ext4_es_find_extent_range(),
>>>>>    which will then unconditionally call __es_find_extent_range().  This
>>>>>    will effectively also initialize the 'es' fields to '0' and, because
>>>>>    __es_tree_search() will return NULL (at least in generic/039 test!),
>>>>>    nothing else will be done.
>>>>>
>>>>> Since all these 3 options seem to have the same result, I believe option
>>>>> 1) is probably the best as it initializes the structure shortly after it's
>>>>> declaration.  Would you agree?  Or did I misunderstood you?
>>>>>
>>>>
>>>> Both 1 and 2 are looks fine to me, but I would prefer to initialize it
>>>> unconditionally in ext4_es_find_extent_range().
>>>>
>>>> @@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode,
>>>> 				ext4_lblk_t lblk, ext4_lblk_t end,
>>>> 				struct extent_status *es)
>>>>  {
>>>> +	es->es_lblk = es->es_len = es->es_pblk = 0;
>>>> +
>>>> 	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
>>>> 		return;
>>>
>>> Thank you, Yi.  I'll send out v2 shortly.  Although, to be fair, the real
>>> patch author shouldn't be me. :-)
>>>
>>
>> Never mind, I just give a suggestion and also I didn't do a full test on
>> this change.
> 
> Oh, talking about testing, I forgot to mention that I see the same
> behaviour with generic/311.  I.e. this test also enters an infinite loop,
> but fixed with this patch.
> 

Yeah, generic/311 also does a lot of mount && journal recovery operations,
and there maybe some other fault injection tests could have the same
results, it's all right now. :)

Thanks,
Yi.
diff mbox series

Patch

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index e57054bdc5fd..b5bfcb6c18a0 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4052,6 +4052,9 @@  static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
 	ext4_lblk_t hole_start, len;
 	struct extent_status es;
 
+	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
+		return 0;
+
 	hole_start = lblk;
 	len = ext4_ext_find_hole(inode, path, &hole_start);
 again:
@@ -4226,7 +4229,8 @@  int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
 		len = ext4_ext_determine_insert_hole(inode, path, map->m_lblk);
 
 		map->m_pblk = 0;
-		map->m_len = min_t(unsigned int, map->m_len, len);
+		if (len > 0)
+			map->m_len = min_t(unsigned int, map->m_len, len);
 		goto out;
 	}
 
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 4a00e2f019d9..acb9616ca119 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -310,8 +310,11 @@  void ext4_es_find_extent_range(struct inode *inode,
 			       ext4_lblk_t lblk, ext4_lblk_t end,
 			       struct extent_status *es)
 {
-	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
+	if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) {
+		/* Initialize extent to zero */
+		es->es_lblk = es->es_len = es->es_pblk = 0;
 		return;
+	}
 
 	trace_ext4_es_find_extent_range_enter(inode, lblk);