diff mbox series

[1/5] ext4: keep "prefetch_grp" and "nr" consistent

Message ID 20240326213823.528302-2-shikemeng@huaweicloud.com
State New
Headers show
Series Minor improvements and cleanups to ext4 mballoc | expand

Commit Message

Kemeng Shi March 26, 2024, 9:38 p.m. UTC
Keep "prefetch_grp" and "nr" consistent to avoid to call
ext4_mb_prefetch_fini with non-prefetched groups.
When we step into next criteria, "prefetch_grp" is set to prefetch start
of new criteria while "nr" is number of the prefetched group in previous
criteria. If previous criteria and next criteria are both inexpensive
(< CR_GOAL_LEN_SLOW) and prefetch_ios reachs sbi->s_mb_prefetch_limit
in previous criteria, "prefetch_grp" and "nr" will be inconsistent and
may introduce unexpected cost to do ext4_mb_init_group for non-prefetched
groups.
Reset "nr" to 0 when we reset "prefetch_grp" to start of prefech in next
criteria to ensure "prefetch_grp" and "nr" are consistent.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 fs/ext4/mballoc.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Ojaswin Mujoo March 29, 2024, 7:52 a.m. UTC | #1
On Wed, Mar 27, 2024 at 05:38:19AM +0800, Kemeng Shi wrote:
> Keep "prefetch_grp" and "nr" consistent to avoid to call
> ext4_mb_prefetch_fini with non-prefetched groups.
> When we step into next criteria, "prefetch_grp" is set to prefetch start
> of new criteria while "nr" is number of the prefetched group in previous
> criteria. If previous criteria and next criteria are both inexpensive
> (< CR_GOAL_LEN_SLOW) and prefetch_ios reachs sbi->s_mb_prefetch_limit
> in previous criteria, "prefetch_grp" and "nr" will be inconsistent and
> may introduce unexpected cost to do ext4_mb_init_group for non-prefetched
> groups.
> Reset "nr" to 0 when we reset "prefetch_grp" to start of prefech in next
> criteria to ensure "prefetch_grp" and "nr" are consistent.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
>  fs/ext4/mballoc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 12b3f196010b..a61fc52956b2 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2856,6 +2856,7 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
>  		group = ac->ac_g_ex.fe_group;
>  		ac->ac_groups_linear_remaining = sbi->s_mb_max_linear_groups;
>  		prefetch_grp = group;
> +		nr = 0;

Looks good, feel free to add:

Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>

To add some thoughts, I think the way we depend on group and nr being in
sync for prefetch and prefetch_fini to work is very fragile and I've
seen some issues there in the past as well. I believe a better approach
would be to use some kind of list which is populated with group nos by
prefetch() and then looked up by prefetch_fini() to initialize the buddy
and other data structures.

As the comment on ext4_mb_prefetch_fini() suggests, we can also look
into eliminating this function completely by using a bg worker queue
kick off when prefetch of the bitmap is complete although we'll also
need to take care of the other caller of these functions ie the 
lazy initialization thread that prefetches block bitmaps in BG
(ext4_lazyinit_thread())

Regards,
ojaswin

>  
>  		for (i = 0, new_cr = cr; i < ngroups; i++,
>  		     ext4_mb_choose_next_group(ac, &new_cr, &group, ngroups)) {
> -- 
> 2.30.0
>
Jan Kara April 4, 2024, 1:22 p.m. UTC | #2
On Wed 27-03-24 05:38:19, Kemeng Shi wrote:
> Keep "prefetch_grp" and "nr" consistent to avoid to call
> ext4_mb_prefetch_fini with non-prefetched groups.
> When we step into next criteria, "prefetch_grp" is set to prefetch start
> of new criteria while "nr" is number of the prefetched group in previous
> criteria. If previous criteria and next criteria are both inexpensive
> (< CR_GOAL_LEN_SLOW) and prefetch_ios reachs sbi->s_mb_prefetch_limit
> in previous criteria, "prefetch_grp" and "nr" will be inconsistent and
> may introduce unexpected cost to do ext4_mb_init_group for non-prefetched
> groups.
> Reset "nr" to 0 when we reset "prefetch_grp" to start of prefech in next
> criteria to ensure "prefetch_grp" and "nr" are consistent.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/mballoc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 12b3f196010b..a61fc52956b2 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2856,6 +2856,7 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
>  		group = ac->ac_g_ex.fe_group;
>  		ac->ac_groups_linear_remaining = sbi->s_mb_max_linear_groups;
>  		prefetch_grp = group;
> +		nr = 0;
>  
>  		for (i = 0, new_cr = cr; i < ngroups; i++,
>  		     ext4_mb_choose_next_group(ac, &new_cr, &group, ngroups)) {
> -- 
> 2.30.0
>
diff mbox series

Patch

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 12b3f196010b..a61fc52956b2 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2856,6 +2856,7 @@  ext4_mb_regular_allocator(struct ext4_allocation_context *ac)
 		group = ac->ac_g_ex.fe_group;
 		ac->ac_groups_linear_remaining = sbi->s_mb_max_linear_groups;
 		prefetch_grp = group;
+		nr = 0;
 
 		for (i = 0, new_cr = cr; i < ngroups; i++,
 		     ext4_mb_choose_next_group(ac, &new_cr, &group, ngroups)) {