diff mbox series

[v5,1/2] ext4: Discard preallocations before releasing group lock

Message ID 20200916113859.1556397-2-yebin10@huawei.com
State Superseded
Headers show
Series Fix dead loop in ext4_mb_new_blocks | expand

Commit Message

Ye Bin Sept. 16, 2020, 11:38 a.m. UTC
From: Jan Kara <jack@suse.cz>

ext4_mb_discard_group_preallocations() can be releasing group lock with
preallocations accumulated on its local list. Thus although
discard_pa_seq was incremented and concurrent allocating processes will
be retrying allocations, it can happen that premature ENOSPC error is
returned because blocks used for preallocations are not available for
reuse yet. Make sure we always free locally accumulated preallocations
before releasing group lock.

Fixes: 07b5b8e1ac40 ("ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/mballoc.c | 28 +++++++++++-----------------
 1 file changed, 11 insertions(+), 17 deletions(-)

Comments

Ritesh Harjani Sept. 18, 2020, 9:07 a.m. UTC | #1
On 9/16/20 5:08 PM, Ye Bin wrote:
> From: Jan Kara <jack@suse.cz>
> 
> ext4_mb_discard_group_preallocations() can be releasing group lock with
> preallocations accumulated on its local list. Thus although
> discard_pa_seq was incremented and concurrent allocating processes will
> be retrying allocations, it can happen that premature ENOSPC error is
> returned because blocks used for preallocations are not available for
> reuse yet. Make sure we always free locally accumulated preallocations
> before releasing group lock.
> 
> Fixes: 07b5b8e1ac40 ("ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling")
> Signed-off-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Ye Bin <yebin10@huawei.com>
> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
> ---
>   fs/ext4/mballoc.c | 28 +++++++++++-----------------
>   1 file changed, 11 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 132c118d12e1..f736819a381b 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4215,22 +4215,6 @@ ext4_mb_discard_group_preallocations(struct super_block *sb,
>   		list_add(&pa->u.pa_tmp_list, &list);
>   	}
>   
> -	/* if we still need more blocks and some PAs were used, try again */
> -	if (free < needed && busy) {
> -		busy = 0;
> -		ext4_unlock_group(sb, group);
> -		cond_resched();
> -		goto repeat;
> -	}
> -
> -	/* found anything to free? */
> -	if (list_empty(&list)) {
> -		BUG_ON(free != 0);
> -		mb_debug(sb, "Someone else may have freed PA for this group %u\n",
> -			 group);
> -		goto out;
> -	}
> -
>   	/* now free all selected PAs */
>   	list_for_each_entry_safe(pa, tmp, &list, u.pa_tmp_list) {
>   
> @@ -4248,7 +4232,17 @@ ext4_mb_discard_group_preallocations(struct super_block *sb,
>   		call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback);
>   	}
>   
> -out:
> +	/* if we still need more blocks and some PAs were used, try again */
> +	if (free < needed && busy) {
> +		ext4_unlock_group(sb, group);
> +		cond_resched();
> +		busy = 0;
> +		/* Make sure we increment discard_pa_seq again */
> +		needed -= free;
> +		free = 0;

Oops sorry about getting back to this.
But if we are making free 0 here so we may return a wrong free value
when we return from this function. We should fix that by also accounting
previous freed blocks at the time of final return from this function.


-ritesh

> +		goto repeat;
> +	}
> +
>   	ext4_unlock_group(sb, group);
>   	ext4_mb_unload_buddy(&e4b);
>   	put_bh(bitmap_bh);
>
Jan Kara Sept. 18, 2020, 9:56 a.m. UTC | #2
On Fri 18-09-20 14:37:15, Ritesh Harjani wrote:
> 
> 
> On 9/16/20 5:08 PM, Ye Bin wrote:
> > From: Jan Kara <jack@suse.cz>
> > 
> > ext4_mb_discard_group_preallocations() can be releasing group lock with
> > preallocations accumulated on its local list. Thus although
> > discard_pa_seq was incremented and concurrent allocating processes will
> > be retrying allocations, it can happen that premature ENOSPC error is
> > returned because blocks used for preallocations are not available for
> > reuse yet. Make sure we always free locally accumulated preallocations
> > before releasing group lock.
> > 
> > Fixes: 07b5b8e1ac40 ("ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling")
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > Signed-off-by: Ye Bin <yebin10@huawei.com>
> > Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
...
> > +	/* if we still need more blocks and some PAs were used, try again */
> > +	if (free < needed && busy) {
> > +		ext4_unlock_group(sb, group);
> > +		cond_resched();
> > +		busy = 0;
> > +		/* Make sure we increment discard_pa_seq again */
> > +		needed -= free;
> > +		free = 0;
> 
> Oops sorry about getting back to this.
> But if we are making free 0 here so we may return a wrong free value
> when we return from this function. We should fix that by also accounting
> previous freed blocks at the time of final return from this function.

Ah, good catch! I'll send v2 with this fixed up.

								Honza
Theodore Ts'o Sept. 24, 2020, 2:58 p.m. UTC | #3
On Wed, Sep 16, 2020 at 07:38:58PM +0800, Ye Bin wrote:
> From: Jan Kara <jack@suse.cz>
> 
> ext4_mb_discard_group_preallocations() can be releasing group lock with
> preallocations accumulated on its local list. Thus although
> discard_pa_seq was incremented and concurrent allocating processes will
> be retrying allocations, it can happen that premature ENOSPC error is
> returned because blocks used for preallocations are not available for
> reuse yet. Make sure we always free locally accumulated preallocations
> before releasing group lock.
> 
> Fixes: 07b5b8e1ac40 ("ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling")
> Signed-off-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Ye Bin <yebin10@huawei.com>
> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>

Thanks, applied.

					- Ted
Theodore Ts'o Sept. 24, 2020, 3 p.m. UTC | #4
On Fri, Sep 18, 2020 at 11:56:53AM +0200, Jan Kara wrote:
> On Fri 18-09-20 14:37:15, Ritesh Harjani wrote:
> > 
> > 
> > On 9/16/20 5:08 PM, Ye Bin wrote:
> > > From: Jan Kara <jack@suse.cz>
> > > 
> > > ext4_mb_discard_group_preallocations() can be releasing group lock with
> > > preallocations accumulated on its local list. Thus although
> > > discard_pa_seq was incremented and concurrent allocating processes will
> > > be retrying allocations, it can happen that premature ENOSPC error is
> > > returned because blocks used for preallocations are not available for
> > > reuse yet. Make sure we always free locally accumulated preallocations
> > > before releasing group lock.
> > > 
> > > Fixes: 07b5b8e1ac40 ("ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling")
> > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > Signed-off-by: Ye Bin <yebin10@huawei.com>
> > > Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
> ...
> > > +	/* if we still need more blocks and some PAs were used, try again */
> > > +	if (free < needed && busy) {
> > > +		ext4_unlock_group(sb, group);
> > > +		cond_resched();
> > > +		busy = 0;
> > > +		/* Make sure we increment discard_pa_seq again */
> > > +		needed -= free;
> > > +		free = 0;
> > 
> > Oops sorry about getting back to this.
> > But if we are making free 0 here so we may return a wrong free value
> > when we return from this function. We should fix that by also accounting
> > previous freed blocks at the time of final return from this function.
> 
> Ah, good catch! I'll send v2 with this fixed up.

Did you send a v2 of this patch?  I can't find it in my inbox...

Thanks!

					- Ted
Jan Kara Sept. 24, 2020, 3:12 p.m. UTC | #5
On Thu 24-09-20 11:00:34, Theodore Y. Ts'o wrote:
> On Fri, Sep 18, 2020 at 11:56:53AM +0200, Jan Kara wrote:
> > On Fri 18-09-20 14:37:15, Ritesh Harjani wrote:
> > > 
> > > 
> > > On 9/16/20 5:08 PM, Ye Bin wrote:
> > > > From: Jan Kara <jack@suse.cz>
> > > > 
> > > > ext4_mb_discard_group_preallocations() can be releasing group lock with
> > > > preallocations accumulated on its local list. Thus although
> > > > discard_pa_seq was incremented and concurrent allocating processes will
> > > > be retrying allocations, it can happen that premature ENOSPC error is
> > > > returned because blocks used for preallocations are not available for
> > > > reuse yet. Make sure we always free locally accumulated preallocations
> > > > before releasing group lock.
> > > > 
> > > > Fixes: 07b5b8e1ac40 ("ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling")
> > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > Signed-off-by: Ye Bin <yebin10@huawei.com>
> > > > Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
> > ...
> > > > +	/* if we still need more blocks and some PAs were used, try again */
> > > > +	if (free < needed && busy) {
> > > > +		ext4_unlock_group(sb, group);
> > > > +		cond_resched();
> > > > +		busy = 0;
> > > > +		/* Make sure we increment discard_pa_seq again */
> > > > +		needed -= free;
> > > > +		free = 0;
> > > 
> > > Oops sorry about getting back to this.
> > > But if we are making free 0 here so we may return a wrong free value
> > > when we return from this function. We should fix that by also accounting
> > > previous freed blocks at the time of final return from this function.
> > 
> > Ah, good catch! I'll send v2 with this fixed up.
> 
> Did you send a v2 of this patch?  I can't find it in my inbox...

Yeah, somehow I forgot to send it. I've sent it now:

https://lore.kernel.org/linux-ext4/20200924150959.4335-1-jack@suse.cz

Note that Ye Bin's patch will need trivial context fixup after applying
this...

								Honza
diff mbox series

Patch

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 132c118d12e1..f736819a381b 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4215,22 +4215,6 @@  ext4_mb_discard_group_preallocations(struct super_block *sb,
 		list_add(&pa->u.pa_tmp_list, &list);
 	}
 
-	/* if we still need more blocks and some PAs were used, try again */
-	if (free < needed && busy) {
-		busy = 0;
-		ext4_unlock_group(sb, group);
-		cond_resched();
-		goto repeat;
-	}
-
-	/* found anything to free? */
-	if (list_empty(&list)) {
-		BUG_ON(free != 0);
-		mb_debug(sb, "Someone else may have freed PA for this group %u\n",
-			 group);
-		goto out;
-	}
-
 	/* now free all selected PAs */
 	list_for_each_entry_safe(pa, tmp, &list, u.pa_tmp_list) {
 
@@ -4248,7 +4232,17 @@  ext4_mb_discard_group_preallocations(struct super_block *sb,
 		call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback);
 	}
 
-out:
+	/* if we still need more blocks and some PAs were used, try again */
+	if (free < needed && busy) {
+		ext4_unlock_group(sb, group);
+		cond_resched();
+		busy = 0;
+		/* Make sure we increment discard_pa_seq again */
+		needed -= free;
+		free = 0;
+		goto repeat;
+	}
+
 	ext4_unlock_group(sb, group);
 	ext4_mb_unload_buddy(&e4b);
 	put_bh(bitmap_bh);