[1/5] ext4: fix race with setting free_inode/clusters_counter

Message ID 1527594317-9214-1-git-send-email-wshilong1991@gmail.com
State Awaiting Upstream
Headers show
Series
  • [1/5] ext4: fix race with setting free_inode/clusters_counter
Related show

Commit Message

Wang Shilong May 29, 2018, 11:45 a.m.
From: Wang Shilong <wshilong@ddn.com>

Whenever we hit block or inode bitmap corruptions we set
bit and then reduce this block group free inode/clusters
counter to expose right available space.

However some of ext4_mark_group_bitmap_corrupted() is called
inside group spinlock, some are not, this could make it happen
that we double reduce one block group free counters from system.

Always hold group spinlock for it could fix it, but it looks
a little heavy, we could use test_and_set_bit() to fix race
problems here.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
---
 fs/ext4/super.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

Comments

Andreas Dilger June 4, 2018, 6 p.m. | #1
On May 29, 2018, at 5:45 AM, Wang Shilong <wangshilong1991@gmail.com> wrote:
> 
> From: Wang Shilong <wshilong@ddn.com>
> 
> Whenever we hit block or inode bitmap corruptions we set
> bit and then reduce this block group free inode/clusters
> counter to expose right available space.
> 
> However some of ext4_mark_group_bitmap_corrupted() is called
> inside group spinlock, some are not, this could make it happen
> that we double reduce one block group free counters from system.
> 
> Always hold group spinlock for it could fix it, but it looks
> a little heavy, we could use test_and_set_bit() to fix race
> problems here.
> 
> Signed-off-by: Wang Shilong <wshilong@ddn.com>

Reviewed-by: Andreas Dilger <adilger@dilger.ca>

> ---
> fs/ext4/super.c | 22 +++++++++++-----------
> 1 file changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index c1c5c87..d6fa6cf 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -770,26 +770,26 @@ void ext4_mark_group_bitmap_corrupted(struct super_block *sb,
> 	struct ext4_sb_info *sbi = EXT4_SB(sb);
> 	struct ext4_group_info *grp = ext4_get_group_info(sb, group);
> 	struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group, NULL);
> +	int ret;
> 
> -	if ((flags & EXT4_GROUP_INFO_BBITMAP_CORRUPT) &&
> -	    !EXT4_MB_GRP_BBITMAP_CORRUPT(grp)) {
> -		percpu_counter_sub(&sbi->s_freeclusters_counter,
> -					grp->bb_free);
> -		set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT,
> -			&grp->bb_state);
> +	if (flags & EXT4_GROUP_INFO_BBITMAP_CORRUPT) {
> +		ret = ext4_test_and_set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT,
> +					    &grp->bb_state);
> +		if (!ret)
> +			percpu_counter_sub(&sbi->s_freeclusters_counter,
> +					   grp->bb_free);
> 	}
> 
> -	if ((flags & EXT4_GROUP_INFO_IBITMAP_CORRUPT) &&
> -	    !EXT4_MB_GRP_IBITMAP_CORRUPT(grp)) {
> -		if (gdp) {
> +	if (flags & EXT4_GROUP_INFO_IBITMAP_CORRUPT) {
> +		ret = ext4_test_and_set_bit(EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT,
> +					    &grp->bb_state);
> +		if (!ret && gdp) {
> 			int count;
> 
> 			count = ext4_free_inodes_count(sb, gdp);
> 			percpu_counter_sub(&sbi->s_freeinodes_counter,
> 					   count);
> 		}
> -		set_bit(EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT,
> -			&grp->bb_state);
> 	}
> }
> 
> --
> 1.8.3.1
> 


Cheers, Andreas
Wang Shilong July 25, 2018, 12:38 a.m. | #2
Hi Ted,

Would you please consider this patchset for new merge windows?
They are all reviewed by Andreas at least.

Thanks,
Shilong

On Tue, Jun 5, 2018 at 2:00 AM, Andreas Dilger <adilger@dilger.ca> wrote:

> On May 29, 2018, at 5:45 AM, Wang Shilong <wangshilong1991@gmail.com>
> wrote:
> >
> > From: Wang Shilong <wshilong@ddn.com>
> >
> > Whenever we hit block or inode bitmap corruptions we set
> > bit and then reduce this block group free inode/clusters
> > counter to expose right available space.
> >
> > However some of ext4_mark_group_bitmap_corrupted() is called
> > inside group spinlock, some are not, this could make it happen
> > that we double reduce one block group free counters from system.
> >
> > Always hold group spinlock for it could fix it, but it looks
> > a little heavy, we could use test_and_set_bit() to fix race
> > problems here.
> >
> > Signed-off-by: Wang Shilong <wshilong@ddn.com>
>
> Reviewed-by: Andreas Dilger <adilger@dilger.ca>
>
> > ---
> > fs/ext4/super.c | 22 +++++++++++-----------
> > 1 file changed, 11 insertions(+), 11 deletions(-)
> >
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index c1c5c87..d6fa6cf 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -770,26 +770,26 @@ void ext4_mark_group_bitmap_corrupted(struct
> super_block *sb,
> >       struct ext4_sb_info *sbi = EXT4_SB(sb);
> >       struct ext4_group_info *grp = ext4_get_group_info(sb, group);
> >       struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group, NULL);
> > +     int ret;
> >
> > -     if ((flags & EXT4_GROUP_INFO_BBITMAP_CORRUPT) &&
> > -         !EXT4_MB_GRP_BBITMAP_CORRUPT(grp)) {
> > -             percpu_counter_sub(&sbi->s_freeclusters_counter,
> > -                                     grp->bb_free);
> > -             set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT,
> > -                     &grp->bb_state);
> > +     if (flags & EXT4_GROUP_INFO_BBITMAP_CORRUPT) {
> > +             ret = ext4_test_and_set_bit(EXT4_
> GROUP_INFO_BBITMAP_CORRUPT_BIT,
> > +                                         &grp->bb_state);
> > +             if (!ret)
> > +                     percpu_counter_sub(&sbi->s_freeclusters_counter,
> > +                                        grp->bb_free);
> >       }
> >
> > -     if ((flags & EXT4_GROUP_INFO_IBITMAP_CORRUPT) &&
> > -         !EXT4_MB_GRP_IBITMAP_CORRUPT(grp)) {
> > -             if (gdp) {
> > +     if (flags & EXT4_GROUP_INFO_IBITMAP_CORRUPT) {
> > +             ret = ext4_test_and_set_bit(EXT4_
> GROUP_INFO_IBITMAP_CORRUPT_BIT,
> > +                                         &grp->bb_state);
> > +             if (!ret && gdp) {
> >                       int count;
> >
> >                       count = ext4_free_inodes_count(sb, gdp);
> >                       percpu_counter_sub(&sbi->s_freeinodes_counter,
> >                                          count);
> >               }
> > -             set_bit(EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT,
> > -                     &grp->bb_state);
> >       }
> > }
> >
> > --
> > 1.8.3.1
> >
>
>
> Cheers, Andreas
>
>
>
>
>
>
<div dir="ltr"><div class="gmail_extra">Hi Ted,</div><div class="gmail_extra"><br></div><div class="gmail_extra">Would you please consider this patchset for new merge windows?</div><div class="gmail_extra">They are all reviewed by Andreas at least.</div><div class="gmail_extra"><div class="gmail_quote"><br></div><div class="gmail_quote">Thanks,</div><div class="gmail_quote">Shilong</div><div class="gmail_quote"><br></div><div class="gmail_quote">On Tue, Jun 5, 2018 at 2:00 AM, Andreas Dilger <span dir="ltr">&lt;<a href="mailto:adilger@dilger.ca" target="_blank">adilger@dilger.ca</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On May 29, 2018, at 5:45 AM, Wang Shilong &lt;<a href="mailto:wangshilong1991@gmail.com">wangshilong1991@gmail.com</a>&gt; wrote:<br>
&gt; <br>
&gt; From: Wang Shilong &lt;<a href="mailto:wshilong@ddn.com">wshilong@ddn.com</a>&gt;<br>
&gt; <br>
&gt; Whenever we hit block or inode bitmap corruptions we set<br>
&gt; bit and then reduce this block group free inode/clusters<br>
&gt; counter to expose right available space.<br>
&gt; <br>
&gt; However some of ext4_mark_group_bitmap_<wbr>corrupted() is called<br>
&gt; inside group spinlock, some are not, this could make it happen<br>
&gt; that we double reduce one block group free counters from system.<br>
&gt; <br>
&gt; Always hold group spinlock for it could fix it, but it looks<br>
&gt; a little heavy, we could use test_and_set_bit() to fix race<br>
&gt; problems here.<br>
&gt; <br>
&gt; Signed-off-by: Wang Shilong &lt;<a href="mailto:wshilong@ddn.com">wshilong@ddn.com</a>&gt;<br>
<br>
Reviewed-by: Andreas Dilger &lt;<a href="mailto:adilger@dilger.ca">adilger@dilger.ca</a>&gt;<br>
<br>
&gt; ---<br>
&gt; fs/ext4/super.c | 22 +++++++++++-----------<br>
&gt; 1 file changed, 11 insertions(+), 11 deletions(-)<br>
&gt; <br>
&gt; diff --git a/fs/ext4/super.c b/fs/ext4/super.c<br>
&gt; index c1c5c87..d6fa6cf 100644<br>
&gt; --- a/fs/ext4/super.c<br>
&gt; +++ b/fs/ext4/super.c<br>
&gt; @@ -770,26 +770,26 @@ void ext4_mark_group_bitmap_<wbr>corrupted(struct super_block *sb,<br>
&gt;       struct ext4_sb_info *sbi = EXT4_SB(sb);<br>
&gt;       struct ext4_group_info *grp = ext4_get_group_info(sb, group);<br>
&gt;       struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group, NULL);<br>
&gt; +     int ret;<br>
&gt; <br>
&gt; -     if ((flags &amp; EXT4_GROUP_INFO_BBITMAP_<wbr>CORRUPT) &amp;&amp;<br>
&gt; -         !EXT4_MB_GRP_BBITMAP_CORRUPT(<wbr>grp)) {<br>
&gt; -             percpu_counter_sub(&amp;sbi-&gt;s_<wbr>freeclusters_counter,<br>
&gt; -                                     grp-&gt;bb_free);<br>
&gt; -             set_bit(EXT4_GROUP_INFO_<wbr>BBITMAP_CORRUPT_BIT,<br>
&gt; -                     &amp;grp-&gt;bb_state);<br>
&gt; +     if (flags &amp; EXT4_GROUP_INFO_BBITMAP_<wbr>CORRUPT) {<br>
&gt; +             ret = ext4_test_and_set_bit(EXT4_<wbr>GROUP_INFO_BBITMAP_CORRUPT_<wbr>BIT,<br>
&gt; +                                         &amp;grp-&gt;bb_state);<br>
&gt; +             if (!ret)<br>
&gt; +                     percpu_counter_sub(&amp;sbi-&gt;s_<wbr>freeclusters_counter,<br>
&gt; +                                        grp-&gt;bb_free);<br>
&gt;       }<br>
&gt; <br>
&gt; -     if ((flags &amp; EXT4_GROUP_INFO_IBITMAP_<wbr>CORRUPT) &amp;&amp;<br>
&gt; -         !EXT4_MB_GRP_IBITMAP_CORRUPT(<wbr>grp)) {<br>
&gt; -             if (gdp) {<br>
&gt; +     if (flags &amp; EXT4_GROUP_INFO_IBITMAP_<wbr>CORRUPT) {<br>
&gt; +             ret = ext4_test_and_set_bit(EXT4_<wbr>GROUP_INFO_IBITMAP_CORRUPT_<wbr>BIT,<br>
&gt; +                                         &amp;grp-&gt;bb_state);<br>
&gt; +             if (!ret &amp;&amp; gdp) {<br>
&gt;                       int count;<br>
&gt; <br>
&gt;                       count = ext4_free_inodes_count(sb, gdp);<br>
&gt;                       percpu_counter_sub(&amp;sbi-&gt;s_<wbr>freeinodes_counter,<br>
&gt;                                          count);<br>
&gt;               }<br>
&gt; -             set_bit(EXT4_GROUP_INFO_<wbr>IBITMAP_CORRUPT_BIT,<br>
&gt; -                     &amp;grp-&gt;bb_state);<br>
&gt;       }<br>
&gt; }<br>
&gt; <br>
&gt; --<br>
&gt; 1.8.3.1<br>
&gt; <br>
<br>
<br>
Cheers, Andreas<br>
<br>
<br>
<br>
<br>
<br>
</blockquote></div><br></div></div>
Theodore Y. Ts'o July 30, 2018, 1:28 a.m. | #3
On Wed, Jul 25, 2018 at 08:38:19AM +0800, Wang Shilong wrote:
> Hi Ted,
> 
> Would you please consider this patchset for new merge windows?
> They are all reviewed by Andreas at least.

The commit descriptions don't explain *why* things are changing, and
in many cases you are doing lots of refactoring that is hard to
validate.  So I've been putting off this patch.  Andreas tried to tell
me more of the context of what is going on, so it's on my list.  But
patches that are easy to review and understang get processed first.

	     	      	 	    - Ted
Theodore Y. Ts'o July 30, 2018, 1:30 a.m. | #4
On Tue, May 29, 2018 at 08:45:13PM +0900, Wang Shilong wrote:
> From: Wang Shilong <wshilong@ddn.com>
> 
> Whenever we hit block or inode bitmap corruptions we set
> bit and then reduce this block group free inode/clusters
> counter to expose right available space.
> 
> However some of ext4_mark_group_bitmap_corrupted() is called
> inside group spinlock, some are not, this could make it happen
> that we double reduce one block group free counters from system.
> 
> Always hold group spinlock for it could fix it, but it looks
> a little heavy, we could use test_and_set_bit() to fix race
> problems here.
> 
> Signed-off-by: Wang Shilong <wshilong@ddn.com>

Applied, thanks.

						- Ted

Patch

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index c1c5c87..d6fa6cf 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -770,26 +770,26 @@  void ext4_mark_group_bitmap_corrupted(struct super_block *sb,
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
 	struct ext4_group_info *grp = ext4_get_group_info(sb, group);
 	struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group, NULL);
+	int ret;
 
-	if ((flags & EXT4_GROUP_INFO_BBITMAP_CORRUPT) &&
-	    !EXT4_MB_GRP_BBITMAP_CORRUPT(grp)) {
-		percpu_counter_sub(&sbi->s_freeclusters_counter,
-					grp->bb_free);
-		set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT,
-			&grp->bb_state);
+	if (flags & EXT4_GROUP_INFO_BBITMAP_CORRUPT) {
+		ret = ext4_test_and_set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT,
+					    &grp->bb_state);
+		if (!ret)
+			percpu_counter_sub(&sbi->s_freeclusters_counter,
+					   grp->bb_free);
 	}
 
-	if ((flags & EXT4_GROUP_INFO_IBITMAP_CORRUPT) &&
-	    !EXT4_MB_GRP_IBITMAP_CORRUPT(grp)) {
-		if (gdp) {
+	if (flags & EXT4_GROUP_INFO_IBITMAP_CORRUPT) {
+		ret = ext4_test_and_set_bit(EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT,
+					    &grp->bb_state);
+		if (!ret && gdp) {
 			int count;
 
 			count = ext4_free_inodes_count(sb, gdp);
 			percpu_counter_sub(&sbi->s_freeinodes_counter,
 					   count);
 		}
-		set_bit(EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT,
-			&grp->bb_state);
 	}
 }