diff mbox

fix bogus BUG_ONs in in mballoc code

Message ID 49B958A1.6060805@redhat.com
State Accepted, archived
Headers show

Commit Message

Eric Sandeen March 12, 2009, 6:46 p.m. UTC
Thiemo Nagel reported that:

# dd if=/dev/zero of=image.ext4 bs=1M count=2
# mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
  -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
# mount -o loop image.ext4 mnt/
# dd if=/dev/zero of=mnt/file

oopsed, with a BUG_ON in ext4_mb_normalize_request because
size == EXT4_BLOCKS_PER_GROUP

It appears to me (esp. after talking to Andreas) that the BUG_ON
is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
be allowed, though larger sizes do indicate a problem.

Fix that an another (apparently rare) codepath with a similar check.

Reported-by: Thiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
--


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Theodore Ts'o March 13, 2009, 12:38 a.m. UTC | #1
On Thu, Mar 12, 2009 at 01:46:57PM -0500, Eric Sandeen wrote:
> Thiemo Nagel reported that:
> 
> # dd if=/dev/zero of=image.ext4 bs=1M count=2
> # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
>   -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
> # mount -o loop image.ext4 mnt/
> # dd if=/dev/zero of=mnt/file
> 
> oopsed, with a BUG_ON in ext4_mb_normalize_request because
> size == EXT4_BLOCKS_PER_GROUP
> 
> It appears to me (esp. after talking to Andreas) that the BUG_ON
> is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
> be allowed, though larger sizes do indicate a problem.

Clearly we should make this change to avoid the BUG_ON; but stupid
question, why shouldn't we allow sizes larger than
EXT4_BLOCKS_PER_GROUP?  

Especially with flex_bg, it is possible for an allocation size >
EXT4_BLOCKS_PER_GROUP to be satisifed, especially if the filesystem
isn't that full yet, and it might even make sense to request a larger
allocation for video files that are getting preallocated, for
example....

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o March 13, 2009, 1:09 a.m. UTC | #2
On Thu, Mar 12, 2009 at 01:46:57PM -0500, Eric Sandeen wrote:
> Thiemo Nagel reported that:
> 
> # dd if=/dev/zero of=image.ext4 bs=1M count=2
> # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
>   -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
> # mount -o loop image.ext4 mnt/
> # dd if=/dev/zero of=mnt/file
> 
> oopsed, with a BUG_ON in ext4_mb_normalize_request because
> size == EXT4_BLOCKS_PER_GROUP
> 
> It appears to me (esp. after talking to Andreas) that the BUG_ON
> is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
> be allowed, though larger sizes do indicate a problem.
> 
> Fix that an another (apparently rare) codepath with a similar check.

Hmm.... is this at all likely to happen with a standard ext4
filesystem parameters?  Or was this triggered because of the
artifially set -g 512 parameter?  The question is whether we should
try pushing this to Linus at this point, or let this wait until the
merge window opens.

Opinions?

						= Ted
<
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen March 13, 2009, 2:08 a.m. UTC | #3
Theodore Tso wrote:
> On Thu, Mar 12, 2009 at 01:46:57PM -0500, Eric Sandeen wrote:
>> Thiemo Nagel reported that:
>>
>> # dd if=/dev/zero of=image.ext4 bs=1M count=2
>> # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
>>   -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
>> # mount -o loop image.ext4 mnt/
>> # dd if=/dev/zero of=mnt/file
>>
>> oopsed, with a BUG_ON in ext4_mb_normalize_request because
>> size == EXT4_BLOCKS_PER_GROUP
>>
>> It appears to me (esp. after talking to Andreas) that the BUG_ON
>> is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
>> be allowed, though larger sizes do indicate a problem.
>>
>> Fix that an another (apparently rare) codepath with a similar check.
> 
> Hmm.... is this at all likely to happen with a standard ext4
> filesystem parameters?  Or was this triggered because of the
> artifially set -g 512 parameter?  The question is whether we should
> try pushing this to Linus at this point, or let this wait until the
> merge window opens.
> 
> Opinions?
> 
> 						= Ted
> <

I wondered the same thing, and will admit to probably not digging deep
enough on this one.  I think the fix is ok as is but you are asking the
right questions.  Maybe a clusterfs mballoc expert can chime in and save
us some time? :)

-=Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andreas Dilger March 13, 2009, 11:09 a.m. UTC | #4
On Mar 12, 2009  20:38 -0400, Theodore Ts'o wrote:
> On Thu, Mar 12, 2009 at 01:46:57PM -0500, Eric Sandeen wrote:
> > Thiemo Nagel reported that:
> > 
> > # dd if=/dev/zero of=image.ext4 bs=1M count=2
> > # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
> >   -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
> > # mount -o loop image.ext4 mnt/
> > # dd if=/dev/zero of=mnt/file
> > 
> > oopsed, with a BUG_ON in ext4_mb_normalize_request because
> > size == EXT4_BLOCKS_PER_GROUP
> > 
> > It appears to me (esp. after talking to Andreas) that the BUG_ON
> > is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
> > be allowed, though larger sizes do indicate a problem.
> 
> Clearly we should make this change to avoid the BUG_ON; but stupid
> question, why shouldn't we allow sizes larger than
> EXT4_BLOCKS_PER_GROUP?  
> 
> Especially with flex_bg, it is possible for an allocation size >
> EXT4_BLOCKS_PER_GROUP to be satisifed, especially if the filesystem
> isn't that full yet, and it might even make sense to request a larger
> allocation for video files that are getting preallocated, for
> example....

There are two reasons that we can't have too-large mballoc allocations:
- mballoc works on a per-group basis, so the most blocks that it can
  allocate at a time is BLOCKS_PER_GROUP.
- the on-disk extent format cannot map more than 128MB at a time, which
  is equal to the group size at 4kB blocksize.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-2.6/fs/ext4/mballoc.c
===================================================================
--- linux-2.6.orig/fs/ext4/mballoc.c
+++ linux-2.6/fs/ext4/mballoc.c
@@ -1447,7 +1447,7 @@  static void ext4_mb_measure_extent(struc
 	struct ext4_free_extent *gex = &ac->ac_g_ex;
 
 	BUG_ON(ex->fe_len <= 0);
-	BUG_ON(ex->fe_len >= EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
+	BUG_ON(ex->fe_len > EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
 	BUG_ON(ex->fe_start >= EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
 	BUG_ON(ac->ac_status != AC_STATUS_CONTINUE);
 
@@ -3292,7 +3292,7 @@  ext4_mb_normalize_request(struct ext4_al
 	}
 	BUG_ON(start + size <= ac->ac_o_ex.fe_logical &&
 			start > ac->ac_o_ex.fe_logical);
-	BUG_ON(size <= 0 || size >= EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
+	BUG_ON(size <= 0 || size > EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
 
 	/* now prepare goal request */