Message ID | 20120607042756.GA30776@thunk.org |
---|---|
State | Superseded, archived |
Headers | show |
On Thu, Jun 07, 2012 at 12:27:56AM -0400, Ted Ts'o wrote: > I think this patch should fix things. Could you give it a try (after > re-enabling uninit_bg and then running e2fsck on the file systems)? > > Many thanks, > > - Ted > > commit b6b218940d47f20fc14e4d203930c575be0c65bf > Author: Theodore Ts'o <tytso@mit.edu> > Date: Thu Jun 7 00:26:56 2012 -0400 > > ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bg > > Ext3 filesystems that are converted to use as many ext4 file systems > as possible Just to clarify the description a bit: Did you mean "..as many ext4 filesystem _features_ as possible" ? > will enable uninit_bg to speed up e2fsck times. File > systems with an native ext3 layout of inode tables and block Perhaps "On file systems with a native ext3 layout..." ? --D > allocation bitmaps (as opposed to ext4's flex_bg layout), ext4 would > sometimes incorrectly calculate the number of free blocks in an > uninitialized block group. As a result, when first allocating a block > in block group marked uninitalized, ext4 would incorrectly report that > the file system was corrupt: > > EXT4-fs error (device dm-42): ext4_mb_generate_buddy:741: group 1699, 32254 clusters in bitmap, 32258 in gd > > Fix the calculation in ext4_num_overhead_clusters() to avoid this > problem. > > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> > > diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c > index 99b6324..cee7812 100644 > --- a/fs/ext4/balloc.c > +++ b/fs/ext4/balloc.c > @@ -90,8 +90,8 @@ unsigned ext4_num_overhead_clusters(struct super_block *sb, > * unusual file system layouts. > */ > if (ext4_block_in_group(sb, ext4_block_bitmap(sb, gdp), block_group)) { > - block_cluster = EXT4_B2C(sbi, (start - > - ext4_block_bitmap(sb, gdp))); > + block_cluster = EXT4_B2C(sbi, > + ext4_block_bitmap(sb, gdp) - start); > if (block_cluster < num_clusters) > block_cluster = -1; > else if (block_cluster == num_clusters) { > @@ -102,7 +102,7 @@ unsigned ext4_num_overhead_clusters(struct super_block *sb, > > if (ext4_block_in_group(sb, ext4_inode_bitmap(sb, gdp), block_group)) { > inode_cluster = EXT4_B2C(sbi, > - start - ext4_inode_bitmap(sb, gdp)); > + ext4_inode_bitmap(sb, gdp) - start); > if (inode_cluster < num_clusters) > inode_cluster = -1; > else if (inode_cluster == num_clusters) { > @@ -114,7 +114,7 @@ unsigned ext4_num_overhead_clusters(struct super_block *sb, > itbl_blk = ext4_inode_table(sb, gdp); > for (i = 0; i < sbi->s_itb_per_group; i++) { > if (ext4_block_in_group(sb, itbl_blk + i, block_group)) { > - c = EXT4_B2C(sbi, start - itbl_blk + i); > + c = EXT4_B2C(sbi, itbl_blk + i - start); > if ((c < num_clusters) || (c == inode_cluster) || > (c == block_cluster) || (c == itbl_cluster)) > continue; > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 7, 2012 at 3:27 PM, djwong <djwong@us.ibm.com> wrote: > On Thu, Jun 07, 2012 at 12:27:56AM -0400, Ted Ts'o wrote: >> I think this patch should fix things. Could you give it a try (after >> re-enabling uninit_bg and then running e2fsck on the file systems)? >> >> Many thanks, >> >> - Ted >> >> commit b6b218940d47f20fc14e4d203930c575be0c65bf >> Author: Theodore Ts'o <tytso@mit.edu> >> Date: Thu Jun 7 00:26:56 2012 -0400 >> >> ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bg >> >> Ext3 filesystems that are converted to use as many ext4 file systems >> as possible > > Just to clarify the description a bit: > > Did you mean "..as many ext4 filesystem _features_ as possible" ? > >> will enable uninit_bg to speed up e2fsck times. File >> systems with an native ext3 layout of inode tables and block > > Perhaps "On file systems with a native ext3 layout..." ? FWIW, I was building the filesystem that triggered this as ext4: mkfs.ext4 -T default -m 0 -O ^huge_file,^flex_bg -E discard,lazy_itable_init /dev/mapper/... -Kees
On Thu, Jun 07, 2012 at 03:40:40PM -0700, Kees Cook wrote: > > FWIW, I was building the filesystem that triggered this as ext4: > > mkfs.ext4 -T default -m 0 -O ^huge_file,^flex_bg -E > discard,lazy_itable_init /dev/mapper/... Ouy of curiosity, was there a reason you chose those particular file system parameters? It's a surprising set, because if you're starting with a fresh file system, enabling flex_bg produces a more optimal file system layout. The reason why I wrote the commit description way I did was because it seemed to be the most likely way that someone would trip across this bug. We didn't notice this for as long as we did since most of the users of ext4 (especially those who use community distributions) have been using freshly mkfs'ed ext4 file systems. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 7, 2012 at 3:54 PM, Ted Ts'o <tytso@mit.edu> wrote: > On Thu, Jun 07, 2012 at 03:40:40PM -0700, Kees Cook wrote: >> >> FWIW, I was building the filesystem that triggered this as ext4: >> >> mkfs.ext4 -T default -m 0 -O ^huge_file,^flex_bg -E >> discard,lazy_itable_init /dev/mapper/... > > Ouy of curiosity, was there a reason you chose those particular file > system parameters? It's a surprising set, because if you're starting > with a fresh file system, enabling flex_bg produces a more optimal > file system layout. The intent of this was to create a small (64M) initial filesystem as fast as possible that could be resized (to 3G) on the fly later. (The performance of the filesystem did not need to be optimized, just the speed of creation so that the create+mount would happen as fast as possible, to unblock things waiting for this filesystem to exist.) In the more common situation, it was built also with: -b 4096 ... -E ...,resize=MMM ... [device] NNN where MMM > NNN. And an online resize would be started a bit after it was created, growing it up to MMM. -Kees
On 2012-06-07, at 5:38 PM, Kees Cook wrote: > On Thu, Jun 7, 2012 at 3:54 PM, Ted Ts'o <tytso@mit.edu> wrote: >> On Thu, Jun 07, 2012 at 03:40:40PM -0700, Kees Cook wrote: >>> >>> FWIW, I was building the filesystem that triggered this as ext4: >>> >>> mkfs.ext4 -T default -m 0 -O ^huge_file,^flex_bg -E >>> discard,lazy_itable_init /dev/mapper/... >> >> Ouy of curiosity, was there a reason you chose those particular file >> system parameters? It's a surprising set, because if you're starting >> with a fresh file system, enabling flex_bg produces a more optimal >> file system layout. > > The intent of this was to create a small (64M) initial filesystem as > fast as possible that could be resized (to 3G) on the fly later. (The > performance of the filesystem did not need to be optimized, just the > speed of creation so that the create+mount would happen as fast as > possible, to unblock things waiting for this filesystem to exist.) Actually, flex_bg should improve mke2fs times, because the inode and block bitmaps are allocated contiguously on disk. Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c index 99b6324..cee7812 100644 --- a/fs/ext4/balloc.c +++ b/fs/ext4/balloc.c @@ -90,8 +90,8 @@ unsigned ext4_num_overhead_clusters(struct super_block *sb, * unusual file system layouts. */ if (ext4_block_in_group(sb, ext4_block_bitmap(sb, gdp), block_group)) { - block_cluster = EXT4_B2C(sbi, (start - - ext4_block_bitmap(sb, gdp))); + block_cluster = EXT4_B2C(sbi, + ext4_block_bitmap(sb, gdp) - start); if (block_cluster < num_clusters) block_cluster = -1; else if (block_cluster == num_clusters) { @@ -102,7 +102,7 @@ unsigned ext4_num_overhead_clusters(struct super_block *sb, if (ext4_block_in_group(sb, ext4_inode_bitmap(sb, gdp), block_group)) { inode_cluster = EXT4_B2C(sbi, - start - ext4_inode_bitmap(sb, gdp)); + ext4_inode_bitmap(sb, gdp) - start); if (inode_cluster < num_clusters) inode_cluster = -1; else if (inode_cluster == num_clusters) { @@ -114,7 +114,7 @@ unsigned ext4_num_overhead_clusters(struct super_block *sb, itbl_blk = ext4_inode_table(sb, gdp); for (i = 0; i < sbi->s_itb_per_group; i++) { if (ext4_block_in_group(sb, itbl_blk + i, block_group)) { - c = EXT4_B2C(sbi, start - itbl_blk + i); + c = EXT4_B2C(sbi, itbl_blk + i - start); if ((c < num_clusters) || (c == inode_cluster) || (c == block_cluster) || (c == itbl_cluster)) continue;
I think this patch should fix things. Could you give it a try (after re-enabling uninit_bg and then running e2fsck on the file systems)? Many thanks, - Ted commit b6b218940d47f20fc14e4d203930c575be0c65bf Author: Theodore Ts'o <tytso@mit.edu> Date: Thu Jun 7 00:26:56 2012 -0400 ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bg Ext3 filesystems that are converted to use as many ext4 file systems as possible will enable uninit_bg to speed up e2fsck times. File systems with an native ext3 layout of inode tables and block allocation bitmaps (as opposed to ext4's flex_bg layout), ext4 would sometimes incorrectly calculate the number of free blocks in an uninitialized block group. As a result, when first allocating a block in block group marked uninitalized, ext4 would incorrectly report that the file system was corrupt: EXT4-fs error (device dm-42): ext4_mb_generate_buddy:741: group 1699, 32254 clusters in bitmap, 32258 in gd Fix the calculation in ext4_num_overhead_clusters() to avoid this problem. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html