Patchwork [13/54] e2fsck: Verify inode bitmap checksum

login
register
mail settings
Submitter Darrick J. Wong
Date March 6, 2012, 11:58 p.m.
Message ID <20120306235844.11945.9126.stgit@elm3b70.beaverton.ibm.com>
Download mbox | patch
Permalink /patch/145071/
State Deferred
Delegated to: Theodore Ts'o
Headers show

Comments

Darrick J. Wong - March 6, 2012, 11:58 p.m.
Rewrite the block bitmap when the checksum doesn't match.  This is ok since
e2fsck will have already computed the correct inode bitmap.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
---
 e2fsck/pass5.c   |   64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 e2fsck/problem.c |    5 ++++
 e2fsck/problem.h |    3 +++
 3 files changed, 72 insertions(+), 0 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - March 10, 2012, 5:46 a.m.
On Tue, Mar 06, 2012 at 03:58:44PM -0800, Darrick J. Wong wrote:
> Rewrite the block bitmap when the checksum doesn't match.  This is ok since
> e2fsck will have already computed the correct inode bitmap.
> 
> Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>

> @@ -74,6 +77,67 @@ void e2fsck_pass5(e2fsck_t ctx)
>  	print_resource_track(ctx, _("Pass 5"), &rtrack, ctx->fs->io);
>  }
>  
> +static void check_inode_bitmap_checksum(e2fsck_t ctx)
> +{
> +	struct problem_context	pctx;
> +	struct ext4_group_desc	*gdp;
> +	char		*buf;
> +	dgrp_t		i;
> +	int		nbytes;
> +	ext2_ino_t	ino_itr;
> +	errcode_t	retval;
> +	int		csum_flag = 0;
> +
> +	/* If bitmap is dirty from being fixed, checksum will be corrected */
> +	if (ext2fs_test_ib_dirty(ctx->fs))
> +		return;
> +
> +	nbytes = (size_t)(EXT2_INODES_PER_GROUP(ctx->fs->super) / 8);
> +	retval = ext2fs_get_memalign(ctx->fs->blocksize, ctx->fs->blocksize,
> +				     &buf);
> +	if (retval) {
> +		com_err(ctx->program_name, 0,
> +		    _("check_inode_bitmap_checksum: Memory allocation error"));
> +		fatal_error(ctx, 0);
> +	}
> +
> +	if (EXT2_HAS_RO_COMPAT_FEATURE(ctx->fs->super,
> +				       EXT4_FEATURE_RO_COMPAT_GDT_CSUM))
> +		csum_flag = 1;

This patch just looks wrong.  RO_COMPAT_GDT_CSUM is the old feature,
and doesn't imply that there will be a checksum in the inode bitmap
block.  Shouldn't this be RO_COMPAT_METADATA_CSUM?

By the way, the bugs which I'm finding in this patch series are
seriously degrading my faith about how well both they and the kernel
side patches have been tested.....

At this point I will probably work on other kernel patches, and return
to this later.  

						- Ted

P.S.  Note that if I apply the full patch series, I am still finding
45 (out of 118) test failures from the regression test suite.  While I
was bisecting to find errors, I found this one by quick inspection.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - March 11, 2012, 2:11 a.m.
On Sat, Mar 10, 2012 at 12:46:57AM -0500, Ted Ts'o wrote:
> 
> This patch just looks wrong.  RO_COMPAT_GDT_CSUM is the old feature,
> and doesn't imply that there will be a checksum in the inode bitmap
> block.  Shouldn't this be RO_COMPAT_METADATA_CSUM?

Correction; the big problem here is that check_inode_bitmap_checksum()
(and in the next e2fsck page, check_block_bitmap_checksum()) are are
doing their thing without actually checking for
RO_COMPAT_METADATA_CSUM).  These code paths shouldn't be getting
activated at all for non-METADATA_CSUM file systems.  But they are.

(And the check_block_bitmap_checksum() function is using
EXT2_BLOCKS_PER_GROUP where it needs to use EXT2_CLUSTERS_PER_GROUP or
it will end up corrupting memory which will cause the bigalloc-related
test f_dup_ba blow up.  But again, this code path shouldn't have been
getting activated in the first place.)

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong - March 12, 2012, 9:25 p.m.
On Sat, Mar 10, 2012 at 09:11:50PM -0500, Ted Ts'o wrote:
> On Sat, Mar 10, 2012 at 12:46:57AM -0500, Ted Ts'o wrote:
> > 
> > This patch just looks wrong.  RO_COMPAT_GDT_CSUM is the old feature,
> > and doesn't imply that there will be a checksum in the inode bitmap
> > block.  Shouldn't this be RO_COMPAT_METADATA_CSUM?
> 
> Correction; the big problem here is that check_inode_bitmap_checksum()
> (and in the next e2fsck page, check_block_bitmap_checksum()) are are
> doing their thing without actually checking for
> RO_COMPAT_METADATA_CSUM).  These code paths shouldn't be getting
> activated at all for non-METADATA_CSUM file systems.  But they are.

You're right, the function could be optimized to exit early if metadata_csum
isn't set.  The bitmap verify function won't flag errors when metadata_csum is
unset, at least, but I suppose it is unecessary looping and IO.

--D

> (And the check_block_bitmap_checksum() function is using
> EXT2_BLOCKS_PER_GROUP where it needs to use EXT2_CLUSTERS_PER_GROUP or
> it will end up corrupting memory which will cause the bigalloc-related
> test f_dup_ba blow up.  But again, this code path shouldn't have been
> getting activated in the first place.)
> 
> 					- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong - March 12, 2012, 10:30 p.m.
On Mon, Mar 12, 2012 at 02:25:37PM -0700, Darrick J. Wong wrote:
> On Sat, Mar 10, 2012 at 09:11:50PM -0500, Ted Ts'o wrote:
> > On Sat, Mar 10, 2012 at 12:46:57AM -0500, Ted Ts'o wrote:
> > > 
> > > This patch just looks wrong.  RO_COMPAT_GDT_CSUM is the old feature,
> > > and doesn't imply that there will be a checksum in the inode bitmap
> > > block.  Shouldn't this be RO_COMPAT_METADATA_CSUM?
> > 
> > Correction; the big problem here is that check_inode_bitmap_checksum()
> > (and in the next e2fsck page, check_block_bitmap_checksum()) are are
> > doing their thing without actually checking for
> > RO_COMPAT_METADATA_CSUM).  These code paths shouldn't be getting
> > activated at all for non-METADATA_CSUM file systems.  But they are.
> 
> You're right, the function could be optimized to exit early if metadata_csum
> isn't set.  The bitmap verify function won't flag errors when metadata_csum is
> unset, at least, but I suppose it is unecessary looping and IO.
> 
> --D
> 
> > (And the check_block_bitmap_checksum() function is using
> > EXT2_BLOCKS_PER_GROUP where it needs to use EXT2_CLUSTERS_PER_GROUP or
> > it will end up corrupting memory which will cause the bigalloc-related
> > test f_dup_ba blow up.  But again, this code path shouldn't have been
> > getting activated in the first place.)

As for this bit -- I should have paid closer attention when bigalloc went in.
I'll send out a corrected patch.

--D
> > 
> > 					- Ted
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/e2fsck/pass5.c b/e2fsck/pass5.c
index 1e836e3..72c4936 100644
--- a/e2fsck/pass5.c
+++ b/e2fsck/pass5.c
@@ -27,6 +27,7 @@  static void check_block_bitmaps(e2fsck_t ctx);
 static void check_inode_bitmaps(e2fsck_t ctx);
 static void check_inode_end(e2fsck_t ctx);
 static void check_block_end(e2fsck_t ctx);
+static void check_inode_bitmap_checksum(e2fsck_t ctx);
 
 void e2fsck_pass5(e2fsck_t ctx)
 {
@@ -64,6 +65,8 @@  void e2fsck_pass5(e2fsck_t ctx)
 	if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
 		return;
 
+	check_inode_bitmap_checksum(ctx);
+
 	ext2fs_free_inode_bitmap(ctx->inode_used_map);
 	ctx->inode_used_map = 0;
 	ext2fs_free_inode_bitmap(ctx->inode_dir_map);
@@ -74,6 +77,67 @@  void e2fsck_pass5(e2fsck_t ctx)
 	print_resource_track(ctx, _("Pass 5"), &rtrack, ctx->fs->io);
 }
 
+static void check_inode_bitmap_checksum(e2fsck_t ctx)
+{
+	struct problem_context	pctx;
+	struct ext4_group_desc	*gdp;
+	char		*buf;
+	dgrp_t		i;
+	int		nbytes;
+	ext2_ino_t	ino_itr;
+	errcode_t	retval;
+	int		csum_flag = 0;
+
+	/* If bitmap is dirty from being fixed, checksum will be corrected */
+	if (ext2fs_test_ib_dirty(ctx->fs))
+		return;
+
+	nbytes = (size_t)(EXT2_INODES_PER_GROUP(ctx->fs->super) / 8);
+	retval = ext2fs_get_memalign(ctx->fs->blocksize, ctx->fs->blocksize,
+				     &buf);
+	if (retval) {
+		com_err(ctx->program_name, 0,
+		    _("check_inode_bitmap_checksum: Memory allocation error"));
+		fatal_error(ctx, 0);
+	}
+
+	if (EXT2_HAS_RO_COMPAT_FEATURE(ctx->fs->super,
+				       EXT4_FEATURE_RO_COMPAT_GDT_CSUM))
+		csum_flag = 1;
+
+	clear_problem_context(&pctx);
+	for (i = 0; i < ctx->fs->group_desc_count; i++) {
+		if (csum_flag && ext2fs_bg_flags_test(ctx->fs, i,
+						      EXT2_BG_INODE_UNINIT))
+			continue;
+
+		ino_itr = 1 + (i * (nbytes << 3));
+		gdp = (struct ext4_group_desc *)ext2fs_group_desc(ctx->fs,
+				ctx->fs->group_desc, i);
+		retval = ext2fs_get_inode_bitmap_range2(ctx->fs->inode_map,
+							ino_itr, nbytes << 3,
+							buf);
+		if (retval)
+			break;
+
+		if (ext2fs_inode_bitmap_csum_verify(ctx->fs, i, buf, nbytes))
+			continue;
+		pctx.group = i;
+		if (!fix_problem(ctx, PR_5_INODE_BITMAP_CSUM_INVALID, &pctx))
+			continue;
+
+		/*
+		 * Fixing one checksum will rewrite all of them.  The bitmap
+		 * will be checked against the one we made during pass1 for
+		 * discrepancies, and fixed if need be.
+		 */
+		ext2fs_mark_ib_dirty(ctx->fs);
+		break;
+	}
+
+	ext2fs_free_mem(&buf);
+}
+
 static void e2fsck_discard_blocks(e2fsck_t ctx, io_manager manager,
 				  blk64_t start, blk64_t count)
 {
diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index a6dd88a..2954d73 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -1684,6 +1684,11 @@  static struct e2fsck_problem problem_table[] = {
 	  N_("@g %g @i(s) in use but @g is marked INODE_UNINIT\n"),
 	  PROMPT_FIX, PR_PREEN_OK },
 
+	/* Group N inode bitmap does not match checksum */
+	{ PR_5_INODE_BITMAP_CSUM_INVALID,
+	  N_("@g %g @i bitmap does not match checksum\n"),
+	  PROMPT_FIX, PR_LATCH_IBITMAP | PR_PREEN_OK },
+
 	/* Post-Pass 5 errors */
 
 	/* Recreate journal if E2F_FLAG_JOURNAL_INODE flag is set */
diff --git a/e2fsck/problem.h b/e2fsck/problem.h
index 5797fe2..0adc7ea 100644
--- a/e2fsck/problem.h
+++ b/e2fsck/problem.h
@@ -1019,6 +1019,9 @@  struct problem_context {
 /* Inode in use but group is marked INODE_UNINIT */
 #define PR_5_INODE_UNINIT		0x050019
 
+/* Inode bitmap checksum does not match */
+#define PR_5_INODE_BITMAP_CSUM_INVALID	0x05001A
+
 /*
  * Post-Pass 5 errors
  */