diff mbox

[07/22] ext4: Create bitmap checksum helper functions

Message ID 20111128232703.19194.55084.stgit@elm3c44.beaverton.ibm.com
State Superseded, archived
Headers show

Commit Message

Darrick J. Wong Nov. 28, 2011, 11:27 p.m. UTC
These helper functions will be used to calculate and verify the block and inode
bitmap checksums.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
---
 fs/ext4/bitmap.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Theodore Ts'o Dec. 5, 2011, 4:33 p.m. UTC | #1
On Mon, Nov 28, 2011 at 03:27:03PM -0800, Darrick J. Wong wrote:
> +static __u32 ext4_bitmap_csum(struct super_block *sb, ext4_group_t group,
> +			      struct buffer_head *bh, int sz)
> +{
> +	__u32 csum;
> +	struct ext4_sb_info *sbi = EXT4_SB(sb);
> +
> +	group = cpu_to_le32(group);
> +	csum = ext4_chksum(sbi, sbi->s_uuid_csum, (__u8 *)&group,
> +			   sizeof(group));
> +	csum = ext4_chksum(sbi, csum, (__u8 *)bh->b_data, sz);
> +
> +	return csum;
> +}

Note: it's strictly speaking not necessary to mix in the group and
s_csum_seed here.  It's useful for the inode table blocks (ITB's)
because the checksum for a particular ITB is located *in* the ITB
itself.  So if an ITB gets written to the wrong place, and in
particular, on top of another ITB, we want to be able to know which
cloned copy was written to the wrong place on disk.

But in the case of the inode and block allocation bitmaps, the
checksums are stored in the block group descriptors; so if the bitmap
is written to the wrong place (and on top of another bitmap), the
checksum will fail to verify, independent of whether we've mixed in
the fs-specific csum seed and the group number.

So I'd suggest dropping this, which will shave a few cycles off of the
checksum calculation, and it will also simplify the code since we
won't need this particular function.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong Dec. 5, 2011, 8:31 p.m. UTC | #2
On Mon, Dec 05, 2011 at 11:33:29AM -0500, Ted Ts'o wrote:
> On Mon, Nov 28, 2011 at 03:27:03PM -0800, Darrick J. Wong wrote:
> > +static __u32 ext4_bitmap_csum(struct super_block *sb, ext4_group_t group,
> > +			      struct buffer_head *bh, int sz)
> > +{
> > +	__u32 csum;
> > +	struct ext4_sb_info *sbi = EXT4_SB(sb);
> > +
> > +	group = cpu_to_le32(group);
> > +	csum = ext4_chksum(sbi, sbi->s_uuid_csum, (__u8 *)&group,
> > +			   sizeof(group));
> > +	csum = ext4_chksum(sbi, csum, (__u8 *)bh->b_data, sz);
> > +
> > +	return csum;
> > +}
> 
> Note: it's strictly speaking not necessary to mix in the group and
> s_csum_seed here.  It's useful for the inode table blocks (ITB's)
> because the checksum for a particular ITB is located *in* the ITB
> itself.  So if an ITB gets written to the wrong place, and in
> particular, on top of another ITB, we want to be able to know which
> cloned copy was written to the wrong place on disk.
> 
> But in the case of the inode and block allocation bitmaps, the
> checksums are stored in the block group descriptors; so if the bitmap
> is written to the wrong place (and on top of another bitmap), the
> checksum will fail to verify, independent of whether we've mixed in
> the fs-specific csum seed and the group number.
>
> So I'd suggest dropping this, which will shave a few cycles off of the
> checksum calculation, and it will also simplify the code since we
> won't need this particular function.

Ok, I can get rid of the function, but I'll keep using sbi->s_uuid_csum as the
crc32c seed since it's basically free.

--D
> 
> 						- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong Dec. 5, 2011, 11:54 p.m. UTC | #3
On Mon, Dec 05, 2011 at 12:31:41PM -0800, Darrick J. Wong wrote:
> On Mon, Dec 05, 2011 at 11:33:29AM -0500, Ted Ts'o wrote:
> > On Mon, Nov 28, 2011 at 03:27:03PM -0800, Darrick J. Wong wrote:
> > > +static __u32 ext4_bitmap_csum(struct super_block *sb, ext4_group_t group,
> > > +			      struct buffer_head *bh, int sz)
> > > +{
> > > +	__u32 csum;
> > > +	struct ext4_sb_info *sbi = EXT4_SB(sb);
> > > +
> > > +	group = cpu_to_le32(group);
> > > +	csum = ext4_chksum(sbi, sbi->s_uuid_csum, (__u8 *)&group,
> > > +			   sizeof(group));
> > > +	csum = ext4_chksum(sbi, csum, (__u8 *)bh->b_data, sz);
> > > +
> > > +	return csum;
> > > +}
> > 
> > Note: it's strictly speaking not necessary to mix in the group and
> > s_csum_seed here.  It's useful for the inode table blocks (ITB's)
> > because the checksum for a particular ITB is located *in* the ITB
> > itself.  So if an ITB gets written to the wrong place, and in
> > particular, on top of another ITB, we want to be able to know which
> > cloned copy was written to the wrong place on disk.
> > 
> > But in the case of the inode and block allocation bitmaps, the
> > checksums are stored in the block group descriptors; so if the bitmap
> > is written to the wrong place (and on top of another bitmap), the
> > checksum will fail to verify, independent of whether we've mixed in
> > the fs-specific csum seed and the group number.
> >
> > So I'd suggest dropping this, which will shave a few cycles off of the
> > checksum calculation, and it will also simplify the code since we
> > won't need this particular function.
> 
> Ok, I can get rid of the function, but I'll keep using sbi->s_uuid_csum as the
> crc32c seed since it's basically free.

On second thought, I purged it out of the kernel and e2fsprogs.  The bitmap
checksum calculations seed with ~0 now.

--D
> 
> --D
> > 
> > 						- Ted
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andreas Dilger Dec. 6, 2011, 5:19 p.m. UTC | #4
On 2011-12-05, at 9:33, Ted Ts'o <tytso@mit.edu> wrote:
> Note: it's strictly speaking not necessary to mix in the group and
> s_csum_seed here.  It's useful for the inode table blocks (ITB's)
> because the checksum for a particular ITB is located *in* the ITB
> itself.  So if an ITB gets written to the wrong place, and in
> particular, on top of another ITB, we want to be able to know which
> cloned copy was written to the wrong place on disk.
> 
> But in the case of the inode and block allocation bitmaps, the
> checksums are stored in the block group descriptors; so if the bitmap
> is written to the wrong place (and on top of another bitmap), the
> checksum will fail to verify, independent of whether we've mixed in
> the fs-specific csum seed and the group number.
> 
> So I'd suggest dropping this, which will shave a few cycles off of the
> checksum calculation, and it will also simplify the code since we
> won't need this particular function.

I wouldn't mind keeping the group just to be consistent with all of the other checksums that are used in the filesystem, which are largely inside the structure being checked.

The s_uuid is definitely useful to keep as the seed because the block and inode bitmaps are not initialized at mke2fs time with uninit_bg, and it is possible to read a stale bitmap from disk that might belong to an earlier instance of the filesystem in case of a failed or misplaced write of the correct bitmap. 

That isn't important for e2fsck, since it doesn't really use the bitmaps, but it is important for the kernel not to use bad bitmaps and corrupt the filesystem further. 

Cheers, Andreas--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong Dec. 6, 2011, 8:59 p.m. UTC | #5
On Tue, Dec 06, 2011 at 10:19:12AM -0700, Andreas Dilger wrote:
> On 2011-12-05, at 9:33, Ted Ts'o <tytso@mit.edu> wrote:
> > Note: it's strictly speaking not necessary to mix in the group and
> > s_csum_seed here.  It's useful for the inode table blocks (ITB's)
> > because the checksum for a particular ITB is located *in* the ITB
> > itself.  So if an ITB gets written to the wrong place, and in
> > particular, on top of another ITB, we want to be able to know which
> > cloned copy was written to the wrong place on disk.
> > 
> > But in the case of the inode and block allocation bitmaps, the
> > checksums are stored in the block group descriptors; so if the bitmap
> > is written to the wrong place (and on top of another bitmap), the
> > checksum will fail to verify, independent of whether we've mixed in
> > the fs-specific csum seed and the group number.
> > 
> > So I'd suggest dropping this, which will shave a few cycles off of the
> > checksum calculation, and it will also simplify the code since we
> > won't need this particular function.
> 
> I wouldn't mind keeping the group just to be consistent with all of the other
> checksums that are used in the filesystem, which are largely inside the
> structure being checked.
> 
> The s_uuid is definitely useful to keep as the seed because the block and
> inode bitmaps are not initialized at mke2fs time with uninit_bg, and it is
> possible to read a stale bitmap from disk that might belong to an earlier
> instance of the filesystem in case of a failed or misplaced write of the
> correct bitmap. 

Hmm... let's say you have bitmap B before mkfs and bitmap B' after mkfs + some
file writes.  B' is lost during write.  It would be bad if B != B' and
crc32c(B) == crc32c(B'), in which case you'd use the wrong bitmap.

I suppose having the fsuuid + groupnum would probably help to make the inputs
to crc32c() more distinct, which would be useful since iirc P(collision)
decreases as the Hamming distance increases.  I'm running a simulation to check
that claim.

--D

> That isn't important for e2fsck, since it doesn't really use the bitmaps, but
> it is important for the kernel not to use bad bitmaps and corrupt the
> filesystem further. 
> 
> Cheers, Andreas--
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ext4/bitmap.c b/fs/ext4/bitmap.c
index fa3af81..285d23a 100644
--- a/fs/ext4/bitmap.c
+++ b/fs/ext4/bitmap.c
@@ -29,3 +29,16 @@  unsigned int ext4_count_free(struct buffer_head *map, unsigned int numchars)
 
 #endif  /*  EXT4FS_DEBUG  */
 
+static __u32 ext4_bitmap_csum(struct super_block *sb, ext4_group_t group,
+			      struct buffer_head *bh, int sz)
+{
+	__u32 csum;
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+
+	group = cpu_to_le32(group);
+	csum = ext4_chksum(sbi, sbi->s_uuid_csum, (__u8 *)&group,
+			   sizeof(group));
+	csum = ext4_chksum(sbi, csum, (__u8 *)bh->b_data, sz);
+
+	return csum;
+}