Message ID | 20121008024126.GC468@thunk.org |
---|---|
State | Accepted, archived |
Headers | show |
I'm currently running with two ext4 patches: Author: Theodore Ts'o <tytso@mit.edu> Date: Sun Oct 7 22:18:56 2012 -0400 Subject: ext4: fix metadata checksum calculation for the superblock Author: Darrick J. Wong <darrick.wong@oracle.com> Date: Wed Oct 17 12:51:30 2012 -0700 Subject: ext4: Don't verify checksums of dx non-leaf nodes during fallback linear scan They appear to fix real problems. I notice, that neither of these have made it into 2.6.5. Should they be sent to -stable at some point? I'm not trying to overrule your judgements on the matter, just ensure that the omission is actually a conscious decision rather than an oversight. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 31, 2012 at 09:05:21PM -0400, George Spelvin wrote: > I'm currently running with two ext4 patches: > > Author: Theodore Ts'o <tytso@mit.edu> > Date: Sun Oct 7 22:18:56 2012 -0400 > Subject: ext4: fix metadata checksum calculation for the superblock > > Author: Darrick J. Wong <darrick.wong@oracle.com> > Date: Wed Oct 17 12:51:30 2012 -0700 > Subject: ext4: Don't verify checksums of dx non-leaf nodes during fallback linear scan > > They appear to fix real problems. I notice, that neither of these have > made it into 2.6.5. Should they be sent to -stable at some point? > > I'm not trying to overrule your judgements on the matter, just ensure that > the omission is actually a conscious decision rather than an oversight. <shrug> I was wondering too, but I figured Ted was probably busy dealing with the corruption bug and such. (Which itself doesn't seem to be in 3.6.x yet) I suppose it's a good sign that it's been more than a week and you haven't hit anything else... --D > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 31, 2012 at 06:13:12PM -0700, Darrick J. Wong wrote: > > Author: Theodore Ts'o <tytso@mit.edu> > > Date: Sun Oct 7 22:18:56 2012 -0400 > > Subject: ext4: fix metadata checksum calculation for the superblock This one was cc'ed to stable@vger.kernel.org. But when you said "I notice, that neither of thse have made it into 2.6.5", I assume you meant 3.5? The last 3.5 kernel is 3.5.7, and Greg K-H isn't backporting fixes to 3.5.x any more. (See http://www.kernel.org to see which kernels are marked "EOL"; those are the ones which are no longer getting updates.) So that means it should eventually make it to the 3.4.x and 3.6.x kernels. > > Author: Darrick J. Wong <darrick.wong@oracle.com> > > Date: Wed Oct 17 12:51:30 2012 -0700 > > Subject: ext4: Don't verify checksums of dx non-leaf nodes during fallback linear scan I missed this one because the subject line didn't have [PATCH] in it. (Darrick, it really helps if you use git format-patch / git send-email; you can use a message-id of the message you're replying to in the mail thread to chain the message to the thread.) I would have eventually found it in patchwork, but even in patchwork the listing would have had a potentially misleading subject line, since it grabs the patch title from the subject line of the e-mail. > <shrug> I was wondering too, but I figured Ted was probably busy dealing with > the corruption bug and such. > > (Which itself doesn't seem to be in 3.6.x yet) It isn't in 3.7-rc3 because I didn't see it before I sent the pull request to Linus.... At this point I'll just include it in the patches to be sent to Linus at the next merge window, mainly because I don't have the time to run a separate regression test run just for this patch, and it's only a cosmetic issue, right? - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 31, 2012 at 09:50:15PM -0400, Theodore Ts'o wrote: > On Wed, Oct 31, 2012 at 06:13:12PM -0700, Darrick J. Wong wrote: > > > Author: Theodore Ts'o <tytso@mit.edu> > > > Date: Sun Oct 7 22:18:56 2012 -0400 > > > Subject: ext4: fix metadata checksum calculation for the superblock > > This one was cc'ed to stable@vger.kernel.org. But when you said "I > notice, that neither of thse have made it into 2.6.5", I assume you > meant 3.5? The last 3.5 kernel is 3.5.7, and Greg K-H isn't > backporting fixes to 3.5.x any more. (See http://www.kernel.org to > see which kernels are marked "EOL"; those are the ones which are no > longer getting updates.) > > So that means it should eventually make it to the 3.4.x and 3.6.x > kernels. I thought he meant 3.6.5, but I haven't really been paying 3.6.x much attention. > > > Author: Darrick J. Wong <darrick.wong@oracle.com> > > > Date: Wed Oct 17 12:51:30 2012 -0700 > > > Subject: ext4: Don't verify checksums of dx non-leaf nodes during fallback linear scan > > I missed this one because the subject line didn't have [PATCH] in it. > (Darrick, it really helps if you use git format-patch / git > send-email; you can use a message-id of the message you're replying to > in the mail thread to chain the message to the thread.) > > I would have eventually found it in patchwork, but even in patchwork > the listing would have had a potentially misleading subject line, > since it grabs the patch title from the subject line of the e-mail. Oops, I guess I did forget the magic "[PATCH]". Sorry about that. > > <shrug> I was wondering too, but I figured Ted was probably busy dealing with > > the corruption bug and such. > > > > (Which itself doesn't seem to be in 3.6.x yet) > > It isn't in 3.7-rc3 because I didn't see it before I sent the pull > request to Linus.... > > At this point I'll just include it in the patches to be sent to Linus > at the next merge window, mainly because I don't have the time to run > a separate regression test run just for this patch, and it's only a > cosmetic issue, right? Yep. --D > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> This one was cc'ed to stable@vger.kernel.org. But when you said "I > notice, that neither of thse have made it into 2.6.5", I assume you > meant 3.5? Whoops, typo! I meant 3.6.5, the very latest just-out-today stable kernel. Quite a few 3.6.x kernels have come out since that patch was Cc'ed, and it keeps not being included. So I wondered. > So that means it should eventually make it to the 3.4.x and 3.6.x > kernels. That's what I thought, but I didn't want to pester Greg until I was sure of your intentions. > At this point I'll just include it in the patches to be sent to Linus > at the next merge window, mainly because I don't have the time to run > a separate regression test run just for this patch, and it's only a > cosmetic issue, right? Well, it causes the file system to be marked dirty and unnecessarily checked on reboot, which I contend is a bug, but it's not a data-loss bug. I do worry that it could cause file lookup to fail when it shouldn't, which *is* effectively a data-loss bug, even if the data reappears on reboot. But I'd have to understand the problem and fix better to know if that actually happens; I haven't observed it. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 01, 2012 at 02:12:12AM -0400, George Spelvin wrote: > > This one was cc'ed to stable@vger.kernel.org. But when you said "I > > notice, that neither of thse have made it into 2.6.5", I assume you > > meant 3.5? > > Whoops, typo! I meant 3.6.5, the very latest just-out-today stable > kernel. > > Quite a few 3.6.x kernels have come out since that patch was Cc'ed, > and it keeps not being included. So I wondered. > > > So that means it should eventually make it to the 3.4.x and 3.6.x > > kernels. > > That's what I thought, but I didn't want to pester Greg until I was sure > of your intentions. > > > At this point I'll just include it in the patches to be sent to Linus > > at the next merge window, mainly because I don't have the time to run > > a separate regression test run just for this patch, and it's only a > > cosmetic issue, right? > > Well, it causes the file system to be marked dirty and unnecessarily > checked on reboot, which I contend is a bug, but it's not a data-loss > bug. > > I do worry that it could cause file lookup to fail when it shouldn't, > which *is* effectively a data-loss bug, even if the data reappears > on reboot. But I'd have to understand the problem and fix better to > know if that actually happens; I haven't observed it. Yes, it would be useful to know what's going on with this directory file, since it seems to fallback to linear scan, yet e2fsck -D doesn't fix it. What I was /going/ for was that the kernel would notice a bad directory and flag it for fsck on reboot. Upon reboot, fsck would be run, notice the bad dir, and feed it to the directory rebuilder to get it fixed for good. However, there doesn't seem to be any real checksum mismatch, so the rebuild doesn't happen. Also ... refresh my memory -- some files have disappeared as a result of this happening? --D -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Yes, it would be useful to know what's going on with this directory file, > since it seems to fallback to linear scan, yet e2fsck -D doesn't fix it. > What I was /going/ for was that the kernel would notice a bad directory > and flag it for fsck on reboot. Upon reboot, fsck would be run, notice > the bad dir, and feed it to the directory rebuilder to get it fixed > for good. However, there doesn't seem to be any real checksum mismatch, > so the rebuild doesn't happen. That's what confuses me. I had already run e2fsck -D (which I assume rebuilds all directories, even if unnecessary) before observing the problem. The other odd clue is that it's always nfsd that chokes; other accesses to the directory (ls -U, ls -lU, grep -r) don't produce the message. > Also ... refresh my memory -- some files have disappeared as a result of this > happening? I haven't observed it, no. But the nature of the symptoms suggests it might be happening. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 01, 2012 at 03:07:31AM -0400, George Spelvin wrote: > > Yes, it would be useful to know what's going on with this directory file, > > since it seems to fallback to linear scan, yet e2fsck -D doesn't fix it. > > What I was /going/ for was that the kernel would notice a bad directory > > and flag it for fsck on reboot. Upon reboot, fsck would be run, notice > > the bad dir, and feed it to the directory rebuilder to get it fixed > > for good. However, there doesn't seem to be any real checksum mismatch, > > so the rebuild doesn't happen. > > That's what confuses me. I had already run e2fsck -D (which I assume > rebuilds all directories, even if unnecessary) before observing the > problem. The other odd clue is that it's always nfsd that chokes; > other accesses to the directory (ls -U, ls -lU, grep -r) don't produce > the message. Oh, so ... it's just nfsd that causes the linear fallback? Regular (i.e. non-nfs) users can see everything in the dir, no error messages? Now *that* is odd. :) You know, I was starting to wonder what on earth would even cause the fallback in the first place. It even looked like most of the "your dir is corrupt" exits from that function would spit out an error or be somehow obviously broken. > > Also ... refresh my memory -- some files have disappeared as a result of this > > happening? > > I haven't observed it, no. But the nature of the symptoms suggests it > might be happening. Hum. When linear scan happens on a hashed dir, it's scanning the same blocks that the hash scan sees. The htree block looks like a regular directory block with one huge "unused" dirent that wraps all the htree data. So, the linear scan should find the exact same files as a htree scan would. If it doesn't, something's wrong. But you say it isn't, so I imagine it's fine. <shrug> Another thing for me to ponder tomorrow. :) --D -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Oh, so ... it's just nfsd that causes the linear fallback? Regular (i.e. > non-nfs) users can see everything in the dir, no error messages? Yup. After it survived one e2fsck -D, I poked at the directory a bit to see if I could cause the error. No success from local access. It's also probably an NFSv2 client. I wonder if it's doing something odd with directory seeks that's causing problems; perhaps htree and the 32-bit seek cookie limit are not friends? >> I haven't observed it, no. But the nature of the symptoms suggests it >> might be happening. > Hum. When linear scan happens on a hashed dir, it's scanning the same > blocks that the hash scan sees. The htree block looks like a regular > directory block with one huge "unused" dirent that wraps all the htree > data. So, the linear scan should find the exact same files as a htree > scan would. If it doesn't, something's wrong. But you say it isn't, > so I imagine it's fine. Maybe I was wrong. I was worried that it was aborting the directory scan due to the error and thus files would disappear. If that doesn't happen, no worries. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 01, 2012 at 03:28:47AM -0400, George Spelvin wrote: > > Oh, so ... it's just nfsd that causes the linear fallback? Regular (i.e. > > non-nfs) users can see everything in the dir, no error messages? > > Yup. After it survived one e2fsck -D, I poked at the directory a bit > to see if I could cause the error. No success from local access. > > It's also probably an NFSv2 client. I wonder if it's doing something > odd with directory seeks that's causing problems; perhaps htree and the > 32-bit seek cookie limit are not friends? <shrug> I'm not nfs-wise, sadly. I _am_ wondering if an ftrace of this might be useful... or a gigantic glut of data that I'll never finish processing. Just from a quick read of ext4_find_entry() it looks like the only thing that results in fallback mode without a kernel message is ext4_bread() failing in dx_probe()? > >> I haven't observed it, no. But the nature of the symptoms suggests it > >> might be happening. > > > Hum. When linear scan happens on a hashed dir, it's scanning the same > > blocks that the hash scan sees. The htree block looks like a regular > > directory block with one huge "unused" dirent that wraps all the htree > > data. So, the linear scan should find the exact same files as a htree > > scan would. If it doesn't, something's wrong. But you say it isn't, > > so I imagine it's fine. > > Maybe I was wrong. I was worried that it was aborting the directory > scan due to the error and thus files would disappear. If that doesn't > happen, no worries. Oh well, it'll run slowly but at least it won't be throwing up errors. --D -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 3ab2539..78971cf 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -2063,8 +2063,7 @@ extern int ext4_resize_fs(struct super_block *sb, ext4_fsblk_t n_blocks_count); extern int ext4_calculate_overhead(struct super_block *sb); extern int ext4_superblock_csum_verify(struct super_block *sb, struct ext4_super_block *es); -extern void ext4_superblock_csum_set(struct super_block *sb, - struct ext4_super_block *es); +extern void ext4_superblock_csum_set(struct super_block *sb); extern void *ext4_kvmalloc(size_t size, gfp_t flags); extern void *ext4_kvzalloc(size_t size, gfp_t flags); extern void ext4_kvfree(void *ptr); diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c index bfa65b4..b4323ba 100644 --- a/fs/ext4/ext4_jbd2.c +++ b/fs/ext4/ext4_jbd2.c @@ -143,17 +143,13 @@ int __ext4_handle_dirty_super(const char *where, unsigned int line, struct buffer_head *bh = EXT4_SB(sb)->s_sbh; int err = 0; + ext4_superblock_csum_set(sb); if (ext4_handle_valid(handle)) { - ext4_superblock_csum_set(sb, - (struct ext4_super_block *)bh->b_data); err = jbd2_journal_dirty_metadata(handle, bh); if (err) ext4_journal_abort_handle(where, line, __func__, bh, handle, err); - } else { - ext4_superblock_csum_set(sb, - (struct ext4_super_block *)bh->b_data); + } else mark_buffer_dirty(bh); - } return err; } diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 982f6fc..5ededf1 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -143,9 +143,10 @@ int ext4_superblock_csum_verify(struct super_block *sb, return es->s_checksum == ext4_superblock_csum(sb, es); } -void ext4_superblock_csum_set(struct super_block *sb, - struct ext4_super_block *es) +void ext4_superblock_csum_set(struct super_block *sb) { + struct ext4_super_block *es = EXT4_SB(sb)->s_es; + if (!EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_METADATA_CSUM)) return; @@ -4387,7 +4388,7 @@ static int ext4_commit_super(struct super_block *sb, int sync) cpu_to_le32(percpu_counter_sum_positive( &EXT4_SB(sb)->s_freeinodes_counter)); BUFFER_TRACE(sbh, "marking dirty"); - ext4_superblock_csum_set(sb, es); + ext4_superblock_csum_set(sb); mark_buffer_dirty(sbh); if (sync) { error = sync_dirty_buffer(sbh);