Message ID | 20120801071935.GA12929@gmail.com |
---|---|
State | Accepted, archived |
Headers | show |
On 08/01/2012 02:19 PM, Zheng Liu wrote: > Hi Nick and Tomasz, > > Could you please try this patch? It seems that the problem is because > error code doesn't be clear. Hi, didn't try the patch yet, but I've noticed the following in dmesg since 3.5: [69004.637293] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. [69004.637330] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit [69004.637335] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. [69004.637365] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit [69004.637370] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. Could this be related?
On Wed, Aug 01, 2012 at 02:16:41PM +0700, Tomasz Chmielewski wrote: > On 08/01/2012 02:19 PM, Zheng Liu wrote: > > Hi Nick and Tomasz, > > > > Could you please try this patch? It seems that the problem is because > > error code doesn't be clear. > > Hi, > > didn't try the patch yet, but I've noticed the following in dmesg since 3.5: > > [69004.637293] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > [69004.637330] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit > [69004.637335] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > [69004.637365] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit > [69004.637370] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > > > Could this be related? Are these messages printed before you enable metadata_csum feature? Regards, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/01/2012 02:48 PM, Zheng Liu wrote: >> didn't try the patch yet, but I've noticed the following in dmesg since 3.5: >> >> [69004.637293] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> [69004.637330] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit >> [69004.637335] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> [69004.637365] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit >> [69004.637370] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> >> >> Could this be related? > > Are these messages printed before you enable metadata_csum feature? I didn't notice them before trying to enable metadata_csum feature. On the other hand, enabling metadata_csum feature was pretty much the first thing I've made after booting to 3.5 kernel on this system, so it could be it changed something. Also, when I do: dumpe2fs -h /dev/sda1|grep metadata_csum I don't see metadata_csum feature anywhere.
On Wed, Aug 01, 2012 at 02:51:43PM +0700, Tomasz Chmielewski wrote: > On 08/01/2012 02:48 PM, Zheng Liu wrote: > > >>didn't try the patch yet, but I've noticed the following in dmesg since 3.5: > >> > >>[69004.637293] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > >>[69004.637330] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit > >>[69004.637335] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > >>[69004.637365] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit > >>[69004.637370] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. > >> > >> > >>Could this be related? > > > >Are these messages printed before you enable metadata_csum feature? > > I didn't notice them before trying to enable metadata_csum feature. > > On the other hand, enabling metadata_csum feature was pretty much > the first thing I've made after booting to 3.5 kernel on this > system, so it could be it changed something. Yes, it will change something when you try to enable metadata_csum feature in tune2fs. So you'd better to run e2fsck to check your filesystem IMHO. > > > Also, when I do: > > dumpe2fs -h /dev/sda1|grep metadata_csum > > I don't see metadata_csum feature anywhere. You won't see this feature until you can enable this feature successful. Regards, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 1, 2012 at 3:17 AM, Zheng Liu <gnehzuil.liu@gmail.com> wrote: > On Wed, Aug 01, 2012 at 02:51:43PM +0700, Tomasz Chmielewski wrote: >> On 08/01/2012 02:48 PM, Zheng Liu wrote: >> >> >>didn't try the patch yet, but I've noticed the following in dmesg since 3.5: >> >> >> >>[69004.637293] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> >>[69004.637330] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit >> >>[69004.637335] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> >>[69004.637365] EXT4-fs warning (device sda1): dx_probe:647: dx entry: limit != root limit >> >>[69004.637370] EXT4-fs warning (device sda1): dx_probe:732: Corrupt dir inode 524293, running e2fsck is recommended. >> >> >> >> >> >>Could this be related? >> > >> >Are these messages printed before you enable metadata_csum feature? >> >> I didn't notice them before trying to enable metadata_csum feature. >> >> On the other hand, enabling metadata_csum feature was pretty much >> the first thing I've made after booting to 3.5 kernel on this >> system, so it could be it changed something. > > Yes, it will change something when you try to enable metadata_csum > feature in tune2fs. So you'd better to run e2fsck to check your > filesystem IMHO. > Sorry for the slow reply -- I hadn't seen any "Corrupt dir inode" errors until now. Before running the one-line patch above, I resynced the MD array and ran a quick fsck (via "touch /forcefsck" & reboot). Then, $ sudo misc/tune2fs -O metadata_csum /dev/md1 [says something about running e2fsck -D] Then I got a few dmesg errors like: [128700.816091] JBD2: Spotted dirty metadata buffer (dev = md1, blocknr = 5243385). There's a risk of filesystem corruption in case of system crash. [128700.816106] JBD2: Spotted dirty metadata buffer (dev = md1, blocknr = 1057). There's a risk of filesystem corruption in case of system crash. then a lot of [128711.000677] EXT4-fs warning (device md1): dx_probe:647: dx entry: limit != root limit [128711.000679] EXT4-fs warning (device md1): dx_probe:732: Corrupt dir inode 7733251, running e2fsck is recommended. On my next command (sudo -s), I got an immediate kernel panic: [128713.776475] EXT4-fs warning (device md1): dx_probe:732: Corrupt dir inode 7733251, running e2fsck is recommended. [128761.137143] BUG: unable to handle kernel NULL pointer dereference at (null) [128761.137195] IP: [<ffffffff8121d448>] ext4_iget+0x498/0xa50 [128761.137231] PGD 106651067 PUD 11cf41067 PMD 0 [128761.137258] Oops: 0000 [#1] SMP [128761.137279] CPU 0 [snip...] Full panic @ http://web.mit.edu/semenko/Public/panic.txt
On Wed, Aug 01, 2012 at 10:43:05PM -0500, Nick Semenkovich wrote: [-- snip --] > Sorry for the slow reply -- > > > I hadn't seen any "Corrupt dir inode" errors until now. > > Before running the one-line patch above, I resynced the MD array and > ran a quick fsck (via "touch /forcefsck" & reboot). > > > Then, > $ sudo misc/tune2fs -O metadata_csum /dev/md1 > > [says something about running e2fsck -D] > > > Then I got a few dmesg errors like: > > [128700.816091] JBD2: Spotted dirty metadata buffer (dev = md1, > blocknr = 5243385). There's a risk of filesystem corruption in case of > system crash. > [128700.816106] JBD2: Spotted dirty metadata buffer (dev = md1, > blocknr = 1057). There's a risk of filesystem corruption in case of > system crash. > > then a lot of > > [128711.000677] EXT4-fs warning (device md1): dx_probe:647: dx entry: > limit != root limit > [128711.000679] EXT4-fs warning (device md1): dx_probe:732: Corrupt > dir inode 7733251, running e2fsck is recommended. > > > On my next command (sudo -s), I got an immediate kernel panic: > > [128713.776475] EXT4-fs warning (device md1): dx_probe:732: Corrupt > dir inode 7733251, running e2fsck is recommended. > [128761.137143] BUG: unable to handle kernel NULL pointer dereference > at (null) > [128761.137195] IP: [<ffffffff8121d448>] ext4_iget+0x498/0xa50 > [128761.137231] PGD 106651067 PUD 11cf41067 PMD 0 > [128761.137258] Oops: 0000 [#1] SMP > [128761.137279] CPU 0 > [snip...] > > Full panic @ http://web.mit.edu/semenko/Public/panic.txt Hi Nick, Thanks for testing my patch. As you described above, it seems that there still has some bugs when metadata_csum feature enabled. I tried to reproduce this bug, but I couldn't reproduce it in my sandbox. I see the full panic file, and it seems that the kernel is running on Ubuntu distribution and it doesn't use a generic mainline kernel. So IMHO would you like to try a latest upstream kernel? At least when the problem happens again, it is easy for me to find out where goes wrong. Thanks for your patient. Regards, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 01, 2012 at 03:19:35PM +0800, Zheng Liu wrote: > Subject: [PATCH] tune2fs: clear error code before rewriting directory when metadata_csum enabled > > From: Zheng Liu <wenqing.lz@taobao.com> > > When we enable metadata_csum feature in tune2fs, all inodes need to be rewrited > to calculate checksum. In this process, the inode that has been removed also > needs to calculate checksum, but the extent tree in these inodes has been clear. > Thus, we cannot read any extents, and an 'EXT2_ET_EXTENT_NO_NEXT' error is > returned back. But in this condition error code in rewrite_dir_context doesn't > be initialized, and it causes an unknown error. Thanks, I've merged this into my e2fsprogs checksum branch. I've promoted all of the metadata checksum patches in e2fsprogs into the next branch. At that point I'll strongly suggest that people use the development branch (currently the next branch, but in the next or two, the master branch) of e2fsprogs. For the kernel, for now I suggest using the v3.5 kernel with the ext4_for_linus (commit 03179fe92318) from the ext4.git tree merged in. Hopefully the necessary bug fix commits will be in the v3.5.1 kernel, but the 3.5.y series hasn't been released yet. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/misc/tune2fs.c b/misc/tune2fs.c index 6a48009..41a5529 100644 --- a/misc/tune2fs.c +++ b/misc/tune2fs.c @@ -592,6 +592,7 @@ errcode_t rewrite_directory(ext2_filsys fs, ext2_ino_t dir, ctx.is_htree = (inode->i_flags & EXT2_INDEX_FL); ctx.dir = dir; + ctx.errcode = 0; retval = ext2fs_block_iterate3(fs, dir, BLOCK_FLAG_READ_ONLY | BLOCK_FLAG_DATA_ONLY, 0, rewrite_dir_block, &ctx);