Message ID | 20140311065449.30585.54020.stgit@birch.djwong.org |
---|---|
State | Accepted, archived |
Headers | show |
On Mon, Mar 10, 2014 at 11:54:49PM -0700, Darrick J. Wong wrote: > If a directory's contents are stored entirely inside the inode, > there's no index to rebuild and no dirblock checksum to recompute. > As far as I know these are the only two reasons to call dir rehash. Well, actually, there is a third reason to rehash directories, and that is to reorganize a directory to optimize out deleted entries that are scattered in the middle of the directory. That being said, it's more critical for inline directories, since we very much want to keep them from spilling over to an external block, this process of compressing out deleted space is something that should be done in real time as we operate on the directory, by the kernel, and not just at fsck time. The only reason why we don't do this today is because if the directory is open for scanning using opendir/readdir, if we reorganize a directory block, it could end up corrupting the readdir --- and for non-inline directories, it's much less important. What I think would might make sense is to have the kernel track whether the directory has been opened for reading, and if it hasn't, then it would be safe to try compressing all of the directory entries in the block so that the free space is in a single unused directory entry at the end of the block. We could try doing this "dynamic compression" of directory free space both at unlink(2) time, and also when we try inserting a directory entry into the block and there is apparently no space in the directory block. So I'm fine with skipping the rehashing of inline directories now, but this is a future, relatively small, kernel project we might want to think about for ext4. Cheers, - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 12, 2014 at 11:52:48PM -0400, Theodore Ts'o wrote: > On Mon, Mar 10, 2014 at 11:54:49PM -0700, Darrick J. Wong wrote: > > If a directory's contents are stored entirely inside the inode, > > there's no index to rebuild and no dirblock checksum to recompute. > > As far as I know these are the only two reasons to call dir rehash. > > Well, actually, there is a third reason to rehash directories, and > that is to reorganize a directory to optimize out deleted entries that > are scattered in the middle of the directory. Ooh, I forgot about that. :/ > That being said, it's more critical for inline directories, since we > very much want to keep them from spilling over to an external block, > this process of compressing out deleted space is something that should > be done in real time as we operate on the directory, by the kernel, > and not just at fsck time. > > The only reason why we don't do this today is because if the directory > is open for scanning using opendir/readdir, if we reorganize a > directory block, it could end up corrupting the readdir --- and for > non-inline directories, it's much less important. > > What I think would might make sense is to have the kernel track > whether the directory has been opened for reading, and if it hasn't, > then it would be safe to try compressing all of the directory entries > in the block so that the free space is in a single unused directory > entry at the end of the block. We could try doing this "dynamic > compression" of directory free space both at unlink(2) time, and also > when we try inserting a directory entry into the block and there is > apparently no space in the directory block. > > So I'm fine with skipping the rehashing of inline directories now, but > this is a future, relatively small, kernel project we might want to > think about for ext4. Probably we ought to fix up rehash.c to be able to compress directory entries too. The only reason I kicked them here was that somehow an inline data dir would end up on the rehash list, causing the block iteration to fail and e2fsck stops cold. --D > > Cheers, > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 12, 2014 at 10:38:48PM -0700, Darrick J. Wong wrote: > Probably we ought to fix up rehash.c to be able to compress directory entries > too. The only reason I kicked them here was that somehow an inline data dir > would end up on the rehash list, causing the block iteration to fail and e2fsck > stops cold. Sure, this would be a nice to have, but I'll take this patch for now so at least we don't crash out. Cheers, - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/e2fsck/rehash.c b/e2fsck/rehash.c index 8a99453..3b05715 100644 --- a/e2fsck/rehash.c +++ b/e2fsck/rehash.c @@ -794,6 +794,11 @@ errcode_t e2fsck_rehash_dir(e2fsck_t ctx, ext2_ino_t ino) outdir.hashes = 0; e2fsck_read_inode(ctx, ino, &inode, "rehash_dir"); + if (EXT2_HAS_INCOMPAT_FEATURE(fs->super, + EXT4_FEATURE_INCOMPAT_INLINE_DATA) && + (inode.i_flags & EXT4_INLINE_DATA_FL)) + return 0; + retval = ENOMEM; fd.harray = 0; dir_buf = malloc(inode.i_size); @@ -822,8 +827,6 @@ retry_nohash: /* Read in the entire directory into memory */ retval = ext2fs_block_iterate3(fs, ino, 0, 0, fill_dir_block, &fd); - if (retval == EXT2_ET_INLINE_DATA_CANT_ITERATE) - goto errout; if (fd.err) { retval = fd.err; goto errout;
If a directory's contents are stored entirely inside the inode, there's no index to rebuild and no dirblock checksum to recompute. As far as I know these are the only two reasons to call dir rehash. Therefore, we can move on to the next dir instead of what we do right now, which is try to iterate the dir blocks (which of course fails due to the inline_data iflag being set) and then flood stdout with useless messages that aren't even failures. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- e2fsck/rehash.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html