diff mbox

[RFC] ext4: ignore i_size_high for directories

Message ID 1232125148-13785-1-git-send-email-tytso@mit.edu
State Accepted, archived
Headers show

Commit Message

Theodore Y. Ts'o Jan. 16, 2009, 4:59 p.m. UTC
Directories are not allowed to be bigger than 2TB, so don't use
i_size_high when reading the inode.  E2fsck should complain about
these inodes, but the simplest thing to do for the kernel is simply to
not consult i_size_high for directory inodes.

This prevents an intentially corrupted filesystem from causing the
kernel to burn a huge amount of CPU and issuing error messages such
as:

EXT4-fs warning (device loop0): ext4_block_to_path: block 135090028 > max

Thanks to David Maciejak from Fortinet's FortiGuard Global Security
Research Team for reporting this issue.

http://bugzilla.kernel.org/show_bug.cgi?id=12375

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
---
 fs/ext4/ext4.h  |    7 +++++--
 fs/ext4/inode.c |    4 ++--
 2 files changed, 7 insertions(+), 4 deletions(-)

Comments

Andreas Dilger Jan. 16, 2009, 11:57 p.m. UTC | #1
On Jan 16, 2009  11:59 -0500, Theodore Ts'o wrote:
> Directories are not allowed to be bigger than 2TB, so don't use

... bigger than 2GB?

> i_size_high when reading the inode.  E2fsck should complain about
> these inodes, but the simplest thing to do for the kernel is simply to
> not consult i_size_high for directory inodes.

Actually, it would be preferable to allow directories to grow beyond
the 2GB limit.  2GB only allows about 30M files.  While the htree code
is currently limited to only 2 levels deep, if you have 8kB+ block
size it is possible to have directories larger than 2GB with only
2 levels of htree.

>  static inline loff_t ext4_isize(struct ext4_inode *raw_inode)
>  {
> +	if (S_ISDIR(le16_to_cpu(raw_inode->i_mode)))
> +		return (loff_t) le32_to_cpu(raw_inode->i_size_lo);
> +	else
> +		return ((loff_t)le32_to_cpu(raw_inode->i_size_high) << 32) |
> +			le32_to_cpu(raw_inode->i_size_lo);

If you are going to limit this it should be "if (!S_ISREG(...))"
instead of "if (S_ISDIR(...))" because none of the special files
should use i_size_high - e2fsck will clear a special inode if
it has a non-zero size.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Y. Ts'o Jan. 17, 2009, 11:36 p.m. UTC | #2
On Sat, Jan 17, 2009 at 07:57:06AM +0800, Andreas Dilger wrote:
> 
> Actually, it would be preferable to allow directories to grow beyond
> the 2GB limit.  2GB only allows about 30M files.  While the htree code
> is currently limited to only 2 levels deep, if you have 8kB+ block
> size it is possible to have directories larger than 2GB with only
> 2 levels of htree.

At some point, maybe, although e2fsprogs doesn't have any support for
directories > 2GB, and I strongly suspect there are other places in
the kernel where we assume directories are < 2GB.  So this is
something that I'd prefer we implement with a feature flag,
explicitly, probably after we get a better b-tree implementation into
ext4 (say, like the one you gave me a few months ago, if one of us
actually has time to try to get it integrated into the kernel and into
e2fsprogs).

> >  static inline loff_t ext4_isize(struct ext4_inode *raw_inode)
> >  {
> > +	if (S_ISDIR(le16_to_cpu(raw_inode->i_mode)))
> > +		return (loff_t) le32_to_cpu(raw_inode->i_size_lo);
> > +	else
> > +		return ((loff_t)le32_to_cpu(raw_inode->i_size_high) << 32) |
> > +			le32_to_cpu(raw_inode->i_size_lo);
> 
> If you are going to limit this it should be "if (!S_ISREG(...))"
> instead of "if (S_ISDIR(...))" because none of the special files
> should use i_size_high - e2fsck will clear a special inode if
> it has a non-zero size.

Hmm, yes, that's probably a better check; although for most special
inodes we ignore i_size so having an absurd i_size is harmless.
Looking at things, we should probably add a check to make sure that
i_size for symlinks is sane.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index c668e43..42396a6 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1206,8 +1206,11 @@  static inline void ext4_r_blocks_count_set(struct ext4_super_block *es,
 
 static inline loff_t ext4_isize(struct ext4_inode *raw_inode)
 {
-	return ((loff_t)le32_to_cpu(raw_inode->i_size_high) << 32) |
-		le32_to_cpu(raw_inode->i_size_lo);
+	if (S_ISDIR(le16_to_cpu(raw_inode->i_mode)))
+		return (loff_t) le32_to_cpu(raw_inode->i_size_lo);
+	else
+		return ((loff_t)le32_to_cpu(raw_inode->i_size_high) << 32) |
+			le32_to_cpu(raw_inode->i_size_lo);
 }
 
 static inline void ext4_isize_set(struct ext4_inode *raw_inode, loff_t i_size)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a6444ce..49484ba 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -360,9 +360,9 @@  static int ext4_block_to_path(struct inode *inode,
 		final = ptrs;
 	} else {
 		ext4_warning(inode->i_sb, "ext4_block_to_path",
-				"block %lu > max",
+				"block %lu > max in inode %lu",
 				i_block + direct_blocks +
-				indirect_blocks + double_blocks);
+				indirect_blocks + double_blocks, inode->i_ino);
 	}
 	if (boundary)
 		*boundary = final - 1 - (i_block & (ptrs - 1));