diff mbox series

[v2] ext4: Fix rec_len verify error

Message ID 20230731010104.1781335-1-zhangshida@kylinos.cn
State Superseded
Headers show
Series [v2] ext4: Fix rec_len verify error | expand

Commit Message

Stephen Zhang July 31, 2023, 1:01 a.m. UTC
From: Shida Zhang <zhangshida@kylinos.cn>

With the configuration PAGE_SIZE 64k and filesystem blocksize 64k,
a problem occurred when more than 13 million files were directly created
under a directory:

EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt?  Run e2fsck -D.
EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt?  Run e2fsck -D.
EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum

When enough files are created, the fake_dirent->reclen will be 0xffff.
it doesn't equal to the blocksize 65536, i.e. 0x10000.

But it is not the same condition when blocksize equals to 4k.
when enough file are created, the fake_dirent->reclen will be 0x1000.
it equals to the blocksize 4k, i.e. 0x1000.

The problem seems to be related to the limitation of the 16-bit field
when the blocksize is set to 64k. To address this, Modify the check so
as to handle it properly.

Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
---
v1->v2:
  Use a better way to check the condition, as suggested by Andreas.

 fs/ext4/namei.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Zhang Yi July 31, 2023, 12:30 p.m. UTC | #1
On 2023/7/31 9:01, zhangshida wrote:
> From: Shida Zhang <zhangshida@kylinos.cn>
> 
> With the configuration PAGE_SIZE 64k and filesystem blocksize 64k,
> a problem occurred when more than 13 million files were directly created
> under a directory:
> 
> EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt?  Run e2fsck -D.
> EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt?  Run e2fsck -D.
> EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum
> 
> When enough files are created, the fake_dirent->reclen will be 0xffff.
> it doesn't equal to the blocksize 65536, i.e. 0x10000.
> 
> But it is not the same condition when blocksize equals to 4k.
> when enough file are created, the fake_dirent->reclen will be 0x1000.
> it equals to the blocksize 4k, i.e. 0x1000.
> 
> The problem seems to be related to the limitation of the 16-bit field
> when the blocksize is set to 64k. To address this, Modify the check so
> as to handle it properly.
> 

Thanks for the patch. It works and looks good to me.

Reviewed-by: Zhang Yi <yi.zhang@huawei.com>

> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
> ---
> v1->v2:
>   Use a better way to check the condition, as suggested by Andreas.
> 
>  fs/ext4/namei.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 0caf6c730ce3..fffed95f8531 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -445,8 +445,9 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode,
>  	struct ext4_dir_entry *dp;
>  	struct dx_root_info *root;
>  	int count_offset;
> +	int blocksize = EXT4_BLOCK_SIZE(inode->i_sb);
>  
> -	if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb))
> +	if (ext4_rec_len_from_disk(dirent->rec_len, blocksize) == blocksize)
>  		count_offset = 8;
>  	else if (le16_to_cpu(dirent->rec_len) == 12) {
>  		dp = (struct ext4_dir_entry *)(((void *)dirent) + 12);
>
Darrick J. Wong July 31, 2023, 3:41 p.m. UTC | #2
On Mon, Jul 31, 2023 at 09:01:04AM +0800, zhangshida wrote:
> From: Shida Zhang <zhangshida@kylinos.cn>
> 
> With the configuration PAGE_SIZE 64k and filesystem blocksize 64k,
> a problem occurred when more than 13 million files were directly created
> under a directory:
> 
> EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt?  Run e2fsck -D.
> EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt?  Run e2fsck -D.
> EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum
> 
> When enough files are created, the fake_dirent->reclen will be 0xffff.
> it doesn't equal to the blocksize 65536, i.e. 0x10000.
> 
> But it is not the same condition when blocksize equals to 4k.
> when enough file are created, the fake_dirent->reclen will be 0x1000.
> it equals to the blocksize 4k, i.e. 0x1000.
> 
> The problem seems to be related to the limitation of the 16-bit field
> when the blocksize is set to 64k. To address this, Modify the check so
> as to handle it properly.

urughghahrhrhr<shudder>

Sorry that I missed that rec_len is an encoded number, not a plain le16
integer...

> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
> ---
> v1->v2:
>   Use a better way to check the condition, as suggested by Andreas.
> 
>  fs/ext4/namei.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 0caf6c730ce3..fffed95f8531 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -445,8 +445,9 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode,
>  	struct ext4_dir_entry *dp;
>  	struct dx_root_info *root;
>  	int count_offset;
> +	int blocksize = EXT4_BLOCK_SIZE(inode->i_sb);
>  
> -	if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb))
> +	if (ext4_rec_len_from_disk(dirent->rec_len, blocksize) == blocksize)
>  		count_offset = 8;
>  	else if (le16_to_cpu(dirent->rec_len) == 12) {

...but what about all the other le16_to_cpu(ext4_dir_entry{,_2}.rec_len)
accesses in this file?  Don't those also need to be converted to
ext4_rec_len_from_disk calls?

Also,
Fixes: dbe89444042ab ("ext4: Calculate and verify checksums for htree nodes")

--D

>  		dp = (struct ext4_dir_entry *)(((void *)dirent) + 12);
> -- 
> 2.27.0
>
Stephen Zhang Aug. 1, 2023, 2:26 a.m. UTC | #3
Darrick J. Wong <djwong@kernel.org> 于2023年7月31日周一 23:41写道:
>
> On Mon, Jul 31, 2023 at 09:01:04AM +0800, zhangshida wrote:
> > From: Shida Zhang <zhangshida@kylinos.cn>
> >
> > With the configuration PAGE_SIZE 64k and filesystem blocksize 64k,
> > a problem occurred when more than 13 million files were directly created
> > under a directory:
> >
> > EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt?  Run e2fsck -D.
> > EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt?  Run e2fsck -D.
> > EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum
> >
> > When enough files are created, the fake_dirent->reclen will be 0xffff.
> > it doesn't equal to the blocksize 65536, i.e. 0x10000.
> >
> > But it is not the same condition when blocksize equals to 4k.
> > when enough file are created, the fake_dirent->reclen will be 0x1000.
> > it equals to the blocksize 4k, i.e. 0x1000.
> >
> > The problem seems to be related to the limitation of the 16-bit field
> > when the blocksize is set to 64k. To address this, Modify the check so
> > as to handle it properly.
>
> urughghahrhrhr<shudder>
>
> Sorry that I missed that rec_len is an encoded number, not a plain le16
> integer...
>

Yep, that's really a point that is easy to forget...

> > Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
> > ---
> > v1->v2:
> >   Use a better way to check the condition, as suggested by Andreas.
> >
> >  fs/ext4/namei.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> > index 0caf6c730ce3..fffed95f8531 100644
> > --- a/fs/ext4/namei.c
> > +++ b/fs/ext4/namei.c
> > @@ -445,8 +445,9 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode,
> >       struct ext4_dir_entry *dp;
> >       struct dx_root_info *root;
> >       int count_offset;
> > +     int blocksize = EXT4_BLOCK_SIZE(inode->i_sb);
> >
> > -     if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb))
> > +     if (ext4_rec_len_from_disk(dirent->rec_len, blocksize) == blocksize)
> >               count_offset = 8;
> >       else if (le16_to_cpu(dirent->rec_len) == 12) {
>
> ...but what about all the other le16_to_cpu(ext4_dir_entry{,_2}.rec_len)
> accesses in this file?  Don't those also need to be converted to
> ext4_rec_len_from_disk calls?
>
> Also,
> Fixes: dbe89444042ab ("ext4: Calculate and verify checksums for htree nodes")
>

Thanks for your suggestion, I will try to add all the other conversion
in this file for the next v3.

Cheers,
Shida





> --D
>
> >               dp = (struct ext4_dir_entry *)(((void *)dirent) + 12);
> > --
> > 2.27.0
> >
diff mbox series

Patch

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 0caf6c730ce3..fffed95f8531 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -445,8 +445,9 @@  static struct dx_countlimit *get_dx_countlimit(struct inode *inode,
 	struct ext4_dir_entry *dp;
 	struct dx_root_info *root;
 	int count_offset;
+	int blocksize = EXT4_BLOCK_SIZE(inode->i_sb);
 
-	if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb))
+	if (ext4_rec_len_from_disk(dirent->rec_len, blocksize) == blocksize)
 		count_offset = 8;
 	else if (le16_to_cpu(dirent->rec_len) == 12) {
 		dp = (struct ext4_dir_entry *)(((void *)dirent) + 12);