diff mbox series

ext4: add reserved GDT blocks check

Message ID 20220526073222.380259-1-yi.zhang@huawei.com
State Superseded
Headers show
Series ext4: add reserved GDT blocks check | expand

Commit Message

Zhang Yi May 26, 2022, 7:32 a.m. UTC
We capture a NULL pointer issue when resizing a corrupt ext4 image which
is freshly clear resize_inode feature (not run e2fsck). It could be
simply reproduced by following steps. The problem is because of the
resize_inode feature was cleared, and it will convert the filesystem to
meta_bg mode in ext4_resize_fs(), but the es->s_reserved_gdt_blocks was
not reduced to zero, so could we mistakenly call reserve_backup_gdb()
and passing an uninitialized resize_inode to it when adding new group
descriptors.

 mkfs.ext4 /dev/sda 3G
 tune2fs -O ^resize_inode /dev/sda #forget to run requested e2fsck
 mount /dev/sda /mnt
 resize2fs /dev/sda 8G

 ========
 BUG: kernel NULL pointer dereference, address: 0000000000000028
 CPU: 19 PID: 3243 Comm: resize2fs Not tainted 5.18.0-rc7-00001-gfde086c5ebfd #748
 ...
 RIP: 0010:ext4_flex_group_add+0xe08/0x2570
 ...
 Call Trace:
  <TASK>
  ext4_resize_fs+0xbec/0x1660
  __ext4_ioctl+0x1749/0x24e0
  ext4_ioctl+0x12/0x20
  __x64_sys_ioctl+0xa6/0x110
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f2dd739617b
 ========

The fix is simple, add a check in ext4_resize_fs() to make sure that the
es->s_reserved_gdt_blocks is zero when the resize_inode feature is
disabled.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/resize.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Ritesh Harjani (IBM) May 28, 2022, 3:01 p.m. UTC | #1
On 22/05/26 03:32PM, Zhang Yi wrote:
> We capture a NULL pointer issue when resizing a corrupt ext4 image which
> is freshly clear resize_inode feature (not run e2fsck). It could be
> simply reproduced by following steps. The problem is because of the
> resize_inode feature was cleared, and it will convert the filesystem to
> meta_bg mode in ext4_resize_fs(), but the es->s_reserved_gdt_blocks was
> not reduced to zero, so could we mistakenly call reserve_backup_gdb()
> and passing an uninitialized resize_inode to it when adding new group
> descriptors.
>
>  mkfs.ext4 /dev/sda 3G
>  tune2fs -O ^resize_inode /dev/sda #forget to run requested e2fsck
>  mount /dev/sda /mnt
>  resize2fs /dev/sda 8G
>
>  ========
>  BUG: kernel NULL pointer dereference, address: 0000000000000028
>  CPU: 19 PID: 3243 Comm: resize2fs Not tainted 5.18.0-rc7-00001-gfde086c5ebfd #748
>  ...
>  RIP: 0010:ext4_flex_group_add+0xe08/0x2570
>  ...
>  Call Trace:
>   <TASK>
>   ext4_resize_fs+0xbec/0x1660
>   __ext4_ioctl+0x1749/0x24e0
>   ext4_ioctl+0x12/0x20
>   __x64_sys_ioctl+0xa6/0x110
>   do_syscall_64+0x3b/0x90
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>  RIP: 0033:0x7f2dd739617b
>  ========
>
> The fix is simple, add a check in ext4_resize_fs() to make sure that the
> es->s_reserved_gdt_blocks is zero when the resize_inode feature is
> disabled.

Your reasoning looks correct to me.

>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> ---
>  fs/ext4/resize.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
> index 90a941d20dff..5791eb7c0761 100644
> --- a/fs/ext4/resize.c
> +++ b/fs/ext4/resize.c
> @@ -2031,6 +2031,9 @@ int ext4_resize_fs(struct super_block *sb, ext4_fsblk_t n_blocks_count)
>  			ext4_warning(sb, "Error opening resize inode");
>  			return PTR_ERR(resize_inode);
>  		}
> +	} else if (es->s_reserved_gdt_blocks) {
> +		ext4_error(sb, "resize_inode disabled but reserved GDT blocks non-zero");
> +		return -EFSCORRUPTED;
>  	}

I think we should do this check in ext4_resize_begin(), i.e.
if ext4_has_feature_resize_inode() is false and es->s_reserved_gdt_blocks is
non-zero, then we should straight away mark and return error.

Later (not as part of this patch/fix) maybe if we detect this problem, we could
use helpers like ext4_update_super() to fix this mismatch problem in kernel
during mount itself. But I think this is not absolutely necessary,
as kernel already during mount outputs a warning and ask for running e2fsck.

Thougts?

-ritesh

>
>  	if ((!resize_inode && !meta_bg) || n_blocks_count == o_blocks_count) {
> --
> 2.31.1
>
Zhang Yi May 30, 2022, 4:55 a.m. UTC | #2
On 2022/5/28 23:01, Ritesh Harjani wrote:
> On 22/05/26 03:32PM, Zhang Yi wrote:
>> We capture a NULL pointer issue when resizing a corrupt ext4 image which
>> is freshly clear resize_inode feature (not run e2fsck). It could be
>> simply reproduced by following steps. The problem is because of the
>> resize_inode feature was cleared, and it will convert the filesystem to
>> meta_bg mode in ext4_resize_fs(), but the es->s_reserved_gdt_blocks was
>> not reduced to zero, so could we mistakenly call reserve_backup_gdb()
>> and passing an uninitialized resize_inode to it when adding new group
>> descriptors.
>>
>>  mkfs.ext4 /dev/sda 3G
>>  tune2fs -O ^resize_inode /dev/sda #forget to run requested e2fsck
>>  mount /dev/sda /mnt
>>  resize2fs /dev/sda 8G
>>
>>  ========
>>  BUG: kernel NULL pointer dereference, address: 0000000000000028
>>  CPU: 19 PID: 3243 Comm: resize2fs Not tainted 5.18.0-rc7-00001-gfde086c5ebfd #748
>>  ...
>>  RIP: 0010:ext4_flex_group_add+0xe08/0x2570
>>  ...
>>  Call Trace:
>>   <TASK>
>>   ext4_resize_fs+0xbec/0x1660
>>   __ext4_ioctl+0x1749/0x24e0
>>   ext4_ioctl+0x12/0x20
>>   __x64_sys_ioctl+0xa6/0x110
>>   do_syscall_64+0x3b/0x90
>>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>>  RIP: 0033:0x7f2dd739617b
>>  ========
>>
>> The fix is simple, add a check in ext4_resize_fs() to make sure that the
>> es->s_reserved_gdt_blocks is zero when the resize_inode feature is
>> disabled.
> 
> Your reasoning looks correct to me.
> 
>>
>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
>> ---
>>  fs/ext4/resize.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
>> index 90a941d20dff..5791eb7c0761 100644
>> --- a/fs/ext4/resize.c
>> +++ b/fs/ext4/resize.c
>> @@ -2031,6 +2031,9 @@ int ext4_resize_fs(struct super_block *sb, ext4_fsblk_t n_blocks_count)
>>  			ext4_warning(sb, "Error opening resize inode");
>>  			return PTR_ERR(resize_inode);
>>  		}
>> +	} else if (es->s_reserved_gdt_blocks) {
>> +		ext4_error(sb, "resize_inode disabled but reserved GDT blocks non-zero");
>> +		return -EFSCORRUPTED;
>>  	}
> 
> I think we should do this check in ext4_resize_begin(), i.e.
> if ext4_has_feature_resize_inode() is false and es->s_reserved_gdt_blocks is
> non-zero, then we should straight away mark and return error.
> 

Thanks for your suggestion. Yes, put this check in ext4_resize_begin() looks
better, it is also useful for EXT4_IOC_GROUP_ADD (although we have a check in
ext4_group_add() already, it is still worth). I will put this check into
ext4_resize_begin() in my next iteration.

> Later (not as part of this patch/fix) maybe if we detect this problem, we could
> use helpers like ext4_update_super() to fix this mismatch problem in kernel
> during mount itself. But I think this is not absolutely necessary,
> as kernel already during mount outputs a warning and ask for running e2fsck.
> 

I think the warning from mount outputs is enough, sysadmins should run e2fsck
after getting this note. It's hard and costly if we want to fix this inconsistent
problem in kernel because from the kernel's point of view, if it detect above
inconsistency, it means that both of the es->s_reserved_gdt_blocks and resize_inode
feature are not trusted anymore, we have to do in depth check as e2fsck does, and
it's hard to make a fix decision automatically(not even e2fsck).

Thanks,
Yi.
diff mbox series

Patch

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 90a941d20dff..5791eb7c0761 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -2031,6 +2031,9 @@  int ext4_resize_fs(struct super_block *sb, ext4_fsblk_t n_blocks_count)
 			ext4_warning(sb, "Error opening resize inode");
 			return PTR_ERR(resize_inode);
 		}
+	} else if (es->s_reserved_gdt_blocks) {
+		ext4_error(sb, "resize_inode disabled but reserved GDT blocks non-zero");
+		return -EFSCORRUPTED;
 	}
 
 	if ((!resize_inode && !meta_bg) || n_blocks_count == o_blocks_count) {