Message ID | 20220713185904.64138-1-tadeusz.struk@linaro.org |
---|---|
State | Rejected |
Headers | show |
Series | ext4: fix kernel BUG in ext4_free_blocks | expand |
On Wed, Jul 13, 2022 at 11:59:04AM -0700, Tadeusz Struk wrote: > Syzbot reported a BUG in ext4_free_blocks. > The issue is triggered from ext4_mb_clear_bb(). What happens is the > block number passed to ext4_get_group_no_and_offset() is 0 and the > es->s_first_data_block is 1. This makes block group number returned > from ext4_get_group_no_and_offset equal to -1. This is then passed to > ext4_get_group_info() and hits a BUG: > BUG_ON(group >= EXT4_SB(sb)->s_groups_count), > what can be seen in the trace below. > This patch adds an assertion to ext4_get_group_no_and_offset() that > checks if block number is not smaller than es->s_first_data_block. > > kernel BUG at fs/ext4/ext4.h:3319! > invalid opcode: 0000 [#1] PREEMPT SMP KASAN > CPU: 0 PID: 337 Comm: repro Not tainted 5.19.0-rc6-00105-g4e8e898e4107-dirty #14 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 > RIP: 0010:ext4_mb_clear_bb+0x1bd6/0x1be0 > Call Trace: > <TASK> > ext4_free_blocks+0x9b3/0xc90 > ext4_clear_blocks+0x344/0x3b0 > ext4_ind_truncate+0x967/0x1050 > ext4_truncate+0xb1b/0x1210 > ext4_evict_inode+0xf06/0x16f0 > evict+0x2a3/0x630 > iput+0x618/0x850 > ext4_enable_quotas+0x578/0x920 > ext4_orphan_cleanup+0x539/0x1200 > ext4_fill_super+0x94d8/0x9bc0 > get_tree_bdev+0x40c/0x630 > ext4_get_tree+0x1c/0x20 > vfs_get_tree+0x88/0x290 > do_new_mount+0x289/0xac0 > path_mount+0x607/0xfd0 > __se_sys_mount+0x2c4/0x3b0 > __x64_sys_mount+0xbf/0xd0 > do_syscall_64+0x3d/0x90 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > </TASK> > > Link: https://syzkaller.appspot.com/bug?id=5266d464285a03cee9dbfda7d2452a72c3c2ae7c > Reported-by: syzbot+15cd994e273307bf5cfa@syzkaller.appspotmail.com > Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> > --- > fs/ext4/balloc.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c > index 78ee3ef795ae..1175750ad05f 100644 > --- a/fs/ext4/balloc.c > +++ b/fs/ext4/balloc.c > @@ -56,6 +56,9 @@ void ext4_get_group_no_and_offset(struct super_block *sb, ext4_fsblk_t blocknr, > struct ext4_super_block *es = EXT4_SB(sb)->s_es; > ext4_grpblk_t offset; > > + if (blocknr < le32_to_cpu(es->s_first_data_block)) > + blocknr = le32_to_cpu(es->s_first_data_block); > + This does not seem right. we should never work with block number smaller than s_first_data_block. The first 1024 bytes of the file system are unused and in case we have 1k block size, the entire first block is unused. I guess the image we work here with is corrupted, from the log it seems that it was noticed correctly so the question is why did we still ended up calling ext4_free_blocks() ? Seems like this should have been stopped earlier by ext4_clear_blocks() ? I did notice that in ext4_mb_clear_bb() we call ext4_get_group_no_and_offset() before ext4_inode_block_valid() but again we should have caught this problem earlier. Can you link me the file system image that generated this problem? Thanks! -Lukas > blocknr = blocknr - le32_to_cpu(es->s_first_data_block); > offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >> > EXT4_SB(sb)->s_cluster_bits; > -- > 2.36.1 >
On Thu, Jul 14, 2022 at 11:53:00AM +0200, Lukas Czerner wrote: > On Wed, Jul 13, 2022 at 11:59:04AM -0700, Tadeusz Struk wrote: > > Syzbot reported a BUG in ext4_free_blocks. > > The issue is triggered from ext4_mb_clear_bb(). What happens is the > > block number passed to ext4_get_group_no_and_offset() is 0 and the > > es->s_first_data_block is 1. This makes block group number returned > > from ext4_get_group_no_and_offset equal to -1. This is then passed to > > ext4_get_group_info() and hits a BUG: > > BUG_ON(group >= EXT4_SB(sb)->s_groups_count), > > what can be seen in the trace below. > > This patch adds an assertion to ext4_get_group_no_and_offset() that > > checks if block number is not smaller than es->s_first_data_block. > > > > kernel BUG at fs/ext4/ext4.h:3319! > > invalid opcode: 0000 [#1] PREEMPT SMP KASAN > > CPU: 0 PID: 337 Comm: repro Not tainted 5.19.0-rc6-00105-g4e8e898e4107-dirty #14 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 > > RIP: 0010:ext4_mb_clear_bb+0x1bd6/0x1be0 > > Call Trace: > > <TASK> > > ext4_free_blocks+0x9b3/0xc90 > > ext4_clear_blocks+0x344/0x3b0 > > ext4_ind_truncate+0x967/0x1050 > > ext4_truncate+0xb1b/0x1210 > > ext4_evict_inode+0xf06/0x16f0 > > evict+0x2a3/0x630 > > iput+0x618/0x850 > > ext4_enable_quotas+0x578/0x920 > > ext4_orphan_cleanup+0x539/0x1200 > > ext4_fill_super+0x94d8/0x9bc0 > > get_tree_bdev+0x40c/0x630 > > ext4_get_tree+0x1c/0x20 > > vfs_get_tree+0x88/0x290 > > do_new_mount+0x289/0xac0 > > path_mount+0x607/0xfd0 > > __se_sys_mount+0x2c4/0x3b0 > > __x64_sys_mount+0xbf/0xd0 > > do_syscall_64+0x3d/0x90 > > entry_SYSCALL_64_after_hwframe+0x63/0xcd > > </TASK> > > > > Link: https://syzkaller.appspot.com/bug?id=5266d464285a03cee9dbfda7d2452a72c3c2ae7c > > Reported-by: syzbot+15cd994e273307bf5cfa@syzkaller.appspotmail.com > > Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> > > --- > > fs/ext4/balloc.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c > > index 78ee3ef795ae..1175750ad05f 100644 > > --- a/fs/ext4/balloc.c > > +++ b/fs/ext4/balloc.c > > @@ -56,6 +56,9 @@ void ext4_get_group_no_and_offset(struct super_block *sb, ext4_fsblk_t blocknr, > > struct ext4_super_block *es = EXT4_SB(sb)->s_es; > > ext4_grpblk_t offset; > > > > + if (blocknr < le32_to_cpu(es->s_first_data_block)) > > + blocknr = le32_to_cpu(es->s_first_data_block); > > + > > This does not seem right. we should never work with block number smaller > than s_first_data_block. The first 1024 bytes of the file system are > unused and in case we have 1k block size, the entire first block is > unused. > > I guess the image we work here with is corrupted, from the log it seems > that it was noticed correctly so the question is why did we still ended > up calling ext4_free_blocks() ? Seems like this should have been stopped > earlier by ext4_clear_blocks() ? > > I did notice that in ext4_mb_clear_bb() we call > ext4_get_group_no_and_offset() before ext4_inode_block_valid() but > again we should have caught this problem earlier. > > Can you link me the file system image that generated this problem? ok, I got the syzkaller C repro to work. The problem is that it's bigalloc file system and the 'block' and 'count' to free in ext4_free_blocks will get adjusted after the ext4_inode_block_valid(). We should make sure that if this happens we also clear the EXT4_FREE_BLOCKS_VALIDATED. Additonally the ext4_inode_block_valid() in ext4_mb_clear_bb() should be called *before* the values are taken for granted. I'll prepare a patch to fix this. -Lukas > > Thanks! > -Lukas > > > > blocknr = blocknr - le32_to_cpu(es->s_first_data_block); > > offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >> > > EXT4_SB(sb)->s_cluster_bits; > > -- > > 2.36.1 > > >
On 7/14/22 05:23, Lukas Czerner wrote: >> This does not seem right. we should never work with block number smaller >> than s_first_data_block. The first 1024 bytes of the file system are >> unused and in case we have 1k block size, the entire first block is >> unused. >> >> I guess the image we work here with is corrupted, from the log it seems >> that it was noticed correctly so the question is why did we still ended >> up calling ext4_free_blocks() ? Seems like this should have been stopped >> earlier by ext4_clear_blocks() ? >> >> I did notice that in ext4_mb_clear_bb() we call >> ext4_get_group_no_and_offset() before ext4_inode_block_valid() but >> again we should have caught this problem earlier. >> >> Can you link me the file system image that generated this problem? > ok, I got the syzkaller C repro to work. The problem is that it's > bigalloc file system and the 'block' and 'count' to free in > ext4_free_blocks will get adjusted after the ext4_inode_block_valid(). > > We should make sure that if this happens we also clear the > EXT4_FREE_BLOCKS_VALIDATED. Additonally the ext4_inode_block_valid() > in ext4_mb_clear_bb() should be called*before* the values are taken for > granted. I'll prepare a patch to fix this. Thank you for feedback Lukas. Please CC me on your patch so I could test it.
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c index 78ee3ef795ae..1175750ad05f 100644 --- a/fs/ext4/balloc.c +++ b/fs/ext4/balloc.c @@ -56,6 +56,9 @@ void ext4_get_group_no_and_offset(struct super_block *sb, ext4_fsblk_t blocknr, struct ext4_super_block *es = EXT4_SB(sb)->s_es; ext4_grpblk_t offset; + if (blocknr < le32_to_cpu(es->s_first_data_block)) + blocknr = le32_to_cpu(es->s_first_data_block); + blocknr = blocknr - le32_to_cpu(es->s_first_data_block); offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >> EXT4_SB(sb)->s_cluster_bits;
Syzbot reported a BUG in ext4_free_blocks. The issue is triggered from ext4_mb_clear_bb(). What happens is the block number passed to ext4_get_group_no_and_offset() is 0 and the es->s_first_data_block is 1. This makes block group number returned from ext4_get_group_no_and_offset equal to -1. This is then passed to ext4_get_group_info() and hits a BUG: BUG_ON(group >= EXT4_SB(sb)->s_groups_count), what can be seen in the trace below. This patch adds an assertion to ext4_get_group_no_and_offset() that checks if block number is not smaller than es->s_first_data_block. kernel BUG at fs/ext4/ext4.h:3319! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 337 Comm: repro Not tainted 5.19.0-rc6-00105-g4e8e898e4107-dirty #14 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 RIP: 0010:ext4_mb_clear_bb+0x1bd6/0x1be0 Call Trace: <TASK> ext4_free_blocks+0x9b3/0xc90 ext4_clear_blocks+0x344/0x3b0 ext4_ind_truncate+0x967/0x1050 ext4_truncate+0xb1b/0x1210 ext4_evict_inode+0xf06/0x16f0 evict+0x2a3/0x630 iput+0x618/0x850 ext4_enable_quotas+0x578/0x920 ext4_orphan_cleanup+0x539/0x1200 ext4_fill_super+0x94d8/0x9bc0 get_tree_bdev+0x40c/0x630 ext4_get_tree+0x1c/0x20 vfs_get_tree+0x88/0x290 do_new_mount+0x289/0xac0 path_mount+0x607/0xfd0 __se_sys_mount+0x2c4/0x3b0 __x64_sys_mount+0xbf/0xd0 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> Link: https://syzkaller.appspot.com/bug?id=5266d464285a03cee9dbfda7d2452a72c3c2ae7c Reported-by: syzbot+15cd994e273307bf5cfa@syzkaller.appspotmail.com Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org> --- fs/ext4/balloc.c | 3 +++ 1 file changed, 3 insertions(+)