Message ID | 20230110042709.2136336-1-libaokun1@huawei.com |
---|---|
State | Superseded |
Headers | show |
Series | ext4: fix task hung in ext4_xattr_delete_inode | expand |
Hi Baokun, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on tytso-ext4/dev] [also build test WARNING on linus/master v6.2-rc3 next-20230110] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Baokun-Li/ext4-fix-task-hung-in-ext4_xattr_delete_inode/20230110-120433 base: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev patch link: https://lore.kernel.org/r/20230110042709.2136336-1-libaokun1%40huawei.com patch subject: [PATCH] ext4: fix task hung in ext4_xattr_delete_inode config: i386-randconfig-a004-20230109 compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/2b538f0fbc0861c2073325963ff3532bf80b20c0 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Baokun-Li/ext4-fix-task-hung-in-ext4_xattr_delete_inode/20230110-120433 git checkout 2b538f0fbc0861c2073325963ff3532bf80b20c0 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash fs/ext4/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> fs/ext4/xattr.c:389:6: warning: variable 'inode' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized] if (parent->i_ino == ea_ino) { ^~~~~~~~~~~~~~~~~~~~~~~ fs/ext4/xattr.c:442:7: note: uninitialized use occurs here iput(inode); ^~~~~ fs/ext4/xattr.c:389:2: note: remove the 'if' if its condition is always false if (parent->i_ino == ea_ino) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fs/ext4/xattr.c:386:21: note: initialize the variable 'inode' to silence this warning struct inode *inode; ^ = NULL 1 warning generated. vim +389 fs/ext4/xattr.c 382 383 static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, 384 u32 ea_inode_hash, struct inode **ea_inode) 385 { 386 struct inode *inode; 387 int err; 388 > 389 if (parent->i_ino == ea_ino) { 390 ext4_error(parent->i_sb, 391 "Parent and EA inode have the same ino %lu", ea_ino); 392 err = -EUCLEAN; 393 goto error; 394 } 395 396 inode = ext4_iget(parent->i_sb, ea_ino, EXT4_IGET_NORMAL); 397 if (IS_ERR(inode)) { 398 err = PTR_ERR(inode); 399 ext4_error(parent->i_sb, 400 "error while reading EA inode %lu err=%d", ea_ino, 401 err); 402 return err; 403 } 404 405 if (is_bad_inode(inode)) { 406 ext4_error(parent->i_sb, 407 "error while reading EA inode %lu is_bad_inode", 408 ea_ino); 409 err = -EIO; 410 goto error; 411 } 412 413 if (!(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) { 414 ext4_error(parent->i_sb, 415 "EA inode %lu does not have EXT4_EA_INODE_FL flag", 416 ea_ino); 417 err = -EINVAL; 418 goto error; 419 } 420 421 ext4_xattr_inode_set_class(inode); 422 423 /* 424 * Check whether this is an old Lustre-style xattr inode. Lustre 425 * implementation does not have hash validation, rather it has a 426 * backpointer from ea_inode to the parent inode. 427 */ 428 if (ea_inode_hash != ext4_xattr_inode_get_hash(inode) && 429 EXT4_XATTR_INODE_GET_PARENT(inode) == parent->i_ino && 430 inode->i_generation == parent->i_generation) { 431 ext4_set_inode_state(inode, EXT4_STATE_LUSTRE_EA_INODE); 432 ext4_xattr_inode_set_ref(inode, 1); 433 } else { 434 inode_lock(inode); 435 inode->i_flags |= S_NOQUOTA; 436 inode_unlock(inode); 437 } 438 439 *ea_inode = inode; 440 return 0; 441 error: 442 iput(inode); 443 return err; 444 } 445
On Tue 10-01-23 12:27:09, Baokun Li wrote: > Syzbot reported a hung task problem: > ================================================================== > INFO: task syz-executor232:5073 blocked for more than 143 seconds. > Not tainted 6.2.0-rc2-syzkaller-00024-g512dee0c00ad #0 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:syz-exec232 state:D stack:21024 pid:5073 ppid:5072 flags:0x00004004 > Call Trace: > <TASK> > context_switch kernel/sched/core.c:5244 [inline] > __schedule+0x995/0xe20 kernel/sched/core.c:6555 > schedule+0xcb/0x190 kernel/sched/core.c:6631 > __wait_on_freeing_inode fs/inode.c:2196 [inline] > find_inode_fast+0x35a/0x4c0 fs/inode.c:950 > iget_locked+0xb1/0x830 fs/inode.c:1273 > __ext4_iget+0x22e/0x3ed0 fs/ext4/inode.c:4861 > ext4_xattr_inode_iget+0x68/0x4e0 fs/ext4/xattr.c:389 > ext4_xattr_inode_dec_ref_all+0x1a7/0xe50 fs/ext4/xattr.c:1148 > ext4_xattr_delete_inode+0xb04/0xcd0 fs/ext4/xattr.c:2880 > ext4_evict_inode+0xd7c/0x10b0 fs/ext4/inode.c:296 > evict+0x2a4/0x620 fs/inode.c:664 > ext4_orphan_cleanup+0xb60/0x1340 fs/ext4/orphan.c:474 > __ext4_fill_super fs/ext4/super.c:5516 [inline] > ext4_fill_super+0x81cd/0x8700 fs/ext4/super.c:5644 > get_tree_bdev+0x400/0x620 fs/super.c:1282 > vfs_get_tree+0x88/0x270 fs/super.c:1489 > do_new_mount+0x289/0xad0 fs/namespace.c:3145 > do_mount fs/namespace.c:3488 [inline] > __do_sys_mount fs/namespace.c:3697 [inline] > __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fa5406fd5ea > RSP: 002b:00007ffc7232f968 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa5406fd5ea > RDX: 0000000020000440 RSI: 0000000020000000 RDI: 00007ffc7232f970 > RBP: 00007ffc7232f970 R08: 00007ffc7232f9b0 R09: 0000000000000432 > R10: 0000000000804a03 R11: 0000000000000202 R12: 0000000000000004 > R13: 0000555556a7a2c0 R14: 00007ffc7232f9b0 R15: 0000000000000000 > </TASK> > ================================================================== > > The problem is that the inode contains an xattr entry with ea_inum of 15 > when cleaning up an orphan inode <15>. When evict inode <15>, the reference > counting of the corresponding EA inode is decreased. When EA inode <15> is > found by find_inode_fast() in __ext4_iget(), it is found that the EA inode > holds the I_FREEING flag and waits for the EA inode to complete deletion. > As a result, when inode <15> is being deleted, we wait for inode <15> to > complete the deletion, resulting in an infinite loop and triggering Hung > Task. To solve this problem, we only need to check whether the ino of EA > inode and parent is the same before getting EA inode. > > Link: https://syzkaller.appspot.com/bug?extid=77d6fcc37bbb92f26048 > Reported-by: syzbot+77d6fcc37bbb92f26048@syzkaller.appspotmail.com > Signed-off-by: Baokun Li <libaokun1@huawei.com> > --- > fs/ext4/xattr.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c > index 7decaaf27e82..9ff8fcf78bb8 100644 > --- a/fs/ext4/xattr.c > +++ b/fs/ext4/xattr.c > @@ -386,6 +386,13 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, > struct inode *inode; > int err; > Perhaps add a comment here like: /* * We have to check for this corruption early as otherwise * iget_locked() could wait indefinitely for the state of our * parent inode. */ > + if (parent->i_ino == ea_ino) { > + ext4_error(parent->i_sb, > + "Parent and EA inode have the same ino %lu", ea_ino); > + err = -EUCLEAN; ^^ I prefer -EFSCORRUPTED here. It is the same value but more descriptive name :). > + goto error; ^^ Just "return err" here. This will try to iput() uninitialized pointer. Honza
On 2023/1/10 19:34, Jan Kara wrote: > On Tue 10-01-23 12:27:09, Baokun Li wrote: >> Syzbot reported a hung task problem: >> ================================================================== >> INFO: task syz-executor232:5073 blocked for more than 143 seconds. >> Not tainted 6.2.0-rc2-syzkaller-00024-g512dee0c00ad #0 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:syz-exec232 state:D stack:21024 pid:5073 ppid:5072 flags:0x00004004 >> Call Trace: >> <TASK> >> context_switch kernel/sched/core.c:5244 [inline] >> __schedule+0x995/0xe20 kernel/sched/core.c:6555 >> schedule+0xcb/0x190 kernel/sched/core.c:6631 >> __wait_on_freeing_inode fs/inode.c:2196 [inline] >> find_inode_fast+0x35a/0x4c0 fs/inode.c:950 >> iget_locked+0xb1/0x830 fs/inode.c:1273 >> __ext4_iget+0x22e/0x3ed0 fs/ext4/inode.c:4861 >> ext4_xattr_inode_iget+0x68/0x4e0 fs/ext4/xattr.c:389 >> ext4_xattr_inode_dec_ref_all+0x1a7/0xe50 fs/ext4/xattr.c:1148 >> ext4_xattr_delete_inode+0xb04/0xcd0 fs/ext4/xattr.c:2880 >> ext4_evict_inode+0xd7c/0x10b0 fs/ext4/inode.c:296 >> evict+0x2a4/0x620 fs/inode.c:664 >> ext4_orphan_cleanup+0xb60/0x1340 fs/ext4/orphan.c:474 >> __ext4_fill_super fs/ext4/super.c:5516 [inline] >> ext4_fill_super+0x81cd/0x8700 fs/ext4/super.c:5644 >> get_tree_bdev+0x400/0x620 fs/super.c:1282 >> vfs_get_tree+0x88/0x270 fs/super.c:1489 >> do_new_mount+0x289/0xad0 fs/namespace.c:3145 >> do_mount fs/namespace.c:3488 [inline] >> __do_sys_mount fs/namespace.c:3697 [inline] >> __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 >> do_syscall_x64 arch/x86/entry/common.c:50 [inline] >> do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 >> entry_SYSCALL_64_after_hwframe+0x63/0xcd >> RIP: 0033:0x7fa5406fd5ea >> RSP: 002b:00007ffc7232f968 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 >> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa5406fd5ea >> RDX: 0000000020000440 RSI: 0000000020000000 RDI: 00007ffc7232f970 >> RBP: 00007ffc7232f970 R08: 00007ffc7232f9b0 R09: 0000000000000432 >> R10: 0000000000804a03 R11: 0000000000000202 R12: 0000000000000004 >> R13: 0000555556a7a2c0 R14: 00007ffc7232f9b0 R15: 0000000000000000 >> </TASK> >> ================================================================== >> >> The problem is that the inode contains an xattr entry with ea_inum of 15 >> when cleaning up an orphan inode <15>. When evict inode <15>, the reference >> counting of the corresponding EA inode is decreased. When EA inode <15> is >> found by find_inode_fast() in __ext4_iget(), it is found that the EA inode >> holds the I_FREEING flag and waits for the EA inode to complete deletion. >> As a result, when inode <15> is being deleted, we wait for inode <15> to >> complete the deletion, resulting in an infinite loop and triggering Hung >> Task. To solve this problem, we only need to check whether the ino of EA >> inode and parent is the same before getting EA inode. >> >> Link: https://syzkaller.appspot.com/bug?extid=77d6fcc37bbb92f26048 >> Reported-by: syzbot+77d6fcc37bbb92f26048@syzkaller.appspotmail.com >> Signed-off-by: Baokun Li <libaokun1@huawei.com> >> --- >> fs/ext4/xattr.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c >> index 7decaaf27e82..9ff8fcf78bb8 100644 >> --- a/fs/ext4/xattr.c >> +++ b/fs/ext4/xattr.c >> @@ -386,6 +386,13 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, >> struct inode *inode; >> int err; >> > Perhaps add a comment here like: > > /* > * We have to check for this corruption early as otherwise > * iget_locked() could wait indefinitely for the state of our > * parent inode. > */ Totally agree! Very accurate comment! >> + if (parent->i_ino == ea_ino) { >> + ext4_error(parent->i_sb, >> + "Parent and EA inode have the same ino %lu", ea_ino); >> + err = -EUCLEAN; > ^^ I prefer -EFSCORRUPTED here. It is the same > value but more descriptive name :). OK! I will replace EUCLEAN with EFSCORRUPTED in v2. >> + goto error; > ^^ Just "return err" here. This will try to iput() > uninitialized pointer. > > Honza > Indeed! I'm sorry I didn't notice this uninitialized variable. Thank you very much for your review! I will send a patch V2 with the changes suggested by you.
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 7decaaf27e82..9ff8fcf78bb8 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -386,6 +386,13 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, struct inode *inode; int err; + if (parent->i_ino == ea_ino) { + ext4_error(parent->i_sb, + "Parent and EA inode have the same ino %lu", ea_ino); + err = -EUCLEAN; + goto error; + } + inode = ext4_iget(parent->i_sb, ea_ino, EXT4_IGET_NORMAL); if (IS_ERR(inode)) { err = PTR_ERR(inode);
Syzbot reported a hung task problem: ================================================================== INFO: task syz-executor232:5073 blocked for more than 143 seconds. Not tainted 6.2.0-rc2-syzkaller-00024-g512dee0c00ad #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-exec232 state:D stack:21024 pid:5073 ppid:5072 flags:0x00004004 Call Trace: <TASK> context_switch kernel/sched/core.c:5244 [inline] __schedule+0x995/0xe20 kernel/sched/core.c:6555 schedule+0xcb/0x190 kernel/sched/core.c:6631 __wait_on_freeing_inode fs/inode.c:2196 [inline] find_inode_fast+0x35a/0x4c0 fs/inode.c:950 iget_locked+0xb1/0x830 fs/inode.c:1273 __ext4_iget+0x22e/0x3ed0 fs/ext4/inode.c:4861 ext4_xattr_inode_iget+0x68/0x4e0 fs/ext4/xattr.c:389 ext4_xattr_inode_dec_ref_all+0x1a7/0xe50 fs/ext4/xattr.c:1148 ext4_xattr_delete_inode+0xb04/0xcd0 fs/ext4/xattr.c:2880 ext4_evict_inode+0xd7c/0x10b0 fs/ext4/inode.c:296 evict+0x2a4/0x620 fs/inode.c:664 ext4_orphan_cleanup+0xb60/0x1340 fs/ext4/orphan.c:474 __ext4_fill_super fs/ext4/super.c:5516 [inline] ext4_fill_super+0x81cd/0x8700 fs/ext4/super.c:5644 get_tree_bdev+0x400/0x620 fs/super.c:1282 vfs_get_tree+0x88/0x270 fs/super.c:1489 do_new_mount+0x289/0xad0 fs/namespace.c:3145 do_mount fs/namespace.c:3488 [inline] __do_sys_mount fs/namespace.c:3697 [inline] __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fa5406fd5ea RSP: 002b:00007ffc7232f968 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa5406fd5ea RDX: 0000000020000440 RSI: 0000000020000000 RDI: 00007ffc7232f970 RBP: 00007ffc7232f970 R08: 00007ffc7232f9b0 R09: 0000000000000432 R10: 0000000000804a03 R11: 0000000000000202 R12: 0000000000000004 R13: 0000555556a7a2c0 R14: 00007ffc7232f9b0 R15: 0000000000000000 </TASK> ================================================================== The problem is that the inode contains an xattr entry with ea_inum of 15 when cleaning up an orphan inode <15>. When evict inode <15>, the reference counting of the corresponding EA inode is decreased. When EA inode <15> is found by find_inode_fast() in __ext4_iget(), it is found that the EA inode holds the I_FREEING flag and waits for the EA inode to complete deletion. As a result, when inode <15> is being deleted, we wait for inode <15> to complete the deletion, resulting in an infinite loop and triggering Hung Task. To solve this problem, we only need to check whether the ino of EA inode and parent is the same before getting EA inode. Link: https://syzkaller.appspot.com/bug?extid=77d6fcc37bbb92f26048 Reported-by: syzbot+77d6fcc37bbb92f26048@syzkaller.appspotmail.com Signed-off-by: Baokun Li <libaokun1@huawei.com> --- fs/ext4/xattr.c | 7 +++++++ 1 file changed, 7 insertions(+)