diff mbox series

ext4: flush s_error_work before journal destroy in ext4_fill_super

Message ID 20210720062409.960734-1-yangerkun@huawei.com
State Superseded
Headers show
Series ext4: flush s_error_work before journal destroy in ext4_fill_super | expand

Commit Message

yangerkun July 20, 2021, 6:24 a.m. UTC
'commit c92dc856848f ("ext4: defer saving error info from atomic
context")' and '2d01ddc86606 ("ext4: save error info to sb through journal
if available")' add s_error_work to fix checksum error problem. But the
error path in ext4_fill_super can lead the follow BUG_ON.

Our testcase got follow BUG:
[32031.759805] ------------[ cut here ]------------
[32031.759807] kernel BUG at fs/jbd2/transaction.c:373!
[32031.760075] invalid opcode: 0000 [#1] SMP PTI
[32031.760336] CPU: 5 PID: 1029268 Comm: kworker/5:1 Kdump: loaded
Tainted: G           OE    --------- -  -
4.18.0
...
[32031.766665]  jbd2__journal_start+0xf1/0x1f0 [jbd2]
[32031.766934]  jbd2_journal_start+0x19/0x20 [jbd2]
[32031.767218]  flush_stashed_error_work+0x30/0x90 [ext4]
[32031.767487]  process_one_work+0x195/0x390
[32031.767747]  worker_thread+0x30/0x390
[32031.768007]  ? process_one_work+0x390/0x390
[32031.768265]  kthread+0x10d/0x130
[32031.768521]  ? kthread_flush_work_fn+0x10/0x10
[32031.768778]  ret_from_fork+0x35/0x40

static int start_this_handle(...)
    BUG_ON(journal->j_flags & JBD2_UNMOUNT); <---- Trigger this

flush_stashed_error_work will try to access journal. We cannot flush
s_error_work after journal destroy.

Fixes: c92dc856848f ("ext4: defer saving error info from atomic context")
Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available")
Signed-off-by: yangerkun <yangerkun@huawei.com>
---
 fs/ext4/super.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Theodore Ts'o July 23, 2021, 12:11 p.m. UTC | #1
On Tue, Jul 20, 2021 at 02:24:09PM +0800, yangerkun wrote:
> 'commit c92dc856848f ("ext4: defer saving error info from atomic
> context")' and '2d01ddc86606 ("ext4: save error info to sb through journal
> if available")' add s_error_work to fix checksum error problem. But the
> error path in ext4_fill_super can lead the follow BUG_ON.

Can you share with me your test case?  Your patch will result in the
shrinker potentially not getting released in some error paths (which
will cause other kernel panics), and in any case, once the journal is
destroyed here:

> @@ -5173,15 +5173,15 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	ext4_xattr_destroy_cache(sbi->s_ea_block_cache);
>  	sbi->s_ea_block_cache = NULL;
> +failed_mount3a:
> +	ext4_es_unregister_shrinker(sbi);
> +failed_mount3:
> +	flush_work(&sbi->s_error_work);
>  
>  	if (sbi->s_journal) {
>  		jbd2_journal_destroy(sbi->s_journal);
>  		sbi->s_journal = NULL;
>  	}
> -failed_mount3a:
> -	ext4_es_unregister_shrinker(sbi);
> -failed_mount3:
> -	flush_work(&sbi->s_error_work);

sbi->s_journal is set to NULL, which means that in
flush_stashed_error_work(), journal will be NULL, which means we won't
call start_this_handle and so this change will not make a difference
given the kernel stack trace in the commit description.

Thanks,

						- Ted
yangerkun July 23, 2021, 1:11 p.m. UTC | #2
在 2021/7/23 20:11, Theodore Ts'o 写道:
> On Tue, Jul 20, 2021 at 02:24:09PM +0800, yangerkun wrote:
>> 'commit c92dc856848f ("ext4: defer saving error info from atomic
>> context")' and '2d01ddc86606 ("ext4: save error info to sb through journal
>> if available")' add s_error_work to fix checksum error problem. But the
>> error path in ext4_fill_super can lead the follow BUG_ON.
> 
> Can you share with me your test case?  Your patch will result in the
> shrinker potentially not getting released in some error paths (which
> will cause other kernel panics), and in any case, once the journal is
> destroyed here:
> 
>> @@ -5173,15 +5173,15 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>>   
>>   	ext4_xattr_destroy_cache(sbi->s_ea_block_cache);
>>   	sbi->s_ea_block_cache = NULL;
>> +failed_mount3a:
>> +	ext4_es_unregister_shrinker(sbi);
>> +failed_mount3:
>> +	flush_work(&sbi->s_error_work);
>>   
>>   	if (sbi->s_journal) {
>>   		jbd2_journal_destroy(sbi->s_journal);
>>   		sbi->s_journal = NULL;
>>   	}
>> -failed_mount3a:
>> -	ext4_es_unregister_shrinker(sbi);
>> -failed_mount3:
>> -	flush_work(&sbi->s_error_work);
> 
> sbi->s_journal is set to NULL, which means that in
> flush_stashed_error_work(), journal will be NULL, which means we won't
> call start_this_handle and so this change will not make a difference
> given the kernel stack trace in the commit description.


For example, before wo goto failed_mount_wq, we may meet some error and 
will goto ext4_handle_error which can call 
schedule_work(&EXT4_SB(sb)->s_error_work). So the work may start 
concurrent with ext4_fill_super goto failed_mount_wq. There does not 
have any lock to protect the concurrent read and modifies for 
sbi->s_journal.


Injection fault  some delay between jbd2_journal_destory and 
sbi->s_journal and

We can injection fault while we do mount and add some delay like follow. 
We will get some panic report easily...


  failed_mount_wq:
          ext4_xattr_destroy_cache(sbi->s_ea_inode_cache);
          sbi->s_ea_inode_cache = NULL;

          ext4_xattr_destroy_cache(sbi->s_ea_block_cache);
          sbi->s_ea_block_cache = NULL;

          if (sbi->s_journal) {
                  jbd2_journal_destroy(sbi->s_journal);
                  msleep(300); <---- add some delay
                  sbi->s_journal = NULL;
          }


So we need to make sure to work will exists while we destroy the 
journal. It maybe better to fix it by move the flush_work before destory 
journal.


> 
> Thanks,
> 
> 						- Ted
> .
>
yangerkun July 23, 2021, 1:25 p.m. UTC | #3
在 2021/7/23 20:11, Theodore Ts'o 写道:
> On Tue, Jul 20, 2021 at 02:24:09PM +0800, yangerkun wrote:
>> 'commit c92dc856848f ("ext4: defer saving error info from atomic
>> context")' and '2d01ddc86606 ("ext4: save error info to sb through journal
>> if available")' add s_error_work to fix checksum error problem. But the
>> error path in ext4_fill_super can lead the follow BUG_ON.
> 
> Can you share with me your test case?  Your patch will result in the
> shrinker potentially not getting released in some error paths (which
> will cause other kernel panics), and in any case, once the journal is

Hi Ted,

The only logic we have changed is that we move the flush_work before we 
call jbd2_journal_destory. I have not seen the problem you describe... 
Can you help to explain more...

Thanks,
Kun.

> destroyed here:
> 
>> @@ -5173,15 +5173,15 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>>   
>>   	ext4_xattr_destroy_cache(sbi->s_ea_block_cache);
>>   	sbi->s_ea_block_cache = NULL;
>> +failed_mount3a:
>> +	ext4_es_unregister_shrinker(sbi);
>> +failed_mount3:
>> +	flush_work(&sbi->s_error_work);
>>   
>>   	if (sbi->s_journal) {
>>   		jbd2_journal_destroy(sbi->s_journal);
>>   		sbi->s_journal = NULL;
>>   	}
>> -failed_mount3a:
>> -	ext4_es_unregister_shrinker(sbi);
>> -failed_mount3:
>> -	flush_work(&sbi->s_error_work);
> 
> sbi->s_journal is set to NULL, which means that in
> flush_stashed_error_work(), journal will be NULL, which means we won't
> call start_this_handle and so this change will not make a difference
> given the kernel stack trace in the commit description.
> 
> Thanks,
> 
> 						- Ted
> .
>
yangerkun Aug. 24, 2021, 10:11 a.m. UTC | #4
Sorry for that this patch will lead a bug when mount failed on arm64(I 
have verify this patch on x86, and it seems ok...).

I will try to fix the problem with another way....

[2021-08-24 17:43:29 INFO] [   45.808127] Internal error: Oops: 96000004 
[#1] SMP
[2021-08-24 17:43:29 INFO] [   45.812986] Modules linked in: realtek 
hclge hns3 megaraid_sas hisi_sas_v3_hw hibmc_drm hisi_sas_main 
host_edma_drv hnae3
[2021-08-24 17:43:29 INFO] [   45.823896] Process mount (pid: 1187, 
stack limit = 0x00000000e0bd7181)
[2021-08-24 17:43:29 INFO] [   45.830481] CPU: 9 PID: 1187 Comm: mount 
Not tainted 4.19.90_4.19.90-2108.7.0+ #2
[2021-08-24 17:43:29 INFO] [   45.837929] Hardware name: Huawei TaiShan 
2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.26.01 06/14/2019
[2021-08-24 17:43:29 INFO] [   45.846587] pstate: 10400009 (nzcV daif 
+PAN -UAO)
[2021-08-24 17:43:29 INFO] [   45.851359] pc : 
percpu_counter_add_batch+0x5c/0x358
[2021-08-24 17:43:29 INFO] [   45.856303] lr : 
ext4_es_free_extent+0xd4/0x4a8
[2021-08-24 17:43:29 INFO] [   45.860812] sp : ffff80262a60f170
[2021-08-24 17:43:29 INFO] [   45.864112] x29: ffff80262a60f170 x28: 
dfff200000000000
[2021-08-24 17:43:29 INFO] [   45.869399] x27: ffff8026403e65cc x26: 
0000000000000000
[2021-08-24 17:43:29 INFO] [   45.874686] x25: ffff80266c429e20 x24: 
0000000000000100
[2021-08-24 17:43:29 INFO] [   45.879973] x23: 1ffff004cd8853c4 x22: 
ffffffffffffffff
[2021-08-24 17:43:29 INFO] [   45.885260] x21: 0000000000000000 x20: 
00006026b7a31000
[2021-08-24 17:43:29 INFO] [   45.890547] x19: ffff80266c429e00 x18: 
0000000000000000
[2021-08-24 17:43:29 INFO] [   45.895833] x17: 0000000000000000 x16: 
0000000000000000
[2021-08-24 17:43:29 INFO] [   45.901120] x15: 1fffe4000504bd02 x14: 
ffff200023ba1adc
[2021-08-24 17:43:29 INFO] [   45.906406] x13: ffff200024332530 x12: 
ffff20002433240c
[2021-08-24 17:43:29 INFO] [   45.911693] x11: ffff2000243304f4 x10: 
ffff200024328f6c
[2021-08-24 17:43:29 INFO] [   45.916980] x9 : 1ffff004c807ccb9 x8 : 
ffff200023b75950
[2021-08-24 17:43:29 INFO] [   45.922266] x7 : ffff200023ba1fe0 x6 : 
0000000000000004
[2021-08-24 17:43:29 INFO] [   45.927553] x5 : 0000000000000000 x4 : 
0000000000000000
[2021-08-24 17:43:29 INFO] [   45.932839] x3 : 00000c04d6f46200 x2 : 
dfff200000000000
[2021-08-24 17:43:29 INFO] [   45.938126] x1 : 0000000000000003 x0 : 
00006026b7a31000
[2021-08-24 17:43:29 INFO] [   45.943413] Call trace:
[2021-08-24 17:43:29 INFO] [   45.945851] 
percpu_counter_add_batch+0x5c/0x358
[2021-08-24 17:43:29 INFO] [   45.950446]  ext4_es_free_extent+0xd4/0x4a8
[2021-08-24 17:43:29 INFO] [   45.954610]  __es_remove_extent+0x37c/0x598
[2021-08-24 17:43:29 INFO] [   45.958773]  ext4_es_remove_extent+0x6c/0x270
[2021-08-24 17:43:29 INFO] [   45.963110]  ext4_clear_inode+0x50/0x188
[2021-08-24 17:43:29 INFO] [   45.967016]  ext4_evict_inode+0x314/0x1450
[2021-08-24 17:43:29 INFO] [   45.971094]  evict+0x238/0x5a0
[2021-08-24 17:43:29 INFO] [   45.974136]  iput+0x2bc/0x6b8
[2021-08-24 17:43:29 INFO] [   45.977090]  jbd2_journal_destroy+0x408/0x800
[2021-08-24 17:43:29 INFO] [   45.981428]  ext4_fill_super+0x5a04/0x8cc0
[2021-08-24 17:43:29 INFO] [   45.985505]  mount_bdev+0x268/0x328
[2021-08-24 17:43:29 INFO] [   45.988978]  ext4_mount+0x44/0x58
[2021-08-24 17:43:29 INFO] [   45.992277]  mount_fs+0x68/0x390
[2021-08-24 17:43:29 INFO] [   45.995492]  vfs_kern_mount.part.2+0x54/0x388
[2021-08-24 17:43:29 INFO] [   45.999830]  do_mount+0xa5c/0x20d0
[2021-08-24 17:43:29 INFO] [   46.003216]  ksys_mount+0x9c/0x118
[2021-08-24 17:43:29 INFO] [   46.006603]  __arm64_sys_mount+0xa8/0x108
[2021-08-24 17:43:29 INFO] [   46.010595]  el0_svc_common+0x10c/0x4a0
[2021-08-24 17:43:29 INFO] [   46.014415]  el0_svc_handler+0x170/0x248
[2021-08-24 17:43:29 INFO] [   46.018320]  el0_svc+0x10/0x218
[2021-08-24 17:43:29 INFO] [   46.021448] Code: f2fbffe2 d343fc03 
92400801 11000c21 (38e26862)
[2021-08-24 17:43:29 INFO] [   46.027513] ---[ end trace 
316fdb8833588599 ]---
[2021-08-24 17:43:29 INFO] [   46.037646] Kernel panic - not syncing: 
Fatal exception
[2021-08-24 17:43:29 INFO] [   46.042848] SMP: stopping secondary CPUs
[2021-08-24 17:43:29 INFO] [   46.046779] Kernel Offset: 0x1baf0000 from 
0xffff200008000000
[2021-08-24 17:43:29 INFO] [   46.052500] CPU features: 0x12,a2a00a38
[2021-08-24 17:43:29 INFO] [   46.056318] Memory Limit: none
[2021-08-24 17:43:29 INFO] [   46.064887] Rebooting in 1 seconds..
[2021-08-24 17:43:30 INFO] [   47.068489] SMP: stopping secondary CPUs
[2021-08-24 17:43:32 INFO] [   48.158471] SMP: failed to stop secondary 
CPUs 0-127

在 2021/7/23 21:25, yangerkun 写道:
> 
> 
> 在 2021/7/23 20:11, Theodore Ts'o 写道:
>> On Tue, Jul 20, 2021 at 02:24:09PM +0800, yangerkun wrote:
>>> 'commit c92dc856848f ("ext4: defer saving error info from atomic
>>> context")' and '2d01ddc86606 ("ext4: save error info to sb through 
>>> journal
>>> if available")' add s_error_work to fix checksum error problem. But the
>>> error path in ext4_fill_super can lead the follow BUG_ON.
>>
>> Can you share with me your test case?  Your patch will result in the
>> shrinker potentially not getting released in some error paths (which
>> will cause other kernel panics), and in any case, once the journal is
> 
> Hi Ted,
> 
> The only logic we have changed is that we move the flush_work before we 
> call jbd2_journal_destory. I have not seen the problem you describe... 
> Can you help to explain more...
> 
> Thanks,
> Kun.
> 
>> destroyed here:
>>
>>> @@ -5173,15 +5173,15 @@ static int ext4_fill_super(struct super_block 
>>> *sb, void *data, int silent)
>>>       ext4_xattr_destroy_cache(sbi->s_ea_block_cache);
>>>       sbi->s_ea_block_cache = NULL;
>>> +failed_mount3a:
>>> +    ext4_es_unregister_shrinker(sbi);
>>> +failed_mount3:
>>> +    flush_work(&sbi->s_error_work);
>>>       if (sbi->s_journal) {
>>>           jbd2_journal_destroy(sbi->s_journal);
>>>           sbi->s_journal = NULL;
>>>       }
>>> -failed_mount3a:
>>> -    ext4_es_unregister_shrinker(sbi);
>>> -failed_mount3:
>>> -    flush_work(&sbi->s_error_work);
>>
>> sbi->s_journal is set to NULL, which means that in
>> flush_stashed_error_work(), journal will be NULL, which means we won't
>> call start_this_handle and so this change will not make a difference
>> given the kernel stack trace in the commit description.
>>
>> Thanks,
>>
>>                         - Ted
>> .
>>
> .
diff mbox series

Patch

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index dfa09a277b56..7db2be03848f 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5173,15 +5173,15 @@  static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 
 	ext4_xattr_destroy_cache(sbi->s_ea_block_cache);
 	sbi->s_ea_block_cache = NULL;
+failed_mount3a:
+	ext4_es_unregister_shrinker(sbi);
+failed_mount3:
+	flush_work(&sbi->s_error_work);
 
 	if (sbi->s_journal) {
 		jbd2_journal_destroy(sbi->s_journal);
 		sbi->s_journal = NULL;
 	}
-failed_mount3a:
-	ext4_es_unregister_shrinker(sbi);
-failed_mount3:
-	flush_work(&sbi->s_error_work);
 	del_timer_sync(&sbi->s_err_report);
 	ext4_stop_mmpd(sbi);
 failed_mount2: