Message ID | 1523673237-112405-1-git-send-email-bo.liu@linux.alibaba.com |
---|---|
State | Superseded, archived |
Headers | show |
Series | Ext4: Set BBITMAP_CORRUPT_BIT when failing to read the allocation bitmap | expand |
On Mon, Apr 16, 2018 at 01:37:23AM +0000, Wanig Shilong wrote: > Yup, that patch have been used in Lustre deployment from RHEL6 and RHEL7 for years. > I think what Liu Bo tried to fix is a BUG.h > If you agreed, I could send our improved patch for bitmap corruptions handling to upstream. Yes, please send your patch out to the list so we can look at the two take the best one. (For one thing, we probably want to do the same thing for the inode allocation bitmap.) Thanks, - Ted
On Mon, Apr 16, 2018 at 01:37:23AM +0000, Wang Shilong wrote: > Yup, that patch have been used in Lustre deployment from RHEL6 and RHEL7 for years. > I think what Liu Bo tried to fix is a BUG. > If you agreed, I could send our improved patch for bitmap corruptions handling to upstream. The patch looks good to me, but needs to cover ext4_wait_block_bitmap as well with ext4_corrupted_block_group(). thanks, -liubo
HI, I just sent some of my patches out, it is cleanup and some fixes that I found which should be sent earlier. I am trying to cook another one which will take a bit time. Thanks, Shilong
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c index f9b3e0a83526..f0a5ed3df5fa 100644 --- a/fs/ext4/balloc.c +++ b/fs/ext4/balloc.c @@ -499,9 +499,19 @@ int ext4_wait_block_bitmap(struct super_block *sb, ext4_group_t block_group, return -EFSCORRUPTED; wait_on_buffer(bh); if (!buffer_uptodate(bh)) { + struct ext4_group_info *grp = + ext4_get_group_info(sb, block_group); + struct ext4_sb_info *sbi = EXT4_SB(sb); + ext4_error(sb, "Cannot read block bitmap - " "block_group = %u, block_bitmap = %llu", block_group, (unsigned long long) bh->b_blocknr); + + if (!EXT4_MB_GRP_BBITMAP_CORRUPT(grp)) + percpu_counter_sub(&sbi->s_freeclusters_counter, + grp->bb_free); + set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT, &grp->bb_state); + return -EIO; } clear_buffer_new(bh);
With failing to read the allocation block bitmap due to disk read errors, ext4 might end up with in-memory buddy bitmap being inconsistent with on-disk block bitmap. And the behavior would be unpredictable, one of the situations I got was [943102.751298] ------------[ cut here ]------------ [943102.751299] kernel BUG at fs/ext4/mballoc.c:1911! [943102.751300] invalid opcode: 0000 [#1] SMP ... [943102.751353] Call Trace: [943102.751363] [<ffffffffa05340a6>] ext4_mb_regular_allocator+0x356/0x460 [ext4] [943102.751371] [<ffffffffa0535b9c>] ext4_mb_new_blocks+0x5ec/0xaf0 [ext4] [943102.751379] [<ffffffffa052434d>] ? __read_extent_tree_block+0x5d/0x1f0 [ext4] [943102.751386] [<ffffffffa0525633>] ? ext4_find_extent+0x143/0x2d0 [ext4] [943102.751394] [<ffffffffa052a7de>] ext4_ext_map_blocks+0xb5e/0xf30 [ext4] [943102.751397] [<ffffffff811bb75c>] ? node_dirty_ok+0x12c/0x170 [943102.751403] [<ffffffffa04f7802>] ext4_map_blocks+0x172/0x600 [ext4] [943102.751406] [<ffffffff8127a8c1>] ? alloc_buffer_head+0x21/0x60 [943102.751407] [<ffffffff81233601>] ? mem_cgroup_commit_charge+0x91/0x530 [943102.751413] [<ffffffffa04f7d22>] _ext4_get_block+0x92/0x100 [ext4] [943102.751419] [<ffffffffa04f7da6>] ext4_get_block+0x16/0x20 [ext4] [943102.751420] [<ffffffff8127d357>] __block_write_begin_int+0x197/0x5e0 [943102.751425] [<ffffffffa04f7d90>] ? _ext4_get_block+0x100/0x100 [ext4] [943102.751432] [<ffffffffa04fcb56>] ? ext4_write_begin+0x126/0x5b0 [ext4] [943102.751433] [<ffffffff8127d7b1>] __block_write_begin+0x11/0x20 [943102.751439] [<ffffffffa04fcbdc>] ext4_write_begin+0x1ac/0x5b0 [ext4] [943102.751446] [<ffffffffa052cfdd>] ? __ext4_journal_stop+0x3d/0xa0 [ext4] [943102.751449] [<ffffffff811ab578>] generic_perform_write+0xc8/0x1c0 [943102.751451] [<ffffffff8125f37e>] ? file_update_time+0x5e/0x110 [943102.751452] [<ffffffff811adbb5>] __generic_file_write_iter+0x185/0x1d0 [943102.751458] [<ffffffffa04f1a6b>] ext4_file_write_iter+0x8b/0x380 [ext4] [943102.751460] [<ffffffff81247429>] ? vfs_getattr_nosec+0x29/0x40 [943102.751462] [<ffffffff81247c6f>] ? cp_new_stat+0x14f/0x180 [943102.751463] [<ffffffff81241115>] __vfs_write+0xe5/0x160 [943102.751464] [<ffffffff812423b5>] vfs_write+0xb5/0x1a0 [943102.751465] [<ffffffff81243875>] SyS_write+0x55/0xc0 [943102.751468] [<ffffffff8171a6da>] entry_SYSCALL_64_fastpath+0x1a/0xc5 [943102.751476] Code: 39 44 24 3c 75 27 49 8b 85 60 <0f> 0b 0f 0b e8 3b 57 b5 e0 90 66 [943102.751484] RIP [<ffffffffa05336dc>] ext4_mb_simple_scan_group+0x14c/0x160 [ext4] [943102.751484] RSP <ffffc90037997820> To avoid the above, EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT should be set in order to prevent any further allocations from this block group. Suggested-by: Theodore Y. Ts'o <tytso@mit.edu> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> --- fs/ext4/balloc.c | 10 ++++++++++ 1 file changed, 10 insertions(+)