From patchwork Thu Aug 19 06:57:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1518464 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4GqwKW0KGXz9sxS for ; Thu, 19 Aug 2021 16:46:39 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231411AbhHSGrN (ORCPT ); Thu, 19 Aug 2021 02:47:13 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:14275 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231234AbhHSGrN (ORCPT ); Thu, 19 Aug 2021 02:47:13 -0400 Received: from dggeme752-chm.china.huawei.com (unknown [172.30.72.56]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4GqwKG38fNz84wg; Thu, 19 Aug 2021 14:46:26 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggeme752-chm.china.huawei.com (10.3.19.98) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Thu, 19 Aug 2021 14:46:34 +0800 From: Zhang Yi To: CC: , , , , Subject: [PATCH v2 1/4] ext4: move inode eio simulation behind io completeion Date: Thu, 19 Aug 2021 14:57:01 +0800 Message-ID: <20210819065704.1248402-2-yi.zhang@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210819065704.1248402-1-yi.zhang@huawei.com> References: <20210819065704.1248402-1-yi.zhang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggeme752-chm.china.huawei.com (10.3.19.98) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org No EIO simulation is required if the buffer is uptodate, so move the simulation behind read bio completeion just like inode/block bitmap simulation does. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/inode.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index d8de607849df..eb2526a35254 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4330,8 +4330,6 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino, bh = sb_getblk(sb, block); if (unlikely(!bh)) return -ENOMEM; - if (ext4_simulate_fail(sb, EXT4_SIM_INODE_EIO)) - goto simulate_eio; if (!buffer_uptodate(bh)) { lock_buffer(bh); @@ -4418,8 +4416,8 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino, ext4_read_bh_nowait(bh, REQ_META | REQ_PRIO, NULL); blk_finish_plug(&plug); wait_on_buffer(bh); + ext4_simulate_fail_bh(sb, bh, EXT4_SIM_INODE_EIO); if (!buffer_uptodate(bh)) { - simulate_eio: if (ret_block) *ret_block = block; brelse(bh); From patchwork Thu Aug 19 06:57:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1518463 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4GqwKV2S2jz9t0Y for ; Thu, 19 Aug 2021 16:46:38 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231404AbhHSGrN (ORCPT ); Thu, 19 Aug 2021 02:47:13 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:8882 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231220AbhHSGrM (ORCPT ); Thu, 19 Aug 2021 02:47:12 -0400 Received: from dggeme752-chm.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4GqwDl4Xzfz8sXD; Thu, 19 Aug 2021 14:42:31 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggeme752-chm.china.huawei.com (10.3.19.98) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Thu, 19 Aug 2021 14:46:34 +0800 From: Zhang Yi To: CC: , , , , Subject: [PATCH v2 2/4] ext4: remove an unnecessary if statement in __ext4_get_inode_loc() Date: Thu, 19 Aug 2021 14:57:02 +0800 Message-ID: <20210819065704.1248402-3-yi.zhang@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210819065704.1248402-1-yi.zhang@huawei.com> References: <20210819065704.1248402-1-yi.zhang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggeme752-chm.china.huawei.com (10.3.19.98) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org The "if (!buffer_uptodate(bh))" hunk covered almost the whole code after getting buffer in __ext4_get_inode_loc() which seems unnecessary, remove it and switch to check ext4_buffer_uptodate(), it simplify code and make it more readable. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/inode.c | 162 +++++++++++++++++++++++------------------------- 1 file changed, 78 insertions(+), 84 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index eb2526a35254..eae1b2d0b550 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4330,99 +4330,93 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino, bh = sb_getblk(sb, block); if (unlikely(!bh)) return -ENOMEM; - if (!buffer_uptodate(bh)) { - lock_buffer(bh); - - if (ext4_buffer_uptodate(bh)) { - /* someone brought it uptodate while we waited */ - unlock_buffer(bh); - goto has_buffer; - } + if (ext4_buffer_uptodate(bh)) + goto has_buffer; - /* - * If we have all information of the inode in memory and this - * is the only valid inode in the block, we need not read the - * block. - */ - if (in_mem) { - struct buffer_head *bitmap_bh; - int i, start; + lock_buffer(bh); + /* + * If we have all information of the inode in memory and this + * is the only valid inode in the block, we need not read the + * block. + */ + if (in_mem) { + struct buffer_head *bitmap_bh; + int i, start; - start = inode_offset & ~(inodes_per_block - 1); + start = inode_offset & ~(inodes_per_block - 1); - /* Is the inode bitmap in cache? */ - bitmap_bh = sb_getblk(sb, ext4_inode_bitmap(sb, gdp)); - if (unlikely(!bitmap_bh)) - goto make_io; + /* Is the inode bitmap in cache? */ + bitmap_bh = sb_getblk(sb, ext4_inode_bitmap(sb, gdp)); + if (unlikely(!bitmap_bh)) + goto make_io; - /* - * If the inode bitmap isn't in cache then the - * optimisation may end up performing two reads instead - * of one, so skip it. - */ - if (!buffer_uptodate(bitmap_bh)) { - brelse(bitmap_bh); - goto make_io; - } - for (i = start; i < start + inodes_per_block; i++) { - if (i == inode_offset) - continue; - if (ext4_test_bit(i, bitmap_bh->b_data)) - break; - } + /* + * If the inode bitmap isn't in cache then the + * optimisation may end up performing two reads instead + * of one, so skip it. + */ + if (!buffer_uptodate(bitmap_bh)) { brelse(bitmap_bh); - if (i == start + inodes_per_block) { - /* all other inodes are free, so skip I/O */ - memset(bh->b_data, 0, bh->b_size); - set_buffer_uptodate(bh); - unlock_buffer(bh); - goto has_buffer; - } + goto make_io; + } + for (i = start; i < start + inodes_per_block; i++) { + if (i == inode_offset) + continue; + if (ext4_test_bit(i, bitmap_bh->b_data)) + break; } + brelse(bitmap_bh); + if (i == start + inodes_per_block) { + /* all other inodes are free, so skip I/O */ + memset(bh->b_data, 0, bh->b_size); + set_buffer_uptodate(bh); + unlock_buffer(bh); + goto has_buffer; + } + } make_io: - /* - * If we need to do any I/O, try to pre-readahead extra - * blocks from the inode table. - */ - blk_start_plug(&plug); - if (EXT4_SB(sb)->s_inode_readahead_blks) { - ext4_fsblk_t b, end, table; - unsigned num; - __u32 ra_blks = EXT4_SB(sb)->s_inode_readahead_blks; - - table = ext4_inode_table(sb, gdp); - /* s_inode_readahead_blks is always a power of 2 */ - b = block & ~((ext4_fsblk_t) ra_blks - 1); - if (table > b) - b = table; - end = b + ra_blks; - num = EXT4_INODES_PER_GROUP(sb); - if (ext4_has_group_desc_csum(sb)) - num -= ext4_itable_unused_count(sb, gdp); - table += num / inodes_per_block; - if (end > table) - end = table; - while (b <= end) - ext4_sb_breadahead_unmovable(sb, b++); - } + /* + * If we need to do any I/O, try to pre-readahead extra + * blocks from the inode table. + */ + blk_start_plug(&plug); + if (EXT4_SB(sb)->s_inode_readahead_blks) { + ext4_fsblk_t b, end, table; + unsigned num; + __u32 ra_blks = EXT4_SB(sb)->s_inode_readahead_blks; + + table = ext4_inode_table(sb, gdp); + /* s_inode_readahead_blks is always a power of 2 */ + b = block & ~((ext4_fsblk_t) ra_blks - 1); + if (table > b) + b = table; + end = b + ra_blks; + num = EXT4_INODES_PER_GROUP(sb); + if (ext4_has_group_desc_csum(sb)) + num -= ext4_itable_unused_count(sb, gdp); + table += num / inodes_per_block; + if (end > table) + end = table; + while (b <= end) + ext4_sb_breadahead_unmovable(sb, b++); + } - /* - * There are other valid inodes in the buffer, this inode - * has in-inode xattrs, or we don't have this inode in memory. - * Read the block from disk. - */ - trace_ext4_load_inode(sb, ino); - ext4_read_bh_nowait(bh, REQ_META | REQ_PRIO, NULL); - blk_finish_plug(&plug); - wait_on_buffer(bh); - ext4_simulate_fail_bh(sb, bh, EXT4_SIM_INODE_EIO); - if (!buffer_uptodate(bh)) { - if (ret_block) - *ret_block = block; - brelse(bh); - return -EIO; - } + /* + * There are other valid inodes in the buffer, this inode + * has in-inode xattrs, or we don't have this inode in memory. + * Read the block from disk. + */ + trace_ext4_load_inode(sb, ino); + ext4_read_bh_nowait(bh, REQ_META | REQ_PRIO, NULL); + blk_finish_plug(&plug); + wait_on_buffer(bh); + ext4_simulate_fail_bh(sb, bh, EXT4_SIM_INODE_EIO); + if (!buffer_uptodate(bh)) { + if (ret_block) + *ret_block = block; + brelse(bh); + return -EIO; } has_buffer: iloc->bh = bh; From patchwork Thu Aug 19 06:57:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1518465 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4GqwKX2x6bz9t18 for ; Thu, 19 Aug 2021 16:46:40 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231361AbhHSGrO (ORCPT ); Thu, 19 Aug 2021 02:47:14 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:8043 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231294AbhHSGrN (ORCPT ); Thu, 19 Aug 2021 02:47:13 -0400 Received: from dggeme752-chm.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GqwJy2fSdzYqdQ; Thu, 19 Aug 2021 14:46:10 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggeme752-chm.china.huawei.com (10.3.19.98) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Thu, 19 Aug 2021 14:46:35 +0800 From: Zhang Yi To: CC: , , , , Subject: [PATCH v2 3/4] ext4: don't return error if huge_file feature mismatch Date: Thu, 19 Aug 2021 14:57:03 +0800 Message-ID: <20210819065704.1248402-4-yi.zhang@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210819065704.1248402-1-yi.zhang@huawei.com> References: <20210819065704.1248402-1-yi.zhang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggeme752-chm.china.huawei.com (10.3.19.98) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org In ext4_inode_blocks_set(), huge_file feature should exist when setting i_blocks beyond a 32 bit variable could be represented, return EFBIG if not. This error should never happen in theory since sb->s_maxbytes should not have allowed this, and we have already init sb->s_maxbytes according to this feature in ext4_fill_super(). So switch to use WARN_ON_ONCE instead. Signed-off-by: Zhang Yi --- fs/ext4/inode.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index eae1b2d0b550..d0d7a5295bf9 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4902,9 +4902,9 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, return ERR_PTR(ret); } -static int ext4_inode_blocks_set(handle_t *handle, - struct ext4_inode *raw_inode, - struct ext4_inode_info *ei) +static void ext4_inode_blocks_set(handle_t *handle, + struct ext4_inode *raw_inode, + struct ext4_inode_info *ei) { struct inode *inode = &(ei->vfs_inode); u64 i_blocks = READ_ONCE(inode->i_blocks); @@ -4918,10 +4918,15 @@ static int ext4_inode_blocks_set(handle_t *handle, raw_inode->i_blocks_lo = cpu_to_le32(i_blocks); raw_inode->i_blocks_high = 0; ext4_clear_inode_flag(inode, EXT4_INODE_HUGE_FILE); - return 0; + return; } - if (!ext4_has_feature_huge_file(sb)) - return -EFBIG; + + /* + * This should never happen since sb->s_maxbytes should not have + * allowed this, which was set according to the huge_file feature + * in ext4_fill_super(). + */ + WARN_ON_ONCE(!ext4_has_feature_huge_file(sb)); if (i_blocks <= 0xffffffffffffULL) { /* @@ -4938,7 +4943,6 @@ static int ext4_inode_blocks_set(handle_t *handle, raw_inode->i_blocks_lo = cpu_to_le32(i_blocks); raw_inode->i_blocks_high = cpu_to_le16(i_blocks >> 32); } - return 0; } static void __ext4_update_other_inode_time(struct super_block *sb, @@ -5029,11 +5033,7 @@ static int ext4_do_update_inode(handle_t *handle, if (ext4_test_inode_state(inode, EXT4_STATE_NEW)) memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size); - err = ext4_inode_blocks_set(handle, raw_inode, ei); - if (err) { - spin_unlock(&ei->i_raw_lock); - goto out_brelse; - } + ext4_inode_blocks_set(handle, raw_inode, ei); raw_inode->i_mode = cpu_to_le16(inode->i_mode); i_uid = i_uid_read(inode); From patchwork Thu Aug 19 06:57:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1518466 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4GqwKX5NzGz9t1C for ; Thu, 19 Aug 2021 16:46:40 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231294AbhHSGrP (ORCPT ); Thu, 19 Aug 2021 02:47:15 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:17048 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231347AbhHSGrN (ORCPT ); Thu, 19 Aug 2021 02:47:13 -0400 Received: from dggeme752-chm.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GqwF66vCdzbfMl; Thu, 19 Aug 2021 14:42:50 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggeme752-chm.china.huawei.com (10.3.19.98) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Thu, 19 Aug 2021 14:46:35 +0800 From: Zhang Yi To: CC: , , , , Subject: [PATCH v2 4/4] ext4: prevent getting empty inode buffer Date: Thu, 19 Aug 2021 14:57:04 +0800 Message-ID: <20210819065704.1248402-5-yi.zhang@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210819065704.1248402-1-yi.zhang@huawei.com> References: <20210819065704.1248402-1-yi.zhang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggeme752-chm.china.huawei.com (10.3.19.98) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org In ext4_get_inode_loc(), we may skip IO and get an zero && uptodate inode buffer when the inode monopolize an inode block for performance reason. For most cases, ext4_mark_iloc_dirty() will fill the inode buffer to make it fine, but we could miss this call if something bad happened. Finally, __ext4_get_inode_loc_noinmem() may probably get an empty inode buffer and trigger ext4 error. For example, if we remove a nonexistent xattr on inode A, ext4_xattr_set_handle() will return ENODATA before invoking ext4_mark_iloc_dirty(), it will left an uptodate but zero buffer. We will get checksum error message in ext4_iget() when getting inode again. EXT4-fs error (device sda): ext4_lookup:1784: inode #131074: comm cat: iget: checksum invalid Even worse, if we allocate another inode B at the same inode block, it will corrupt the inode A on disk when write back inode B. So this patch postpone the init and mark buffer uptodate logic until before filling correct inode data in ext4_do_update_inode() if skip read I/O, ensure the buffer is real uptodate. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/inode.c | 26 +++++++++++++++++++++++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index d0d7a5295bf9..02d910c9d8f1 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4367,9 +4367,11 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino, } brelse(bitmap_bh); if (i == start + inodes_per_block) { - /* all other inodes are free, so skip I/O */ - memset(bh->b_data, 0, bh->b_size); - set_buffer_uptodate(bh); + /* + * All other inodes are free, skip I/O. Return + * un-inited buffer (which is postponed until + * before filling inode data) immediately. + */ unlock_buffer(bh); goto has_buffer; } @@ -5026,6 +5028,24 @@ static int ext4_do_update_inode(handle_t *handle, gid_t i_gid; projid_t i_projid; + /* + * If the buffer is not uptodate, it means all information of inode + * in memory and we got this buffer without reading the block. We + * must be cautious that once we mark the buffer as uptodate, we + * rely on filling in the correct inode data later in this function. + * Otherwise if we getting the left falsepositive buffer when + * creating other inode on the same block, it could corrupt the + * inode data on disk. + */ + if (!buffer_uptodate(bh)) { + lock_buffer(bh); + if (!buffer_uptodate(bh)) { + memset(bh->b_data, 0, bh->b_size); + set_buffer_uptodate(bh); + } + unlock_buffer(bh); + } + spin_lock(&ei->i_raw_lock); /* For fields not tracked in the in-memory inode,