From patchwork Thu Aug 24 09:26:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825269 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBZ0l9tz1yg8 for ; Thu, 24 Aug 2023 19:31:34 +1000 (AEST) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBZ0GM9z4wxK for ; Thu, 24 Aug 2023 19:31:34 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBZ0CnJz4wxN; Thu, 24 Aug 2023 19:31:34 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBZ082Gz4wxK for ; Thu, 24 Aug 2023 19:31:34 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240754AbjHXJbH (ORCPT ); Thu, 24 Aug 2023 05:31:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240771AbjHXJaq (ORCPT ); Thu, 24 Aug 2023 05:30:46 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8D12310FA for ; Thu, 24 Aug 2023 02:30:44 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4RWd9T4wyXz4f3jYX for ; Thu, 24 Aug 2023 17:30:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S5; Thu, 24 Aug 2023 17:30:40 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 01/16] ext4: correct the start block of counting reserved clusters Date: Thu, 24 Aug 2023 17:26:04 +0800 Message-Id: <20230824092619.1327976-2-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S5 X-Coremail-Antispam: 1UD129KBjvJXoW7WF17tFyrWF1xJrW3Jw1UZFb_yoW8Gr15p3 WfJw13tr4rua4UGa48Kw1UGF1UAayjkFW7Wrs3t34fXFW5Zr95Gr18Kw40vFy5XFW8t3yr XF1jkw17Cay7ta7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9v14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6rxdM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMxC20s 026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_ JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14 v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xva j40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8JV W8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbec_DUUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,MAY_BE_FORGED, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi When big allocate feature is enabled, we need to count and update reserved clusters before removing a delayed only extent_status entry. {init|count|get}_rsvd() have already done this, but the start block number of this counting isn's correct in the following case. lblk end | | v v ------------------------- | | orig_es ------------------------- ^ ^ len1 is 0 | len2 | If the start block of the orig_es entry founded is bigger than lblk, we passed lblk as start block to count_rsvd(), but the length is correct, finally, the range to be counted is offset. This patch fix this by passing the start blocks to 'orig_es->lblk + len1'. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/extents_status.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 6f7de14c0fa8..5e625ea4545d 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -1405,8 +1405,8 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, } } if (count_reserved) - count_rsvd(inode, lblk, orig_es.es_len - len1 - len2, - &orig_es, &rc); + count_rsvd(inode, orig_es.es_lblk + len1, + orig_es.es_len - len1 - len2, &orig_es, &rc); goto out_get_reserved; } From patchwork Thu Aug 24 09:26:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825273 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBb4KCJz26jN for ; Thu, 24 Aug 2023 19:31:35 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBb3sw9z4wxK for ; Thu, 24 Aug 2023 19:31:35 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBb3qQTz4wxN; Thu, 24 Aug 2023 19:31:35 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBb3lSrz4wxK for ; Thu, 24 Aug 2023 19:31:35 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240774AbjHXJbJ (ORCPT ); Thu, 24 Aug 2023 05:31:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240782AbjHXJar (ORCPT ); Thu, 24 Aug 2023 05:30:47 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C9FE10F9 for ; Thu, 24 Aug 2023 02:30:44 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4RWd9V0R2rz4f3kFY for ; Thu, 24 Aug 2023 17:30:38 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S6; Thu, 24 Aug 2023 17:30:41 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 02/16] ext4: make sure allocate pending entry not fail Date: Thu, 24 Aug 2023 17:26:05 +0800 Message-Id: <20230824092619.1327976-3-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S6 X-Coremail-Antispam: 1UD129KBjvJXoW3WFW3tF4UJr45Wr1kZF45Awb_yoWfCry7pF W3Xrn8Ar18Xw1DWFWftF4UZr1Yg3W8tFWjyrZIkryfZF1rXFyftF10kF1YvF1FyrWxXw1a qrWjk34Uua1j9a7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9v14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6rxdM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMxC20s 026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_ JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14 v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xva j40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8JV W8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUjYiiDUUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,MAY_BE_FORGED, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi __insert_pending() allocate memory in atomic context, so the allocation could fail, but we are not handling that failure now. It could lead ext4_es_remove_extent() to get wrong reserved clusters, and the global data blocks reservation count will be incorrect. The same to extents_status entry preallocation, preallocate pending entry out of the i_es_lock with __GFP_NOFAIL, make sure __insert_pending() and __revise_pending() always succeeds. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/extents_status.c | 123 ++++++++++++++++++++++++++++----------- 1 file changed, 89 insertions(+), 34 deletions(-) diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 5e625ea4545d..f4b50652f0cc 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -152,8 +152,9 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, static int es_reclaim_extents(struct ext4_inode_info *ei, int *nr_to_scan); static int __es_shrink(struct ext4_sb_info *sbi, int nr_to_scan, struct ext4_inode_info *locked_ei); -static void __revise_pending(struct inode *inode, ext4_lblk_t lblk, - ext4_lblk_t len); +static int __revise_pending(struct inode *inode, ext4_lblk_t lblk, + ext4_lblk_t len, + struct pending_reservation **prealloc); int __init ext4_init_es(void) { @@ -448,6 +449,19 @@ static void ext4_es_list_del(struct inode *inode) spin_unlock(&sbi->s_es_lock); } +static inline struct pending_reservation *__alloc_pending(bool nofail) +{ + if (!nofail) + return kmem_cache_alloc(ext4_pending_cachep, GFP_ATOMIC); + + return kmem_cache_zalloc(ext4_pending_cachep, GFP_KERNEL | __GFP_NOFAIL); +} + +static inline void __free_pending(struct pending_reservation *pr) +{ + kmem_cache_free(ext4_pending_cachep, pr); +} + /* * Returns true if we cannot fail to allocate memory for this extent_status * entry and cannot reclaim it until its status changes. @@ -836,11 +850,12 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, { struct extent_status newes; ext4_lblk_t end = lblk + len - 1; - int err1 = 0; - int err2 = 0; + int err1 = 0, err2 = 0, err3 = 0; struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct extent_status *es1 = NULL; struct extent_status *es2 = NULL; + struct pending_reservation *pr = NULL; + bool revise_pending = false; if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) return; @@ -868,11 +883,17 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, ext4_es_insert_extent_check(inode, &newes); + revise_pending = sbi->s_cluster_ratio > 1 && + test_opt(inode->i_sb, DELALLOC) && + (status & (EXTENT_STATUS_WRITTEN | + EXTENT_STATUS_UNWRITTEN)); retry: if (err1 && !es1) es1 = __es_alloc_extent(true); if ((err1 || err2) && !es2) es2 = __es_alloc_extent(true); + if ((err1 || err2 || err3) && revise_pending && !pr) + pr = __alloc_pending(true); write_lock(&EXT4_I(inode)->i_es_lock); err1 = __es_remove_extent(inode, lblk, end, NULL, es1); @@ -897,13 +918,18 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, es2 = NULL; } - if (sbi->s_cluster_ratio > 1 && test_opt(inode->i_sb, DELALLOC) && - (status & EXTENT_STATUS_WRITTEN || - status & EXTENT_STATUS_UNWRITTEN)) - __revise_pending(inode, lblk, len); + if (revise_pending) { + err3 = __revise_pending(inode, lblk, len, &pr); + if (err3 != 0) + goto error; + if (pr) { + __free_pending(pr); + pr = NULL; + } + } error: write_unlock(&EXT4_I(inode)->i_es_lock); - if (err1 || err2) + if (err1 || err2 || err3) goto retry; ext4_es_print_tree(inode); @@ -1311,7 +1337,7 @@ static unsigned int get_rsvd(struct inode *inode, ext4_lblk_t end, rc->ndelonly--; node = rb_next(&pr->rb_node); rb_erase(&pr->rb_node, &tree->root); - kmem_cache_free(ext4_pending_cachep, pr); + __free_pending(pr); if (!node) break; pr = rb_entry(node, struct pending_reservation, @@ -1907,11 +1933,13 @@ static struct pending_reservation *__get_pending(struct inode *inode, * * @inode - file containing the cluster * @lblk - logical block in the cluster to be added + * @prealloc - preallocated pending entry * * Returns 0 on successful insertion and -ENOMEM on failure. If the * pending reservation is already in the set, returns successfully. */ -static int __insert_pending(struct inode *inode, ext4_lblk_t lblk) +static int __insert_pending(struct inode *inode, ext4_lblk_t lblk, + struct pending_reservation **prealloc) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_pending_tree *tree = &EXT4_I(inode)->i_pending_tree; @@ -1937,10 +1965,15 @@ static int __insert_pending(struct inode *inode, ext4_lblk_t lblk) } } - pr = kmem_cache_alloc(ext4_pending_cachep, GFP_ATOMIC); - if (pr == NULL) { - ret = -ENOMEM; - goto out; + if (likely(*prealloc == NULL)) { + pr = __alloc_pending(false); + if (!pr) { + ret = -ENOMEM; + goto out; + } + } else { + pr = *prealloc; + *prealloc = NULL; } pr->lclu = lclu; @@ -1970,7 +2003,7 @@ static void __remove_pending(struct inode *inode, ext4_lblk_t lblk) if (pr != NULL) { tree = &EXT4_I(inode)->i_pending_tree; rb_erase(&pr->rb_node, &tree->root); - kmem_cache_free(ext4_pending_cachep, pr); + __free_pending(pr); } } @@ -2029,10 +2062,10 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, bool allocated) { struct extent_status newes; - int err1 = 0; - int err2 = 0; + int err1 = 0, err2 = 0, err3 = 0; struct extent_status *es1 = NULL; struct extent_status *es2 = NULL; + struct pending_reservation *pr = NULL; if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) return; @@ -2052,6 +2085,8 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, es1 = __es_alloc_extent(true); if ((err1 || err2) && !es2) es2 = __es_alloc_extent(true); + if ((err1 || err2 || err3) && allocated && !pr) + pr = __alloc_pending(true); write_lock(&EXT4_I(inode)->i_es_lock); err1 = __es_remove_extent(inode, lblk, lblk, NULL, es1); @@ -2074,11 +2109,18 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, es2 = NULL; } - if (allocated) - __insert_pending(inode, lblk); + if (allocated) { + err3 = __insert_pending(inode, lblk, &pr); + if (err3 != 0) + goto error; + if (pr) { + __free_pending(pr); + pr = NULL; + } + } error: write_unlock(&EXT4_I(inode)->i_es_lock); - if (err1 || err2) + if (err1 || err2 || err3) goto retry; ext4_es_print_tree(inode); @@ -2184,21 +2226,24 @@ unsigned int ext4_es_delayed_clu(struct inode *inode, ext4_lblk_t lblk, * @inode - file containing the range * @lblk - logical block defining the start of range * @len - length of range in blocks + * @prealloc - preallocated pending entry * * Used after a newly allocated extent is added to the extents status tree. * Requires that the extents in the range have either written or unwritten * status. Must be called while holding i_es_lock. */ -static void __revise_pending(struct inode *inode, ext4_lblk_t lblk, - ext4_lblk_t len) +static int __revise_pending(struct inode *inode, ext4_lblk_t lblk, + ext4_lblk_t len, + struct pending_reservation **prealloc) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); ext4_lblk_t end = lblk + len - 1; ext4_lblk_t first, last; bool f_del = false, l_del = false; + int ret = 0; if (len == 0) - return; + return 0; /* * Two cases - block range within single cluster and block range @@ -2219,7 +2264,9 @@ static void __revise_pending(struct inode *inode, ext4_lblk_t lblk, f_del = __es_scan_range(inode, &ext4_es_is_delonly, first, lblk - 1); if (f_del) { - __insert_pending(inode, first); + ret = __insert_pending(inode, first, prealloc); + if (ret < 0) + goto out; } else { last = EXT4_LBLK_CMASK(sbi, end) + sbi->s_cluster_ratio - 1; @@ -2227,9 +2274,11 @@ static void __revise_pending(struct inode *inode, ext4_lblk_t lblk, l_del = __es_scan_range(inode, &ext4_es_is_delonly, end + 1, last); - if (l_del) - __insert_pending(inode, last); - else + if (l_del) { + ret = __insert_pending(inode, last, prealloc); + if (ret < 0) + goto out; + } else __remove_pending(inode, last); } } else { @@ -2237,18 +2286,24 @@ static void __revise_pending(struct inode *inode, ext4_lblk_t lblk, if (first != lblk) f_del = __es_scan_range(inode, &ext4_es_is_delonly, first, lblk - 1); - if (f_del) - __insert_pending(inode, first); - else + if (f_del) { + ret = __insert_pending(inode, first, prealloc); + if (ret < 0) + goto out; + } else __remove_pending(inode, first); last = EXT4_LBLK_CMASK(sbi, end) + sbi->s_cluster_ratio - 1; if (last != end) l_del = __es_scan_range(inode, &ext4_es_is_delonly, end + 1, last); - if (l_del) - __insert_pending(inode, last); - else + if (l_del) { + ret = __insert_pending(inode, last, prealloc); + if (ret < 0) + goto out; + } else __remove_pending(inode, last); } +out: + return ret; } From patchwork Thu Aug 24 09:26:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825265 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=150.107.74.76; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBX5srkz1yZs for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBX5mwpz4wZn for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBX5k83z4wxN; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBX5cq1z4wZn for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240623AbjHXJbH (ORCPT ); Thu, 24 Aug 2023 05:31:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58992 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240778AbjHXJar (ORCPT ); Thu, 24 Aug 2023 05:30:47 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7293172D for ; Thu, 24 Aug 2023 02:30:44 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9Y1drzz4f3prY for ; Thu, 24 Aug 2023 17:30:41 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S7; Thu, 24 Aug 2023 17:30:41 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 03/16] ext4: let __revise_pending() return the number of new inserts pendings Date: Thu, 24 Aug 2023 17:26:06 +0800 Message-Id: <20230824092619.1327976-4-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S7 X-Coremail-Antispam: 1UD129KBjvJXoWxuF45Xry5Xry7Aw4xuF1kGrg_yoWrKF1kp3 ya9as8Ary8Xw1UWa1FyF4UZr1Yg3W8tFWDXrZakryfZFW8XFy5tF10yF1avF1FyrWxJw13 XFWjk34UCa1jgaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9v14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6rxdM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMxC20s 026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_ JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14 v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xva j40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8JV W8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbJ73DUUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi Change __insert_pending() to return 1 on successful insertion a new pending cluster, and then change __revise_pending() to return the number of new inserts pendings. Signed-off-by: Zhang Yi --- fs/ext4/extents_status.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index f4b50652f0cc..67ac09930541 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -892,7 +892,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, es1 = __es_alloc_extent(true); if ((err1 || err2) && !es2) es2 = __es_alloc_extent(true); - if ((err1 || err2 || err3) && revise_pending && !pr) + if ((err1 || err2 || err3 < 0) && revise_pending && !pr) pr = __alloc_pending(true); write_lock(&EXT4_I(inode)->i_es_lock); @@ -920,7 +920,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, if (revise_pending) { err3 = __revise_pending(inode, lblk, len, &pr); - if (err3 != 0) + if (err3 < 0) goto error; if (pr) { __free_pending(pr); @@ -929,7 +929,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, } error: write_unlock(&EXT4_I(inode)->i_es_lock); - if (err1 || err2 || err3) + if (err1 || err2 || err3 < 0) goto retry; ext4_es_print_tree(inode); @@ -1935,7 +1935,7 @@ static struct pending_reservation *__get_pending(struct inode *inode, * @lblk - logical block in the cluster to be added * @prealloc - preallocated pending entry * - * Returns 0 on successful insertion and -ENOMEM on failure. If the + * Returns 1 on successful insertion and -ENOMEM on failure. If the * pending reservation is already in the set, returns successfully. */ static int __insert_pending(struct inode *inode, ext4_lblk_t lblk, @@ -1979,6 +1979,7 @@ static int __insert_pending(struct inode *inode, ext4_lblk_t lblk, rb_link_node(&pr->rb_node, parent, p); rb_insert_color(&pr->rb_node, &tree->root); + ret = 1; out: return ret; @@ -2085,7 +2086,7 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, es1 = __es_alloc_extent(true); if ((err1 || err2) && !es2) es2 = __es_alloc_extent(true); - if ((err1 || err2 || err3) && allocated && !pr) + if ((err1 || err2 || err3 < 0) && allocated && !pr) pr = __alloc_pending(true); write_lock(&EXT4_I(inode)->i_es_lock); @@ -2111,7 +2112,7 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, if (allocated) { err3 = __insert_pending(inode, lblk, &pr); - if (err3 != 0) + if (err3 < 0) goto error; if (pr) { __free_pending(pr); @@ -2120,7 +2121,7 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, } error: write_unlock(&EXT4_I(inode)->i_es_lock); - if (err1 || err2 || err3) + if (err1 || err2 || err3 < 0) goto retry; ext4_es_print_tree(inode); @@ -2230,7 +2231,9 @@ unsigned int ext4_es_delayed_clu(struct inode *inode, ext4_lblk_t lblk, * * Used after a newly allocated extent is added to the extents status tree. * Requires that the extents in the range have either written or unwritten - * status. Must be called while holding i_es_lock. + * status. Must be called while holding i_es_lock. Returns number of new + * inserts pending cluster on insert pendings, returns 0 on remove pendings, + * return -ENOMEM on failure. */ static int __revise_pending(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len, @@ -2240,6 +2243,7 @@ static int __revise_pending(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t end = lblk + len - 1; ext4_lblk_t first, last; bool f_del = false, l_del = false; + int pendings = 0; int ret = 0; if (len == 0) @@ -2267,6 +2271,7 @@ static int __revise_pending(struct inode *inode, ext4_lblk_t lblk, ret = __insert_pending(inode, first, prealloc); if (ret < 0) goto out; + pendings += ret; } else { last = EXT4_LBLK_CMASK(sbi, end) + sbi->s_cluster_ratio - 1; @@ -2278,6 +2283,7 @@ static int __revise_pending(struct inode *inode, ext4_lblk_t lblk, ret = __insert_pending(inode, last, prealloc); if (ret < 0) goto out; + pendings += ret; } else __remove_pending(inode, last); } @@ -2290,6 +2296,7 @@ static int __revise_pending(struct inode *inode, ext4_lblk_t lblk, ret = __insert_pending(inode, first, prealloc); if (ret < 0) goto out; + pendings += ret; } else __remove_pending(inode, first); @@ -2301,9 +2308,10 @@ static int __revise_pending(struct inode *inode, ext4_lblk_t lblk, ret = __insert_pending(inode, last, prealloc); if (ret < 0) goto out; + pendings += ret; } else __remove_pending(inode, last); } out: - return ret; + return (ret < 0) ? ret : pendings; } From patchwork Thu Aug 24 09:26:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825271 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBZ531Yz26jS for ; Thu, 24 Aug 2023 19:31:34 +1000 (AEST) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBZ4ZWHz4wxK for ; Thu, 24 Aug 2023 19:31:34 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBZ4TnVz4wxN; Thu, 24 Aug 2023 19:31:34 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBZ4Mwlz4wxK for ; Thu, 24 Aug 2023 19:31:34 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240763AbjHXJbI (ORCPT ); Thu, 24 Aug 2023 05:31:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59016 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240779AbjHXJar (ORCPT ); Thu, 24 Aug 2023 05:30:47 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B7E4198A for ; Thu, 24 Aug 2023 02:30:45 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9X5wL7z4f41Rw for ; Thu, 24 Aug 2023 17:30:40 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S8; Thu, 24 Aug 2023 17:30:41 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 04/16] ext4: count removed reserved blocks for delalloc only es entry Date: Thu, 24 Aug 2023 17:26:07 +0800 Message-Id: <20230824092619.1327976-5-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S8 X-Coremail-Antispam: 1UD129KBjvJXoW3Jr1fur45ZF43GF1xZFW3trb_yoW7Xr4kpF ZxuF15Kw13X34I93yftw4DZr1Sga48KFWUJr9Ik34fuF4rArySvF18AF42vFyrKrW0gw4j qF4jk34Uua12ga7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9G14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v2 0xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxV W8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZX7UUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi Current __es_remove_extent() count reserved clusters if the removed es entry is delalloc only, it's equal to blocks number if the bigalloc feature is disabled, but we cannot get the blocks number if that feature is enabled. So add a parameter to count the reserved blocks number, it is not used in this patch now, it will be used to calculate reserved meta blocks. Signed-off-by: Zhang Yi --- fs/ext4/extents_status.c | 40 ++++++++++++++++++++++++++++------------ 1 file changed, 28 insertions(+), 12 deletions(-) diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 67ac09930541..3a004ed04570 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -141,13 +141,18 @@ * -- Extent-level locking */ +struct rsvd_info { + int ndelonly_clu; /* reserved clusters for delalloc es entry */ + int ndelonly_blk; /* reserved blocks for delalloc es entry */ +}; + static struct kmem_cache *ext4_es_cachep; static struct kmem_cache *ext4_pending_cachep; static int __es_insert_extent(struct inode *inode, struct extent_status *newes, struct extent_status *prealloc); static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, - ext4_lblk_t end, int *reserved, + ext4_lblk_t end, struct rsvd_info *rinfo, struct extent_status *prealloc); static int es_reclaim_extents(struct ext4_inode_info *ei, int *nr_to_scan); static int __es_shrink(struct ext4_sb_info *sbi, int nr_to_scan, @@ -1050,6 +1055,7 @@ int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk, struct rsvd_count { int ndelonly; + int ndelonly_blk; bool first_do_lblk_found; ext4_lblk_t first_do_lblk; ext4_lblk_t last_do_lblk; @@ -1076,6 +1082,7 @@ static void init_rsvd(struct inode *inode, ext4_lblk_t lblk, struct rb_node *node; rc->ndelonly = 0; + rc->ndelonly_blk = 0; /* * for bigalloc, note the first delonly block in the range has not @@ -1124,10 +1131,12 @@ static void count_rsvd(struct inode *inode, ext4_lblk_t lblk, long len, if (sbi->s_cluster_ratio == 1) { rc->ndelonly += (int) len; + rc->ndelonly_blk = rc->ndelonly; return; } /* bigalloc */ + rc->ndelonly_blk += (int)len; i = (lblk < es->es_lblk) ? es->es_lblk : lblk; end = lblk + (ext4_lblk_t) len - 1; @@ -1355,16 +1364,17 @@ static unsigned int get_rsvd(struct inode *inode, ext4_lblk_t end, * @inode - file containing range * @lblk - first block in range * @end - last block in range - * @reserved - number of cluster reservations released + * @rinfo - reserved information collected, includes number of + * block/cluster reservations released * @prealloc - pre-allocated es to avoid memory allocation failures * - * If @reserved is not NULL and delayed allocation is enabled, counts + * If @rinfo is not NULL and delayed allocation is enabled, counts * block/cluster reservations freed by removing range and if bigalloc * enabled cancels pending reservations as needed. Returns 0 on success, * error code on failure. */ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, - ext4_lblk_t end, int *reserved, + ext4_lblk_t end, struct rsvd_info *rinfo, struct extent_status *prealloc) { struct ext4_es_tree *tree = &EXT4_I(inode)->i_es_tree; @@ -1374,11 +1384,15 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len1, len2; ext4_fsblk_t block; int err = 0; - bool count_reserved = true; + bool count_reserved = false; struct rsvd_count rc; - if (reserved == NULL || !test_opt(inode->i_sb, DELALLOC)) - count_reserved = false; + if (rinfo) { + rinfo->ndelonly_clu = 0; + rinfo->ndelonly_blk = 0; + if (test_opt(inode->i_sb, DELALLOC)) + count_reserved = true; + } es = __es_tree_search(&tree->root, lblk); if (!es) @@ -1476,8 +1490,10 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, } out_get_reserved: - if (count_reserved) - *reserved = get_rsvd(inode, end, es, &rc); + if (count_reserved) { + rinfo->ndelonly_clu = get_rsvd(inode, end, es, &rc); + rinfo->ndelonly_blk = rc.ndelonly_blk; + } out: return err; } @@ -1496,8 +1512,8 @@ void ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len) { ext4_lblk_t end; + struct rsvd_info rinfo; int err = 0; - int reserved = 0; struct extent_status *es = NULL; if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) @@ -1522,7 +1538,7 @@ void ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, * is reclaimed. */ write_lock(&EXT4_I(inode)->i_es_lock); - err = __es_remove_extent(inode, lblk, end, &reserved, es); + err = __es_remove_extent(inode, lblk, end, &rinfo, es); /* Free preallocated extent if it didn't get used. */ if (es) { if (!es->es_len) @@ -1534,7 +1550,7 @@ void ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, goto retry; ext4_es_print_tree(inode); - ext4_da_release_space(inode, reserved); + ext4_da_release_space(inode, rinfo.ndelonly_clu); return; } From patchwork Thu Aug 24 09:26:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825266 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBX6159z1yfF for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBV5mS2z4wxK for ; Thu, 24 Aug 2023 19:31:30 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBV5jZ1z4wxN; Thu, 24 Aug 2023 19:31:30 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBV5cfrz4wxK for ; Thu, 24 Aug 2023 19:31:30 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233569AbjHXJbA (ORCPT ); Thu, 24 Aug 2023 05:31:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240785AbjHXJas (ORCPT ); Thu, 24 Aug 2023 05:30:48 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B433199E for ; Thu, 24 Aug 2023 02:30:45 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9Y6l9Rz4f3prH for ; Thu, 24 Aug 2023 17:30:41 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S9; Thu, 24 Aug 2023 17:30:42 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 05/16] ext4: pass real delayed status into ext4_es_insert_extent() Date: Thu, 24 Aug 2023 17:26:08 +0800 Message-Id: <20230824092619.1327976-6-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S9 X-Coremail-Antispam: 1UD129KBjvJXoWxWr15Jry8Kr4Dur1rWrykXwb_yoW5GF13p3 sxAw1rWF4UWw4j934S9r40gr15KayqkrWDCrs5JryrtayfGr1SkF1DtFW8ZFyqgrW8Aa1Y qFWru3srCay5CFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9G14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v2 0xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxV W8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZX7UUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi Commit 'd2dc317d564a ("ext4: fix data corruption caused by unwritten and delayed extents")' fix a data corruption issue by stop passing delayed status into ext4_es_insert_extent() if the mapping range has been written. This patch change it to still pass the real delayed status and deal with the 'delayed && written' case in ext4_es_insert_extent(). If the status have delayed bit is set, it means that the path of delayed allocation is still running, and this insert process is not allocating delayed allocated blocks. Signed-off-by: Zhang Yi --- fs/ext4/extents_status.c | 13 +++++++------ fs/ext4/inode.c | 2 -- 2 files changed, 7 insertions(+), 8 deletions(-) diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 3a004ed04570..62191c772b82 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -873,13 +873,14 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, BUG_ON(end < lblk); + /* + * Insert extent as delayed and written which can potentially cause + * data lose, and the extent has been written, it's safe to remove + * the delayed flag even it's still delayed. + */ if ((status & EXTENT_STATUS_DELAYED) && - (status & EXTENT_STATUS_WRITTEN)) { - ext4_warning(inode->i_sb, "Inserting extent [%u/%u] as " - " delayed and written which can potentially " - " cause data loss.", lblk, len); - WARN_ON(1); - } + (status & EXTENT_STATUS_WRITTEN)) + status &= ~EXTENT_STATUS_DELAYED; newes.es_lblk = lblk; newes.es_len = len; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 6c490f05e2ba..82115d6656d3 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -563,7 +563,6 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode, status = map->m_flags & EXT4_MAP_UNWRITTEN ? EXTENT_STATUS_UNWRITTEN : EXTENT_STATUS_WRITTEN; if (!(flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) && - !(status & EXTENT_STATUS_WRITTEN) && ext4_es_scan_range(inode, &ext4_es_is_delayed, map->m_lblk, map->m_lblk + map->m_len - 1)) status |= EXTENT_STATUS_DELAYED; @@ -673,7 +672,6 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode, status = map->m_flags & EXT4_MAP_UNWRITTEN ? EXTENT_STATUS_UNWRITTEN : EXTENT_STATUS_WRITTEN; if (!(flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) && - !(status & EXTENT_STATUS_WRITTEN) && ext4_es_scan_range(inode, &ext4_es_is_delayed, map->m_lblk, map->m_lblk + map->m_len - 1)) status |= EXTENT_STATUS_DELAYED; From patchwork Thu Aug 24 09:26:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825268 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=150.107.74.76; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBY5lCDz26jR for ; Thu, 24 Aug 2023 19:31:33 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBY5J0pz4wxN for ; Thu, 24 Aug 2023 19:31:33 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBY5Dg5z4wy6; Thu, 24 Aug 2023 19:31:33 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBY57YSz4wxN for ; Thu, 24 Aug 2023 19:31:33 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240758AbjHXJbI (ORCPT ); Thu, 24 Aug 2023 05:31:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59074 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240788AbjHXJat (ORCPT ); Thu, 24 Aug 2023 05:30:49 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69A0C198E for ; Thu, 24 Aug 2023 02:30:46 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9Y40l4z4f41Gl for ; Thu, 24 Aug 2023 17:30:41 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S10; Thu, 24 Aug 2023 17:30:42 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 06/16] ext4: move delalloc data reserve spcae updating into ext4_es_insert_extent() Date: Thu, 24 Aug 2023 17:26:09 +0800 Message-Id: <20230824092619.1327976-7-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S10 X-Coremail-Antispam: 1UD129KBjvJXoW3tw1UXw4ktFyfKF13tFW7twb_yoWkAFWUpr W3Kr13Jw15Xr1q9r4Iqw1UWr1Yga18trWUGrZ3tr18uFWrAF1S9F1ktF1rZFyUtrW8JFn0 qFyY9w17ua1q9a7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9C14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIxAIcVCF04k26cxK x2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI 0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQSdkUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi We update data reserved space for delalloc after allocating new blocks in ext4_{ind|ext}_map_blocks(). If bigalloc feature is enabled, we also need to query the extents_status tree to calculate the exact reserved clusters. If we move it to ext4_es_insert_extent(), just after dropping delalloc extents_status entry, it could become more simple because __es_remove_extent() has done most of the work and we could remove entire ext4_es_delayed_clu(). One important thing needs to take care of is that if bigalloc is enabled, we should update data reserved count when first converting some of the delayed only es entries of a caluster which has many other delayed only entries left over. | one cluster | -------------------------------------------------------- | da es 0 | .. | da es 1 | .. | da es 2 | .. | da es 3 | -------------------------------------------------------- ^ ^ | | <- first allocating this delayed extent The later allocations in that cluster will not count again. We could do this by counting the new inserts pending clusters. Another important thing is the quota claiming and i_blocks count, if the delayed allocating has been raced by another no-delay allocating (from fallocate, filemap, DIO...), we cannot claim quota as usual because the racer have already done it. We could distinguish this case easily through checking EXTENT_STATUS_DELAYED and the reserved only blocks counted by __es_remove_extent(). If the EXTENT_STATUS_DELAYED is set, it always means that the allocating is not from the delayed allocating. But on the contrary, we can only get the opposite conclusion if bigalloc is not enabled. If bigalloc is enabled, it could be raced by another fallocate which is writing to other non-delayed areas of the same cluster. In this case, the EXTENT_STATUS_DELAYED is not set but we cannot claim quota again. | one cluster | ------------------------------------------- | | delayed es | ------------------------------------------- ^ ^ | fallocate | So we also need to check the counted reserved only blocks, if it is zero it means that the allocating is not from the delayed allocating, and we should release reserved qutoa instead of claim it. Signed-off-by: Zhang Yi --- fs/ext4/extents.c | 37 ------------- fs/ext4/extents_status.c | 115 +++++++++------------------------------ fs/ext4/extents_status.h | 2 - fs/ext4/indirect.c | 7 --- fs/ext4/inode.c | 5 +- 5 files changed, 30 insertions(+), 136 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index e4115d338f10..592383effe80 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4323,43 +4323,6 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, goto out; } - /* - * Reduce the reserved cluster count to reflect successful deferred - * allocation of delayed allocated clusters or direct allocation of - * clusters discovered to be delayed allocated. Once allocated, a - * cluster is not included in the reserved count. - */ - if (test_opt(inode->i_sb, DELALLOC) && allocated_clusters) { - if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) { - /* - * When allocating delayed allocated clusters, simply - * reduce the reserved cluster count and claim quota - */ - ext4_da_update_reserve_space(inode, allocated_clusters, - 1); - } else { - ext4_lblk_t lblk, len; - unsigned int n; - - /* - * When allocating non-delayed allocated clusters - * (from fallocate, filemap, DIO, or clusters - * allocated when delalloc has been disabled by - * ext4_nonda_switch), reduce the reserved cluster - * count by the number of allocated clusters that - * have previously been delayed allocated. Quota - * has been claimed by ext4_mb_new_blocks() above, - * so release the quota reservations made for any - * previously delayed allocated clusters. - */ - lblk = EXT4_LBLK_CMASK(sbi, map->m_lblk); - len = allocated_clusters << sbi->s_cluster_bits; - n = ext4_es_delayed_clu(inode, lblk, len); - if (n > 0) - ext4_da_update_reserve_space(inode, (int) n, 0); - } - } - /* * Cache the extent and update transaction to commit on fdatasync only * when it is _not_ an unwritten extent. diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 62191c772b82..34164c2827f2 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -856,11 +856,14 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, struct extent_status newes; ext4_lblk_t end = lblk + len - 1; int err1 = 0, err2 = 0, err3 = 0; + struct rsvd_info rinfo; + int pending = 0; struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct extent_status *es1 = NULL; struct extent_status *es2 = NULL; struct pending_reservation *pr = NULL; bool revise_pending = false; + bool delayed = false; if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) return; @@ -878,6 +881,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, * data lose, and the extent has been written, it's safe to remove * the delayed flag even it's still delayed. */ + delayed = status & EXTENT_STATUS_DELAYED; if ((status & EXTENT_STATUS_DELAYED) && (status & EXTENT_STATUS_WRITTEN)) status &= ~EXTENT_STATUS_DELAYED; @@ -902,7 +906,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, pr = __alloc_pending(true); write_lock(&EXT4_I(inode)->i_es_lock); - err1 = __es_remove_extent(inode, lblk, end, NULL, es1); + err1 = __es_remove_extent(inode, lblk, end, &rinfo, es1); if (err1 != 0) goto error; /* Free preallocated extent if it didn't get used. */ @@ -932,9 +936,30 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, __free_pending(pr); pr = NULL; } + /* + * In the first partial allocating some delayed extents of + * one cluster, we also need to count the data cluster when + * allocating delay only extent entries. + */ + pending = err3; } error: write_unlock(&EXT4_I(inode)->i_es_lock); + /* + * If EXTENT_STATUS_DELAYED is not set and delayed only blocks is + * not zero, we are allocating delayed allocated clusters, simply + * reduce the reserved cluster count and claim quota. + * + * Otherwise, we aren't allocating delayed allocated clusters + * (from fallocate, filemap, DIO, or clusters allocated when + * delalloc has been disabled by ext4_nonda_switch()), reduce the + * reserved cluster count by the number of allocated clusters that + * have previously been delayed allocated. Quota has been claimed + * by ext4_mb_new_blocks(), so release the quota reservations made + * for any previously delayed allocated clusters. + */ + ext4_da_update_reserve_space(inode, rinfo.ndelonly_clu + pending, + !delayed && rinfo.ndelonly_blk); if (err1 || err2 || err3 < 0) goto retry; @@ -2146,94 +2171,6 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, return; } -/* - * __es_delayed_clu - count number of clusters containing blocks that - * are delayed only - * - * @inode - file containing block range - * @start - logical block defining start of range - * @end - logical block defining end of range - * - * Returns the number of clusters containing only delayed (not delayed - * and unwritten) blocks in the range specified by @start and @end. Any - * cluster or part of a cluster within the range and containing a delayed - * and not unwritten block within the range is counted as a whole cluster. - */ -static unsigned int __es_delayed_clu(struct inode *inode, ext4_lblk_t start, - ext4_lblk_t end) -{ - struct ext4_es_tree *tree = &EXT4_I(inode)->i_es_tree; - struct extent_status *es; - struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); - struct rb_node *node; - ext4_lblk_t first_lclu, last_lclu; - unsigned long long last_counted_lclu; - unsigned int n = 0; - - /* guaranteed to be unequal to any ext4_lblk_t value */ - last_counted_lclu = ~0ULL; - - es = __es_tree_search(&tree->root, start); - - while (es && (es->es_lblk <= end)) { - if (ext4_es_is_delonly(es)) { - if (es->es_lblk <= start) - first_lclu = EXT4_B2C(sbi, start); - else - first_lclu = EXT4_B2C(sbi, es->es_lblk); - - if (ext4_es_end(es) >= end) - last_lclu = EXT4_B2C(sbi, end); - else - last_lclu = EXT4_B2C(sbi, ext4_es_end(es)); - - if (first_lclu == last_counted_lclu) - n += last_lclu - first_lclu; - else - n += last_lclu - first_lclu + 1; - last_counted_lclu = last_lclu; - } - node = rb_next(&es->rb_node); - if (!node) - break; - es = rb_entry(node, struct extent_status, rb_node); - } - - return n; -} - -/* - * ext4_es_delayed_clu - count number of clusters containing blocks that - * are both delayed and unwritten - * - * @inode - file containing block range - * @lblk - logical block defining start of range - * @len - number of blocks in range - * - * Locking for external use of __es_delayed_clu(). - */ -unsigned int ext4_es_delayed_clu(struct inode *inode, ext4_lblk_t lblk, - ext4_lblk_t len) -{ - struct ext4_inode_info *ei = EXT4_I(inode); - ext4_lblk_t end; - unsigned int n; - - if (len == 0) - return 0; - - end = lblk + len - 1; - WARN_ON(end < lblk); - - read_lock(&ei->i_es_lock); - - n = __es_delayed_clu(inode, lblk, end); - - read_unlock(&ei->i_es_lock); - - return n; -} - /* * __revise_pending - makes, cancels, or leaves unchanged pending cluster * reservations for a specified block range depending diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h index d9847a4a25db..7344667eb2cd 100644 --- a/fs/ext4/extents_status.h +++ b/fs/ext4/extents_status.h @@ -251,8 +251,6 @@ extern void ext4_remove_pending(struct inode *inode, ext4_lblk_t lblk); extern bool ext4_is_pending(struct inode *inode, ext4_lblk_t lblk); extern void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, bool allocated); -extern unsigned int ext4_es_delayed_clu(struct inode *inode, ext4_lblk_t lblk, - ext4_lblk_t len); extern void ext4_clear_inode_es(struct inode *inode); #endif /* _EXT4_EXTENTS_STATUS_H */ diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c index a9f3716119d3..448401e02c55 100644 --- a/fs/ext4/indirect.c +++ b/fs/ext4/indirect.c @@ -652,13 +652,6 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode, ext4_update_inode_fsync_trans(handle, inode, 1); count = ar.len; - /* - * Update reserved blocks/metadata blocks after successful block - * allocation which had been deferred till now. - */ - if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) - ext4_da_update_reserve_space(inode, count, 1); - got_it: map->m_flags |= EXT4_MAP_MAPPED; map->m_pblk = le32_to_cpu(chain[depth-1].key); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 82115d6656d3..546a3b09fd0a 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -330,11 +330,14 @@ qsize_t *ext4_get_reserved_space(struct inode *inode) * ext4_discard_preallocations() from here. */ void ext4_da_update_reserve_space(struct inode *inode, - int used, int quota_claim) + int used, int quota_claim) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); + if (!used) + return; + spin_lock(&ei->i_block_reservation_lock); trace_ext4_da_update_reserve_space(inode, used, quota_claim); if (unlikely(used > ei->i_reserved_data_blocks)) { From patchwork Thu Aug 24 09:26:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825264 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBX43Qhz26jP for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBT3sQSz4wZn for ; Thu, 24 Aug 2023 19:31:29 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBT3p0rz4wxK; Thu, 24 Aug 2023 19:31:29 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBT2Xh8z4wZn for ; Thu, 24 Aug 2023 19:31:29 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240347AbjHXJbA (ORCPT ); Thu, 24 Aug 2023 05:31:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240784AbjHXJas (ORCPT ); Thu, 24 Aug 2023 05:30:48 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0364519A0 for ; Thu, 24 Aug 2023 02:30:45 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4RWd9W68pXz4f3lK3 for ; Thu, 24 Aug 2023 17:30:39 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S11; Thu, 24 Aug 2023 17:30:43 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 07/16] ext4: count inode's total delalloc data blocks into ext4_es_tree Date: Thu, 24 Aug 2023 17:26:10 +0800 Message-Id: <20230824092619.1327976-8-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S11 X-Coremail-Antispam: 1UD129KBjvJXoWxXrWDWrW7JrWkGr1rKw47twb_yoW5Gw13pa s3A3WUGr4fXw1kWayxXr4UZr1fta48Gay7GrWftr1IkFyUAryftF10yFWjvFyYqrW8Jw45 XF48tw1UGa13KaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9C14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIxAIcVCF04k26cxK x2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI 0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQSdkUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,MAY_BE_FORGED, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi Add a parameter in struct ext4_es_tree to count per-inode's total delalloc data blocks number, it will be used to calculate reserved meta blocks when creating a new delalloc extent entry, or mapping a delalloc entry to a real one or releasing a delalloc entry. Signed-off-by: Zhang Yi --- fs/ext4/extents_status.c | 19 +++++++++++++++++++ fs/ext4/extents_status.h | 1 + 2 files changed, 20 insertions(+) diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 34164c2827f2..b098c3316189 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -178,6 +178,7 @@ void ext4_es_init_tree(struct ext4_es_tree *tree) { tree->root = RB_ROOT; tree->cache_es = NULL; + tree->da_es_len = 0; } #ifdef ES_DEBUG__ @@ -787,6 +788,20 @@ static inline void ext4_es_insert_extent_check(struct inode *inode, } #endif +/* + * Update total delay allocated extent length. + */ +static inline void ext4_es_update_da_block(struct inode *inode, long es_len) +{ + struct ext4_es_tree *tree = &EXT4_I(inode)->i_es_tree; + + if (!es_len) + return; + + tree->da_es_len += es_len; + es_debug("update da blocks %ld, to %u\n", es_len, tree->da_es_len); +} + static int __es_insert_extent(struct inode *inode, struct extent_status *newes, struct extent_status *prealloc) { @@ -915,6 +930,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, __es_free_extent(es1); es1 = NULL; } + ext4_es_update_da_block(inode, -rinfo.ndelonly_blk); err2 = __es_insert_extent(inode, &newes, es2); if (err2 == -ENOMEM && !ext4_es_must_keep(&newes)) @@ -1571,6 +1587,7 @@ void ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, __es_free_extent(es); es = NULL; } + ext4_es_update_da_block(inode, -rinfo.ndelonly_blk); write_unlock(&EXT4_I(inode)->i_es_lock); if (err) goto retry; @@ -2161,6 +2178,8 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, pr = NULL; } } + + ext4_es_update_da_block(inode, newes.es_len); error: write_unlock(&EXT4_I(inode)->i_es_lock); if (err1 || err2 || err3 < 0) diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h index 7344667eb2cd..ee873b305210 100644 --- a/fs/ext4/extents_status.h +++ b/fs/ext4/extents_status.h @@ -66,6 +66,7 @@ struct extent_status { struct ext4_es_tree { struct rb_root root; struct extent_status *cache_es; /* recently accessed extent */ + ext4_lblk_t da_es_len; /* total delalloc len */ }; struct ext4_es_stats { From patchwork Thu Aug 24 09:26:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825263 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=150.107.74.76; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBX3k85z26jN for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBX3drPz4wxN for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBX3bCGz4wy6; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBX3W5nz4wxN for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240745AbjHXJbH (ORCPT ); Thu, 24 Aug 2023 05:31:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59066 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240786AbjHXJas (ORCPT ); Thu, 24 Aug 2023 05:30:48 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4DDE110F for ; Thu, 24 Aug 2023 02:30:46 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9Z2Dnzz4f41Gv for ; Thu, 24 Aug 2023 17:30:42 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S12; Thu, 24 Aug 2023 17:30:43 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 08/16] ext4: refactor delalloc space reservation Date: Thu, 24 Aug 2023 17:26:11 +0800 Message-Id: <20230824092619.1327976-9-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S12 X-Coremail-Antispam: 1UD129KBjvJXoW7KFy7uw1xGFykWw4xXF13Jwb_yoW8Cw4Upr W3CFsrKr4xW3s2kF4SqrnrXF1rKa92qrWUJFW29w1fZry3XFyfKF1qyF15ZF1fKrW8XF4Y qFWUJ34Uua1jka7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9C14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIxAIcVCF04k26cxK x2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI 0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQSdkUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi Cleanup the delalloc reserve space calling, split it from the bigalloc checks, call ext4_da_reserve_space() if it have unmapped block need to reserve, no logical changes. Signed-off-by: Zhang Yi --- fs/ext4/inode.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 546a3b09fd0a..861602903b4d 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1623,8 +1623,9 @@ static void ext4_print_free_blocks(struct inode *inode) static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); - int ret; + unsigned int rsv_dlen = 1; bool allocated = false; + int ret; /* * If the cluster containing lblk is shared with a delayed, @@ -1637,11 +1638,8 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) * it's necessary to examine the extent tree if a search of the * extents status tree doesn't get a match. */ - if (sbi->s_cluster_ratio == 1) { - ret = ext4_da_reserve_space(inode); - if (ret != 0) /* ENOSPC */ - return ret; - } else { /* bigalloc */ + if (sbi->s_cluster_ratio > 1) { + rsv_dlen = 0; if (!ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk)) { if (!ext4_es_scan_clu(inode, &ext4_es_is_mapped, lblk)) { @@ -1649,19 +1647,22 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) EXT4_B2C(sbi, lblk)); if (ret < 0) return ret; - if (ret == 0) { - ret = ext4_da_reserve_space(inode); - if (ret != 0) /* ENOSPC */ - return ret; - } else { + if (ret == 0) + rsv_dlen = 1; + else allocated = true; - } } else { allocated = true; } } } + if (rsv_dlen > 0) { + ret = ext4_da_reserve_space(inode); + if (ret) /* ENOSPC */ + return ret; + } + ext4_es_insert_delayed_block(inode, lblk, allocated); return 0; } From patchwork Thu Aug 24 09:26:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825272 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBb0FhGz26jP for ; Thu, 24 Aug 2023 19:31:35 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBZ6wbPz4wxK for ; Thu, 24 Aug 2023 19:31:34 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBZ6sg4z4wxN; Thu, 24 Aug 2023 19:31:34 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBZ6n14z4wxK for ; Thu, 24 Aug 2023 19:31:34 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240776AbjHXJbK (ORCPT ); Thu, 24 Aug 2023 05:31:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240789AbjHXJat (ORCPT ); Thu, 24 Aug 2023 05:30:49 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9067E67 for ; Thu, 24 Aug 2023 02:30:46 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9Z4pfqz4f41SB for ; Thu, 24 Aug 2023 17:30:42 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S13; Thu, 24 Aug 2023 17:30:43 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 09/16] ext4: count reserved metadata blocks for delalloc per inode Date: Thu, 24 Aug 2023 17:26:12 +0800 Message-Id: <20230824092619.1327976-10-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S13 X-Coremail-Antispam: 1UD129KBjvJXoW3Ww43Xry3Zr1DWw48WrW3KFg_yoWxGw4fp3 WDAFy5WFy8Wr1DWayxXr42yr4fua4IgF4UtF4DWFy7ZFy3J3Z2qr1ktFyYvFyYkrZxKrsr Xa4ru34ru3WUWFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9C14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIxAIcVCF04k26cxK x2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI 0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQSdkUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi Add a new parameter ei->i_reserved_ext_blocks to prepare for reserving metadata blocks for delalloc. This parameter will be used to count the per inode's total reserved metadata blocks, this value should always be zero when the inode is dieing. Also update the corresponding tracepoints and debug interface. Signed-off-by: Zhang Yi --- fs/ext4/ext4.h | 1 + fs/ext4/inode.c | 2 ++ fs/ext4/super.c | 10 +++++++--- include/trace/events/ext4.h | 25 +++++++++++++++++-------- 4 files changed, 27 insertions(+), 11 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 84618c46f239..ee2dbbde176e 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1104,6 +1104,7 @@ struct ext4_inode_info { /* allocation reservation info for delalloc */ /* In case of bigalloc, this refer to clusters rather than blocks */ unsigned int i_reserved_data_blocks; + unsigned int i_reserved_ext_blocks; /* pending cluster reservations for bigalloc file systems */ struct ext4_pending_tree i_pending_tree; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 861602903b4d..dda17b3340ce 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1606,6 +1606,8 @@ static void ext4_print_free_blocks(struct inode *inode) ext4_msg(sb, KERN_CRIT, "Block reservation details"); ext4_msg(sb, KERN_CRIT, "i_reserved_data_blocks=%u", ei->i_reserved_data_blocks); + ext4_msg(sb, KERN_CRIT, "i_reserved_ext_blocks=%u", + ei->i_reserved_ext_blocks); return; } diff --git a/fs/ext4/super.c b/fs/ext4/super.c index bb42525de8d0..7bc7c8c0ed71 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1436,6 +1436,7 @@ static struct inode *ext4_alloc_inode(struct super_block *sb) ei->i_es_shk_nr = 0; ei->i_es_shrink_lblk = 0; ei->i_reserved_data_blocks = 0; + ei->i_reserved_ext_blocks = 0; spin_lock_init(&(ei->i_block_reservation_lock)); ext4_init_pending_tree(&ei->i_pending_tree); #ifdef CONFIG_QUOTA @@ -1487,11 +1488,14 @@ static void ext4_destroy_inode(struct inode *inode) dump_stack(); } - if (EXT4_I(inode)->i_reserved_data_blocks) + if (EXT4_I(inode)->i_reserved_data_blocks || + EXT4_I(inode)->i_reserved_ext_blocks) ext4_msg(inode->i_sb, KERN_ERR, - "Inode %lu (%p): i_reserved_data_blocks (%u) not cleared!", + "Inode %lu (%p): i_reserved_data_blocks (%u) or " + "i_reserved_ext_blocks (%u) not cleared!", inode->i_ino, EXT4_I(inode), - EXT4_I(inode)->i_reserved_data_blocks); + EXT4_I(inode)->i_reserved_data_blocks, + EXT4_I(inode)->i_reserved_ext_blocks); } static void ext4_shutdown(struct super_block *sb) diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 65029dfb92fb..115f96f444ff 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -1224,6 +1224,7 @@ TRACE_EVENT(ext4_da_update_reserve_space, __field( __u64, i_blocks ) __field( int, used_blocks ) __field( int, reserved_data_blocks ) + __field( int, reserved_ext_blocks ) __field( int, quota_claim ) __field( __u16, mode ) ), @@ -1233,18 +1234,19 @@ TRACE_EVENT(ext4_da_update_reserve_space, __entry->ino = inode->i_ino; __entry->i_blocks = inode->i_blocks; __entry->used_blocks = used_blocks; - __entry->reserved_data_blocks = - EXT4_I(inode)->i_reserved_data_blocks; + __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; + __entry->reserved_ext_blocks = EXT4_I(inode)->i_reserved_ext_blocks; __entry->quota_claim = quota_claim; __entry->mode = inode->i_mode; ), TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu used_blocks %d " - "reserved_data_blocks %d quota_claim %d", + "reserved_data_blocks %d reserved_ext_blocks %d quota_claim %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, - __entry->used_blocks, __entry->reserved_data_blocks, + __entry->used_blocks, + __entry->reserved_data_blocks, __entry->reserved_ext_blocks, __entry->quota_claim) ); @@ -1258,6 +1260,7 @@ TRACE_EVENT(ext4_da_reserve_space, __field( ino_t, ino ) __field( __u64, i_blocks ) __field( int, reserved_data_blocks ) + __field( int, reserved_ext_blocks ) __field( __u16, mode ) ), @@ -1266,15 +1269,17 @@ TRACE_EVENT(ext4_da_reserve_space, __entry->ino = inode->i_ino; __entry->i_blocks = inode->i_blocks; __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; + __entry->reserved_ext_blocks = EXT4_I(inode)->i_reserved_ext_blocks; __entry->mode = inode->i_mode; ), TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu " - "reserved_data_blocks %d", + "reserved_data_blocks %d reserved_ext_blocks %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, - __entry->reserved_data_blocks) + __entry->reserved_data_blocks, + __entry->reserved_ext_blocks) ); TRACE_EVENT(ext4_da_release_space, @@ -1288,6 +1293,7 @@ TRACE_EVENT(ext4_da_release_space, __field( __u64, i_blocks ) __field( int, freed_blocks ) __field( int, reserved_data_blocks ) + __field( int, reserved_ext_blocks ) __field( __u16, mode ) ), @@ -1297,15 +1303,18 @@ TRACE_EVENT(ext4_da_release_space, __entry->i_blocks = inode->i_blocks; __entry->freed_blocks = freed_blocks; __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; + __entry->reserved_ext_blocks = EXT4_I(inode)->i_reserved_ext_blocks; __entry->mode = inode->i_mode; ), TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu freed_blocks %d " - "reserved_data_blocks %d", + "reserved_data_blocks %d reserved_ext_blocks %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, - __entry->freed_blocks, __entry->reserved_data_blocks) + __entry->freed_blocks, + __entry->reserved_data_blocks, + __entry->reserved_ext_blocks) ); DECLARE_EVENT_CLASS(ext4__bitmap_load, From patchwork Thu Aug 24 09:26:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825258 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=150.107.74.76; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBV49F3z1yZs for ; Thu, 24 Aug 2023 19:31:30 +1000 (AEST) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBV3fHrz4wxK for ; Thu, 24 Aug 2023 19:31:30 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBV3bQ9z4wxN; Thu, 24 Aug 2023 19:31:30 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBV3Vdbz4wxK for ; Thu, 24 Aug 2023 19:31:30 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240193AbjHXJa7 (ORCPT ); Thu, 24 Aug 2023 05:30:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41246 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240792AbjHXJav (ORCPT ); Thu, 24 Aug 2023 05:30:51 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F44610FA for ; Thu, 24 Aug 2023 02:30:48 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9b0N26z4f41S8 for ; Thu, 24 Aug 2023 17:30:43 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S14; Thu, 24 Aug 2023 17:30:44 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 10/16] ext4: reserve meta blocks in ext4_da_reserve_space() Date: Thu, 24 Aug 2023 17:26:13 +0800 Message-Id: <20230824092619.1327976-11-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S14 X-Coremail-Antispam: 1UD129KBjvJXoWxXw18ZFWDtr15uF4UAw47urg_yoWrCryrpF n8AFy3W348W34kWFWfZr47Zr4fua4IgFWUtFZrWF1xZFy5J3WSgr1DtF1YvF1YyrZ3Gw1D Xa45W34ru3WUWaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9K14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIxAIcVCF04k26cxK x2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI 0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuYvjfUOBTYUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi Prepare to reserve metadata blocks for delay allocation in ext4_da_reserve_space(), claim reserved space from the global sbi->s_dirtyclusters_counter, and also updating tracepoints to show the new reserved metadata blocks. This patch is just a preparation, the reserved ext_len is always zero. Signed-off-by: Zhang Yi --- fs/ext4/inode.c | 28 ++++++++++++++++------------ include/trace/events/ext4.h | 10 ++++++++-- 2 files changed, 24 insertions(+), 14 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index dda17b3340ce..a189009d20fa 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1439,31 +1439,37 @@ static int ext4_journalled_write_end(struct file *file, } /* - * Reserve space for a single cluster + * Reserve space for a 'rsv_dlen' data blocks/clusters and 'rsv_extlen' + * extent metadata blocks. */ -static int ext4_da_reserve_space(struct inode *inode) +static int ext4_da_reserve_space(struct inode *inode, unsigned int rsv_dlen, + unsigned int rsv_extlen) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); int ret; + if (!rsv_dlen && !rsv_extlen) + return 0; + /* * We will charge metadata quota at writeout time; this saves * us from metadata over-estimation, though we may go over by * a small amount in the end. Here we just reserve for data. */ - ret = dquot_reserve_block(inode, EXT4_C2B(sbi, 1)); + ret = dquot_reserve_block(inode, EXT4_C2B(sbi, rsv_dlen)); if (ret) return ret; spin_lock(&ei->i_block_reservation_lock); - if (ext4_claim_free_clusters(sbi, 1, 0)) { + if (ext4_claim_free_clusters(sbi, rsv_dlen + rsv_extlen, 0)) { spin_unlock(&ei->i_block_reservation_lock); - dquot_release_reservation_block(inode, EXT4_C2B(sbi, 1)); + dquot_release_reservation_block(inode, EXT4_C2B(sbi, rsv_dlen)); return -ENOSPC; } - ei->i_reserved_data_blocks++; - trace_ext4_da_reserve_space(inode); + ei->i_reserved_data_blocks += rsv_dlen; + ei->i_reserved_ext_blocks += rsv_extlen; + trace_ext4_da_reserve_space(inode, rsv_dlen, rsv_extlen); spin_unlock(&ei->i_block_reservation_lock); return 0; /* success */ @@ -1659,11 +1665,9 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) } } - if (rsv_dlen > 0) { - ret = ext4_da_reserve_space(inode); - if (ret) /* ENOSPC */ - return ret; - } + ret = ext4_da_reserve_space(inode, rsv_dlen, 0); + if (ret) /* ENOSPC */ + return ret; ext4_es_insert_delayed_block(inode, lblk, allocated); return 0; diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 115f96f444ff..7a9839f2d681 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -1251,14 +1251,16 @@ TRACE_EVENT(ext4_da_update_reserve_space, ); TRACE_EVENT(ext4_da_reserve_space, - TP_PROTO(struct inode *inode), + TP_PROTO(struct inode *inode, int data_blocks, int meta_blocks), - TP_ARGS(inode), + TP_ARGS(inode, data_blocks, meta_blocks), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, i_blocks ) + __field( int, data_blocks ) + __field( int, meta_blocks ) __field( int, reserved_data_blocks ) __field( int, reserved_ext_blocks ) __field( __u16, mode ) @@ -1268,16 +1270,20 @@ TRACE_EVENT(ext4_da_reserve_space, __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->i_blocks = inode->i_blocks; + __entry->data_blocks = data_blocks; + __entry->meta_blocks = meta_blocks; __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; __entry->reserved_ext_blocks = EXT4_I(inode)->i_reserved_ext_blocks; __entry->mode = inode->i_mode; ), TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu " + "data_blocks %d meta_blocks %d " "reserved_data_blocks %d reserved_ext_blocks %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, + __entry->data_blocks, __entry->meta_blocks, __entry->reserved_data_blocks, __entry->reserved_ext_blocks) ); From patchwork Thu Aug 24 09:26:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825260 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=150.107.74.76; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBW3l8bz1yZs for ; Thu, 24 Aug 2023 19:31:31 +1000 (AEST) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBW3B8cz4wy6 for ; Thu, 24 Aug 2023 19:31:31 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBW36hzz4x0W; Thu, 24 Aug 2023 19:31:31 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBW32TTz4wy6 for ; Thu, 24 Aug 2023 19:31:31 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240707AbjHXJbF (ORCPT ); Thu, 24 Aug 2023 05:31:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240790AbjHXJat (ORCPT ); Thu, 24 Aug 2023 05:30:49 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8921D10F9 for ; Thu, 24 Aug 2023 02:30:47 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9c12Vsz4f3q39 for ; Thu, 24 Aug 2023 17:30:44 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S15; Thu, 24 Aug 2023 17:30:44 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 11/16] ext4: factor out common part of ext4_da_{release|update_reserve}_space() Date: Thu, 24 Aug 2023 17:26:14 +0800 Message-Id: <20230824092619.1327976-12-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S15 X-Coremail-Antispam: 1UD129KBjvJXoWxZrWrXF1xJryxtry3Cr1rZwb_yoW5tF17pr y3CFW3Wa48WrykuFWfXr4UZr1F9aySqFWUtrZ7WFnrZrW5Ga4Sgr18tF1FvF1YkrZ3Jr4j qa45G34ru3WDArJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9K14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIxAIcVCF04k26cxK x2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI 0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuYvjfUOBTYUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi The reserve blocks updating part in ext4_da_release_space() and ext4_da_update_reserve_space() are almost the same, so factor them out to a common helper. Signed-off-by: Zhang Yi --- fs/ext4/inode.c | 60 +++++++++++++++++++++---------------------------- 1 file changed, 25 insertions(+), 35 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index a189009d20fa..13036cecbcc0 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -325,6 +325,27 @@ qsize_t *ext4_get_reserved_space(struct inode *inode) } #endif +static void __ext4_da_update_reserve_space(const char *where, + struct inode *inode, + int data_len) +{ + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); + struct ext4_inode_info *ei = EXT4_I(inode); + + if (unlikely(data_len > ei->i_reserved_data_blocks)) { + ext4_warning(inode->i_sb, "%s: ino %lu, clear %d " + "with only %d reserved data blocks", + where, inode->i_ino, data_len, + ei->i_reserved_data_blocks); + WARN_ON(1); + data_len = ei->i_reserved_data_blocks; + } + + /* Update per-inode reservations */ + ei->i_reserved_data_blocks -= data_len; + percpu_counter_sub(&sbi->s_dirtyclusters_counter, data_len); +} + /* * Called with i_data_sem down, which is important since we can call * ext4_discard_preallocations() from here. @@ -340,19 +361,7 @@ void ext4_da_update_reserve_space(struct inode *inode, spin_lock(&ei->i_block_reservation_lock); trace_ext4_da_update_reserve_space(inode, used, quota_claim); - if (unlikely(used > ei->i_reserved_data_blocks)) { - ext4_warning(inode->i_sb, "%s: ino %lu, used %d " - "with only %d reserved data blocks", - __func__, inode->i_ino, used, - ei->i_reserved_data_blocks); - WARN_ON(1); - used = ei->i_reserved_data_blocks; - } - - /* Update per-inode reservations */ - ei->i_reserved_data_blocks -= used; - percpu_counter_sub(&sbi->s_dirtyclusters_counter, used); - + __ext4_da_update_reserve_space(__func__, inode, used); spin_unlock(&ei->i_block_reservation_lock); /* Update quota subsystem for data blocks */ @@ -1483,29 +1492,10 @@ void ext4_da_release_space(struct inode *inode, int to_free) if (!to_free) return; /* Nothing to release, exit */ - spin_lock(&EXT4_I(inode)->i_block_reservation_lock); - + spin_lock(&ei->i_block_reservation_lock); trace_ext4_da_release_space(inode, to_free); - if (unlikely(to_free > ei->i_reserved_data_blocks)) { - /* - * if there aren't enough reserved blocks, then the - * counter is messed up somewhere. Since this - * function is called from invalidate page, it's - * harmless to return without any action. - */ - ext4_warning(inode->i_sb, "ext4_da_release_space: " - "ino %lu, to_free %d with only %d reserved " - "data blocks", inode->i_ino, to_free, - ei->i_reserved_data_blocks); - WARN_ON(1); - to_free = ei->i_reserved_data_blocks; - } - ei->i_reserved_data_blocks -= to_free; - - /* update fs dirty data blocks counter */ - percpu_counter_sub(&sbi->s_dirtyclusters_counter, to_free); - - spin_unlock(&EXT4_I(inode)->i_block_reservation_lock); + __ext4_da_update_reserve_space(__func__, inode, to_free); + spin_unlock(&ei->i_block_reservation_lock); dquot_release_reservation_block(inode, EXT4_C2B(sbi, to_free)); } From patchwork Thu Aug 24 09:26:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825274 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=150.107.74.76; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBb6Rvrz26jT for ; Thu, 24 Aug 2023 19:31:35 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBb6215z4wxK for ; Thu, 24 Aug 2023 19:31:35 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBb5yy5z4wxN; Thu, 24 Aug 2023 19:31:35 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBb5sWkz4wxK for ; Thu, 24 Aug 2023 19:31:35 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240783AbjHXJbK (ORCPT ); Thu, 24 Aug 2023 05:31:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59108 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240791AbjHXJau (ORCPT ); Thu, 24 Aug 2023 05:30:50 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 020D610F for ; Thu, 24 Aug 2023 02:30:48 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9c3brNz4f3kjk for ; Thu, 24 Aug 2023 17:30:44 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S16; Thu, 24 Aug 2023 17:30:44 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 12/16] ext4: update reserved meta blocks in ext4_da_{release|update_reserve}_space() Date: Thu, 24 Aug 2023 17:26:15 +0800 Message-Id: <20230824092619.1327976-13-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S16 X-Coremail-Antispam: 1UD129KBjvJXoWxtw15Gr1Utr1rGr4UWr1xXwb_yoWfWF4kpF 15CFy5Ka4rWr1kua1fZr47Zr4S9a40gFWUtFs7WFy7Zry5J3WIgF1DtF1SvFyYkrs3Gw1q qa45u34rZa1UWFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9E14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAv wI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14 v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZX7UUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi The same to ext4_da_reserve_space(), we also need to update reserved metadata blocks when we release and convert a delalloc space range in ext4_da_release_space() and ext4_da_update_reserve_space(). So also prepare to reserve metadata blocks in these two functions, the reservation logic are the same to data blocks. This patch is just a preparation, the reserved ext_len is always zero. Signed-off-by: Zhang Yi --- fs/ext4/ext4.h | 4 ++-- fs/ext4/inode.c | 47 +++++++++++++++++++++---------------- include/trace/events/ext4.h | 28 ++++++++++++++-------- 3 files changed, 47 insertions(+), 32 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index ee2dbbde176e..3e0a39653469 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -2998,9 +2998,9 @@ extern int ext4_zero_partial_blocks(handle_t *handle, struct inode *inode, extern vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf); extern qsize_t *ext4_get_reserved_space(struct inode *inode); extern int ext4_get_projid(struct inode *inode, kprojid_t *projid); -extern void ext4_da_release_space(struct inode *inode, int to_free); +extern void ext4_da_release_space(struct inode *inode, unsigned int data_len); extern void ext4_da_update_reserve_space(struct inode *inode, - int used, int quota_claim); + unsigned int data_len, int quota_claim); extern int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t lblk, ext4_fsblk_t pblk, ext4_lblk_t len); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 13036cecbcc0..38c47ce1333b 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -327,53 +327,59 @@ qsize_t *ext4_get_reserved_space(struct inode *inode) static void __ext4_da_update_reserve_space(const char *where, struct inode *inode, - int data_len) + unsigned int data_len, int ext_len) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); - if (unlikely(data_len > ei->i_reserved_data_blocks)) { - ext4_warning(inode->i_sb, "%s: ino %lu, clear %d " - "with only %d reserved data blocks", - where, inode->i_ino, data_len, - ei->i_reserved_data_blocks); + if (unlikely(data_len > ei->i_reserved_data_blocks || + ext_len > (long)ei->i_reserved_ext_blocks)) { + ext4_warning(inode->i_sb, "%s: ino %lu, clear %d,%d " + "with only %d,%d reserved data blocks", + where, inode->i_ino, data_len, ext_len, + ei->i_reserved_data_blocks, + ei->i_reserved_ext_blocks); WARN_ON(1); - data_len = ei->i_reserved_data_blocks; + data_len = min(data_len, ei->i_reserved_data_blocks); + ext_len = min_t(unsigned int, ext_len, ei->i_reserved_ext_blocks); } /* Update per-inode reservations */ ei->i_reserved_data_blocks -= data_len; - percpu_counter_sub(&sbi->s_dirtyclusters_counter, data_len); + ei->i_reserved_ext_blocks -= ext_len; + percpu_counter_sub(&sbi->s_dirtyclusters_counter, (s64)data_len + ext_len); } /* * Called with i_data_sem down, which is important since we can call * ext4_discard_preallocations() from here. */ -void ext4_da_update_reserve_space(struct inode *inode, - int used, int quota_claim) +void ext4_da_update_reserve_space(struct inode *inode, unsigned int data_len, + int quota_claim) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); + int ext_len = 0; - if (!used) + if (!data_len) return; spin_lock(&ei->i_block_reservation_lock); - trace_ext4_da_update_reserve_space(inode, used, quota_claim); - __ext4_da_update_reserve_space(__func__, inode, used); + trace_ext4_da_update_reserve_space(inode, data_len, ext_len, + quota_claim); + __ext4_da_update_reserve_space(__func__, inode, data_len, ext_len); spin_unlock(&ei->i_block_reservation_lock); /* Update quota subsystem for data blocks */ if (quota_claim) - dquot_claim_block(inode, EXT4_C2B(sbi, used)); + dquot_claim_block(inode, EXT4_C2B(sbi, data_len)); else { /* * We did fallocate with an offset that is already delayed * allocated. So on delayed allocated writeback we should * not re-claim the quota for fallocated blocks. */ - dquot_release_reservation_block(inode, EXT4_C2B(sbi, used)); + dquot_release_reservation_block(inode, EXT4_C2B(sbi, data_len)); } /* @@ -1484,20 +1490,21 @@ static int ext4_da_reserve_space(struct inode *inode, unsigned int rsv_dlen, return 0; /* success */ } -void ext4_da_release_space(struct inode *inode, int to_free) +void ext4_da_release_space(struct inode *inode, unsigned int data_len) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); + int ext_len = 0; - if (!to_free) + if (!data_len) return; /* Nothing to release, exit */ spin_lock(&ei->i_block_reservation_lock); - trace_ext4_da_release_space(inode, to_free); - __ext4_da_update_reserve_space(__func__, inode, to_free); + trace_ext4_da_release_space(inode, data_len, ext_len); + __ext4_da_update_reserve_space(__func__, inode, data_len, ext_len); spin_unlock(&ei->i_block_reservation_lock); - dquot_release_reservation_block(inode, EXT4_C2B(sbi, to_free)); + dquot_release_reservation_block(inode, EXT4_C2B(sbi, data_len)); } /* diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 7a9839f2d681..e1e9d7ead20f 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -1214,15 +1214,19 @@ TRACE_EVENT(ext4_forget, ); TRACE_EVENT(ext4_da_update_reserve_space, - TP_PROTO(struct inode *inode, int used_blocks, int quota_claim), + TP_PROTO(struct inode *inode, + int data_blocks, + int meta_blocks, + int quota_claim), - TP_ARGS(inode, used_blocks, quota_claim), + TP_ARGS(inode, data_blocks, meta_blocks, quota_claim), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, i_blocks ) - __field( int, used_blocks ) + __field( int, data_blocks ) + __field( int, meta_blocks ) __field( int, reserved_data_blocks ) __field( int, reserved_ext_blocks ) __field( int, quota_claim ) @@ -1233,19 +1237,20 @@ TRACE_EVENT(ext4_da_update_reserve_space, __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->i_blocks = inode->i_blocks; - __entry->used_blocks = used_blocks; + __entry->data_blocks = data_blocks; + __entry->meta_blocks = meta_blocks; __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; __entry->reserved_ext_blocks = EXT4_I(inode)->i_reserved_ext_blocks; __entry->quota_claim = quota_claim; __entry->mode = inode->i_mode; ), - TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu used_blocks %d " + TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu data_blocks %d meta_blocks %d " "reserved_data_blocks %d reserved_ext_blocks %d quota_claim %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, - __entry->used_blocks, + __entry->data_blocks, __entry->meta_blocks, __entry->reserved_data_blocks, __entry->reserved_ext_blocks, __entry->quota_claim) ); @@ -1289,15 +1294,16 @@ TRACE_EVENT(ext4_da_reserve_space, ); TRACE_EVENT(ext4_da_release_space, - TP_PROTO(struct inode *inode, int freed_blocks), + TP_PROTO(struct inode *inode, int freed_blocks, int meta_blocks), - TP_ARGS(inode, freed_blocks), + TP_ARGS(inode, freed_blocks, meta_blocks), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, i_blocks ) __field( int, freed_blocks ) + __field( int, meta_blocks ) __field( int, reserved_data_blocks ) __field( int, reserved_ext_blocks ) __field( __u16, mode ) @@ -1308,17 +1314,19 @@ TRACE_EVENT(ext4_da_release_space, __entry->ino = inode->i_ino; __entry->i_blocks = inode->i_blocks; __entry->freed_blocks = freed_blocks; + __entry->meta_blocks = meta_blocks; __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; __entry->reserved_ext_blocks = EXT4_I(inode)->i_reserved_ext_blocks; __entry->mode = inode->i_mode; ), - TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu freed_blocks %d " + TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu " + "freed_blocks %d meta_blocks %d" "reserved_data_blocks %d reserved_ext_blocks %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, - __entry->freed_blocks, + __entry->freed_blocks, __entry->meta_blocks, __entry->reserved_data_blocks, __entry->reserved_ext_blocks) ); From patchwork Thu Aug 24 09:26:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825262 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBX2lxzz1yh3 for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBW0m5sz4wxN for ; Thu, 24 Aug 2023 19:31:31 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBW0hfWz4wy6; Thu, 24 Aug 2023 19:31:31 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBW0chwz4wxN for ; Thu, 24 Aug 2023 19:31:31 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240641AbjHXJbD (ORCPT ); Thu, 24 Aug 2023 05:31:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240793AbjHXJav (ORCPT ); Thu, 24 Aug 2023 05:30:51 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D2B2172D for ; Thu, 24 Aug 2023 02:30:48 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9c0mCGz4f41SJ for ; Thu, 24 Aug 2023 17:30:44 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S17; Thu, 24 Aug 2023 17:30:45 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 13/16] ext4: calculate the worst extent blocks needed of a delalloc es entry Date: Thu, 24 Aug 2023 17:26:16 +0800 Message-Id: <20230824092619.1327976-14-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S17 X-Coremail-Antispam: 1UD129KBjvJXoWxZr1fury3Jr1DurW3uF45ZFb_yoW5WF1Dpr 9xZr15Gr43Ww129ayfCw48Zr1Fg3WxGrWUXrWfGryYqFW8Jr1xKFn8tFW2qFy0qFWfXa12 vF45tryUGw4Y9FDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9E14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAv wI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14 v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZX7UUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi Add a new helper to calculate the worst case of extent blocks that needed while mapping a new delalloc extent_status entry. In the worst case, one delay data block consumes one extent enrty, the worst extent blocks should be 'leaf blocks + index blocks + (max depth - depth increasing costs)'. The detailed calculation formula is: / DIV_ROUND_UP(da_blocks, ext_per_block); (i = 0) f(i) = \ DIV_ROUND_UP(f(i-1), idx_per_block); (0 < i < max_depth) SUM = f(0) + .. + f(n) + max_depth - n - 1; (0 <= n < max_depth, f(n) > 0) For example: On the default 4k block size, the default ext_per_block and idx_per_block are 340. (1) If we map 50 length of blocks, the worst entent block is DIV_ROUND_UP(50, 340) + EXT4_MAX_EXTENT_DEPTH - 1 = 5, (2) if we map 500 length of blocks, the worst extent block is DIV_ROUND_UP(500, 340) + DIV_ROUND_UP(DIV_ROUND_UP(500, 340), 340) + EXT4_MAX_EXTENT_DEPTH - 2 = 6, and so on. It is a preparation for reserving meta blocks of delalloc. Signed-off-by: Zhang Yi --- fs/ext4/ext4.h | 2 ++ fs/ext4/extents.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 3e0a39653469..11813382fbcc 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -3699,6 +3699,8 @@ extern int ext4_swap_extents(handle_t *handle, struct inode *inode1, ext4_lblk_t lblk2, ext4_lblk_t count, int mark_unwritten,int *err); extern int ext4_clu_mapped(struct inode *inode, ext4_lblk_t lclu); +extern unsigned int ext4_map_worst_ext_blocks(struct inode *inode, + unsigned int len); extern int ext4_datasem_ensure_credits(handle_t *handle, struct inode *inode, int check_cred, int restart_cred, int revoke_cred); diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 592383effe80..43c251a42144 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -5797,6 +5797,34 @@ int ext4_clu_mapped(struct inode *inode, ext4_lblk_t lclu) return err ? err : mapped; } +/* + * Calculate the worst case of extents blocks needed while mapping 'len' + * data blocks. + */ +unsigned int ext4_map_worst_ext_blocks(struct inode *inode, unsigned int len) +{ + unsigned int ext_blocks = 0; + int max_entries; + int depth, max_depth; + + if (!len) + return 0; + + max_entries = ext4_ext_space_block(inode, 0); + max_depth = EXT4_MAX_EXTENT_DEPTH; + + for (depth = 0; depth < max_depth; depth++) { + len = DIV_ROUND_UP(len, max_entries); + ext_blocks += len; + if (len == 1) + break; + if (depth == 0) + max_entries = ext4_ext_space_block_idx(inode, 0); + } + + return ext_blocks + max_depth - depth - 1; +} + /* * Updates physical block address and unwritten status of extent * starting at lblk start and of len. If such an extent doesn't exist, From patchwork Thu Aug 24 09:26:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825267 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBY1Gxrz26jQ for ; Thu, 24 Aug 2023 19:31:33 +1000 (AEST) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBY0mG7z4wZn for ; Thu, 24 Aug 2023 19:31:33 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBY0hvFz4wxK; Thu, 24 Aug 2023 19:31:33 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBY0cc1z4wZn for ; Thu, 24 Aug 2023 19:31:33 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240729AbjHXJbG (ORCPT ); Thu, 24 Aug 2023 05:31:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240796AbjHXJav (ORCPT ); Thu, 24 Aug 2023 05:30:51 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD3AEE67 for ; Thu, 24 Aug 2023 02:30:48 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9d1gFqz4f3q3h for ; Thu, 24 Aug 2023 17:30:45 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S18; Thu, 24 Aug 2023 17:30:45 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 14/16] ext4: reserve extent blocks for delalloc Date: Thu, 24 Aug 2023 17:26:17 +0800 Message-Id: <20230824092619.1327976-15-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S18 X-Coremail-Antispam: 1UD129KBjvAXoW3uw4xGr48ZrWrCF15Xr4xZwb_yoW8Gw45Ko WaqF47Xws8ZrWDKrZ7CFykAFyxua9xGrWfJw1Fvw43CFyfXrnrC347t3W7Za43Xa1F9r4q q3s3Xrn8GFZ7JrZ3n29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUYp7AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20EY4v20xva j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r126s0DM28Irc Ia0xkI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK021l 84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26F4UJV W0owA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrw CFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE 14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2 IY67AKxVW8JVW5JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20E Y4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267 AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQSdkUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi Now ext4 only reserve data block for delalloc in ext4_da_reserve_space(), and switch to no delalloc mode if the space if the free blocks is less than 150% of dirty blocks or the watermark. In the meantime, '27dd43854227 ("ext4: introduce reserved space")' reserve some of the file system space (2% or 4096 clusters, whichever is smaller). Both of them could make sure that space is not exhausted when mapping delalloc entries as much as possible, but cannot guarantee (Under high concurrent writes, ext4_ext4_nonda_switch() does not work because it only read the count on current CPU, and reserved_clusters can also be exhausted easily). So it could lead to infinite loop in ext4_do_writepages(), think about we have only one free block left and want to allocate a data block and a new extent block in ext4_writepages(). ext4_do_writepages() // <-- 1 mpage_map_and_submit_extent() mpage_map_one_extent() ext4_map_blocks() ext4_ext_map_blocks() ext4_mb_new_blocks() //allocate the last block ext4_ext_insert_extent //allocate failed ext4_free_blocks() //free the data block just allocated return -ENOSPC; ext4_count_free_clusters() //is true return -ENOSPC; --> goto 1 and infinite loop One more thing, it could also lead to data lost and trigger below error message. EXT4-fs (sda): delayed block allocation failed for inode X at logical offset X with max blocks X with error -28 EXT4-fs (sda): This should not happen!! Data will be lost The best solution is try to calculate and reserve extent blocks (metadata blocks) that could be allocated when mapping a delalloc es entry. The reservation is very tricky and is related to the continuity of physical blocks. An effective way is to reserve for the worst-case, which means every block is discontinuous and costs an extent entry, ext4_map_worst_ext_blocks() does this calculation. We have already count the total delayed data blocks in the ext4_es_tree, so we could use it calculate to the worst metadata blocks that should reserved, and save it in the prepared ei->i_reserved_ext_blocks, once the delalloc entry mapped, recalculate it and release the unused reservation. Signed-off-by: Zhang Yi --- fs/ext4/ext4.h | 6 ++++-- fs/ext4/extents_status.c | 29 +++++++++++++++++++------- fs/ext4/inode.c | 41 ++++++++++++++++++++++++++++--------- include/trace/events/ext4.h | 25 +++++++++++++++------- 4 files changed, 75 insertions(+), 26 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 11813382fbcc..67b12f9ffc50 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -2998,9 +2998,11 @@ extern int ext4_zero_partial_blocks(handle_t *handle, struct inode *inode, extern vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf); extern qsize_t *ext4_get_reserved_space(struct inode *inode); extern int ext4_get_projid(struct inode *inode, kprojid_t *projid); -extern void ext4_da_release_space(struct inode *inode, unsigned int data_len); +extern void ext4_da_release_space(struct inode *inode, unsigned int data_len, + unsigned int total_da_len, long da_len); extern void ext4_da_update_reserve_space(struct inode *inode, - unsigned int data_len, int quota_claim); + unsigned int data_len, unsigned int total_da_len, + long da_len, int quota_claim); extern int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t lblk, ext4_fsblk_t pblk, ext4_lblk_t len); diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index b098c3316189..8e0dec27f967 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -789,17 +789,20 @@ static inline void ext4_es_insert_extent_check(struct inode *inode, #endif /* - * Update total delay allocated extent length. + * Update and return total delay allocated extent length. */ -static inline void ext4_es_update_da_block(struct inode *inode, long es_len) +static inline unsigned int ext4_es_update_da_block(struct inode *inode, + long es_len) { struct ext4_es_tree *tree = &EXT4_I(inode)->i_es_tree; if (!es_len) - return; + goto out; tree->da_es_len += es_len; es_debug("update da blocks %ld, to %u\n", es_len, tree->da_es_len); +out: + return tree->da_es_len; } static int __es_insert_extent(struct inode *inode, struct extent_status *newes, @@ -870,6 +873,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, { struct extent_status newes; ext4_lblk_t end = lblk + len - 1; + ext4_lblk_t da_blocks = 0; int err1 = 0, err2 = 0, err3 = 0; struct rsvd_info rinfo; int pending = 0; @@ -930,7 +934,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, __es_free_extent(es1); es1 = NULL; } - ext4_es_update_da_block(inode, -rinfo.ndelonly_blk); + da_blocks = ext4_es_update_da_block(inode, -rinfo.ndelonly_blk); err2 = __es_insert_extent(inode, &newes, es2); if (err2 == -ENOMEM && !ext4_es_must_keep(&newes)) @@ -975,6 +979,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, * for any previously delayed allocated clusters. */ ext4_da_update_reserve_space(inode, rinfo.ndelonly_clu + pending, + da_blocks, -rinfo.ndelonly_blk, !delayed && rinfo.ndelonly_blk); if (err1 || err2 || err3 < 0) goto retry; @@ -1554,6 +1559,7 @@ void ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len) { ext4_lblk_t end; + ext4_lblk_t da_blocks = 0; struct rsvd_info rinfo; int err = 0; struct extent_status *es = NULL; @@ -1587,13 +1593,14 @@ void ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, __es_free_extent(es); es = NULL; } - ext4_es_update_da_block(inode, -rinfo.ndelonly_blk); + da_blocks = ext4_es_update_da_block(inode, -rinfo.ndelonly_blk); write_unlock(&EXT4_I(inode)->i_es_lock); if (err) goto retry; ext4_es_print_tree(inode); - ext4_da_release_space(inode, rinfo.ndelonly_clu); + ext4_da_release_space(inode, rinfo.ndelonly_clu, da_blocks, + -rinfo.ndelonly_blk); return; } @@ -2122,6 +2129,7 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, bool allocated) { struct extent_status newes; + ext4_lblk_t da_blocks; int err1 = 0, err2 = 0, err3 = 0; struct extent_status *es1 = NULL; struct extent_status *es2 = NULL; @@ -2179,12 +2187,19 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, } } - ext4_es_update_da_block(inode, newes.es_len); + da_blocks = ext4_es_update_da_block(inode, newes.es_len); error: write_unlock(&EXT4_I(inode)->i_es_lock); if (err1 || err2 || err3 < 0) goto retry; + /* + * New reserved meta space has been claimed for a single newly added + * delayed block in ext4_da_reserve_space(), but most of the reserved + * count of meta blocks could be merged, so recalculate it according + * to latest total delayed blocks. + */ + ext4_da_update_reserve_space(inode, 0, da_blocks, newes.es_len, 0); ext4_es_print_tree(inode); ext4_print_pending_tree(inode); return; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 38c47ce1333b..d714bf2e4171 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -332,6 +332,9 @@ static void __ext4_da_update_reserve_space(const char *where, struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); + if (!data_len && !ext_len) + return; + if (unlikely(data_len > ei->i_reserved_data_blocks || ext_len > (long)ei->i_reserved_ext_blocks)) { ext4_warning(inode->i_sb, "%s: ino %lu, clear %d,%d " @@ -355,21 +358,30 @@ static void __ext4_da_update_reserve_space(const char *where, * ext4_discard_preallocations() from here. */ void ext4_da_update_reserve_space(struct inode *inode, unsigned int data_len, + unsigned int total_da_len, long da_len, int quota_claim) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); - int ext_len = 0; + unsigned int new_ext_len; + int ext_len; - if (!data_len) + if (!data_len && !da_len) return; + if (da_len) + new_ext_len = ext4_map_worst_ext_blocks(inode, total_da_len); + spin_lock(&ei->i_block_reservation_lock); - trace_ext4_da_update_reserve_space(inode, data_len, ext_len, - quota_claim); + ext_len = da_len ? ei->i_reserved_ext_blocks - new_ext_len : 0; + trace_ext4_da_update_reserve_space(inode, data_len, total_da_len, + ext_len, quota_claim); __ext4_da_update_reserve_space(__func__, inode, data_len, ext_len); spin_unlock(&ei->i_block_reservation_lock); + if (!data_len) + return; + /* Update quota subsystem for data blocks */ if (quota_claim) dquot_claim_block(inode, EXT4_C2B(sbi, data_len)); @@ -1490,21 +1502,28 @@ static int ext4_da_reserve_space(struct inode *inode, unsigned int rsv_dlen, return 0; /* success */ } -void ext4_da_release_space(struct inode *inode, unsigned int data_len) +void ext4_da_release_space(struct inode *inode, unsigned int data_len, + unsigned int total_da_len, long da_len) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); - int ext_len = 0; + unsigned int new_ext_len; + int ext_len; - if (!data_len) + if (!data_len && !da_len) return; /* Nothing to release, exit */ + if (da_len) + new_ext_len = ext4_map_worst_ext_blocks(inode, total_da_len); + spin_lock(&ei->i_block_reservation_lock); - trace_ext4_da_release_space(inode, data_len, ext_len); + ext_len = da_len ? (ei->i_reserved_ext_blocks - new_ext_len) : 0; + trace_ext4_da_release_space(inode, data_len, total_da_len, ext_len); __ext4_da_update_reserve_space(__func__, inode, data_len, ext_len); spin_unlock(&ei->i_block_reservation_lock); - dquot_release_reservation_block(inode, EXT4_C2B(sbi, data_len)); + if (data_len) + dquot_release_reservation_block(inode, EXT4_C2B(sbi, data_len)); } /* @@ -1629,6 +1648,7 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); unsigned int rsv_dlen = 1; + unsigned int rsv_extlen; bool allocated = false; int ret; @@ -1662,7 +1682,8 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) } } - ret = ext4_da_reserve_space(inode, rsv_dlen, 0); + rsv_extlen = ext4_map_worst_ext_blocks(inode, 1); + ret = ext4_da_reserve_space(inode, rsv_dlen, rsv_extlen); if (ret) /* ENOSPC */ return ret; diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index e1e9d7ead20f..6916b1c5dff6 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -1216,16 +1216,18 @@ TRACE_EVENT(ext4_forget, TRACE_EVENT(ext4_da_update_reserve_space, TP_PROTO(struct inode *inode, int data_blocks, + unsigned int total_da_blocks, int meta_blocks, int quota_claim), - TP_ARGS(inode, data_blocks, meta_blocks, quota_claim), + TP_ARGS(inode, data_blocks, total_da_blocks, meta_blocks, quota_claim), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, i_blocks ) __field( int, data_blocks ) + __field( unsigned int, total_da_blocks ) __field( int, meta_blocks ) __field( int, reserved_data_blocks ) __field( int, reserved_ext_blocks ) @@ -1238,6 +1240,7 @@ TRACE_EVENT(ext4_da_update_reserve_space, __entry->ino = inode->i_ino; __entry->i_blocks = inode->i_blocks; __entry->data_blocks = data_blocks; + __entry->total_da_blocks = total_da_blocks; __entry->meta_blocks = meta_blocks; __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; __entry->reserved_ext_blocks = EXT4_I(inode)->i_reserved_ext_blocks; @@ -1245,12 +1248,14 @@ TRACE_EVENT(ext4_da_update_reserve_space, __entry->mode = inode->i_mode; ), - TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu data_blocks %d meta_blocks %d " + TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu " + "data_blocks %d total_da_blocks %u meta_blocks %d " "reserved_data_blocks %d reserved_ext_blocks %d quota_claim %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, - __entry->data_blocks, __entry->meta_blocks, + __entry->data_blocks, + __entry->total_da_blocks, __entry->meta_blocks, __entry->reserved_data_blocks, __entry->reserved_ext_blocks, __entry->quota_claim) ); @@ -1294,15 +1299,19 @@ TRACE_EVENT(ext4_da_reserve_space, ); TRACE_EVENT(ext4_da_release_space, - TP_PROTO(struct inode *inode, int freed_blocks, int meta_blocks), + TP_PROTO(struct inode *inode, + int freed_blocks, + unsigned int total_da_blocks, + int meta_blocks), - TP_ARGS(inode, freed_blocks, meta_blocks), + TP_ARGS(inode, freed_blocks, total_da_blocks, meta_blocks), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, i_blocks ) __field( int, freed_blocks ) + __field( unsigned int, total_da_blocks ) __field( int, meta_blocks ) __field( int, reserved_data_blocks ) __field( int, reserved_ext_blocks ) @@ -1314,6 +1323,7 @@ TRACE_EVENT(ext4_da_release_space, __entry->ino = inode->i_ino; __entry->i_blocks = inode->i_blocks; __entry->freed_blocks = freed_blocks; + __entry->total_da_blocks = total_da_blocks; __entry->meta_blocks = meta_blocks; __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; __entry->reserved_ext_blocks = EXT4_I(inode)->i_reserved_ext_blocks; @@ -1321,12 +1331,13 @@ TRACE_EVENT(ext4_da_release_space, ), TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu " - "freed_blocks %d meta_blocks %d" + "freed_blocks %d total_da_blocks %u, meta_blocks %d" "reserved_data_blocks %d reserved_ext_blocks %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, - __entry->freed_blocks, __entry->meta_blocks, + __entry->freed_blocks, + __entry->total_da_blocks, __entry->meta_blocks, __entry->reserved_data_blocks, __entry->reserved_ext_blocks) ); From patchwork Thu Aug 24 09:26:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825261 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=150.107.74.76; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBX0NC8z1yg8 for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBX0Hnjz4wy6 for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBX0DsFz4x0W; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBX091Cz4wy6 for ; Thu, 24 Aug 2023 19:31:32 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235846AbjHXJbF (ORCPT ); Thu, 24 Aug 2023 05:31:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41276 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240794AbjHXJav (ORCPT ); Thu, 24 Aug 2023 05:30:51 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA732198A for ; Thu, 24 Aug 2023 02:30:48 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RWd9d4Dw4z4f3kpL for ; Thu, 24 Aug 2023 17:30:45 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S19; Thu, 24 Aug 2023 17:30:46 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 15/16] ext4: flush delalloc blocks if no free space Date: Thu, 24 Aug 2023 17:26:18 +0800 Message-Id: <20230824092619.1327976-16-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S19 X-Coremail-Antispam: 1UD129KBjvJXoW3Xry7Ar1DAr18Cw1rZF47Arb_yoW7CryDpa 98C3WrGr40vw1DWa13XFsxXFyS9w40kFyUGr4fu34jvrZIqF1rWF9rtFy0yF15trWrJw1x uFWUKryUurWjk37anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9E14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAv wI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14 v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZX7UUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi For delalloc, the reserved metadata blocks count is calculated in the worst case, so the reservation could be larger than the real needs, that could lead to return false positive -ENOSPC when claiming free space. So start a worker to flush delalloc blocks in ext4_should_retry_alloc(). If the s_dirtyclusters_counter is not zero, there may have some delalloc metadata blocks that could be freed. Signed-off-by: Zhang Yi --- fs/ext4/balloc.c | 47 +++++++++++++++++++++++++++++++++++++++++------ fs/ext4/ext4.h | 5 +++++ fs/ext4/super.c | 12 ++++++++++++ 3 files changed, 58 insertions(+), 6 deletions(-) diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c index 79b20d6ae39e..e8acc21ef56d 100644 --- a/fs/ext4/balloc.c +++ b/fs/ext4/balloc.c @@ -667,6 +667,30 @@ int ext4_claim_free_clusters(struct ext4_sb_info *sbi, return -ENOSPC; } +void ext4_writeback_da_blocks(struct work_struct *work) +{ + struct ext4_sb_info *sbi = container_of(work, struct ext4_sb_info, + s_da_flush_work); + + try_to_writeback_inodes_sb(sbi->s_sb, WB_REASON_FS_FREE_SPACE); +} + +/* + * Writeback delallocated blocks and try to free unused reserved extent + * blocks, return 0 if no delalloc blocks need to writeback, 1 otherwise. + */ +static int ext4_flush_da_blocks(struct ext4_sb_info *sbi) +{ + if (!percpu_counter_read_positive(&sbi->s_dirtyclusters_counter) && + !percpu_counter_sum(&sbi->s_dirtyclusters_counter)) + return 0; + + if (!work_busy(&sbi->s_da_flush_work)) + queue_work(sbi->s_da_flush_wq, &sbi->s_da_flush_work); + flush_work(&sbi->s_da_flush_work); + return 1; +} + /** * ext4_should_retry_alloc() - check if a block allocation should be retried * @sb: superblock @@ -681,15 +705,22 @@ int ext4_claim_free_clusters(struct ext4_sb_info *sbi, int ext4_should_retry_alloc(struct super_block *sb, int *retries) { struct ext4_sb_info *sbi = EXT4_SB(sb); - - if (!sbi->s_journal) - return 0; + int result = 0; if (++(*retries) > 3) { percpu_counter_inc(&sbi->s_sra_exceeded_retry_limit); return 0; } + /* + * Flush allocated delalloc blocks and try to free unused + * reserved extent blocks. + */ + if (test_opt(sb, DELALLOC)) + result += ext4_flush_da_blocks(sbi); + + if (!sbi->s_journal) + goto out; /* * if there's no indication that blocks are about to be freed it's * possible we just missed a transaction commit that did so @@ -701,16 +732,20 @@ int ext4_should_retry_alloc(struct super_block *sb, int *retries) flush_work(&sbi->s_discard_work); atomic_dec(&sbi->s_retry_alloc_pending); } - return ext4_has_free_clusters(sbi, 1, 0); + result += ext4_has_free_clusters(sbi, 1, 0); + goto out; } /* * it's possible we've just missed a transaction commit here, * so ignore the returned status */ - ext4_debug("%s: retrying operation after ENOSPC\n", sb->s_id); + result += 1; (void) jbd2_journal_force_commit_nested(sbi->s_journal); - return 1; +out: + if (result) + ext4_debug("%s: retrying operation after ENOSPC\n", sb->s_id); + return result; } /* diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 67b12f9ffc50..6f4259ea6751 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1627,6 +1627,10 @@ struct ext4_sb_info { /* workqueue for reserved extent conversions (buffered io) */ struct workqueue_struct *rsv_conversion_wq; + /* workqueue for delalloc buffer IO flushing */ + struct workqueue_struct *s_da_flush_wq; + struct work_struct s_da_flush_work; + /* timer for periodic error stats printing */ struct timer_list s_err_report; @@ -2716,6 +2720,7 @@ extern int ext4_wait_block_bitmap(struct super_block *sb, struct buffer_head *bh); extern struct buffer_head *ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group); +extern void ext4_writeback_da_blocks(struct work_struct *work); extern unsigned ext4_free_clusters_after_init(struct super_block *sb, ext4_group_t block_group, struct ext4_group_desc *gdp); diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 7bc7c8c0ed71..6f50975ba42e 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1335,6 +1335,8 @@ static void ext4_put_super(struct super_block *sb) flush_work(&sbi->s_sb_upd_work); destroy_workqueue(sbi->rsv_conversion_wq); + flush_work(&sbi->s_da_flush_work); + destroy_workqueue(sbi->s_da_flush_wq); ext4_release_orphan_info(sb); if (sbi->s_journal) { @@ -5491,6 +5493,14 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) goto failed_mount4; } + INIT_WORK(&sbi->s_da_flush_work, ext4_writeback_da_blocks); + sbi->s_da_flush_wq = alloc_workqueue("ext4_delalloc_flush", WQ_UNBOUND, 1); + if (!sbi->s_da_flush_wq) { + printk(KERN_ERR "EXT4-fs: failed to create workqueue\n"); + err = -ENOMEM; + goto failed_mount4; + } + /* * The jbd2_journal_load will have done any necessary log recovery, * so we can safely mount the rest of the filesystem now. @@ -5660,6 +5670,8 @@ failed_mount9: __maybe_unused sb->s_root = NULL; failed_mount4: ext4_msg(sb, KERN_ERR, "mount failed"); + if (sbi->s_da_flush_wq) + destroy_workqueue(sbi->s_da_flush_wq); if (EXT4_SB(sb)->rsv_conversion_wq) destroy_workqueue(EXT4_SB(sb)->rsv_conversion_wq); failed_mount_wq: From patchwork Thu Aug 24 09:26:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 1825259 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=150.107.74.76; helo=gandalf.ozlabs.org; envelope-from=srs0=4ho7=ej=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RWdBW5T3zz1yfF for ; Thu, 24 Aug 2023 19:31:31 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBW5JSKz4wy6 for ; Thu, 24 Aug 2023 19:31:31 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4RWdBW5FVcz4x0W; Thu, 24 Aug 2023 19:31:31 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=ozlabs.org) Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4RWdBW58yJz4wy6 for ; Thu, 24 Aug 2023 19:31:31 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240694AbjHXJbE (ORCPT ); Thu, 24 Aug 2023 05:31:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240795AbjHXJav (ORCPT ); Thu, 24 Aug 2023 05:30:51 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E3AA10F9 for ; Thu, 24 Aug 2023 02:30:49 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.169]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4RWd9b0spMz4f3nqn for ; Thu, 24 Aug 2023 17:30:43 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHl6kzI+dkL1rbBQ--.46575S20; Thu, 24 Aug 2023 17:30:46 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [RFC PATCH 16/16] ext4: drop ext4_nonda_switch() Date: Thu, 24 Aug 2023 17:26:19 +0800 Message-Id: <20230824092619.1327976-17-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> References: <20230824092619.1327976-1-yi.zhang@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHl6kzI+dkL1rbBQ--.46575S20 X-Coremail-Antispam: 1UD129KBjvJXoWxWry8uF48CFWkKF4rArWDurg_yoWrXFWfpF W3Kw4fJr45Zr1DWrs3Xw1DZryFkayUKrWUKrW2gr18uF9xCr1S9F1DKF1FvFy2krWxJ3Z0 qFWUC347uF9Ik3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9E14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAv wI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14 v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZX7UUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,MAY_BE_FORGED, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zhang Yi Now that we have reserve enough metadata blocks for delalloc, the ext4_nonda_switch() could be dropped, it's safe to keep on delalloc mode for buffer write if the dirty space is high and free space is low, we can make sure always successfully allocate the metadata block while mapping delalloc entries in ext4_writepages(). Signed-off-by: Zhang Yi --- fs/ext4/extents_status.c | 9 ++++----- fs/ext4/inode.c | 39 ++------------------------------------- 2 files changed, 6 insertions(+), 42 deletions(-) diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 8e0dec27f967..954c6e49182e 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -971,11 +971,10 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, * reduce the reserved cluster count and claim quota. * * Otherwise, we aren't allocating delayed allocated clusters - * (from fallocate, filemap, DIO, or clusters allocated when - * delalloc has been disabled by ext4_nonda_switch()), reduce the - * reserved cluster count by the number of allocated clusters that - * have previously been delayed allocated. Quota has been claimed - * by ext4_mb_new_blocks(), so release the quota reservations made + * (from fallocate, filemap, DIO), reduce the reserved cluster + * count by the number of allocated clusters that have previously + * been delayed allocated. Quota has been claimed by + * ext4_mb_new_blocks(), so release the quota reservations made * for any previously delayed allocated clusters. */ ext4_da_update_reserve_space(inode, rinfo.ndelonly_clu + pending, diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index d714bf2e4171..0a76c99ea8c6 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2838,40 +2838,6 @@ static int ext4_dax_writepages(struct address_space *mapping, return ret; } -static int ext4_nonda_switch(struct super_block *sb) -{ - s64 free_clusters, dirty_clusters; - struct ext4_sb_info *sbi = EXT4_SB(sb); - - /* - * switch to non delalloc mode if we are running low - * on free block. The free block accounting via percpu - * counters can get slightly wrong with percpu_counter_batch getting - * accumulated on each CPU without updating global counters - * Delalloc need an accurate free block accounting. So switch - * to non delalloc when we are near to error range. - */ - free_clusters = - percpu_counter_read_positive(&sbi->s_freeclusters_counter); - dirty_clusters = - percpu_counter_read_positive(&sbi->s_dirtyclusters_counter); - /* - * Start pushing delalloc when 1/2 of free blocks are dirty. - */ - if (dirty_clusters && (free_clusters < 2 * dirty_clusters)) - try_to_writeback_inodes_sb(sb, WB_REASON_FS_FREE_SPACE); - - if (2 * free_clusters < 3 * dirty_clusters || - free_clusters < (dirty_clusters + EXT4_FREECLUSTERS_WATERMARK)) { - /* - * free block count is less than 150% of dirty blocks - * or free blocks is less than watermark - */ - return 1; - } - return 0; -} - static int ext4_da_write_begin(struct file *file, struct address_space *mapping, loff_t pos, unsigned len, struct page **pagep, void **fsdata) @@ -2886,7 +2852,7 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping, index = pos >> PAGE_SHIFT; - if (ext4_nonda_switch(inode->i_sb) || ext4_verity_in_progress(inode)) { + if (ext4_verity_in_progress(inode)) { *fsdata = (void *)FALL_BACK_TO_NONDELALLOC; return ext4_write_begin(file, mapping, pos, len, pagep, fsdata); @@ -6117,8 +6083,7 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) goto retry_alloc; /* Delalloc case is easy... */ - if (test_opt(inode->i_sb, DELALLOC) && - !ext4_nonda_switch(inode->i_sb)) { + if (test_opt(inode->i_sb, DELALLOC)) { do { err = block_page_mkwrite(vma, vmf, ext4_da_get_block_prep);