{"id":2226014,"url":"http://patchwork.ozlabs.org/api/1.2/patches/2226014/?format=json","web_url":"http://patchwork.ozlabs.org/project/linux-ext4/patch/20260422021042.4157510-9-yi.zhang@huaweicloud.com/","project":{"id":8,"url":"http://patchwork.ozlabs.org/api/1.2/projects/8/?format=json","name":"Linux ext4 filesystem development","link_name":"linux-ext4","list_id":"linux-ext4.vger.kernel.org","list_email":"linux-ext4@vger.kernel.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<20260422021042.4157510-9-yi.zhang@huaweicloud.com>","list_archive_url":null,"date":"2026-04-22T02:10:28","name":"[v3,08/22] ext4: implement buffered write path using iomap","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"8567bd7aae206da17f6470cda99566e4a0815a0a","submitter":{"id":85428,"url":"http://patchwork.ozlabs.org/api/1.2/people/85428/?format=json","name":"Zhang Yi","email":"yi.zhang@huaweicloud.com"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/linux-ext4/patch/20260422021042.4157510-9-yi.zhang@huaweicloud.com/mbox/","series":[{"id":500911,"url":"http://patchwork.ozlabs.org/api/1.2/series/500911/?format=json","web_url":"http://patchwork.ozlabs.org/project/linux-ext4/list/?series=500911","date":"2026-04-22T02:10:23","name":"ext4: use iomap for regular file's buffered I/O path","version":3,"mbox":"http://patchwork.ozlabs.org/series/500911/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/2226014/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/2226014/checks/","tags":{},"related":[],"headers":{"Return-Path":"\n <SRS0=Tjbi=CV=vger.kernel.org=linux-ext4+bounces-15973-patchwork-incoming=ozlabs.org@ozlabs.org>","X-Original-To":["incoming@patchwork.ozlabs.org","linux-ext4@vger.kernel.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","patchwork-incoming@ozlabs.org"],"Authentication-Results":["legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org\n (client-ip=2404:9400:2221:ea00::3; helo=mail.ozlabs.org;\n envelope-from=srs0=tjbi=cv=vger.kernel.org=linux-ext4+bounces-15973-patchwork-incoming=ozlabs.org@ozlabs.org;\n receiver=patchwork.ozlabs.org)","gandalf.ozlabs.org;\n arc=pass smtp.remote-ip=\"2600:3c0a:e001:db::12fc:5321\"\n arc.chain=subspace.kernel.org","gandalf.ozlabs.org;\n dmarc=none (p=none dis=none) header.from=huaweicloud.com","gandalf.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=2600:3c0a:e001:db::12fc:5321; helo=sea.lore.kernel.org;\n envelope-from=linux-ext4+bounces-15973-patchwork-incoming=ozlabs.org@vger.kernel.org;\n receiver=ozlabs.org)","smtp.subspace.kernel.org;\n arc=none smtp.client-ip=45.249.212.56","smtp.subspace.kernel.org;\n dmarc=none (p=none dis=none) header.from=huaweicloud.com","smtp.subspace.kernel.org;\n spf=pass smtp.mailfrom=huaweicloud.com"],"Received":["from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g0jfC6s5qz1yD5\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 22 Apr 2026 12:21:55 +1000 (AEST)","from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3])\n\tby gandalf.ozlabs.org (Postfix) with ESMTP id 4g0jfC6Q3kz4w1l\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 22 Apr 2026 12:21:55 +1000 (AEST)","by gandalf.ozlabs.org (Postfix)\n\tid 4g0jfC6MT4z4wKJ; Wed, 22 Apr 2026 12:21:55 +1000 (AEST)","from sea.lore.kernel.org (sea.lore.kernel.org\n [IPv6:2600:3c0a:e001:db::12fc:5321])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519)\n\t(No client certificate requested)\n\tby gandalf.ozlabs.org (Postfix) with ESMTPS id 4g0jf83wCCz4w1l\n\tfor <patchwork-incoming@ozlabs.org>; Wed, 22 Apr 2026 12:21:52 +1000 (AEST)","from smtp.subspace.kernel.org (conduit.subspace.kernel.org\n [100.90.174.1])\n\tby sea.lore.kernel.org (Postfix) with ESMTP id B6F04308BF02\n\tfor <patchwork-incoming@ozlabs.org>; Wed, 22 Apr 2026 02:17:21 +0000 (UTC)","from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby smtp.subspace.kernel.org (Postfix) with ESMTP id 62E0034B19A;\n\tWed, 22 Apr 2026 02:17:03 +0000 (UTC)","from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com\n [45.249.212.56])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby smtp.subspace.kernel.org (Postfix) with ESMTPS id 80792310620;\n\tWed, 22 Apr 2026 02:17:00 +0000 (UTC)","from mail.maildlp.com (unknown [172.19.163.198])\n\tby dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4g0jX421tVzKHMQS;\n\tWed, 22 Apr 2026 10:16:36 +0800 (CST)","from mail02.huawei.com (unknown [10.116.40.252])\n\tby mail.maildlp.com (Postfix) with ESMTP id 5152240604;\n\tWed, 22 Apr 2026 10:16:56 +0800 (CST)","from huaweicloud.com (unknown [10.50.85.155])\n\tby APP3 (Coremail) with SMTP id _Ch0CgB3JL6PL+hpqkgUBQ--.2635S12;\n\tWed, 22 Apr 2026 10:16:56 +0800 (CST)"],"ARC-Seal":["i=2; a=rsa-sha256; d=ozlabs.org; s=201707; t=1776824515; cv=pass;\n\tb=ae4fFrc5WHbgBm8qDOznf5QP0UM5W7q482KLvMZstErsiWTkRp7oGB8Xx2Ah3UYYidBp7guT70xlmBFWMNirHwOvhnKOZU1LI8CTyzipHl2vRUa6KFy2OXlv1/xI1VdoKATn9Zqo76xSfQ9f7pLVfw1WhMrse2rIG8k4cBhvN0y3KLv5WXRkHeFoB22PpFzglt/NGtcKc06dOFoeKLEXA0CFKjM4LGNAsgLxJfowH7qSZh1MIW0whrHoPWMN6dM08Hi/MSkGfK3KpYFurj7Yn71aK+eja5+1Ub+vHxdOIAFI2vcSkuRQVhv4UYvtmL3q2HHi+6TfQRaQwzPv/KXauQ==","i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;\n\tt=1776824223; cv=none;\n b=lAC1zK0zRPASPDnnT+SOmIsv0h/TPSD82ReeYrtTHaKGubETvpPoHxt7OUMxILa7b3vN88EOMPHUMthLhwA/jqaCD25HWpk7K4y15hIzC37N2t33AuFI439XxRXSfmRdVSCkEasLEAUrknmUU3FJ5yh2g4D+n6OzlDSLlvAn3Cg="],"ARC-Message-Signature":["i=2; a=rsa-sha256; d=ozlabs.org; s=201707;\n\tt=1776824515; c=relaxed/relaxed;\n\tbh=w2xhrKLUmGbG8aHF/Y+DgKM8K8wc77JY0lXWHG+f2pE=;\n\th=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:\n\t MIME-Version;\n b=jpj5Ab/8hoh7yAcIeE2qnMLO3bJ2zKgt0q0YbWD8WZQ5mg+t9p0uCr2Brsjp2y6fZGX53R9+9nMd77IUTDxtAROrp3N03OjpFC0+YpEe/rGLDFaVTpHuBBrIa4k9iJvIUAovkaOIcg27lkjoj7HmeQvzHlO4SKZ9BGuh4+xTo6rUUhLZ73zDThWbl/5cKrqp2DbbnSU0LXzuM1avV8rVfp2tMDjaLIASJi08kW2bVQON79gGDNtgZHRDgqy1wj4pA3W9xLITE+8Ni2I9WUMO8a7q57AEsaWsitWTasirVx+VuoAR1vcfCVYCGNDQhoTYF63cVWbERn/7jQzuSp2Dmg==","i=1; a=rsa-sha256; d=subspace.kernel.org;\n\ts=arc-20240116; t=1776824223; c=relaxed/simple;\n\tbh=6HObb2LAWjwIuKdbRXu5Ab0/G5LsBqg7PpPg8bLypfg=;\n\th=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:\n\t MIME-Version;\n b=uu/KeUSFUfg4O/RUZdGbTHqr/H7J6ZNm8xt9MVTzWPP/7RSmjcv1NBS2k9RUGVqWduCM/LjTtmG3rvkFP3W8DzrESpFsljdSu99E87A4LPtn9AdRBj7Xv+BWOWyEsj2+AbD2WwhkZeMDBCoy0MTRIzlNO0WR1ZNWHB4Zkq+k/b4="],"ARC-Authentication-Results":["i=2; gandalf.ozlabs.org;\n dmarc=none (p=none dis=none) header.from=huaweicloud.com;\n spf=pass (client-ip=2600:3c0a:e001:db::12fc:5321; helo=sea.lore.kernel.org;\n envelope-from=linux-ext4+bounces-15973-patchwork-incoming=ozlabs.org@vger.kernel.org;\n receiver=ozlabs.org) smtp.mailfrom=vger.kernel.org","i=1; smtp.subspace.kernel.org;\n dmarc=none (p=none dis=none) header.from=huaweicloud.com;\n spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56"],"From":"Zhang Yi <yi.zhang@huaweicloud.com>","To":"linux-ext4@vger.kernel.org,\n\tlinux-fsdevel@vger.kernel.org","Cc":"linux-kernel@vger.kernel.org,\n\ttytso@mit.edu,\n\tadilger.kernel@dilger.ca,\n\tlibaokun@linux.alibaba.com,\n\tjack@suse.cz,\n\tojaswin@linux.ibm.com,\n\tritesh.list@gmail.com,\n\tdjwong@kernel.org,\n\thch@infradead.org,\n\tyi.zhang@huawei.com,\n\tyi.zhang@huaweicloud.com,\n\tyizhang089@gmail.com,\n\tyangerkun@huawei.com,\n\tyukuai@fnnas.com","Subject":"[PATCH v3 08/22] ext4: implement buffered write path using iomap","Date":"Wed, 22 Apr 2026 10:10:28 +0800","Message-ID":"<20260422021042.4157510-9-yi.zhang@huaweicloud.com>","X-Mailer":"git-send-email 2.52.0","In-Reply-To":"<20260422021042.4157510-1-yi.zhang@huaweicloud.com>","References":"<20260422021042.4157510-1-yi.zhang@huaweicloud.com>","Precedence":"bulk","X-Mailing-List":"linux-ext4@vger.kernel.org","List-Id":"<linux-ext4.vger.kernel.org>","List-Subscribe":"<mailto:linux-ext4+subscribe@vger.kernel.org>","List-Unsubscribe":"<mailto:linux-ext4+unsubscribe@vger.kernel.org>","MIME-Version":"1.0","Content-Transfer-Encoding":"8bit","X-CM-TRANSID":"_Ch0CgB3JL6PL+hpqkgUBQ--.2635S12","X-Coremail-Antispam":"1UD129KBjvJXoW3Aw1kJFWrAw1fWFW7WF18Grg_yoWkuF4kpF\n\t90kry5GFsrXr97uF4ftF47Zr1F93WxtrW7CrW3Wrn8XryqyrWIqF40gFyayF15trZ7Cr4j\n\tqF4Ykry8Wr4UCrDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2\n\t9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0\n\trVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI\n\tkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2\n\tz4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F\n\t4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq\n\t3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7\n\tIYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U\n\tM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2\n\tkIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE\n\tbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67\n\tAF67kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI\n\t42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF\n\t4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI\n\tdaVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU=","X-CM-SenderInfo":"d1lo6xhdqjqx5xdzvxpfor3voofrz/","X-Spam-Status":"No, score=-1.1 required=5.0 tests=ARC_SIGNED,ARC_VALID,\n\tDMARC_MISSING,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,\n\tSPF_HELO_NONE,SPF_PASS autolearn=disabled version=4.0.1","X-Spam-Checker-Version":"SpamAssassin 4.0.1 (2024-03-25) on gandalf.ozlabs.org"},"content":"From: Zhang Yi <yi.zhang@huawei.com>\n\nIntroduce two new iomap_ops instances, ext4_iomap_buffered_write_ops and\next4_iomap_buffered_da_write_ops, to implement the iomap write paths for\next4. ext4_iomap_buffered_da_write_begin() invokes ext4_da_map_blocks()\nto map delayed allocation extents, and ext4_iomap_buffer_write_begin()\ninvokes ext4_iomap_get_blocks() to directly allocate blocks in\nnon-delayed allocation mode. Additionally, add ext4_iomap_valid() to\ncheck the validity of extents by the iomap infrastructure.\n\nKey changes:\n\n - Since we don't use data=ordered mode to prevent exposing stale data\n   in the non-delayed allocation path, we always allocate unwritten\n   extents for new blocks.\n\n - The iomap write path maps multiple blocks at a time in the\n   iomap_begin() callbacks, so we must remove the stale delayed\n   allocation range in case of short writes and write failures.\n   Otherwise, this could result in a range of delayed extents being\n   covered by a clean folio, which would lead to inaccurate space\n   reservation.\n\n - The lock ordering of the folio lock and transaction start is the\n   opposite of that in the buffer_head buffered write path. So we have\n   to stop journal handle in the iomap_begin() callbacks. The lock\n   ordering documentation in super.c has been updated accordingly.\n\nSigned-off-by: Zhang Yi <yi.zhang@huawei.com>\n---\n fs/ext4/ext4.h  |   4 ++\n fs/ext4/file.c  |  20 +++++-\n fs/ext4/inode.c | 164 +++++++++++++++++++++++++++++++++++++++++++++++-\n fs/ext4/super.c |  10 ++-\n 4 files changed, 191 insertions(+), 7 deletions(-)","diff":"diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h\nindex fe3491ad2129..be92ff648362 100644\n--- a/fs/ext4/ext4.h\n+++ b/fs/ext4/ext4.h\n@@ -3057,6 +3057,7 @@ int ext4_walk_page_buffers(handle_t *handle,\n int do_journal_get_write_access(handle_t *handle, struct inode *inode,\n \t\t\t\tstruct buffer_head *bh);\n void ext4_set_inode_mapping_order(struct inode *inode);\n+int ext4_nonda_switch(struct super_block *sb);\n #define FALL_BACK_TO_NONDELALLOC 1\n #define CONVERT_INLINE_DATA\t 2\n \n@@ -3943,6 +3944,9 @@ static inline void ext4_clear_io_unwritten_flag(ext4_io_end_t *io_end)\n \n extern const struct iomap_ops ext4_iomap_ops;\n extern const struct iomap_ops ext4_iomap_report_ops;\n+extern const struct iomap_ops ext4_iomap_buffered_write_ops;\n+extern const struct iomap_ops ext4_iomap_buffered_da_write_ops;\n+extern const struct iomap_write_ops ext4_iomap_write_ops;\n \n static inline int ext4_buffer_uptodate(struct buffer_head *bh)\n {\ndiff --git a/fs/ext4/file.c b/fs/ext4/file.c\nindex eb1a323962b1..7f9bfbbc4a4e 100644\n--- a/fs/ext4/file.c\n+++ b/fs/ext4/file.c\n@@ -299,6 +299,21 @@ static ssize_t ext4_write_checks(struct kiocb *iocb, struct iov_iter *from)\n \treturn count;\n }\n \n+static ssize_t ext4_iomap_buffered_write(struct kiocb *iocb,\n+\t\t\t\t\t struct iov_iter *from)\n+{\n+\tstruct inode *inode = file_inode(iocb->ki_filp);\n+\tconst struct iomap_ops *iomap_ops;\n+\n+\tif (test_opt(inode->i_sb, DELALLOC) && !ext4_nonda_switch(inode->i_sb))\n+\t\tiomap_ops = &ext4_iomap_buffered_da_write_ops;\n+\telse\n+\t\tiomap_ops = &ext4_iomap_buffered_write_ops;\n+\n+\treturn iomap_file_buffered_write(iocb, from, iomap_ops,\n+\t\t\t\t\t &ext4_iomap_write_ops, NULL);\n+}\n+\n static ssize_t ext4_buffered_write_iter(struct kiocb *iocb,\n \t\t\t\t\tstruct iov_iter *from)\n {\n@@ -313,7 +328,10 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb,\n \tif (ret <= 0)\n \t\tgoto out;\n \n-\tret = generic_perform_write(iocb, from);\n+\tif (ext4_inode_buffered_iomap(inode))\n+\t\tret = ext4_iomap_buffered_write(iocb, from);\n+\telse\n+\t\tret = generic_perform_write(iocb, from);\n \n out:\n \tinode_unlock(inode);\ndiff --git a/fs/ext4/inode.c b/fs/ext4/inode.c\nindex 5ffd6aeb3485..0ca303a90249 100644\n--- a/fs/ext4/inode.c\n+++ b/fs/ext4/inode.c\n@@ -3097,7 +3097,7 @@ static int ext4_dax_writepages(struct address_space *mapping,\n \treturn ret;\n }\n \n-static int ext4_nonda_switch(struct super_block *sb)\n+int ext4_nonda_switch(struct super_block *sb)\n {\n \ts64 free_clusters, dirty_clusters;\n \tstruct ext4_sb_info *sbi = EXT4_SB(sb);\n@@ -3467,6 +3467,15 @@ static bool ext4_inode_datasync_dirty(struct inode *inode)\n \treturn inode_state_read_once(inode) & I_DIRTY_DATASYNC;\n }\n \n+static bool ext4_iomap_valid(struct inode *inode, const struct iomap *iomap)\n+{\n+\treturn iomap->validity_cookie == READ_ONCE(EXT4_I(inode)->i_es_seq);\n+}\n+\n+const struct iomap_write_ops ext4_iomap_write_ops = {\n+\t.iomap_valid = ext4_iomap_valid,\n+};\n+\n static void ext4_set_iomap(struct inode *inode, struct iomap *iomap,\n \t\t\t   struct ext4_map_blocks *map, loff_t offset,\n \t\t\t   loff_t length, unsigned int flags)\n@@ -3501,6 +3510,8 @@ static void ext4_set_iomap(struct inode *inode, struct iomap *iomap,\n \t    !ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))\n \t\tiomap->flags |= IOMAP_F_MERGED;\n \n+\tiomap->validity_cookie = map->m_seq;\n+\n \t/*\n \t * Flags passed to ext4_map_blocks() for direct I/O writes can result\n \t * in m_flags having both EXT4_MAP_MAPPED and EXT4_MAP_UNWRITTEN bits\n@@ -3908,8 +3919,12 @@ const struct iomap_ops ext4_iomap_report_ops = {\n \t.iomap_begin = ext4_iomap_begin_report,\n };\n \n+/* Map blocks */\n+typedef int (ext4_get_blocks_t)(struct inode *, struct ext4_map_blocks *);\n+\n static int ext4_iomap_map_blocks(struct inode *inode, loff_t offset,\n-\t\tloff_t length, struct ext4_map_blocks *map)\n+\t\tloff_t length, ext4_get_blocks_t get_blocks,\n+\t\tstruct ext4_map_blocks *map)\n {\n \tu8 blkbits = inode->i_blkbits;\n \n@@ -3921,6 +3936,9 @@ static int ext4_iomap_map_blocks(struct inode *inode, loff_t offset,\n \tmap->m_len = min_t(loff_t, (offset + length - 1) >> blkbits,\n \t\t\t   EXT4_MAX_LOGICAL_BLOCK) - map->m_lblk + 1;\n \n+\tif (get_blocks)\n+\t\treturn get_blocks(inode, map);\n+\n \treturn ext4_map_blocks(NULL, inode, map, 0);\n }\n \n@@ -3938,7 +3956,7 @@ static int ext4_iomap_buffered_read_begin(struct inode *inode, loff_t offset,\n \tif (WARN_ON_ONCE(ext4_has_inline_data(inode)))\n \t\treturn -ERANGE;\n \n-\tret = ext4_iomap_map_blocks(inode, offset, length, &map);\n+\tret = ext4_iomap_map_blocks(inode, offset, length, NULL, &map);\n \tif (ret < 0)\n \t\treturn ret;\n \n@@ -3946,6 +3964,146 @@ static int ext4_iomap_buffered_read_begin(struct inode *inode, loff_t offset,\n \treturn 0;\n }\n \n+static int ext4_iomap_get_blocks(struct inode *inode,\n+\t\t\t\t struct ext4_map_blocks *map)\n+{\n+\tloff_t i_size = i_size_read(inode);\n+\thandle_t *handle;\n+\tint ret, needed_blocks;\n+\n+\t/*\n+\t * Check if the blocks have already been allocated, this could\n+\t * avoid initiating a new journal transaction and return the\n+\t * mapping information directly.\n+\t */\n+\tif ((map->m_lblk + map->m_len) <=\n+\t    round_up(i_size, i_blocksize(inode)) >> inode->i_blkbits) {\n+\t\tret = ext4_map_blocks(NULL, inode, map, 0);\n+\t\tif (ret < 0)\n+\t\t\treturn ret;\n+\t\tif (map->m_flags & (EXT4_MAP_MAPPED | EXT4_MAP_UNWRITTEN |\n+\t\t\t\t    EXT4_MAP_DELAYED))\n+\t\t\treturn 0;\n+\t}\n+\n+\t/*\n+\t * Reserve one block more for addition to orphan list in case\n+\t * we allocate blocks but write fails for some reason.\n+\t */\n+\tneeded_blocks = ext4_chunk_trans_blocks(inode, map->m_len) + 1;\n+\thandle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, needed_blocks);\n+\tif (IS_ERR(handle))\n+\t\treturn PTR_ERR(handle);\n+\n+\tret = ext4_map_blocks(handle, inode, map,\n+\t\t\t      EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT);\n+\t/*\n+\t * Stop handle here following the lock ordering of the folio lock\n+\t * and the transaction start.\n+\t */\n+\text4_journal_stop(handle);\n+\n+\treturn ret;\n+}\n+\n+static int ext4_iomap_buffered_do_write_begin(struct inode *inode,\n+\t\tloff_t offset, loff_t length, unsigned int flags,\n+\t\tstruct iomap *iomap, struct iomap *srcmap, bool delalloc)\n+{\n+\tint ret, retries = 0;\n+\tstruct ext4_map_blocks map;\n+\text4_get_blocks_t *get_blocks;\n+\n+\tret = ext4_emergency_state(inode->i_sb);\n+\tif (unlikely(ret))\n+\t\treturn ret;\n+\n+\t/* Inline data support is not yet available. */\n+\tif (WARN_ON_ONCE(ext4_has_inline_data(inode)))\n+\t\treturn -ERANGE;\n+\tif (WARN_ON_ONCE(!(flags & IOMAP_WRITE)))\n+\t\treturn -EINVAL;\n+\n+\tif (delalloc)\n+\t\tget_blocks = ext4_da_map_blocks;\n+\telse\n+\t\tget_blocks = ext4_iomap_get_blocks;\n+retry:\n+\tret = ext4_iomap_map_blocks(inode, offset, length, get_blocks, &map);\n+\tif (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))\n+\t\tgoto retry;\n+\tif (ret < 0)\n+\t\treturn ret;\n+\n+\text4_set_iomap(inode, iomap, &map, offset, length, flags);\n+\treturn 0;\n+}\n+\n+static int ext4_iomap_buffered_write_begin(struct inode *inode,\n+\t\tloff_t offset, loff_t length, unsigned int flags,\n+\t\tstruct iomap *iomap, struct iomap *srcmap)\n+{\n+\treturn ext4_iomap_buffered_do_write_begin(inode, offset, length, flags,\n+\t\t\t\t\t\t  iomap, srcmap, false);\n+}\n+\n+static int ext4_iomap_buffered_da_write_begin(struct inode *inode,\n+\t\tloff_t offset, loff_t length, unsigned int flags,\n+\t\tstruct iomap *iomap, struct iomap *srcmap)\n+{\n+\treturn ext4_iomap_buffered_do_write_begin(inode, offset, length, flags,\n+\t\t\t\t\t\t  iomap, srcmap, true);\n+}\n+\n+/*\n+ * Drop the staled delayed allocation range from the write failure,\n+ * including both start and end blocks. If not, we could leave a range\n+ * of delayed extents covered by a clean folio, it could lead to\n+ * inaccurate space reservation.\n+ */\n+static void ext4_iomap_punch_delalloc(struct inode *inode, loff_t offset,\n+\t\t\t\t     loff_t length, struct iomap *iomap)\n+{\n+\tdown_write(&EXT4_I(inode)->i_data_sem);\n+\text4_es_remove_extent(inode, offset >> inode->i_blkbits,\n+\t\t\tDIV_ROUND_UP_ULL(length, EXT4_BLOCK_SIZE(inode->i_sb)));\n+\tup_write(&EXT4_I(inode)->i_data_sem);\n+}\n+\n+static int ext4_iomap_buffered_da_write_end(struct inode *inode, loff_t offset,\n+\t\t\t\t\t    loff_t length, ssize_t written,\n+\t\t\t\t\t    unsigned int flags,\n+\t\t\t\t\t    struct iomap *iomap)\n+{\n+\tloff_t start_byte, end_byte;\n+\n+\t/* If we didn't reserve the blocks, we're not allowed to punch them. */\n+\tif (iomap->type != IOMAP_DELALLOC || !(iomap->flags & IOMAP_F_NEW))\n+\t\treturn 0;\n+\n+\t/* Nothing to do if we've written the entire delalloc extent */\n+\tstart_byte = iomap_last_written_block(inode, offset, written);\n+\tend_byte = round_up(offset + length, i_blocksize(inode));\n+\tif (start_byte >= end_byte)\n+\t\treturn 0;\n+\n+\tfilemap_invalidate_lock(inode->i_mapping);\n+\tiomap_write_delalloc_release(inode, start_byte, end_byte, flags,\n+\t\t\t\t     iomap, ext4_iomap_punch_delalloc);\n+\tfilemap_invalidate_unlock(inode->i_mapping);\n+\treturn 0;\n+}\n+\n+\n+const struct iomap_ops ext4_iomap_buffered_write_ops = {\n+\t.iomap_begin = ext4_iomap_buffered_write_begin,\n+};\n+\n+const struct iomap_ops ext4_iomap_buffered_da_write_ops = {\n+\t.iomap_begin = ext4_iomap_buffered_da_write_begin,\n+\t.iomap_end = ext4_iomap_buffered_da_write_end,\n+};\n+\n const struct iomap_ops ext4_iomap_buffered_read_ops = {\n \t.iomap_begin = ext4_iomap_buffered_read_begin,\n };\ndiff --git a/fs/ext4/super.c b/fs/ext4/super.c\nindex 6a77db4d3124..9bc294b769db 100644\n--- a/fs/ext4/super.c\n+++ b/fs/ext4/super.c\n@@ -104,9 +104,13 @@ static const struct fs_parameter_spec ext4_param_specs[];\n  *   -> page lock -> i_data_sem (rw)\n  *\n  * buffered write path:\n- * sb_start_write -> i_mutex -> mmap_lock\n- * sb_start_write -> i_mutex -> transaction start -> page lock ->\n- *   i_data_sem (rw)\n+ * sb_start_write -> i_rwsem (w) -> mmap_lock\n+ * - buffer_head path:\n+ *   sb_start_write -> i_rwsem (w) -> transaction start -> folio lock ->\n+ *     i_data_sem (rw)\n+ * - iomap path:\n+ *   sb_start_write -> i_rwsem (w) -> transaction start -> i_data_sem (rw)\n+ *   sb_start_write -> i_rwsem (w) -> folio lock (not under an active handle)\n  *\n  * truncate:\n  * sb_start_write -> i_mutex -> invalidate_lock (w) -> i_mmap_rwsem (w) ->\n","prefixes":["v3","08/22"]}