{"id":2186636,"url":"http://patchwork.ozlabs.org/api/1.2/patches/2186636/?format=json","web_url":"http://patchwork.ozlabs.org/project/linux-ext4/patch/20260120112538.132774-6-me@linux.beauty/","project":{"id":8,"url":"http://patchwork.ozlabs.org/api/1.2/projects/8/?format=json","name":"Linux ext4 filesystem development","link_name":"linux-ext4","list_id":"linux-ext4.vger.kernel.org","list_email":"linux-ext4@vger.kernel.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<20260120112538.132774-6-me@linux.beauty>","list_archive_url":null,"date":"2026-01-20T11:25:34","name":"[RFC,v4,5/7] ext4: fast commit: avoid i_data_sem by dropping ext4_map_blocks() in snapshots","commit_ref":null,"pull_url":null,"state":"superseded","archived":false,"hash":"b5c5a5f444b6e2be92c129e2e8eaf8e51d7e8b93","submitter":{"id":84264,"url":"http://patchwork.ozlabs.org/api/1.2/people/84264/?format=json","name":"Li Chen","email":"me@linux.beauty"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/linux-ext4/patch/20260120112538.132774-6-me@linux.beauty/mbox/","series":[{"id":489032,"url":"http://patchwork.ozlabs.org/api/1.2/series/489032/?format=json","web_url":"http://patchwork.ozlabs.org/project/linux-ext4/list/?series=489032","date":"2026-01-20T11:25:32","name":"ext4: fast commit: snapshot inode state for FC log","version":4,"mbox":"http://patchwork.ozlabs.org/series/489032/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/2186636/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/2186636/checks/","tags":{},"related":[],"headers":{"Return-Path":"\n <SRS0=56k8=7Z=vger.kernel.org=linux-ext4+bounces-13103-patchwork-incoming=ozlabs.org@ozlabs.org>","X-Original-To":["incoming@patchwork.ozlabs.org","linux-ext4@vger.kernel.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","patchwork-incoming@ozlabs.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=linux.beauty header.i=me@linux.beauty\n header.a=rsa-sha256 header.s=zmail header.b=dssfFFWw;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org\n (client-ip=2404:9400:2221:ea00::3; helo=mail.ozlabs.org;\n envelope-from=srs0=56k8=7z=vger.kernel.org=linux-ext4+bounces-13103-patchwork-incoming=ozlabs.org@ozlabs.org;\n receiver=patchwork.ozlabs.org)","gandalf.ozlabs.org;\n arc=pass smtp.remote-ip=\"2a01:60a::1994:3:14\"\n arc.chain=\"subspace.kernel.org:zohomail.com\"","gandalf.ozlabs.org;\n dmarc=none (p=none dis=none) header.from=linux.beauty","gandalf.ozlabs.org;\n spf=fail smtp.mailfrom=vger.kernel.org","gandalf.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=linux.beauty header.i=me@linux.beauty\n header.a=rsa-sha256 header.s=zmail header.b=dssfFFWw;\n\tdkim-atps=neutral","gandalf.ozlabs.org;\n spf=softfail (domain owner discourages use of this host)\n smtp.mailfrom=vger.kernel.org (client-ip=2a01:60a::1994:3:14;\n helo=ams.mirrors.kernel.org;\n envelope-from=linux-ext4+bounces-13103-patchwork-incoming=ozlabs.org@vger.kernel.org;\n receiver=ozlabs.org)","smtp.subspace.kernel.org;\n\tdkim=pass (1024-bit key) header.d=linux.beauty header.i=me@linux.beauty\n header.b=\"dssfFFWw\"","smtp.subspace.kernel.org;\n arc=pass smtp.client-ip=136.143.188.112","smtp.subspace.kernel.org;\n dmarc=none (p=none dis=none) header.from=linux.beauty","smtp.subspace.kernel.org;\n spf=pass smtp.mailfrom=linux.beauty"],"Received":["from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4dwQJb4z4Mz1xsg\n\tfor <incoming@patchwork.ozlabs.org>; Tue, 20 Jan 2026 22:36:31 +1100 (AEDT)","from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3])\n\tby gandalf.ozlabs.org (Postfix) with ESMTP id 4dwQJb4Tq1z4w26\n\tfor <incoming@patchwork.ozlabs.org>; Tue, 20 Jan 2026 22:36:31 +1100 (AEDT)","by gandalf.ozlabs.org (Postfix)\n\tid 4dwQJb4MBDz4wB9; Tue, 20 Jan 2026 22:36:31 +1100 (AEDT)","from ams.mirrors.kernel.org (ams.mirrors.kernel.org\n [IPv6:2a01:60a::1994:3:14])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519)\n\t(No client certificate requested)\n\tby gandalf.ozlabs.org (Postfix) with ESMTPS id 4dwQJW6vHwz4w26\n\tfor <patchwork-incoming@ozlabs.org>; Tue, 20 Jan 2026 22:36:27 +1100 (AEDT)","from smtp.subspace.kernel.org (relay.kernel.org [52.25.139.140])\n\t(using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ams.mirrors.kernel.org (Postfix) with ESMTPS id A27FC42B200\n\tfor <patchwork-incoming@ozlabs.org>; Tue, 20 Jan 2026 11:28:07 +0000 (UTC)","from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby smtp.subspace.kernel.org (Postfix) with ESMTP id 7609B3EF0A6;\n\tTue, 20 Jan 2026 11:27:08 +0000 (UTC)","from sender4-pp-f112.zoho.com (sender4-pp-f112.zoho.com\n [136.143.188.112])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby smtp.subspace.kernel.org (Postfix) with ESMTPS id 115703A7E17;\n\tTue, 20 Jan 2026 11:27:04 +0000 (UTC)","by mx.zohomail.com with SMTPS id 1768908374161511.0399864679431;\n\tTue, 20 Jan 2026 03:26:14 -0800 (PST)"],"ARC-Seal":["i=3; a=rsa-sha256; d=ozlabs.org; s=201707; t=1768908991; cv=pass;\n\tb=WfnHrQCudSWRtO+ACFecSYYzzZuzVaVRyIjUcUKMmksHhDo7mEJ9fa+wVSLlWQQbF8wrSqVZlLUbcuAJBdmE2YR+Qsf0N9WSMq2WvhC/NN6xVQREwk4Qtpj1m5tq9US64Dc6mK5Lv4IsMvzdAvB+pqq30Vap9ssCUiMcQaSMse2DH+GWS7FhOgLY6/0eGnilkiD3fYZiyKX/j2QTSXSrJ1ynUiWXCfv52AQSjEZP9fQ+Cmm35lw/ZClw4xib3s0hnwY9ByGti1UX5pcBikAP2gf9t8FNGXu4FsFI4IWXHvdb/pDgJQKbkhsMO8SmyytNeTcz+4w2n8rQc58vUoXjcQ==","i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;\n\tt=1768908428; cv=pass;\n b=iuTJABehYwseucsuEPZgIfa0qGq76RBywyWa4zG95YAvpfnTv5NPn4rd0OzSpKzVEtg8aSlnvEZzfJ60VAQ3TJYQBZNU5nDA4ftCxwjtKpq6LOfxBVyNJLpzDOL3qvHW4hhWvKUJ632Crth5ALPLYp/8uK9cE3MGaCXOZ0Rzk7M=","i=1; a=rsa-sha256; t=1768908376; cv=none;\n\td=zohomail.com; s=zohoarc;\n\tb=jGDIZTbujvvce5VZKEIxUzXMSU1TaCI/IwRGA9uIGC93kKsNbKDW4kCJi4MIrXoT0rcGvyOli/CnqDUPA3ErKTJ9wNty7uVrJJB2CMLNjRtmY2/L7uc3El8kxklW5SQmaMxHWUsOJyYg3wtb/xELhSF8VZjihLEKLpTLnhxWquQ="],"ARC-Message-Signature":["i=3; a=rsa-sha256; d=ozlabs.org; s=201707;\n\tt=1768908991; c=relaxed/relaxed;\n\tbh=7QB2hp2pU43ouGnaXacTBNanuoWV/l+MNLizhNnEfwc=;\n\th=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:\n\t MIME-Version;\n b=YeVrpbwquduQuEZLqdrdPX1K8SC3a2gJyFzQgXBIZikVS+kacn/KJ8oBPf+N+L610DWKL4q7+uFUhR9mELccSbvUQ1/vyD2soGPkt9+hPCFl0AjHMWCW9LfugYON3AVT1iRyuoe/ZanuUqcN2JPzRxuM9F3g1Uyte+3g+grPonk+r/BnaVILeK/t7wP+p/kaMh7YEBVjkP3JSEkw03H9qLiZX2x2ymzDC2Tombxw4uWp6Aa5kXhZbYrhP/6ztvGoNW5ZH7YcoCDWrqIWV6Fx0bQX85t/f9Ne9SE1DKgCf2zK+jkgVxhLYNX5lFyKkoqanuL5crghj+cv57rX9sbojQ==","i=2; a=rsa-sha256; d=subspace.kernel.org;\n\ts=arc-20240116; t=1768908428; c=relaxed/simple;\n\tbh=0tNznCmPvl+G0JZYbmOuIAjjHfoX714+rfjHwE9O3qw=;\n\th=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:\n\t MIME-Version;\n b=ESOc4mrj4ReTv6xB+qkQzWVkLOfFhDCkPqH4o/4NHr6RHXyTZfQ0uWmIcnakQH1P5BNzAYkfym14pl8HXuVGxzm/DHG5nfmPuy80YgNi2AdlWeMX/0ngxtQU23Yeah6QwE8PD59A9aoK0DU145zMPeUejdlttECTE/zNQvnvJi4=","i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com;\n s=zohoarc;\n\tt=1768908376;\n h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:Subject:To:To:Message-Id:Reply-To;\n\tbh=7QB2hp2pU43ouGnaXacTBNanuoWV/l+MNLizhNnEfwc=;\n\tb=aZOntE1/IJ2zgSGA4cHojKF96BaSscY87ZHxbp/VJc01+vDBJFT5hAr9F5notsoY9okGeNPYGpgU+zVvFwVcZWpsVtOTApHdUkt+UTp6V8BcL+4FTurj0Gi6HPRhiiNDuIKw2kiYN+KWnSfYNuriSOVdeLze2B3i8IoS3pUsbJ4="],"ARC-Authentication-Results":["i=3; gandalf.ozlabs.org;\n dmarc=none (p=none dis=none) header.from=linux.beauty;\n spf=fail smtp.mailfrom=vger.kernel.org; dkim=pass (1024-bit key;\n unprotected) header.d=linux.beauty header.i=me@linux.beauty\n header.a=rsa-sha256 header.s=zmail header.b=dssfFFWw; dkim-atps=neutral;\n spf=softfail (client-ip=2a01:60a::1994:3:14; helo=ams.mirrors.kernel.org;\n envelope-from=linux-ext4+bounces-13103-patchwork-incoming=ozlabs.org@vger.kernel.org;\n receiver=ozlabs.org) smtp.mailfrom=vger.kernel.org","i=2; smtp.subspace.kernel.org;\n dmarc=none (p=none dis=none) header.from=linux.beauty;\n spf=pass smtp.mailfrom=linux.beauty;\n dkim=pass (1024-bit key) header.d=linux.beauty header.i=me@linux.beauty\n header.b=dssfFFWw; arc=pass smtp.client-ip=136.143.188.112","i=1; mx.zohomail.com;\n\tdkim=pass  header.i=linux.beauty;\n\tspf=pass  smtp.mailfrom=me@linux.beauty;\n\tdmarc=pass header.from=<me@linux.beauty>"],"DKIM-Signature":"v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1768908376;\n\ts=zmail; d=linux.beauty; i=me@linux.beauty;\n\th=From:From:To:To:Cc:Cc:Subject:Subject:Date:Date:Message-ID:In-Reply-To:References:MIME-Version:Content-Transfer-Encoding:Message-Id:Reply-To;\n\tbh=7QB2hp2pU43ouGnaXacTBNanuoWV/l+MNLizhNnEfwc=;\n\tb=dssfFFWwillSLFOwDQBbEskXz6VqihPlfGVuUwalQAK3u+vM/11AAwz76HcYBw1R\n\touljNIZC3VAKgTXvnNeviT3zRbKT+KOFKcuGOmc7P2KvWzNiV9GLDPCBz+wpM1KUQb3\n\taiyfHURAc2K09TAslMGhe69Ik8FUlsGwGKxvCJ7o=","From":"Li Chen <me@linux.beauty>","To":"Zhang Yi <yi.zhang@huaweicloud.com>,\n\t\"Theodore Ts'o\" <tytso@mit.edu>,\n\tAndreas Dilger <adilger.kernel@dilger.ca>,\n\tlinux-ext4@vger.kernel.org,\n\tlinux-kernel@vger.kernel.org","Cc":"Li Chen <me@linux.beauty>","Subject":"[RFC v4 5/7] ext4: fast commit: avoid i_data_sem by dropping\n ext4_map_blocks() in snapshots","Date":"Tue, 20 Jan 2026 19:25:34 +0800","Message-ID":"<20260120112538.132774-6-me@linux.beauty>","X-Mailer":"git-send-email 2.52.0","In-Reply-To":"<20260120112538.132774-1-me@linux.beauty>","References":"<20260120112538.132774-1-me@linux.beauty>","Precedence":"bulk","X-Mailing-List":"linux-ext4@vger.kernel.org","List-Id":"<linux-ext4.vger.kernel.org>","List-Subscribe":"<mailto:linux-ext4+subscribe@vger.kernel.org>","List-Unsubscribe":"<mailto:linux-ext4+unsubscribe@vger.kernel.org>","MIME-Version":"1.0","Content-Transfer-Encoding":"8bit","X-ZohoMailClient":"External","X-Spam-Status":"No, score=-0.2 required=5.0 tests=ARC_SIGNED,ARC_VALID,\n\tDKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DMARC_MISSING,\n\tHEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,\n\tSPF_SOFTFAIL autolearn=disabled version=4.0.1","X-Spam-Checker-Version":"SpamAssassin 4.0.1 (2024-03-25) on gandalf.ozlabs.org"},"content":"Commit-time snapshots run under jbd2_journal_lock_updates(), so the work\ndone there must stay bounded.\n\nThe snapshot path still used ext4_map_blocks() to build data ranges. This\ncan take i_data_sem and pulls the mapping code into the snapshot logic.\nBuild inode data range snapshots from the extent status tree instead.\n\nThe extent status tree is a cache, not an authoritative source. If the\nneeded information is missing or unstable (e.g. delayed allocation), treat\nthe transaction as fast commit ineligible and fall back to full commit.\n\nAlso cap the number of inodes and ranges snapshotted per fast commit and\nallocate range records from a dedicated slab cache. The inode pointer\narray is allocated outside the updates-locked window.\n\nTesting: QEMU/KVM guest, virtio-pmem + dax, ext4 -O fast_commit, mounted\ndax,noatime. Ran python3 500x {4K write + fsync}, fallocate 256M, and\npython3 500x {creat + fsync(dir)} without lockdep splats or errors.\n\nSigned-off-by: Li Chen <me@linux.beauty>\n---\n fs/ext4/fast_commit.c | 253 +++++++++++++++++++++++++++++-------------\n 1 file changed, 177 insertions(+), 76 deletions(-)","diff":"diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c\nindex 966211a3342a..d1eefee60912 100644\n--- a/fs/ext4/fast_commit.c\n+++ b/fs/ext4/fast_commit.c\n@@ -183,6 +183,15 @@\n \n #include <trace/events/ext4.h>\n static struct kmem_cache *ext4_fc_dentry_cachep;\n+static struct kmem_cache *ext4_fc_range_cachep;\n+\n+/*\n+ * Avoid spending unbounded time/memory snapshotting highly fragmented files\n+ * under jbd2_journal_lock_updates(). If we exceed this limit, fall back to\n+ * full commit.\n+ */\n+#define EXT4_FC_SNAPSHOT_MAX_INODES\t1024\n+#define EXT4_FC_SNAPSHOT_MAX_RANGES\t2048\n \n static void ext4_end_buffer_io_sync(struct buffer_head *bh, int uptodate)\n {\n@@ -954,7 +963,7 @@ static void ext4_fc_free_ranges(struct list_head *head)\n \n \tlist_for_each_entry_safe(range, range_n, head, list) {\n \t\tlist_del(&range->list);\n-\t\tkfree(range);\n+\t\tkmem_cache_free(ext4_fc_range_cachep, range);\n \t}\n }\n \n@@ -972,16 +981,19 @@ static void ext4_fc_free_inode_snap(struct inode *inode)\n }\n \n static int ext4_fc_snapshot_inode_data(struct inode *inode,\n-\t\t\t\t       struct list_head *ranges)\n+\t\t\t\t       struct list_head *ranges,\n+\t\t\t\t       unsigned int nr_ranges_total,\n+\t\t\t\t       unsigned int *nr_rangesp)\n {\n \tstruct ext4_inode_info *ei = EXT4_I(inode);\n+\tunsigned int nr_ranges = 0;\n \text4_lblk_t start_lblk, end_lblk, cur_lblk;\n-\tstruct ext4_map_blocks map;\n-\tint ret;\n \n \tspin_lock(&ei->i_fc_lock);\n \tif (ei->i_fc_lblk_len == 0) {\n \t\tspin_unlock(&ei->i_fc_lock);\n+\t\tif (nr_rangesp)\n+\t\t\t*nr_rangesp = 0;\n \t\treturn 0;\n \t}\n \tstart_lblk = ei->i_fc_lblk_start;\n@@ -994,61 +1006,78 @@ static int ext4_fc_snapshot_inode_data(struct inode *inode,\n \t\t   start_lblk, end_lblk, inode->i_ino);\n \n \twhile (cur_lblk <= end_lblk) {\n+\t\tstruct extent_status es;\n \t\tstruct ext4_fc_range *range;\n+\t\text4_lblk_t len;\n \n-\t\tmap.m_lblk = cur_lblk;\n-\t\tmap.m_len = end_lblk - cur_lblk + 1;\n-\t\tret = ext4_map_blocks(NULL, inode, &map,\n-\t\t\t\t      EXT4_GET_BLOCKS_IO_SUBMIT |\n-\t\t\t\t      EXT4_EX_NOCACHE);\n-\t\tif (ret < 0)\n-\t\t\treturn -ECANCELED;\n+\t\tif (!ext4_es_lookup_extent(inode, cur_lblk, NULL, &es, NULL))\n+\t\t\treturn -EAGAIN;\n+\n+\t\tif (ext4_es_is_delayed(&es))\n+\t\t\treturn -EAGAIN;\n \n-\t\tif (map.m_len == 0) {\n+\t\tlen = es.es_len - (cur_lblk - es.es_lblk);\n+\t\tif (len > end_lblk - cur_lblk + 1)\n+\t\t\tlen = end_lblk - cur_lblk + 1;\n+\t\tif (len == 0) {\n \t\t\tcur_lblk++;\n \t\t\tcontinue;\n \t\t}\n \n-\t\trange = kmalloc(sizeof(*range), GFP_NOFS);\n+\t\tif (nr_ranges_total + nr_ranges >= EXT4_FC_SNAPSHOT_MAX_RANGES)\n+\t\t\treturn -E2BIG;\n+\n+\t\trange = kmem_cache_alloc(ext4_fc_range_cachep, GFP_NOFS);\n \t\tif (!range)\n \t\t\treturn -ENOMEM;\n+\t\tnr_ranges++;\n \n-\t\trange->lblk = map.m_lblk;\n-\t\trange->len = map.m_len;\n+\t\trange->lblk = cur_lblk;\n+\t\trange->len = len;\n \t\trange->pblk = 0;\n \t\trange->unwritten = false;\n \n-\t\tif (ret == 0) {\n+\t\tif (ext4_es_is_hole(&es)) {\n \t\t\trange->tag = EXT4_FC_TAG_DEL_RANGE;\n-\t\t} else {\n-\t\t\tunsigned int max = (map.m_flags & EXT4_MAP_UNWRITTEN) ?\n-\t\t\t\tEXT_UNWRITTEN_MAX_LEN : EXT_INIT_MAX_LEN;\n-\n-\t\t\t/* Limit the number of blocks in one extent */\n-\t\t\tmap.m_len = min(max, map.m_len);\n+\t\t} else if (ext4_es_is_written(&es) ||\n+\t\t\t   ext4_es_is_unwritten(&es)) {\n+\t\t\tunsigned int max;\n \n \t\t\trange->tag = EXT4_FC_TAG_ADD_RANGE;\n-\t\t\trange->len = map.m_len;\n-\t\t\trange->pblk = map.m_pblk;\n-\t\t\trange->unwritten = !!(map.m_flags & EXT4_MAP_UNWRITTEN);\n+\t\t\trange->pblk = ext4_es_pblock(&es) +\n+\t\t\t\t      (cur_lblk - es.es_lblk);\n+\t\t\trange->unwritten = ext4_es_is_unwritten(&es);\n+\n+\t\t\tmax = range->unwritten ? EXT_UNWRITTEN_MAX_LEN :\n+\t\t\t\t\t\t EXT_INIT_MAX_LEN;\n+\t\t\tif (range->len > max)\n+\t\t\t\trange->len = max;\n+\t\t} else {\n+\t\t\tkmem_cache_free(ext4_fc_range_cachep, range);\n+\t\t\treturn -EAGAIN;\n \t\t}\n \n \t\tINIT_LIST_HEAD(&range->list);\n \t\tlist_add_tail(&range->list, ranges);\n \n-\t\tcur_lblk += map.m_len;\n+\t\tcur_lblk += range->len;\n \t}\n \n+\tif (nr_rangesp)\n+\t\t*nr_rangesp = nr_ranges;\n \treturn 0;\n }\n \n-static int ext4_fc_snapshot_inode(struct inode *inode)\n+static int ext4_fc_snapshot_inode(struct inode *inode,\n+\t\t\t\t  unsigned int nr_ranges_total,\n+\t\t\t\t  unsigned int *nr_rangesp)\n {\n \tstruct ext4_inode_info *ei = EXT4_I(inode);\n \tstruct ext4_fc_inode_snap *snap;\n \tint inode_len = EXT4_GOOD_OLD_INODE_SIZE;\n \tstruct ext4_iloc iloc;\n \tLIST_HEAD(ranges);\n+\tunsigned int nr_ranges = 0;\n \tint ret;\n \tint alloc_ctx;\n \n@@ -1072,7 +1101,8 @@ static int ext4_fc_snapshot_inode(struct inode *inode)\n \tmemcpy(snap->inode_buf, (u8 *)ext4_raw_inode(&iloc), inode_len);\n \tbrelse(iloc.bh);\n \n-\tret = ext4_fc_snapshot_inode_data(inode, &ranges);\n+\tret = ext4_fc_snapshot_inode_data(inode, &ranges, nr_ranges_total,\n+\t\t\t\t\t  &nr_ranges);\n \tif (ret) {\n \t\tkfree(snap);\n \t\text4_fc_free_ranges(&ranges);\n@@ -1085,10 +1115,11 @@ static int ext4_fc_snapshot_inode(struct inode *inode)\n \tlist_splice_tail_init(&ranges, &snap->data_list);\n \text4_fc_unlock(inode->i_sb, alloc_ctx);\n \n+\tif (nr_rangesp)\n+\t\t*nr_rangesp = nr_ranges;\n \treturn 0;\n }\n \n-\n /* Flushes data of all the inodes in the commit queue. */\n static int ext4_fc_flush_data(journal_t *journal)\n {\n@@ -1167,49 +1198,32 @@ static int ext4_fc_commit_dentry_updates(journal_t *journal, u32 *crc)\n \treturn 0;\n }\n \n-static int ext4_fc_snapshot_inodes(journal_t *journal)\n+static int ext4_fc_alloc_snapshot_inodes(struct super_block *sb,\n+\t\t\t\t\t struct inode ***inodesp,\n+\t\t\t\t\t unsigned int *nr_inodesp);\n+\n+static int ext4_fc_snapshot_inodes(journal_t *journal, struct inode **inodes,\n+\t\t\t\t   unsigned int inodes_size)\n {\n \tstruct super_block *sb = journal->j_private;\n \tstruct ext4_sb_info *sbi = EXT4_SB(sb);\n \tstruct ext4_inode_info *iter;\n \tstruct ext4_fc_dentry_update *fc_dentry;\n-\tstruct inode **inodes;\n-\tunsigned int nr_inodes = 0;\n \tunsigned int i = 0;\n+\tunsigned int idx;\n+\tunsigned int nr_ranges = 0;\n \tint ret = 0;\n \tint alloc_ctx;\n \n-\talloc_ctx = ext4_fc_lock(sb);\n-\tlist_for_each_entry(iter, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list)\n-\t\tnr_inodes++;\n-\n-\tlist_for_each_entry(fc_dentry, &sbi->s_fc_dentry_q[FC_Q_MAIN], fcd_list) {\n-\t\tstruct ext4_inode_info *ei;\n-\n-\t\tif (fc_dentry->fcd_op != EXT4_FC_TAG_CREAT)\n-\t\t\tcontinue;\n-\t\tif (list_empty(&fc_dentry->fcd_dilist))\n-\t\t\tcontinue;\n-\n-\t\t/* See the comment in ext4_fc_commit_dentry_updates(). */\n-\t\tei = list_first_entry(&fc_dentry->fcd_dilist,\n-\t\t\t\t      struct ext4_inode_info, i_fc_dilist);\n-\t\tif (!list_empty(&ei->i_fc_list))\n-\t\t\tcontinue;\n-\n-\t\tnr_inodes++;\n-\t}\n-\text4_fc_unlock(sb, alloc_ctx);\n-\n-\tif (!nr_inodes)\n+\tif (!inodes_size)\n \t\treturn 0;\n \n-\tinodes = kvcalloc(nr_inodes, sizeof(*inodes), GFP_NOFS);\n-\tif (!inodes)\n-\t\treturn -ENOMEM;\n-\n \talloc_ctx = ext4_fc_lock(sb);\n \tlist_for_each_entry(iter, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) {\n+\t\tif (i >= inodes_size) {\n+\t\t\tret = -E2BIG;\n+\t\t\tgoto unlock;\n+\t\t}\n \t\tinodes[i++] = &iter->vfs_inode;\n \t}\n \n@@ -1229,6 +1243,10 @@ static int ext4_fc_snapshot_inodes(journal_t *journal)\n \t\tif (!list_empty(&ei->i_fc_list))\n \t\t\tcontinue;\n \n+\t\tif (i >= inodes_size) {\n+\t\t\tret = -E2BIG;\n+\t\t\tgoto unlock;\n+\t\t}\n \t\t/*\n \t\t * Create-only inodes may only be referenced via fcd_dilist and\n \t\t * not appear on s_fc_q[MAIN]. They may hit the last iput while\n@@ -1240,15 +1258,22 @@ static int ext4_fc_snapshot_inodes(journal_t *journal)\n \t\text4_set_inode_state(inode, EXT4_STATE_FC_COMMITTING);\n \t\tinodes[i++] = inode;\n \t}\n+unlock:\n \text4_fc_unlock(sb, alloc_ctx);\n \n-\tfor (nr_inodes = 0; nr_inodes < i; nr_inodes++) {\n-\t\tret = ext4_fc_snapshot_inode(inodes[nr_inodes]);\n+\tif (ret)\n+\t\treturn ret;\n+\n+\tfor (idx = 0; idx < i; idx++) {\n+\t\tunsigned int inode_ranges = 0;\n+\n+\t\tret = ext4_fc_snapshot_inode(inodes[idx], nr_ranges,\n+\t\t\t\t\t     &inode_ranges);\n \t\tif (ret)\n \t\t\tbreak;\n+\t\tnr_ranges += inode_ranges;\n \t}\n \n-\tkvfree(inodes);\n \treturn ret;\n }\n \n@@ -1259,6 +1284,8 @@ static int ext4_fc_perform_commit(journal_t *journal)\n \tstruct ext4_inode_info *iter;\n \tstruct ext4_fc_head head;\n \tstruct inode *inode;\n+\tstruct inode **inodes;\n+\tunsigned int inodes_size;\n \tstruct blk_plug plug;\n \tint ret = 0;\n \tu32 crc = 0;\n@@ -1311,6 +1338,10 @@ static int ext4_fc_perform_commit(journal_t *journal)\n \t\treturn ret;\n \n \n+\tret = ext4_fc_alloc_snapshot_inodes(sb, &inodes, &inodes_size);\n+\tif (ret)\n+\t\treturn ret;\n+\n \t/* Step 4: Mark all inodes as being committed. */\n \tjbd2_journal_lock_updates(journal);\n \t/*\n@@ -1326,8 +1357,9 @@ static int ext4_fc_perform_commit(journal_t *journal)\n \t}\n \text4_fc_unlock(sb, alloc_ctx);\n \n-\tret = ext4_fc_snapshot_inodes(journal);\n+\tret = ext4_fc_snapshot_inodes(journal, inodes, inodes_size);\n \tjbd2_journal_unlock_updates(journal);\n+\tkvfree(inodes);\n \tif (ret)\n \t\treturn ret;\n \n@@ -1383,6 +1415,64 @@ static int ext4_fc_perform_commit(journal_t *journal)\n \treturn ret;\n }\n \n+static unsigned int ext4_fc_count_snapshot_inodes(struct super_block *sb)\n+{\n+\tstruct ext4_sb_info *sbi = EXT4_SB(sb);\n+\tstruct ext4_inode_info *iter;\n+\tstruct ext4_fc_dentry_update *fc_dentry;\n+\tunsigned int nr_inodes = 0;\n+\tint alloc_ctx;\n+\n+\talloc_ctx = ext4_fc_lock(sb);\n+\tlist_for_each_entry(iter, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list)\n+\t\tnr_inodes++;\n+\n+\tlist_for_each_entry(fc_dentry, &sbi->s_fc_dentry_q[FC_Q_MAIN], fcd_list) {\n+\t\tstruct ext4_inode_info *ei;\n+\n+\t\tif (fc_dentry->fcd_op != EXT4_FC_TAG_CREAT)\n+\t\t\tcontinue;\n+\t\tif (list_empty(&fc_dentry->fcd_dilist))\n+\t\t\tcontinue;\n+\n+\t\t/* See the comment in ext4_fc_commit_dentry_updates(). */\n+\t\tei = list_first_entry(&fc_dentry->fcd_dilist,\n+\t\t\t\t      struct ext4_inode_info, i_fc_dilist);\n+\t\tif (!list_empty(&ei->i_fc_list))\n+\t\t\tcontinue;\n+\n+\t\tnr_inodes++;\n+\t}\n+\text4_fc_unlock(sb, alloc_ctx);\n+\n+\treturn nr_inodes;\n+}\n+\n+static int ext4_fc_alloc_snapshot_inodes(struct super_block *sb,\n+\t\t\t\t\t struct inode ***inodesp,\n+\t\t\t\t\t unsigned int *nr_inodesp)\n+{\n+\tunsigned int nr_inodes = ext4_fc_count_snapshot_inodes(sb);\n+\tstruct inode **inodes;\n+\n+\t*inodesp = NULL;\n+\t*nr_inodesp = 0;\n+\n+\tif (!nr_inodes)\n+\t\treturn 0;\n+\n+\tif (nr_inodes > EXT4_FC_SNAPSHOT_MAX_INODES)\n+\t\treturn -E2BIG;\n+\n+\tinodes = kvcalloc(nr_inodes, sizeof(*inodes), GFP_NOFS);\n+\tif (!inodes)\n+\t\treturn -ENOMEM;\n+\n+\t*inodesp = inodes;\n+\t*nr_inodesp = nr_inodes;\n+\treturn 0;\n+}\n+\n static void ext4_fc_update_stats(struct super_block *sb, int status,\n \t\t\t\t u64 commit_time, int nblks, tid_t commit_tid)\n {\n@@ -1475,7 +1565,10 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)\n \tfc_bufs_before = (sbi->s_fc_bytes + bsize - 1) / bsize;\n \tret = ext4_fc_perform_commit(journal);\n \tif (ret < 0) {\n-\t\tstatus = EXT4_FC_STATUS_FAILED;\n+\t\tif (ret == -EAGAIN || ret == -E2BIG || ret == -ECANCELED)\n+\t\t\tstatus = EXT4_FC_STATUS_INELIGIBLE;\n+\t\telse\n+\t\t\tstatus = EXT4_FC_STATUS_FAILED;\n \t\tgoto fallback;\n \t}\n \tnblks = (sbi->s_fc_bytes + bsize - 1) / bsize - fc_bufs_before;\n@@ -1559,34 +1652,35 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)\n \n \twhile (!list_empty(&sbi->s_fc_dentry_q[FC_Q_MAIN])) {\n \t\tfc_dentry = list_first_entry(&sbi->s_fc_dentry_q[FC_Q_MAIN],\n-\t\t\t\t\t     struct ext4_fc_dentry_update,\n-\t\t\t\t\t     fcd_list);\n+\t\t\t\t\t\t struct ext4_fc_dentry_update,\n+\t\t\t\t\t\t fcd_list);\n \t\tlist_del_init(&fc_dentry->fcd_list);\n \t\tif (fc_dentry->fcd_op == EXT4_FC_TAG_CREAT &&\n-\t\t    !list_empty(&fc_dentry->fcd_dilist)) {\n+\t\t\t!list_empty(&fc_dentry->fcd_dilist)) {\n \t\t\t/* See the comment in ext4_fc_commit_dentry_updates(). */\n \t\t\tei = list_first_entry(&fc_dentry->fcd_dilist,\n-\t\t\t\t\t      struct ext4_inode_info,\n-\t\t\t\t\t      i_fc_dilist);\n+\t\t\t\t\t\t  struct ext4_inode_info,\n+\t\t\t\t\t\t  i_fc_dilist);\n \t\t\text4_fc_free_inode_snap(&ei->vfs_inode);\n \t\t\tspin_lock(&ei->i_fc_lock);\n \t\t\text4_clear_inode_state(&ei->vfs_inode,\n-\t\t\t\t\t       EXT4_STATE_FC_REQUEUE);\n+\t\t\t\t\t\t   EXT4_STATE_FC_REQUEUE);\n \t\t\text4_clear_inode_state(&ei->vfs_inode,\n-\t\t\t\t\t       EXT4_STATE_FC_COMMITTING);\n+\t\t\t\t\t\t   EXT4_STATE_FC_COMMITTING);\n \t\t\tspin_unlock(&ei->i_fc_lock);\n \t\t\t/*\n \t\t\t * Make sure clearing of EXT4_STATE_FC_COMMITTING is\n-\t\t\t * visible before we send the wakeup. Pairs with implicit\n-\t\t\t * barrier in prepare_to_wait() in ext4_fc_del().\n+\t\t\t * visible before we send the wakeup. Pairs with\n+\t\t\t * implicit barrier in prepare_to_wait() in\n+\t\t\t * ext4_fc_del().\n \t\t\t */\n \t\t\tsmp_mb();\n #if (BITS_PER_LONG < 64)\n \t\t\twake_up_bit(&ei->i_state_flags,\n-\t\t\t\t    EXT4_STATE_FC_COMMITTING);\n+\t\t\t\t\tEXT4_STATE_FC_COMMITTING);\n #else\n \t\t\twake_up_bit(&ei->i_flags,\n-\t\t\t\t    EXT4_STATE_FC_COMMITTING);\n+\t\t\t\t\tEXT4_STATE_FC_COMMITTING);\n #endif\n \t\t}\n \t\tlist_del_init(&fc_dentry->fcd_dilist);\n@@ -2582,13 +2676,20 @@ int __init ext4_fc_init_dentry_cache(void)\n \text4_fc_dentry_cachep = KMEM_CACHE(ext4_fc_dentry_update,\n \t\t\t\t\t   SLAB_RECLAIM_ACCOUNT);\n \n-\tif (ext4_fc_dentry_cachep == NULL)\n+\tif (!ext4_fc_dentry_cachep)\n \t\treturn -ENOMEM;\n \n+\text4_fc_range_cachep = KMEM_CACHE(ext4_fc_range, SLAB_RECLAIM_ACCOUNT);\n+\tif (!ext4_fc_range_cachep) {\n+\t\tkmem_cache_destroy(ext4_fc_dentry_cachep);\n+\t\treturn -ENOMEM;\n+\t}\n+\n \treturn 0;\n }\n \n void ext4_fc_destroy_dentry_cache(void)\n {\n+\tkmem_cache_destroy(ext4_fc_range_cachep);\n \tkmem_cache_destroy(ext4_fc_dentry_cachep);\n }\n","prefixes":["RFC","v4","5/7"]}