{"id":2220436,"url":"http://patchwork.ozlabs.org/api/1.1/patches/2220436/?format=json","web_url":"http://patchwork.ozlabs.org/project/kvm-riscv/patch/2026040717163343908VqFt1HjxIYObFoWo2Xe@zte.com.cn/","project":{"id":70,"url":"http://patchwork.ozlabs.org/api/1.1/projects/70/?format=json","name":"Linux KVM RISC-V","link_name":"kvm-riscv","list_id":"kvm-riscv.lists.infradead.org","list_email":"kvm-riscv@lists.infradead.org","web_url":"","scm_url":"","webscm_url":""},"msgid":"<2026040717163343908VqFt1HjxIYObFoWo2Xe@zte.com.cn>","date":"2026-04-07T09:16:33","name":"[3/3] RISC-V: KVM: Recover gstage huge page mappings during disable-dirty-log","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"f1a0df008d74e88c8c527115ebb1d71c6eab8cb4","submitter":{"id":91800,"url":"http://patchwork.ozlabs.org/api/1.1/people/91800/?format=json","name":"","email":"wang.yechao255@zte.com.cn"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/kvm-riscv/patch/2026040717163343908VqFt1HjxIYObFoWo2Xe@zte.com.cn/mbox/","series":[{"id":498947,"url":"http://patchwork.ozlabs.org/api/1.1/series/498947/?format=json","web_url":"http://patchwork.ozlabs.org/project/kvm-riscv/list/?series=498947","date":"2026-04-07T09:10:52","name":"RISC-V: KVM: Huge page recovery during disable-dirty-log","version":1,"mbox":"http://patchwork.ozlabs.org/series/498947/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/2220436/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/2220436/checks/","tags":{},"headers":{"Return-Path":"\n <kvm-riscv-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":"patchwork-incoming@legolas.ozlabs.org","Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n secure) header.d=lists.infradead.org header.i=@lists.infradead.org\n header.a=rsa-sha256 header.s=bombadil.20210309 header.b=sLgd6uuJ;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=none (no SPF record) smtp.mailfrom=lists.infradead.org\n (client-ip=2607:7c80:54:3::133; helo=bombadil.infradead.org;\n envelope-from=kvm-riscv-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org;\n receiver=patchwork.ozlabs.org)"],"Received":["from bombadil.infradead.org (bombadil.infradead.org\n [IPv6:2607:7c80:54:3::133])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fqgYy22w2z1yD3\n\tfor <incoming@patchwork.ozlabs.org>; Tue, 07 Apr 2026 19:16:54 +1000 (AEST)","from localhost ([::1] helo=bombadil.infradead.org)\n\tby bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux))\n\tid 1wA2YG-00000006D9u-2Rbx;\n\tTue, 07 Apr 2026 09:16:52 +0000","from mxct.zte.com.cn ([183.62.165.209])\n\tby bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux))\n\tid 1wA2YD-00000006D9A-2yAy;\n\tTue, 07 Apr 2026 09:16:51 +0000","from mse-fl2.zte.com.cn (unknown [10.5.228.133])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest\n SHA256)\n\t(No client certificate requested)\n\tby mxct.zte.com.cn (FangMail) with ESMTPS id 4fqgYg4SmVz501bD;\n\tTue, 07 Apr 2026 17:16:39 +0800 (CST)","from szxlzmapp04.zte.com.cn ([10.5.231.166])\n\tby mse-fl2.zte.com.cn with SMTP id 6379GVZl050576;\n\tTue, 7 Apr 2026 17:16:31 +0800 (+08)\n\t(envelope-from wang.yechao255@zte.com.cn)","from mapi (szxlzmapp03[null])\n\tby mapi (Zmail) with MAPI id mid12;\n\tTue, 7 Apr 2026 17:16:33 +0800 (CST)"],"DKIM-Signature":"v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;\n\td=lists.infradead.org; s=bombadil.20210309; h=Sender:\n\tContent-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post:\n\tList-Archive:List-Unsubscribe:List-Id:Subject:Cc:To:From:Mime-Version:Date:\n\tReferences:In-Reply-To:Message-ID:Reply-To:Content-ID:Content-Description:\n\tResent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:\n\tList-Owner; bh=VSJz1ynpVBgtZ2ToR64xouKKhbxwiJ2q1cJG4hFW9Wo=; b=sLgd6uuJkQuApu\n\tfjkMgVR0r7DGP3BZe/98MrgqXLQ05B4spGpXatVei0zBUzObnh/qZERURyatCwbxFEuAdDbXcJH07\n\tTZq0m9rbMhXzMtiBT1Hap4wKJ+V0noSJPxr8mHIz69d+jiLvbl6orx9yhUm10MEDmu8zSOVZMqy7r\n\t/LU0NB5YVu+8/TKUlJUpAkIM3unubpGQRbE4J7NNUIBedoT8StJPflhVGJJYRJkhE1KJK1RQ/SQLe\n\t5c3e8rZzacLxaaXpOSpQr1Iy+AxxvQB7f29v5yAVcjeiZ5gylJVTQxr355pde6jjGgUmjOzf6UAzg\n\tJLRzERGXLYZh7WGJSQpA==;","X-Zmail-TransId":"2b0569d4cb716e6-99116","X-Mailer":"Zmail v1.0","Message-ID":"<2026040717163343908VqFt1HjxIYObFoWo2Xe@zte.com.cn>","In-Reply-To":"<20260407171052241tmZDFGusMP_wlEsBVVtJo@zte.com.cn>","References":"20260407171052241tmZDFGusMP_wlEsBVVtJo@zte.com.cn","Date":"Tue, 7 Apr 2026 17:16:33 +0800 (CST)","Mime-Version":"1.0","From":"<wang.yechao255@zte.com.cn>","To":"<anup@brainfault.org>, <atish.patra@linux.dev>, <pjw@kernel.org>,\n        <palmer@dabbelt.com>, <aou@eecs.berkeley.edu>, <alex@ghiti.fr>","Cc":"<kvm@vger.kernel.org>, <kvm-riscv@lists.infradead.org>,\n        <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>","Subject":"=?utf-8?q?=5BPATCH_3/3=5D_RISC-V=3A_KVM=3A_Recover_gstage_huge_page?=\n\t=?utf-8?q?_mappings_during_disable-dirty-log?=","X-MAIL":"mse-fl2.zte.com.cn 6379GVZl050576","X-TLS":"YES","X-SPF-DOMAIN":"zte.com.cn","X-ENVELOPE-SENDER":"wang.yechao255@zte.com.cn","X-SPF":"None","X-SOURCE-IP":"10.5.228.133 unknown Tue, 07 Apr 2026 17:16:39 +0800","X-Fangmail-Anti-Spam-Filtered":"true","X-Fangmail-MID-QID":"69D4CB77.000/4fqgYg4SmVz501bD","X-Bad-Reply":"References and In-Reply-To but no 'Re:' in Subject.","X-CRM114-Version":"20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 ","X-CRM114-CacheID":"sfid-20260407_021650_086545_6123B129 ","X-CRM114-Status":"GOOD (  14.24  )","X-Spam-Score":"-4.2 (----)","X-Spam-Report":"Spam detection software,\n running on the system \"bombadil.infradead.org\",\n has NOT identified this incoming email as spam.  The original\n message has been attached to this so you can view it or label\n similar future email.  If you have any questions, see\n the administrator of that system for details.\n Content preview:  From: Wang Yechao <wang.yechao255@zte.com.cn> When dirty\n logging\n    is enabled,\n the gstage mappings are split into 4K pages to track dirty pages.\n    If the migration fails or is canceled,\n in order to keep the VM's performance\n    consistent with that befor [...]\n Content analysis details:   (-4.2 points, 5.0 required)\n  pts rule name              description\n ---- ----------------------\n --------------------------------------------------\n -2.3 RCVD_IN_DNSWL_MED      RBL: Sender listed at https://www.dnswl.org/,\n                             medium trust\n                             [183.62.165.209 listed in list.dnswl.org]\n -0.0 SPF_PASS               SPF: sender matches SPF record\n  0.0 SPF_HELO_NONE          SPF: HELO does not publish an SPF Record\n -1.9 BAYES_00               BODY: Bayes spam probability is 0 to 1%\n                             [score: 0.0000]\n  0.0 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED RBL: ADMINISTRATOR NOTICE: The\n                             query to Validity was blocked.  See\n                             https://knowledge.validity.com/hc/en-us/articles/20961730681243\n                              for more information.\n                         [183.62.165.209 listed in\n sa-trusted.bondedsender.org]\n  0.0 RCVD_IN_VALIDITY_SAFE_BLOCKED RBL: ADMINISTRATOR NOTICE: The query to\n                              Validity was blocked.  See\n                             https://knowledge.validity.com/hc/en-us/articles/20961730681243\n                              for more information.\n                             [183.62.165.209 listed in sa-accredit.habeas.com]\n  0.0 UNPARSEABLE_RELAY      Informational: message has unparseable relay\n lines\n  0.0 RCVD_IN_VALIDITY_RPBL_BLOCKED RBL: ADMINISTRATOR NOTICE: The query to\n                              Validity was blocked.  See\n                             https://knowledge.validity.com/hc/en-us/articles/20961730681243\n                              for more information.\n                            [183.62.165.209 listed in\n bl.score.senderscore.com]","X-BeenThere":"kvm-riscv@lists.infradead.org","X-Mailman-Version":"2.1.34","Precedence":"list","List-Id":"<kvm-riscv.lists.infradead.org>","List-Unsubscribe":"<http://lists.infradead.org/mailman/options/kvm-riscv>,\n <mailto:kvm-riscv-request@lists.infradead.org?subject=unsubscribe>","List-Archive":"<http://lists.infradead.org/pipermail/kvm-riscv/>","List-Post":"<mailto:kvm-riscv@lists.infradead.org>","List-Help":"<mailto:kvm-riscv-request@lists.infradead.org?subject=help>","List-Subscribe":"<http://lists.infradead.org/mailman/listinfo/kvm-riscv>,\n <mailto:kvm-riscv-request@lists.infradead.org?subject=subscribe>","Content-Type":"text/plain; charset=\"us-ascii\"","Content-Transfer-Encoding":"7bit","Sender":"\"kvm-riscv\" <kvm-riscv-bounces@lists.infradead.org>","Errors-To":"kvm-riscv-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org"},"content":"From: Wang Yechao <wang.yechao255@zte.com.cn>\n\nWhen dirty logging is enabled, the gstage mappings are split into\n4K pages to track dirty pages. If the migration fails or is canceled,\nin order to keep the VM's performance consistent with that before\ndirty logging was enabled, the gstage huge page mappings are recoverd\nwhen dirty logging is disabled.\n\nWith this patch, dirty_log_perf_test shows a decrease in the number of\nvCPU faults:\n\n$ perf stat -e kvm:kvm_page_fault \\\n/dirty_log_perf_test -s anonymous_hugetlb_1gb -v 1 -e -b 1G\n\nBefore: 524,819      kvm:kvm_page_fault\nAfter : 263,211      kvm:kvm_page_fault\n\nSigned-off-by: Wang Yechao <wang.yechao255@zte.com.cn>\n---\n arch/riscv/include/asm/kvm_gstage.h |  4 +++\n arch/riscv/kvm/gstage.c             | 42 ++++++++++++++++++++++++\n arch/riscv/kvm/mmu.c                | 51 +++++++++++++++++++++++++++++\n 3 files changed, 97 insertions(+)","diff":"diff --git a/arch/riscv/include/asm/kvm_gstage.h b/arch/riscv/include/asm/kvm_gstage.h\nindex 373748c6745e..6e5aaa487adf 100644\n--- a/arch/riscv/include/asm/kvm_gstage.h\n+++ b/arch/riscv/include/asm/kvm_gstage.h\n@@ -57,6 +57,10 @@ int kvm_riscv_gstage_split_huge(struct kvm_gstage *gstage,\n                                 struct kvm_mmu_memory_cache *pcache,\n                                 gpa_t addr, u32 target_level, bool flush);\n\n+void kvm_riscv_gstage_recover_huge(struct kvm_gstage *gstage, gpa_t addr,\n+\t\t\t\t   unsigned long taget_page_size,\n+\t\t\t\t   unsigned long *page_size);\n+\n enum kvm_riscv_gstage_op {\n \tGSTAGE_OP_NOP = 0,\t/* Nothing */\n \tGSTAGE_OP_CLEAR,\t/* Clear/Unmap */\ndiff --git a/arch/riscv/kvm/gstage.c b/arch/riscv/kvm/gstage.c\nindex ffec3e5ddcaf..54881a38b363 100644\n--- a/arch/riscv/kvm/gstage.c\n+++ b/arch/riscv/kvm/gstage.c\n@@ -335,6 +335,48 @@ int kvm_riscv_gstage_split_huge(struct kvm_gstage *gstage,\n \treturn 0;\n }\n\n+void kvm_riscv_gstage_recover_huge(struct kvm_gstage *gstage, gpa_t addr,\n+\t\t\t\t   unsigned long target_page_size,\n+\t\t\t\t   unsigned long *page_size)\n+{\n+\tu32 current_level = kvm_riscv_gstage_pgd_levels - 1;\n+\tpte_t *next_ptep = (pte_t *)gstage->pgd;\n+\tu32 target_level, out_level;\n+\tpte_t *ptep, *child_ptep;\n+\tint ret;\n+\n+\tout_level = 0;\n+\tret = gstage_page_size_to_level(target_page_size, &target_level);\n+\tif (ret)\n+\t\tgoto out;\n+\n+\twhile (current_level >= target_level) {\n+\t\tptep = (pte_t *)&next_ptep[gstage_pte_index(addr, current_level)];\n+\n+\t\tout_level = current_level;\n+\t\tif (!pte_val(ptep_get(ptep)))\n+\t\t\tgoto out;\n+\n+\t\t/* The mapping is already a huge page mapping. */\n+\t\tif (gstage_pte_leaf(ptep))\n+\t\t\tgoto out;\n+\n+\t\tnext_ptep = (pte_t *)gstage_pte_page_vaddr(ptep_get(ptep));\n+\t\tcurrent_level--;\n+\t}\n+\n+\t/* Replace the huge PTE with the first PTE entry of the child page table.*/\n+\tchild_ptep = (pte_t *)&next_ptep[0];\n+\tset_pte(ptep, __pte(pte_val(ptep_get(child_ptep))));\n+\n+\tgstage_tlb_flush(gstage, target_level, addr);\n+\n+\tput_page(virt_to_page(next_ptep));\n+\n+out:\n+\tgstage_level_to_page_size(out_level, page_size);\n+}\n+\n void kvm_riscv_gstage_op_pte(struct kvm_gstage *gstage, gpa_t addr,\n \t\t\t     pte_t *ptep, u32 ptep_level, enum kvm_riscv_gstage_op op)\n {\ndiff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c\nindex d2116c09c589..0b7077946e90 100644\n--- a/arch/riscv/kvm/mmu.c\n+++ b/arch/riscv/kvm/mmu.c\n@@ -16,6 +16,8 @@\n #include <asm/kvm_mmu.h>\n #include <asm/kvm_nacl.h>\n\n+static void kvm_mmu_recover_huge_pages(struct kvm *kvm, int slot);\n+\n static void mmu_wp_memory_region(struct kvm *kvm, int slot)\n {\n \tstruct kvm_memslots *slots = kvm_memslots(kvm);\n@@ -175,6 +177,17 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,\n \t\tif (kvm_dirty_log_manual_protect_and_init_set(kvm))\n \t\t\treturn;\n \t\tmmu_wp_memory_region(kvm, new->id);\n+\t} else {\n+\t\t/*\n+\t\t * Recover huge page mappings in the slot now that dirty logging\n+\t\t * is disabled, i.e. now that KVM does not have to track guest\n+\t\t * writes at 4KiB granularity.\n+\t\t *\n+\t\t * Dirty logging might be disabled by userspace if an ongoing VM\n+\t\t * live migration is cancelled and the VM must continue running\n+\t\t * on the source.\n+\t\t */\n+\t\tkvm_mmu_recover_huge_pages(kvm, new->id);\n \t}\n }\n\n@@ -620,3 +633,41 @@ void kvm_riscv_mmu_update_hgatp(struct kvm_vcpu *vcpu)\n \tif (!kvm_riscv_gstage_vmid_bits())\n \t\tkvm_riscv_local_hfence_gvma_all();\n }\n+\n+static void kvm_mmu_recover_huge_pages(struct kvm *kvm, int slot)\n+{\n+\tstruct kvm_memslots *slots = kvm_memslots(kvm);\n+\tstruct kvm_memory_slot *memslot = id_to_memslot(slots, slot);\n+\tunsigned long hva = gfn_to_hva(kvm, memslot->base_gfn);\n+\tphys_addr_t start = memslot->base_gfn << PAGE_SHIFT;\n+\tphys_addr_t end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;\n+\tphys_addr_t addr = start;\n+\tstruct kvm_gstage gstage;\n+\tunsigned long page_size;\n+\n+\tif (!fault_supports_gstage_huge_mapping(memslot, hva))\n+\t\treturn;\n+\n+\tgstage.kvm = kvm;\n+\tgstage.flags = 0;\n+\tgstage.vmid = READ_ONCE(kvm->arch.vmid.vmid);\n+\tgstage.pgd = kvm->arch.pgd;\n+\n+\tspin_lock(&kvm->mmu_lock);\n+\n+\twhile (addr < end) {\n+\t\tcond_resched_lock(&kvm->mmu_lock);\n+\n+\t\tif (get_hva_mapping_size(kvm, hva) < PMD_SIZE) {\n+\t\t\taddr += PMD_SIZE;\n+\t\t\thva  += PMD_SIZE;\n+\t\t\tcontinue;\n+\t\t}\n+\n+\t\tkvm_riscv_gstage_recover_huge(&gstage, addr, PMD_SIZE, &page_size);\n+\n+\t\taddr += page_size;\n+\t\thva  += page_size;\n+\t}\n+\tspin_unlock(&kvm->mmu_lock);\n+}\n","prefixes":["3/3"]}