From patchwork Wed Mar 6 00:50:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krister Johansen X-Patchwork-Id: 1908496 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TqDQ26wTXz1yX4 for ; Wed, 6 Mar 2024 11:51:06 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1rhfUj-0003uq-Se; Wed, 06 Mar 2024 00:50:54 +0000 Received: from buffalo.pear.relay.mailchannels.net ([23.83.216.24]) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1rhfUZ-0003r5-TL for kernel-team@lists.ubuntu.com; Wed, 06 Mar 2024 00:50:44 +0000 X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 1F86082254 for ; Wed, 6 Mar 2024 00:50:41 +0000 (UTC) Received: from pdx1-sub0-mail-a210.dreamhost.com (unknown [127.0.0.6]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 4EDFC80900 for ; Wed, 6 Mar 2024 00:50:40 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1709686240; a=rsa-sha256; cv=none; b=YJ1O1CvKKIhWiw3s1od+sU7k8kohfAXL5/6adKQBXlOopUZTb10LJkE+gSrxYtDQ90oJ9n IBk68VWzuW8Nf4V7GupxqfcZ3UQ47LNxyK3zGVFNslwbcd66ofKxafCbYLcoErpj63bEou db5T8y/MbTRFGRZBx6k5SFHS1JIxZmug4ShymJUI9DNofi5Z1pxxwPS9iNVgO562JbO8wl CHRvY6db6ZWrGwyMf8n3H3m1p4SKwF/AjnelrrpF7JolxH/xlBuWAqcURr2vWuTWX3+/nT L+/cYa3kSBj3QISQnPcTUSlyhRy/ydxLaj9goz2ppGJRbk73Tin5NBH0qyMlBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1709686240; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:dkim-signature; bh=b9PDxuEw0v6kwct9+h44GACXQ4O73HeIyNcEQqZn5AU=; b=5dtrPOjcE7pUov4Pn/ILTzdOIKGCScgWJvWMQLlM1P1RfgGmy5rQv1HLt7SfIgpqcPFojh 5iHn6SC7yAqdoWPRgJRQIK2F/3g9Cn+fQ2Ye6rRKuEbGsCA5EIBzBtIYH7YBtMvyFGiugy BuRJXYnZGupCbuQqEketl3pgkBTydPHI4gIZ3+lqR5U7JTkS75LtTu8kss75UuuwUXwM7R R+aMsXycp8gcSz/CRfsqy3yYKLQqi9Pn5dVFyCV+vsGMRqdesPFUrPQe1hcqqFnDLumk+v LA+8ybmqiFfECLubC8HMj15Wc8v6Ey8zLTAjNQXWPGyx1JrL/LxUcTqz1tmd1A== ARC-Authentication-Results: i=1; rspamd-7f9dd9fb96-nfzbq; auth=pass smtp.auth=dreamhost smtp.mailfrom=kjlx@templeofstupid.com X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com X-MC-Relay: Good X-MailChannels-SenderId: dreamhost|x-authsender|kjlx@templeofstupid.com X-MailChannels-Auth-Id: dreamhost X-Imminent-Desert: 54fd05913760db26_1709686240703_3364561649 X-MC-Loop-Signature: 1709686240703:2481507800 X-MC-Ingress-Time: 1709686240702 Received: from pdx1-sub0-mail-a210.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.104.192.87 (trex/6.9.2); Wed, 06 Mar 2024 00:50:40 +0000 Received: from kmjvbox.templeofstupid.com (c-73-222-159-162.hsd1.ca.comcast.net [73.222.159.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: kjlx@templeofstupid.com) by pdx1-sub0-mail-a210.dreamhost.com (Postfix) with ESMTPSA id 4TqDPX0ffCzBC for ; Tue, 5 Mar 2024 16:50:40 -0800 (PST) Received: from johansen (uid 1000) (envelope-from kjlx@templeofstupid.com) id e0082 by kmjvbox.templeofstupid.com (DragonFly Mail Agent v0.12); Tue, 05 Mar 2024 16:50:31 -0800 Date: Tue, 5 Mar 2024 16:50:31 -0800 From: Krister Johansen To: kernel-team@lists.ubuntu.com Subject: [SRU][J][PATCH v2 1/2] KVM: arm64: Work out supported block level at compile time Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Received-SPF: pass client-ip=23.83.216.24; envelope-from=kjlx@templeofstupid.com; helo=buffalo.pear.relay.mailchannels.net X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Oliver Upton BugLink: https://bugs.launchpad.net/bugs/2056227 Work out the minimum page table level where KVM supports block mappings at compile time. While at it, rewrite the comment around supported block mappings to directly describe what KVM supports instead of phrasing in terms of what it does not. Signed-off-by: Oliver Upton Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20221007234151.461779-2-oliver.upton@linux.dev (cherry picked from commit 3b5c082bbfa20d9a57924edd655bbe63fe98ab06) Signed-off-by: Krister Johansen --- arch/arm64/include/asm/kvm_pgtable.h | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 027783829584..87e782eec925 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -13,6 +13,18 @@ #define KVM_PGTABLE_MAX_LEVELS 4U +/* + * The largest supported block sizes for KVM (no 52-bit PA support): + * - 4K (level 1): 1GB + * - 16K (level 2): 32MB + * - 64K (level 2): 512MB + */ +#ifdef CONFIG_ARM64_4K_PAGES +#define KVM_PGTABLE_MIN_BLOCK_LEVEL 1U +#else +#define KVM_PGTABLE_MIN_BLOCK_LEVEL 2U +#endif + static inline u64 kvm_get_parange(u64 mmfr0) { u64 parange = cpuid_feature_extract_unsigned_field(mmfr0, @@ -58,11 +70,7 @@ static inline u64 kvm_granule_size(u32 level) static inline bool kvm_level_supports_block_mapping(u32 level) { - /* - * Reject invalid block mappings and don't bother with 4TB mappings for - * 52-bit PAs. - */ - return !(level == 0 || (PAGE_SIZE != SZ_4K && level == 1)); + return level >= KVM_PGTABLE_MIN_BLOCK_LEVEL; } /** From patchwork Wed Mar 6 00:50:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krister Johansen X-Patchwork-Id: 1908497 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TqDQB5GYTz1yX4 for ; Wed, 6 Mar 2024 11:51:14 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1rhfUv-0003zB-61; Wed, 06 Mar 2024 00:51:05 +0000 Received: from honeydew.cherry.relay.mailchannels.net ([23.83.223.83]) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1rhfUj-0003ui-E6 for kernel-team@lists.ubuntu.com; Wed, 06 Mar 2024 00:50:54 +0000 X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id A5B3B84231A for ; Wed, 6 Mar 2024 00:50:51 +0000 (UTC) Received: from pdx1-sub0-mail-a210.dreamhost.com (unknown [127.0.0.6]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 3F05D8409CB for ; Wed, 6 Mar 2024 00:50:51 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1709686251; a=rsa-sha256; cv=none; b=vtFNn0jc4IwwMZIh573YXQLBMrOCsGJcttQgtuUHLgDEEaaEPu10lWsrjfEINPlpyHiIKo ZGNEM2AApJQNHRTJmaZUs0hhcvfn80DH0Pw7vf9kzM5oM2HzAH0TWAzZBFKKrRrmfftiGL 0FFLtSZ7adpED5gbEYDCs0r83NZZ9dMbCNDVlr9Tb7m8GbXU7/kcbk5LbvmpeyxFMSIRcR wi9p2rQzlQMIEkQDuGA7MQGTMiWwLmUCrjWHtY4SfsMbOXjTn3mI8Oj9qgQIb5FiSpW27M OzSH8XxQLE1ibzbYrxdRyVWfbiiVlxzW5DhpIRdXRW8pc4Z0sgdJ9v8Rhy8/jQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1709686251; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2U3i8JqbSbVuoW+XUqHQf8cx5lHM/QuoqsxKwqagnMg=; b=zv4XEYqCn67z0VThZuXxxk2OIAZCKUNIqt7webTDd+ZQBqh0yikgcPcLgZInz9aSBjCSDT lsWujpcK77XV/9gExukL1Xyxhgg13dvFApCbUc0dbE30vZsbB5Wc+VQ6Bunn/XXM9/STaW W4fIfWxP76ud1AKUptcloUbC9gE0Fydqsushhmeuo35Z9hFRgxRz64um3lKRDYuZ+0n3sK YLHLJoSipQ+avt7zQE4waEa1zy6AcDElHVvnGrjWyKJa+pSM0Z27CLvipFEwW9fwCxmPsa ZYO99PGF9g124henq3ihGfkjeMd88kT33CNtToSeSEPl0uQgR3IVrn5LzdymDQ== ARC-Authentication-Results: i=1; rspamd-7f9dd9fb96-fw85s; auth=pass smtp.auth=dreamhost smtp.mailfrom=kjlx@templeofstupid.com X-Sender-Id: dreamhost|x-authsender|kjlx@templeofstupid.com X-MC-Relay: Good X-MailChannels-SenderId: dreamhost|x-authsender|kjlx@templeofstupid.com X-MailChannels-Auth-Id: dreamhost X-Company-Whimsical: 3484c5027aa73249_1709686251501_3821709081 X-MC-Loop-Signature: 1709686251501:1903304566 X-MC-Ingress-Time: 1709686251500 Received: from pdx1-sub0-mail-a210.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.117.236.39 (trex/6.9.2); Wed, 06 Mar 2024 00:50:51 +0000 Received: from kmjvbox.templeofstupid.com (c-73-222-159-162.hsd1.ca.comcast.net [73.222.159.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: kjlx@templeofstupid.com) by pdx1-sub0-mail-a210.dreamhost.com (Postfix) with ESMTPSA id 4TqDPk74BWz5B for ; Tue, 5 Mar 2024 16:50:50 -0800 (PST) Received: from johansen (uid 1000) (envelope-from kjlx@templeofstupid.com) id e0082 by kmjvbox.templeofstupid.com (DragonFly Mail Agent v0.12); Tue, 05 Mar 2024 16:50:42 -0800 Date: Tue, 5 Mar 2024 16:50:42 -0800 From: Krister Johansen To: kernel-team@lists.ubuntu.com Subject: [SRU][J][PATCH v2 2/2] KVM: arm64: Limit stage2_apply_range() batch size to largest block Message-ID: <5bb6c686e115c79e746686df03aa8c43af97592d.1709685750.git.kjlx@templeofstupid.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Received-SPF: pass client-ip=23.83.223.83; envelope-from=kjlx@templeofstupid.com; helo=honeydew.cherry.relay.mailchannels.net X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Oliver Upton BugLink: https://bugs.launchpad.net/bugs/2056227 Presently stage2_apply_range() works on a batch of memory addressed by a stage 2 root table entry for the VM. Depending on the IPA limit of the VM and PAGE_SIZE of the host, this could address a massive range of memory. Some examples: 4 level, 4K paging -> 512 GB batch size 3 level, 64K paging -> 4TB batch size Unsurprisingly, working on such a large range of memory can lead to soft lockups. When running dirty_log_perf_test: ./dirty_log_perf_test -m -2 -s anonymous_thp -b 4G -v 48 watchdog: BUG: soft lockup - CPU#0 stuck for 45s! [dirty_log_perf_:16703] Modules linked in: vfat fat cdc_ether usbnet mii xhci_pci xhci_hcd sha3_generic gq(O) CPU: 0 PID: 16703 Comm: dirty_log_perf_ Tainted: G O 6.0.0-smp-DEV #1 pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : dcache_clean_inval_poc+0x24/0x38 lr : clean_dcache_guest_page+0x28/0x4c sp : ffff800021763990 pmr_save: 000000e0 x29: ffff800021763990 x28: 0000000000000005 x27: 0000000000000de0 x26: 0000000000000001 x25: 00400830b13bc77f x24: ffffad4f91ead9c0 x23: 0000000000000000 x22: ffff8000082ad9c8 x21: 0000fffafa7bc000 x20: ffffad4f9066ce50 x19: 0000000000000003 x18: ffffad4f92402000 x17: 000000000000011b x16: 000000000000011b x15: 0000000000000124 x14: ffff07ff8301d280 x13: 0000000000000000 x12: 00000000ffffffff x11: 0000000000010001 x10: fffffc0000000000 x9 : ffffad4f9069e580 x8 : 000000000000000c x7 : 0000000000000000 x6 : 000000000000003f x5 : ffff07ffa2076980 x4 : 0000000000000001 x3 : 000000000000003f x2 : 0000000000000040 x1 : ffff0830313bd000 x0 : ffff0830313bcc40 Call trace: dcache_clean_inval_poc+0x24/0x38 stage2_unmap_walker+0x138/0x1ec __kvm_pgtable_walk+0x130/0x1d4 __kvm_pgtable_walk+0x170/0x1d4 __kvm_pgtable_walk+0x170/0x1d4 __kvm_pgtable_walk+0x170/0x1d4 kvm_pgtable_stage2_unmap+0xc4/0xf8 kvm_arch_flush_shadow_memslot+0xa4/0x10c kvm_set_memslot+0xb8/0x454 __kvm_set_memory_region+0x194/0x244 kvm_vm_ioctl_set_memory_region+0x58/0x7c kvm_vm_ioctl+0x49c/0x560 __arm64_sys_ioctl+0x9c/0xd4 invoke_syscall+0x4c/0x124 el0_svc_common+0xc8/0x194 do_el0_svc+0x38/0xc0 el0_svc+0x2c/0xa4 el0t_64_sync_handler+0x84/0xf0 el0t_64_sync+0x1a0/0x1a4 Use the largest supported block mapping for the configured page size as the batch granularity. In so doing the walker is guaranteed to visit a leaf only once. Signed-off-by: Oliver Upton Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20221007234151.461779-3-oliver.upton@linux.dev (cherry picked from commit 5994bc9e05c2f8811f233aa434e391cd2783f0f5) Signed-off-by: Krister Johansen --- arch/arm64/include/asm/stage2_pgtable.h | 20 -------------------- arch/arm64/kvm/mmu.c | 9 ++++++++- 2 files changed, 8 insertions(+), 21 deletions(-) diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h index fe341a6578c3..c8dca8ae359c 100644 --- a/arch/arm64/include/asm/stage2_pgtable.h +++ b/arch/arm64/include/asm/stage2_pgtable.h @@ -10,13 +10,6 @@ #include -/* - * PGDIR_SHIFT determines the size a top-level page table entry can map - * and depends on the number of levels in the page table. Compute the - * PGDIR_SHIFT for a given number of levels. - */ -#define pt_levels_pgdir_shift(lvls) ARM64_HW_PGTABLE_LEVEL_SHIFT(4 - (lvls)) - /* * The hardware supports concatenation of up to 16 tables at stage2 entry * level and we use the feature whenever possible, which means we resolve 4 @@ -30,11 +23,6 @@ #define stage2_pgtable_levels(ipa) ARM64_HW_PGTABLE_LEVELS((ipa) - 4) #define kvm_stage2_levels(kvm) VTCR_EL2_LVLS(kvm->arch.vtcr) -/* stage2_pgdir_shift() is the size mapped by top-level stage2 entry for the VM */ -#define stage2_pgdir_shift(kvm) pt_levels_pgdir_shift(kvm_stage2_levels(kvm)) -#define stage2_pgdir_size(kvm) (1ULL << stage2_pgdir_shift(kvm)) -#define stage2_pgdir_mask(kvm) ~(stage2_pgdir_size(kvm) - 1) - /* * kvm_mmmu_cache_min_pages() is the number of pages required to install * a stage-2 translation. We pre-allocate the entry level page table at @@ -42,12 +30,4 @@ */ #define kvm_mmu_cache_min_pages(kvm) (kvm_stage2_levels(kvm) - 1) -static inline phys_addr_t -stage2_pgd_addr_end(struct kvm *kvm, phys_addr_t addr, phys_addr_t end) -{ - phys_addr_t boundary = (addr + stage2_pgdir_size(kvm)) & stage2_pgdir_mask(kvm); - - return (boundary - 1 < end - 1) ? boundary : end; -} - #endif /* __ARM64_S2_PGTABLE_H_ */ diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 38a8095744a0..db667b4ad103 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -31,6 +31,13 @@ static phys_addr_t hyp_idmap_vector; static unsigned long io_map_base; +static phys_addr_t stage2_range_addr_end(phys_addr_t addr, phys_addr_t end) +{ + phys_addr_t size = kvm_granule_size(KVM_PGTABLE_MIN_BLOCK_LEVEL); + phys_addr_t boundary = ALIGN_DOWN(addr + size, size); + + return (boundary - 1 < end - 1) ? boundary : end; +} /* * Release kvm_mmu_lock periodically if the memory region is large. Otherwise, @@ -52,7 +59,7 @@ static int stage2_apply_range(struct kvm *kvm, phys_addr_t addr, if (!pgt) return -EINVAL; - next = stage2_pgd_addr_end(kvm, addr, end); + next = stage2_range_addr_end(addr, end); ret = fn(pgt, addr, next - addr); if (ret) break;