From patchwork Tue Sep 26 18:57:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 1839912 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=DO81J8Y/; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Rw8H20lNqz1ynX for ; Wed, 27 Sep 2023 05:01:34 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qlDHI-0000Ia-UY; Tue, 26 Sep 2023 14:59:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qlDHG-0000Dm-TS for qemu-devel@nongnu.org; Tue, 26 Sep 2023 14:59:22 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qlDHE-00039D-TI for qemu-devel@nongnu.org; Tue, 26 Sep 2023 14:59:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695754760; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kZcRdwyNP50tSHDs9hyBXUz08rwwVIv8MNAI/C060fQ=; b=DO81J8Y/fFpz3S8w1U/STlM9PwCZer9olhgLT1tOwdF5JGQggpgCocHO0iG61IJ8m8mcwT Nce/S6fPWnL0AsUtOZrPiE+nWm9mGRnelAv1e6jShNYvgfoMbETnukFlWfaeGw59wMIC0w QsFbn9Q+4PEdo2rYrfvde7g4as8Z9uQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-642-kL5N-bImMhus8JdUVGCtYg-1; Tue, 26 Sep 2023 14:59:16 -0400 X-MC-Unique: kL5N-bImMhus8JdUVGCtYg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0B08C805B29; Tue, 26 Sep 2023 18:59:16 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.192.33]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1C8382026D4B; Tue, 26 Sep 2023 18:59:09 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Igor Mammedov , Xiao Guangrong , "Michael S. Tsirkin" , Peter Xu , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Eduardo Habkost , Marcel Apfelbaum , Yanan Wang , Michal Privoznik , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Gavin Shan , Alex Williamson , Stefan Hajnoczi , "Maciej S . Szmigiero" , kvm@vger.kernel.org, "Maciej S . Szmigiero" Subject: [PATCH v4 09/18] memory-device, vhost: Support memory devices that dynamically consume memslots Date: Tue, 26 Sep 2023 20:57:29 +0200 Message-ID: <20230926185738.277351-10-david@redhat.com> In-Reply-To: <20230926185738.277351-1-david@redhat.com> References: <20230926185738.277351-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We want to support memory devices that have a dynamically managed memory region container as device memory region. This device memory region maps multiple RAM memory subregions (e.g., aliases to the same RAM memory region), whereby these subregions can be (un)mapped on demand. Each RAM subregion will consume a memslot in KVM and vhost, resulting in such a new device consuming memslots dynamically, and initially usually 0. We already track the number of used vs. required memslots for all memslots. From that, we can derive the number of reserved memslots that must not be used otherwise. The target use case is virtio-mem and the hyper-v balloon, which will dynamically map aliases to RAM memory region into their device memory region container. Properly document what's supported and what's not and extend the vhost memslot check accordingly. Reviewed-by: Maciej S. Szmigiero Signed-off-by: David Hildenbrand --- hw/mem/memory-device.c | 29 +++++++++++++++++++++++++++-- hw/virtio/vhost.c | 18 ++++++++++++++---- include/hw/mem/memory-device.h | 7 +++++++ stubs/memory_device.c | 5 +++++ 4 files changed, 53 insertions(+), 6 deletions(-) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index d37cfbd65d..1b14ba5661 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -62,19 +62,44 @@ static unsigned int memory_device_get_memslots(MemoryDeviceState *md) return 1; } +/* + * Memslots that are reserved by memory devices (required but still reported + * as free from KVM / vhost). + */ +static unsigned int get_reserved_memslots(MachineState *ms) +{ + if (ms->device_memory->used_memslots > + ms->device_memory->required_memslots) { + /* This is unexpected, and we warned already in the memory notifier. */ + return 0; + } + return ms->device_memory->required_memslots - + ms->device_memory->used_memslots; +} + +unsigned int memory_devices_get_reserved_memslots(void) +{ + if (!current_machine->device_memory) { + return 0; + } + return get_reserved_memslots(current_machine); +} + static void memory_device_check_addable(MachineState *ms, MemoryDeviceState *md, MemoryRegion *mr, Error **errp) { const uint64_t used_region_size = ms->device_memory->used_region_size; const uint64_t size = memory_region_size(mr); const unsigned int required_memslots = memory_device_get_memslots(md); + const unsigned int reserved_memslots = get_reserved_memslots(ms); /* we will need memory slots for kvm and vhost */ - if (kvm_enabled() && kvm_get_free_memslots() < required_memslots) { + if (kvm_enabled() && + kvm_get_free_memslots() < required_memslots + reserved_memslots) { error_setg(errp, "hypervisor has not enough free memory slots left"); return; } - if (vhost_get_free_memslots() < required_memslots) { + if (vhost_get_free_memslots() < required_memslots + reserved_memslots) { error_setg(errp, "a used vhost backend has not enough free memory slots left"); return; } diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 8e84dca246..f7e1ac12a8 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -23,6 +23,7 @@ #include "qemu/log.h" #include "standard-headers/linux/vhost_types.h" #include "hw/virtio/virtio-bus.h" +#include "hw/mem/memory-device.h" #include "migration/blocker.h" #include "migration/qemu-file-types.h" #include "sysemu/dma.h" @@ -1423,7 +1424,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque, VhostBackendType backend_type, uint32_t busyloop_timeout, Error **errp) { - unsigned int used; + unsigned int used, reserved, limit; uint64_t features; int i, r, n_initialized_vqs = 0; @@ -1529,9 +1530,18 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque, } else { used = used_memslots; } - if (used > hdev->vhost_ops->vhost_backend_memslots_limit(hdev)) { - error_setg(errp, "vhost backend memory slots limit is less" - " than current number of present memory slots"); + /* + * We assume that all reserved memslots actually require a real memslot + * in our vhost backend. This might not be true, for example, if the + * memslot would be ROM. If ever relevant, we can optimize for that -- + * but we'll need additional information about the reservations. + */ + reserved = memory_devices_get_reserved_memslots(); + limit = hdev->vhost_ops->vhost_backend_memslots_limit(hdev); + if (used + reserved > limit) { + error_setg(errp, "vhost backend memory slots limit (%d) is less" + " than current number of used (%d) and reserved (%d)" + " memory slots for memory devices.", limit, used, reserved); r = -EINVAL; goto fail_busyloop; } diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h index b51a579fb9..c7b624da6a 100644 --- a/include/hw/mem/memory-device.h +++ b/include/hw/mem/memory-device.h @@ -46,6 +46,12 @@ typedef struct MemoryDeviceState MemoryDeviceState; * single RAM memory region or a memory region container with subregions * that are RAM memory regions or aliases to RAM memory regions. Other * memory regions or subregions are not supported. + * + * If the device memory region returned via @get_memory_region is a + * memory region container, it's supported to dynamically (un)map subregions + * as long as the number of memslots returned by @get_memslots() won't + * be exceeded and as long as all memory regions are of the same kind (e.g., + * all RAM or all ROM). */ struct MemoryDeviceClass { /* private */ @@ -125,6 +131,7 @@ struct MemoryDeviceClass { MemoryDeviceInfoList *qmp_memory_device_list(void); uint64_t get_plugged_memory_size(void); +unsigned int memory_devices_get_reserved_memslots(void); void memory_device_pre_plug(MemoryDeviceState *md, MachineState *ms, const uint64_t *legacy_align, Error **errp); void memory_device_plug(MemoryDeviceState *md, MachineState *ms); diff --git a/stubs/memory_device.c b/stubs/memory_device.c index e75cac62dc..318a5d4187 100644 --- a/stubs/memory_device.c +++ b/stubs/memory_device.c @@ -10,3 +10,8 @@ uint64_t get_plugged_memory_size(void) { return (uint64_t)-1; } + +unsigned int memory_devices_get_reserved_memslots(void) +{ + return 0; +}