From patchwork Tue Aug 15 08:58:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dou Liyang X-Patchwork-Id: 801481 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xWmkV2cQ8z9t2y for ; Tue, 15 Aug 2017 19:04:18 +1000 (AEST) Received: from localhost ([::1]:47211 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dhXll-0000Ku-14 for incoming@patchwork.ozlabs.org; Tue, 15 Aug 2017 05:04:13 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57254) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dhXgZ-0003bW-DN for qemu-devel@nongnu.org; Tue, 15 Aug 2017 04:58:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dhXgU-0007HA-Um for qemu-devel@nongnu.org; Tue, 15 Aug 2017 04:58:51 -0400 Received: from mail.cn.fujitsu.com ([183.91.158.132]:64783 helo=heian.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dhXgU-0007GG-40 for qemu-devel@nongnu.org; Tue, 15 Aug 2017 04:58:46 -0400 X-IronPort-AV: E=Sophos;i="5.41,377,1498492800"; d="scan'208";a="23589603" Received: from localhost (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 15 Aug 2017 16:58:41 +0800 Received: from G08CNEXCHPEKD01.g08.fujitsu.local (unknown [10.167.33.80]) by cn.fujitsu.com (Postfix) with ESMTP id 7878A472438E; Tue, 15 Aug 2017 16:58:41 +0800 (CST) Received: from localhost.localdomain.localdomain (10.167.226.106) by G08CNEXCHPEKD01.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server (TLS) id 14.3.319.2; Tue, 15 Aug 2017 16:58:40 +0800 From: Dou Liyang To: Date: Tue, 15 Aug 2017 16:58:31 +0800 Message-ID: <1502787511-20894-1-git-send-email-douly.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.5.5 MIME-Version: 1.0 X-Originating-IP: [10.167.226.106] X-yoursite-MailScanner-ID: 7878A472438E.AF59D X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: douly.fnst@cn.fujitsu.com X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 183.91.158.132 Subject: [Qemu-devel] [PATCH] hw/acpi: Select an node with memory for mapping memory hole to X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dou Liyang , ehabkost@redhat.com, david@redhat.com, mst@redhat.com, armbru@redhat.com, dgilbert@redhat.com, imammedo@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Currently, Using the fisrt node without memory on the machine makes QEMU unhappy. With this example command line: ... \ -m 1024M,slots=4,maxmem=32G \ -numa node,nodeid=0 \ -numa node,mem=1024M,nodeid=1 \ -numa node,nodeid=2 \ -numa node,nodeid=3 \ Guest reports "No NUMA configuration found" and the NUMA topology is wrong. This is because when QEMU builds ACPI SRAT, it regards node0 as the default node to deal with the memory hole(640K-1M). this means the node0 must have some memory(>1M), but, actually it can have no memory. Fix this problem by replace the node0 with the first node which has memory on it. Add a new function for each node. Also do some cleanup. Signed-off-by: Dou Liyang --- hw/i386/acpi-build.c | 76 +++++++++++++++++++++++++++++++++------------------- 1 file changed, 48 insertions(+), 28 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 98dd424..e5f57d2 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -2318,15 +2318,43 @@ build_tpm2(GArray *table_data, BIOSLinker *linker) (void *)tpm2_ptr, "TPM2", sizeof(*tpm2_ptr), 4, NULL, NULL); } +static uint64_t +build_srat_node_entry(GArray *table_data, PCMachineState *pcms, + int i, uint64_t mem_base, uint64_t mem_len) +{ + AcpiSratMemoryAffinity *numamem; + uint64_t next_base; + + next_base = mem_base + mem_len; + + /* Cut out the ACPI_PCI hole */ + if (mem_base <= pcms->below_4g_mem_size && + next_base > pcms->below_4g_mem_size) { + mem_len -= next_base - pcms->below_4g_mem_size; + if (mem_len > 0) { + numamem = acpi_data_push(table_data, sizeof *numamem); + build_srat_memory(numamem, mem_base, mem_len, i, + MEM_AFFINITY_ENABLED); + } + mem_base = 1ULL << 32; + mem_len = next_base - pcms->below_4g_mem_size; + next_base += (1ULL << 32) - pcms->below_4g_mem_size; + } + numamem = acpi_data_push(table_data, sizeof *numamem); + build_srat_memory(numamem, mem_base, mem_len, i, + MEM_AFFINITY_ENABLED); + return next_base; +} + static void build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine) { AcpiSystemResourceAffinityTable *srat; AcpiSratMemoryAffinity *numamem; - int i; + int i, node; int srat_start, numa_start, slots; - uint64_t mem_len, mem_base, next_base; + uint64_t mem_len, mem_base; MachineClass *mc = MACHINE_GET_CLASS(machine); const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine); PCMachineState *pcms = PC_MACHINE(machine); @@ -2370,36 +2398,28 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine) /* the memory map is a bit tricky, it contains at least one hole * from 640k-1M and possibly another one from 3.5G-4G. */ - next_base = 0; + numa_start = table_data->len; + /* get the first node which has memory and map the hole from 640K-1M */ + for (node = 0; + node < pcms->numa_nodes && pcms->node_mem[node] == 0; + node++); numamem = acpi_data_push(table_data, sizeof *numamem); - build_srat_memory(numamem, 0, 640 * 1024, 0, MEM_AFFINITY_ENABLED); - next_base = 1024 * 1024; - for (i = 1; i < pcms->numa_nodes + 1; ++i) { - mem_base = next_base; - mem_len = pcms->node_mem[i - 1]; - if (i == 1) { - mem_len -= 1024 * 1024; - } - next_base = mem_base + mem_len; - - /* Cut out the ACPI_PCI hole */ - if (mem_base <= pcms->below_4g_mem_size && - next_base > pcms->below_4g_mem_size) { - mem_len -= next_base - pcms->below_4g_mem_size; - if (mem_len > 0) { - numamem = acpi_data_push(table_data, sizeof *numamem); - build_srat_memory(numamem, mem_base, mem_len, i - 1, - MEM_AFFINITY_ENABLED); - } - mem_base = 1ULL << 32; - mem_len = next_base - pcms->below_4g_mem_size; - next_base += (1ULL << 32) - pcms->below_4g_mem_size; + build_srat_memory(numamem, 0, 640 * 1024, node, MEM_AFFINITY_ENABLED); + + /* map the rest of memory from 1M */ + mem_base = 1024 * 1024; + mem_len = pcms->node_mem[node] - mem_base; + mem_base = build_srat_node_entry(table_data, pcms, node, + mem_base, mem_len); + + for (i = 0; i < pcms->numa_nodes; i++) { + if (i == node) { + continue; } - numamem = acpi_data_push(table_data, sizeof *numamem); - build_srat_memory(numamem, mem_base, mem_len, i - 1, - MEM_AFFINITY_ENABLED); + mem_base = build_srat_node_entry(table_data, pcms, i, + mem_base, pcms->node_mem[i]); } slots = (table_data->len - numa_start) / sizeof *numamem; for (; slots < pcms->numa_nodes + 2; slots++) {