From patchwork Sat Jun 6 00:09:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Verma, Vishal L" X-Patchwork-Id: 1304449 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49f0JQ1zr9z9sPF for ; Sat, 6 Jun 2020 10:09:56 +1000 (AEST) Received: from localhost ([::1]:43824 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jhMPK-0005iS-1L for incoming@patchwork.ozlabs.org; Fri, 05 Jun 2020 20:09:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42720) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jhMOv-0005gG-Ej for qemu-devel@nongnu.org; Fri, 05 Jun 2020 20:09:29 -0400 Received: from mga02.intel.com ([134.134.136.20]:45983) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jhMOu-0006I8-9z for qemu-devel@nongnu.org; Fri, 05 Jun 2020 20:09:29 -0400 IronPort-SDR: o6oOSsW3bDqQs+rlCiTYXi8AoDkONjQCF25Ilzk20Mv3KH7/eZpUPFfy+Pm1MP4+mUC5XVGmiI 3Jr32BtRoA9w== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Jun 2020 17:09:21 -0700 IronPort-SDR: XlV+3n5n0ZEf+YHkXAZYOvD9rBB0GKP+lpMuwdqBFryc+IltBT1AezzS1Q+B2yKDafUQyVJ0PQ C2xOFFfG3yLg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,478,1583222400"; d="scan'208";a="294853323" Received: from vverma7-mobl4.lm.intel.com ([10.254.185.105]) by fmsmga004.fm.intel.com with ESMTP; 05 Jun 2020 17:09:20 -0700 From: Vishal Verma To: Subject: [PATCH v5 2/3] hw/acpi/nvdimm: add a helper to augment SRAT generation Date: Fri, 5 Jun 2020 18:09:10 -0600 Message-Id: <20200606000911.9896-3-vishal.l.verma@intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200606000911.9896-1-vishal.l.verma@intel.com> References: <20200606000911.9896-1-vishal.l.verma@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.20; envelope-from=vishal.l.verma@intel.com; helo=mga02.intel.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/05 20:09:20 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Vivier , Thomas Huth , Xiao Guangrong , "Michael S. Tsirkin" , jingqi.liu@intel.com, Dave Hansen , Eduardo Habkost , Vishal Verma , Paolo Bonzini , Igor Mammedov , Dan Williams , Richard Henderson Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" NVDIMMs can belong to their own proximity domains, as described by the NFIT. In such cases, the SRAT needs to have Memory Affinity structures in the SRAT for these NVDIMMs, otherwise Linux doesn't populate node data structures properly during NUMA initialization. See the following for an example failure case. https://lore.kernel.org/linux-nvdimm/20200416225438.15208-1-vishal.l.verma@intel.com/ Introduce a new helper, nvdimm_build_srat(), and call it for both the i386 and arm versions of 'build_srat()' to augment the SRAT with memory affinity information for NVDIMMs. The relevant command line options to exercise this are below. Nodes 0-1 contain CPUs and regular memory, and nodes 2-3 are the NVDIMM address space. -object memory-backend-ram,id=mem0,size=2048M -numa node,nodeid=0,memdev=mem0, -numa cpu,node-id=0,socket-id=0 -object memory-backend-ram,id=mem1,size=2048M -numa node,nodeid=1,memdev=mem1, -numa cpu,node-id=1,socket-id=1 -numa node,nodeid=2, -object memory-backend-file,id=nvmem0,share,mem-path=nvdimm-0,size=16384M,align=1G -device nvdimm,memdev=nvmem0,id=nv0,label-size=2M,node=2 -numa node,nodeid=3, -object memory-backend-file,id=nvmem1,share,mem-path=nvdimm-1,size=16384M,align=1G -device nvdimm,memdev=nvmem1,id=nv1,label-size=2M,node=3 Cc: Jingqi Liu Cc: Michael S. Tsirkin Reviewed-by: Jingqi Liu Reviewed-by: Igor Mammedov Signed-off-by: Vishal Verma --- hw/acpi/nvdimm.c | 23 +++++++++++++++++++++++ hw/arm/virt-acpi-build.c | 4 ++++ hw/i386/acpi-build.c | 5 +++++ include/hw/mem/nvdimm.h | 1 + 4 files changed, 33 insertions(+) diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c index 9316d12b70..8f7cc16add 100644 --- a/hw/acpi/nvdimm.c +++ b/hw/acpi/nvdimm.c @@ -28,6 +28,7 @@ #include "qemu/osdep.h" #include "qemu/uuid.h" +#include "qapi/error.h" #include "hw/acpi/acpi.h" #include "hw/acpi/aml-build.h" #include "hw/acpi/bios-linker-loader.h" @@ -1334,6 +1335,28 @@ static void nvdimm_build_ssdt(GArray *table_offsets, GArray *table_data, free_aml_allocator(); } +void nvdimm_build_srat(GArray *table_data) +{ + GSList *device_list = nvdimm_get_device_list(); + + for (; device_list; device_list = device_list->next) { + AcpiSratMemoryAffinity *numamem = NULL; + DeviceState *dev = device_list->data; + Object *obj = OBJECT(dev); + uint64_t addr, size; + int node; + + node = object_property_get_int(obj, PC_DIMM_NODE_PROP, &error_abort); + addr = object_property_get_uint(obj, PC_DIMM_ADDR_PROP, &error_abort); + size = object_property_get_uint(obj, PC_DIMM_SIZE_PROP, &error_abort); + + numamem = acpi_data_push(table_data, sizeof *numamem); + build_srat_memory(numamem, addr, size, node, + MEM_AFFINITY_ENABLED | MEM_AFFINITY_NON_VOLATILE); + } + g_slist_free(device_list); +} + void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data, BIOSLinker *linker, NVDIMMState *state, uint32_t ram_slots) diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index 1b0a584c7b..2cbccd5fe2 100644 --- a/hw/arm/virt-acpi-build.c +++ b/hw/arm/virt-acpi-build.c @@ -539,6 +539,10 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms) } } + if (ms->nvdimms_state->is_enabled) { + nvdimm_build_srat(table_data); + } + if (ms->device_memory) { numamem = acpi_data_push(table_data, sizeof *numamem); build_srat_memory(numamem, ms->device_memory->base, diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 2e15f6848e..d996525e2c 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -2428,6 +2428,11 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine) MEM_AFFINITY_ENABLED); } } + + if (machine->nvdimms_state->is_enabled) { + nvdimm_build_srat(table_data); + } + slots = (table_data->len - numa_start) / sizeof *numamem; for (; slots < pcms->numa_nodes + 2; slots++) { numamem = acpi_data_push(table_data, sizeof *numamem); diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h index a3c08955e8..b67a1aedf6 100644 --- a/include/hw/mem/nvdimm.h +++ b/include/hw/mem/nvdimm.h @@ -155,6 +155,7 @@ typedef struct NVDIMMState NVDIMMState; void nvdimm_init_acpi_state(NVDIMMState *state, MemoryRegion *io, struct AcpiGenericAddress dsm_io, FWCfgState *fw_cfg, Object *owner); +void nvdimm_build_srat(GArray *table_data); void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data, BIOSLinker *linker, NVDIMMState *state, uint32_t ram_slots);