From patchwork Wed Jul 11 10:31:54 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vasilis Liaskovitis X-Patchwork-Id: 170445 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 58B832C01FA for ; Wed, 11 Jul 2012 21:59:01 +1000 (EST) Received: from localhost ([::1]:41603 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SouEV-00056J-CS for incoming@patchwork.ozlabs.org; Wed, 11 Jul 2012 06:33:23 -0400 Received: from eggs.gnu.org ([208.118.235.92]:44324) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SouDb-0003Wo-Vf for qemu-devel@nongnu.org; Wed, 11 Jul 2012 06:32:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SouDS-0005qu-Br for qemu-devel@nongnu.org; Wed, 11 Jul 2012 06:32:27 -0400 Received: from mail-bk0-f45.google.com ([209.85.214.45]:57641) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SouDS-0005nm-5T for qemu-devel@nongnu.org; Wed, 11 Jul 2012 06:32:18 -0400 Received: by mail-bk0-f45.google.com with SMTP id ji1so712841bkc.4 for ; Wed, 11 Jul 2012 03:32:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:date:message-id:x-mailer:in-reply-to:references :x-gm-message-state; bh=6Y4LImsjrYO1PVOUal4ERiCifu6Fh9yQo5NPBBS+6B0=; b=WjBROGgYbd9p/gJrtZiu+MLGopuLLaGTBZ8qjai9Sf52bjZ2TzBo1kVoyV/+t4vodr ebIIn8nFdeHFaNWwsnlE1y/+iZ1d0ee+AHI7+rHlkKzbk0iNTBCT3rt2ZWH9VB8HSKil hVXm9AKvZ9sczUDrawxcJI3YoOE/t/s3ZotkY1iIUqtmxjxF1CwMoCKySaIpJ5UV54nu pQg/+aiAWT7OmLhqihuWf4pPGbZpMk3h6J+Ed2HisEt0pAet2Va0uRlAF5QP6M+a15KN 4djo5qGxQ67+rWPLuAwGaHE/5XneOzpATxwOZ1sXEOVqA0bidfpAgN0yxeGAQD7DzvPo qcLg== Received: by 10.204.154.141 with SMTP id o13mr19193696bkw.72.1342002737060; Wed, 11 Jul 2012 03:32:17 -0700 (PDT) Received: from dhcp-192-168-178-175.ri.profitbricks.localdomain ([62.217.45.26]) by mx.google.com with ESMTPS id e20sm794740bkv.10.2012.07.11.03.32.16 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 11 Jul 2012 03:32:16 -0700 (PDT) From: Vasilis Liaskovitis To: qemu-devel@nongnu.org, kvm@vger.kernel.org, seabios@seabios.org Date: Wed, 11 Jul 2012 12:31:54 +0200 Message-Id: <1342002726-18258-10-git-send-email-vasilis.liaskovitis@profitbricks.com> X-Mailer: git-send-email 1.7.9 In-Reply-To: <1342002726-18258-1-git-send-email-vasilis.liaskovitis@profitbricks.com> References: <1342002726-18258-1-git-send-email-vasilis.liaskovitis@profitbricks.com> X-Gm-Message-State: ALoCoQmGOSwJr0Qyl3xnZ6wXFpRnuGUqN4LXnawoNFLh2f/0iN22niuZnTRVMCRLj5x0az5cyP32 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.214.45 Cc: gleb@redhat.com, Vasilis Liaskovitis , kevin@koconnor.net, avi@redhat.com, anthony@codemonkey.ws, imammedo@redhat.com Subject: [Qemu-devel] [RFC PATCH v2 09/21] pc: Add dimm paravirt SRAT info X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The numa_fw_cfg paravirt interface is extended to include SRAT information for all hotplug-able dimms. There are 3 words for each hotplug-able memory slot, denoting start address, size and node proximity. The new info is appended after existing numa info, so that the fw_cfg layout does not break. This information is used by Seabios to build hotplug memory device objects at runtime. nb_numa_nodes is set to 1 by default (not 0), so that we always pass srat info to SeaBIOS. v1->v2: Dimm SRAT info (#dimms) is appended at end of existing numa fw_cfg in order not to break existing layout Documentation of the new fwcfg layout is included in docs/specs/fwcfg.txt Signed-off-by: Vasilis Liaskovitis --- docs/specs/fwcfg.txt | 28 ++++++++++++++++++++++++++ hw/pc.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++- vl.c | 2 +- 3 files changed, 80 insertions(+), 3 deletions(-) create mode 100644 docs/specs/fwcfg.txt diff --git a/docs/specs/fwcfg.txt b/docs/specs/fwcfg.txt new file mode 100644 index 0000000..e6fcd8f --- /dev/null +++ b/docs/specs/fwcfg.txt @@ -0,0 +1,28 @@ +QEMU<->BIOS Paravirt Documentation +-------------------------------------- + +This document describes paravirt data structures passed from QEMU to BIOS. + +fw_cfg SRAT paravirt info +-------------------- +The SRAT info passed from QEMU to BIOS has the following layout: + +----------------------------------------------------------------------------------------------- +#nodes | cpu0_pxm | cpu1_pxm | ... | cpulast_pxm | node0_mem | node1_mem | ... | nodelast_mem + +----------------------------------------------------------------------------------------------- +#dimms | dimm0_start | dimm0_sz | dimm0_pxm | ... | dimmlast_start | dimmlast_sz | dimmlast_pxm + +Entry 0 contains the number of numa nodes (nb_numa_nodes). + +Entries 1..max_cpus: The next max_cpus entries describe node proximity for each +one of the vCPUs in the system. + +Entries max_cpus+1..max_cpus+nb_numa_nodes+1: The next nb_numa_nodes entries +describe the memory size for each one of the NUMA nodes in the system. + +Entry max_cpus+nb_numa_nodes+1 contains the number of memory dimms (nb_hp_dimms) + +The last 3 * nb_hp_dimms entries are organized in triplets: Each triplet contains +the physical address offset, size (in bytes), and node proximity for the +respective dimm. diff --git a/hw/pc.c b/hw/pc.c index ef9901a..cf651d0 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -598,12 +598,15 @@ int e820_add_entry(uint64_t address, uint64_t length, uint32_t type) return index; } +static void setup_hp_dimms(uint64_t *fw_cfg_slots); + static void *bochs_bios_init(void) { void *fw_cfg; uint8_t *smbios_table; size_t smbios_len; uint64_t *numa_fw_cfg; + uint64_t *hp_dimms_fw_cfg; int i, j; register_ioport_write(0x400, 1, 2, bochs_bios_write, NULL); @@ -638,8 +641,10 @@ static void *bochs_bios_init(void) /* allocate memory for the NUMA channel: one (64bit) word for the number * of nodes, one word for each VCPU->node and one word for each node to * hold the amount of memory. + * Finally one word for the number of hotplug memory slots and three words + * for each hotplug memory slot (start address, size and node proximity). */ - numa_fw_cfg = g_malloc0((1 + max_cpus + nb_numa_nodes) * 8); + numa_fw_cfg = g_malloc0((2 + max_cpus + nb_numa_nodes + 3 * nb_hp_dimms) * 8); numa_fw_cfg[0] = cpu_to_le64(nb_numa_nodes); for (i = 0; i < max_cpus; i++) { for (j = 0; j < nb_numa_nodes; j++) { @@ -652,8 +657,15 @@ static void *bochs_bios_init(void) for (i = 0; i < nb_numa_nodes; i++) { numa_fw_cfg[max_cpus + 1 + i] = cpu_to_le64(node_mem[i]); } + + numa_fw_cfg[1 + max_cpus + nb_numa_nodes] = cpu_to_le64(nb_hp_dimms); + + hp_dimms_fw_cfg = numa_fw_cfg + 2 + max_cpus + nb_numa_nodes; + if (nb_hp_dimms) + setup_hp_dimms(hp_dimms_fw_cfg); + fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, (uint8_t *)numa_fw_cfg, - (1 + max_cpus + nb_numa_nodes) * 8); + (2 + max_cpus + nb_numa_nodes + 3 * nb_hp_dimms) * 8); return fw_cfg; } @@ -1223,3 +1235,40 @@ target_phys_addr_t pc_set_hp_memory_offset(uint64_t size) return ret; } + +static void setup_hp_dimms(uint64_t *fw_cfg_slots) +{ + int i = 0; + Error *err = NULL; + DeviceState *dev; + DimmState *slot; + const char *type; + BusChild *kid; + BusState *bus = sysbus_get_default(); + + QTAILQ_FOREACH(kid, &bus->children, sibling) { + dev = kid->child; + type = object_property_get_str(OBJECT(dev), "type", &err); + if (err) { + error_free(err); + fprintf(stderr, "error getting device type\n"); + exit(1); + } + + if (!strcmp(type, "dimm")) { + if (!dev->id) { + fprintf(stderr, "error getting dimm device id\n"); + exit(1); + } + slot = DIMM(dev); + /* determine starting physical address for this memory slot */ + assert(slot->start); + fw_cfg_slots[3 * slot->idx] = cpu_to_le64(slot->start); + fw_cfg_slots[3 * slot->idx + 1] = cpu_to_le64(slot->size); + fw_cfg_slots[3 * slot->idx + 2] = cpu_to_le64(slot->node); + i++; + } + } + assert(i == nb_hp_dimms); +} + diff --git a/vl.c b/vl.c index 0ff8818..37c9798 100644 --- a/vl.c +++ b/vl.c @@ -2335,7 +2335,7 @@ int main(int argc, char **argv, char **envp) node_cpumask[i] = 0; } - nb_numa_nodes = 0; + nb_numa_nodes = 1; nb_nics = 0; autostart= 1;