From patchwork Wed Jun 18 16:19:28 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Michael S. Tsirkin" X-Patchwork-Id: 361630 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 86FDF14007B for ; Thu, 19 Jun 2014 02:33:37 +1000 (EST) Received: from localhost ([::1]:59046 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WxInn-0006FC-H2 for incoming@patchwork.ozlabs.org; Wed, 18 Jun 2014 12:33:35 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35578) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WxIa0-0002Ja-4J for qemu-devel@nongnu.org; Wed, 18 Jun 2014 12:19:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WxIZu-0007oa-Cw for qemu-devel@nongnu.org; Wed, 18 Jun 2014 12:19:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:4253) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WxIZu-0007oC-0u for qemu-devel@nongnu.org; Wed, 18 Jun 2014 12:19:14 -0400 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s5IGJ5MT021721 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 18 Jun 2014 12:19:05 -0400 Received: from redhat.com (ovpn-116-25.ams2.redhat.com [10.36.116.25]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id s5IGIxYJ009213; Wed, 18 Jun 2014 12:19:00 -0400 Date: Wed, 18 Jun 2014 19:19:28 +0300 From: "Michael S. Tsirkin" To: qemu-devel@nongnu.org Message-ID: <1403108034-32054-62-git-send-email-mst@redhat.com> References: <1403108034-32054-1-git-send-email-mst@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1403108034-32054-1-git-send-email-mst@redhat.com> X-Mutt-Fcc: =sent X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: "Edgar E. Iglesias" , Peter Maydell , Fam Zheng , Eduardo Habkost , Andre Przywara , Hu Tao , Michael Tokarev , Blue Swirl , Anthony Liguori , Jan Kiszka , Paolo Bonzini , Stefan Hajnoczi , =?us-ascii?B?PT9VVEYtOD9xP0FuZHJlYXM9MjBGPUMzPUE0cmJlcj89?= , Wanlong Gao , Richard Henderson Subject: [Qemu-devel] [PULL v2 061/106] NUMA: move numa related code to new file numa.c X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Wanlong Gao Signed-off-by: Wanlong Gao Reviewed-by: Eduardo Habkost Signed-off-by: Paolo Bonzini Signed-off-by: Hu Tao Signed-off-by: Blue Swirl Signed-off-by: Andre Przywara Signed-off-by: Michael S. Tsirkin Acked-by: Michael S. Tsirkin MST: comment tweaks --- Makefile.target | 2 +- include/exec/cpu-all.h | 2 - include/exec/cpu-common.h | 2 + include/sysemu/cpus.h | 1 - include/sysemu/sysemu.h | 3 + cpus.c | 14 ---- numa.c | 185 ++++++++++++++++++++++++++++++++++++++++++++++ vl.c | 139 +--------------------------------- 8 files changed, 192 insertions(+), 156 deletions(-) create mode 100644 numa.c diff --git a/Makefile.target b/Makefile.target index 06c1e59..fc5827c 100644 --- a/Makefile.target +++ b/Makefile.target @@ -119,7 +119,7 @@ endif #CONFIG_BSD_USER ######################################################### # System emulator target ifdef CONFIG_SOFTMMU -obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o +obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o obj-y += qtest.o obj-y += hw/ obj-$(CONFIG_FDT) += device_tree.o diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h index e8363d7..ed28f1e 100644 --- a/include/exec/cpu-all.h +++ b/include/exec/cpu-all.h @@ -297,8 +297,6 @@ CPUArchState *cpu_copy(CPUArchState *env); /* memory API */ -extern ram_addr_t ram_size; - /* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */ #define RAM_PREALLOC_MASK (1 << 0) diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index a21b65a..e8c7970 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -45,6 +45,8 @@ typedef uintptr_t ram_addr_t; # define RAM_ADDR_FMT "%" PRIxPTR #endif +extern ram_addr_t ram_size; + /* memory API */ typedef void CPUWriteMemoryFunc(void *opaque, hwaddr addr, uint32_t value); diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h index 6502488..4f79081 100644 --- a/include/sysemu/cpus.h +++ b/include/sysemu/cpus.h @@ -23,7 +23,6 @@ extern int smp_threads; #define smp_threads 1 #endif -void set_numa_modes(void); void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg); #endif diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index ba5c7f8..565c8f6 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -144,6 +144,9 @@ extern QEMUClockType rtc_clock; extern int nb_numa_nodes; extern uint64_t node_mem[MAX_NODES]; extern unsigned long *node_cpumask[MAX_NODES]; +void numa_add(const char *optarg); +void set_numa_nodes(void); +void set_numa_modes(void); #define MAX_OPTION_ROMS 16 typedef struct QEMUOptionRom { diff --git a/cpus.c b/cpus.c index dd7ac13..ce668b7 100644 --- a/cpus.c +++ b/cpus.c @@ -1312,20 +1312,6 @@ static void tcg_exec_all(void) exit_request = 0; } -void set_numa_modes(void) -{ - CPUState *cpu; - int i; - - CPU_FOREACH(cpu) { - for (i = 0; i < nb_numa_nodes; i++) { - if (test_bit(cpu->cpu_index, node_cpumask[i])) { - cpu->numa_node = i; - } - } - } -} - void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg) { /* XXX: implement xxx_cpu_list for targets that still miss it */ diff --git a/numa.c b/numa.c new file mode 100644 index 0000000..bd0d2b7 --- /dev/null +++ b/numa.c @@ -0,0 +1,185 @@ +/* + * NUMA parameter parsing routines + * + * Copyright (c) 2014 Fujitsu Ltd. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "sysemu/sysemu.h" +#include "exec/cpu-common.h" +#include "qemu/bitmap.h" +#include "qom/cpu.h" + +static void numa_node_parse_cpus(int nodenr, const char *cpus) +{ + char *endptr; + unsigned long long value, endvalue; + + /* Empty CPU range strings will be considered valid, they will simply + * not set any bit in the CPU bitmap. + */ + if (!*cpus) { + return; + } + + if (parse_uint(cpus, &value, &endptr, 10) < 0) { + goto error; + } + if (*endptr == '-') { + if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) { + goto error; + } + } else if (*endptr == '\0') { + endvalue = value; + } else { + goto error; + } + + if (endvalue >= MAX_CPUMASK_BITS) { + endvalue = MAX_CPUMASK_BITS - 1; + fprintf(stderr, + "qemu: NUMA: A max of %d VCPUs are supported\n", + MAX_CPUMASK_BITS); + } + + if (endvalue < value) { + goto error; + } + + bitmap_set(node_cpumask[nodenr], value, endvalue-value+1); + return; + +error: + fprintf(stderr, "qemu: Invalid NUMA CPU range: %s\n", cpus); + exit(1); +} + +void numa_add(const char *optarg) +{ + char option[128]; + char *endptr; + unsigned long long nodenr; + + optarg = get_opt_name(option, 128, optarg, ','); + if (*optarg == ',') { + optarg++; + } + if (!strcmp(option, "node")) { + + if (nb_numa_nodes >= MAX_NODES) { + fprintf(stderr, "qemu: too many NUMA nodes\n"); + exit(1); + } + + if (get_param_value(option, 128, "nodeid", optarg) == 0) { + nodenr = nb_numa_nodes; + } else { + if (parse_uint_full(option, &nodenr, 10) < 0) { + fprintf(stderr, "qemu: Invalid NUMA nodeid: %s\n", option); + exit(1); + } + } + + if (nodenr >= MAX_NODES) { + fprintf(stderr, "qemu: invalid NUMA nodeid: %llu\n", nodenr); + exit(1); + } + + if (get_param_value(option, 128, "mem", optarg) == 0) { + node_mem[nodenr] = 0; + } else { + int64_t sval; + sval = strtosz(option, &endptr); + if (sval < 0 || *endptr) { + fprintf(stderr, "qemu: invalid numa mem size: %s\n", optarg); + exit(1); + } + node_mem[nodenr] = sval; + } + if (get_param_value(option, 128, "cpus", optarg) != 0) { + numa_node_parse_cpus(nodenr, option); + } + nb_numa_nodes++; + } else { + fprintf(stderr, "Invalid -numa option: %s\n", option); + exit(1); + } +} + +void set_numa_nodes(void) +{ + if (nb_numa_nodes > 0) { + int i; + + if (nb_numa_nodes > MAX_NODES) { + nb_numa_nodes = MAX_NODES; + } + + /* If no memory size if given for any node, assume the default case + * and distribute the available memory equally across all nodes + */ + for (i = 0; i < nb_numa_nodes; i++) { + if (node_mem[i] != 0) { + break; + } + } + if (i == nb_numa_nodes) { + uint64_t usedmem = 0; + + /* On Linux, the each node's border has to be 8MB aligned, + * the final node gets the rest. + */ + for (i = 0; i < nb_numa_nodes - 1; i++) { + node_mem[i] = (ram_size / nb_numa_nodes) & ~((1 << 23UL) - 1); + usedmem += node_mem[i]; + } + node_mem[i] = ram_size - usedmem; + } + + for (i = 0; i < nb_numa_nodes; i++) { + if (!bitmap_empty(node_cpumask[i], MAX_CPUMASK_BITS)) { + break; + } + } + /* assigning the VCPUs round-robin is easier to implement, guest OSes + * must cope with this anyway, because there are BIOSes out there in + * real machines which also use this scheme. + */ + if (i == nb_numa_nodes) { + for (i = 0; i < max_cpus; i++) { + set_bit(i, node_cpumask[i % nb_numa_nodes]); + } + } + } +} + +void set_numa_modes(void) +{ + CPUState *cpu; + int i; + + CPU_FOREACH(cpu) { + for (i = 0; i < nb_numa_nodes; i++) { + if (test_bit(cpu->cpu_index, node_cpumask[i])) { + cpu->numa_node = i; + } + } + } +} diff --git a/vl.c b/vl.c index 4c6d6df..c87c7d8 100644 --- a/vl.c +++ b/vl.c @@ -1275,102 +1275,6 @@ char *get_boot_devices_list(size_t *size, bool ignore_suffixes) return list; } -static void numa_node_parse_cpus(int nodenr, const char *cpus) -{ - char *endptr; - unsigned long long value, endvalue; - - /* Empty CPU range strings will be considered valid, they will simply - * not set any bit in the CPU bitmap. - */ - if (!*cpus) { - return; - } - - if (parse_uint(cpus, &value, &endptr, 10) < 0) { - goto error; - } - if (*endptr == '-') { - if (parse_uint_full(endptr + 1, &endvalue, 10) < 0) { - goto error; - } - } else if (*endptr == '\0') { - endvalue = value; - } else { - goto error; - } - - if (endvalue >= MAX_CPUMASK_BITS) { - endvalue = MAX_CPUMASK_BITS - 1; - fprintf(stderr, - "qemu: NUMA: A max of %d VCPUs are supported\n", - MAX_CPUMASK_BITS); - } - - if (endvalue < value) { - goto error; - } - - bitmap_set(node_cpumask[nodenr], value, endvalue-value+1); - return; - -error: - fprintf(stderr, "qemu: Invalid NUMA CPU range: %s\n", cpus); - exit(1); -} - -static void numa_add(const char *optarg) -{ - char option[128]; - char *endptr; - unsigned long long nodenr; - - optarg = get_opt_name(option, 128, optarg, ','); - if (*optarg == ',') { - optarg++; - } - if (!strcmp(option, "node")) { - - if (nb_numa_nodes >= MAX_NODES) { - fprintf(stderr, "qemu: too many NUMA nodes\n"); - exit(1); - } - - if (get_param_value(option, 128, "nodeid", optarg) == 0) { - nodenr = nb_numa_nodes; - } else { - if (parse_uint_full(option, &nodenr, 10) < 0) { - fprintf(stderr, "qemu: Invalid NUMA nodeid: %s\n", option); - exit(1); - } - } - - if (nodenr >= MAX_NODES) { - fprintf(stderr, "qemu: invalid NUMA nodeid: %llu\n", nodenr); - exit(1); - } - - if (get_param_value(option, 128, "mem", optarg) == 0) { - node_mem[nodenr] = 0; - } else { - int64_t sval; - sval = strtosz(option, &endptr); - if (sval < 0 || *endptr) { - fprintf(stderr, "qemu: invalid numa mem size: %s\n", optarg); - exit(1); - } - node_mem[nodenr] = sval; - } - if (get_param_value(option, 128, "cpus", optarg) != 0) { - numa_node_parse_cpus(nodenr, option); - } - nb_numa_nodes++; - } else { - fprintf(stderr, "Invalid -numa option: %s\n", option); - exit(1); - } -} - static QemuOptsList qemu_smp_opts = { .name = "smp-opts", .implied_opt_name = "cpus", @@ -4400,48 +4304,7 @@ int main(int argc, char **argv, char **envp) default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS); default_drive(default_sdcard, snapshot, IF_SD, 0, SD_OPTS); - if (nb_numa_nodes > 0) { - int i; - - if (nb_numa_nodes > MAX_NODES) { - nb_numa_nodes = MAX_NODES; - } - - /* If no memory size if given for any node, assume the default case - * and distribute the available memory equally across all nodes - */ - for (i = 0; i < nb_numa_nodes; i++) { - if (node_mem[i] != 0) - break; - } - if (i == nb_numa_nodes) { - uint64_t usedmem = 0; - - /* On Linux, the each node's border has to be 8MB aligned, - * the final node gets the rest. - */ - for (i = 0; i < nb_numa_nodes - 1; i++) { - node_mem[i] = (ram_size / nb_numa_nodes) & ~((1 << 23UL) - 1); - usedmem += node_mem[i]; - } - node_mem[i] = ram_size - usedmem; - } - - for (i = 0; i < nb_numa_nodes; i++) { - if (!bitmap_empty(node_cpumask[i], MAX_CPUMASK_BITS)) { - break; - } - } - /* assigning the VCPUs round-robin is easier to implement, guest OSes - * must cope with this anyway, because there are BIOSes out there in - * real machines which also use this scheme. - */ - if (i == nb_numa_nodes) { - for (i = 0; i < max_cpus; i++) { - set_bit(i, node_cpumask[i % nb_numa_nodes]); - } - } - } + set_numa_nodes(); if (qemu_opts_foreach(qemu_find_opts("mon"), mon_init_func, NULL, 1) != 0) { exit(1);