From patchwork Thu Mar 19 19:26:13 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eduardo Habkost X-Patchwork-Id: 452191 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id F3C81140079 for ; Fri, 20 Mar 2015 06:31:10 +1100 (AEDT) Received: from localhost ([::1]:40950 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYg9t-0007Za-77 for incoming@patchwork.ozlabs.org; Thu, 19 Mar 2015 15:31:09 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53220) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYg5o-0008FG-Ke for qemu-devel@nongnu.org; Thu, 19 Mar 2015 15:26:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YYg5j-0005eM-0d for qemu-devel@nongnu.org; Thu, 19 Mar 2015 15:26:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39727) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYg5i-0005e9-O6 for qemu-devel@nongnu.org; Thu, 19 Mar 2015 15:26:50 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t2JJQnfd004779 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 19 Mar 2015 15:26:49 -0400 Received: from localhost ([10.3.113.19]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t2JJQmB8014856; Thu, 19 Mar 2015 15:26:48 -0400 From: Eduardo Habkost To: Peter Maydell Date: Thu, 19 Mar 2015 16:26:13 -0300 Message-Id: <1426793174-19012-6-git-send-email-ehabkost@redhat.com> In-Reply-To: <1426793174-19012-1-git-send-email-ehabkost@redhat.com> References: <1426793174-19012-1-git-send-email-ehabkost@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26 X-MIME-Autoconverted: from 8bit to quoted-printable by mx1.redhat.com id t2JJQnfd004779 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: qemu-devel@nongnu.org, Paolo Bonzini , Igor Mammedov , =?UTF-8?q?Andreas=20F=C3=A4rber?= , "Michael S. Tsirkin" Subject: [Qemu-devel] [PULL 5/6] pc: fix default VCPU to NUMA node mapping X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Igor Mammedov Since commit dd0247e0 pc: acpi: mark all possible CPUs as enabled in SRAT Linux kernel actually tries to use CPU to Node mapping from QEMU provided SRAT table instead of discarding it, and that in some cases breaks build_sched_domains() which expects sane mapping where cores/threads belonging to the same socket are on the same NUMA node. With current default round-robin mapping of VCPUs to nodes guest ends-up with cores/threads belonging to the same socket being on different NUMA nodes. For example with following CLI: qemu-system-x86_64 -m 4G \ -cpu Opteron_G3,vendor=AuthenticAMD \ -smp 5,sockets=1,cores=4,threads=1,maxcpus=8 \ -numa node,nodeid=0 -numa node,nodeid=1 2.6.32 based kernels will hang on boot due to incorrectly built sched_group-s list in update_sd_lb_stats() Replacing default mapping with a manual, where VCPUs belonging to the same socket are on the same NUMA node, fixes the issue for guests which can't handle nonsense topology i.e. changing CLI to: -numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7 So instead of simply scattering VCPUs around nodes, provide callback to map the same socket VCPUs to the same NUMA node, which is what guests would expect from a sane hardware/BIOS. Signed-off-by: Igor Mammedov Reviewed-by: Andreas Färber Signed-off-by: Eduardo Habkost --- hw/i386/pc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 4b46c29..a52d2af 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1851,6 +1851,14 @@ static void pc_machine_initfn(Object *obj) NULL, NULL); } +static unsigned pc_cpu_index_to_socket_id(unsigned cpu_index) +{ + unsigned pkg_id, core_id, smt_id; + x86_topo_ids_from_idx(smp_cores, smp_threads, cpu_index, + &pkg_id, &core_id, &smt_id); + return pkg_id; +} + static void pc_machine_class_init(ObjectClass *oc, void *data) { MachineClass *mc = MACHINE_CLASS(oc); @@ -1859,6 +1867,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data) pcmc->get_hotplug_handler = mc->get_hotplug_handler; mc->get_hotplug_handler = pc_get_hotpug_handler; + mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id; hc->plug = pc_machine_device_plug_cb; hc->unplug_request = pc_machine_device_unplug_request_cb; hc->unplug = pc_machine_device_unplug_cb;