From patchwork Mon Nov 20 10:03:44 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 839482 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3ygPV44cL7z9s03 for ; Mon, 20 Nov 2017 21:05:16 +1100 (AEDT) Received: from localhost ([::1]:56420 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eGix0-00013q-N6 for incoming@patchwork.ozlabs.org; Mon, 20 Nov 2017 05:05:14 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45743) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eGiw2-0000hL-PE for qemu-devel@nongnu.org; Mon, 20 Nov 2017 05:04:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eGivz-0001fR-KM for qemu-devel@nongnu.org; Mon, 20 Nov 2017 05:04:14 -0500 Received: from 10.mo173.mail-out.ovh.net ([46.105.74.148]:56676) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eGivz-0001XF-Eu for qemu-devel@nongnu.org; Mon, 20 Nov 2017 05:04:11 -0500 Received: from player739.ha.ovh.net (b9.ovh.net [213.186.33.59]) by mo173.mail-out.ovh.net (Postfix) with ESMTP id B70F886EAB for ; Mon, 20 Nov 2017 11:03:59 +0100 (CET) Received: from zorba.kaod.org.com (LFbn-1-2231-173.w90-76.abo.wanadoo.fr [90.76.52.173]) (Authenticated sender: clg@kaod.org) by player739.ha.ovh.net (Postfix) with ESMTPSA id 2A3E34A0088; Mon, 20 Nov 2017 11:03:51 +0100 (CET) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: qemu-ppc@nongnu.org, qemu-devel@nongnu.org, David Gibson , Nikunj A Dadhania , Benjamin Herrenschmidt Date: Mon, 20 Nov 2017 11:03:44 +0100 Message-Id: <20171120100347.8601-1-clg@kaod.org> X-Mailer: git-send-email 2.13.6 MIME-Version: 1.0 X-Ovh-Tracer-Id: 12145645246307273555 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedttddrkeefgddtlecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfqggfjpdevjffgvefmvefgnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 46.105.74.148 Subject: [Qemu-devel] [PATCH for-2.12 v3 0/3] disable the decrementer interrupt when a CPU is unplugged X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?C=C3=A9dric_Le_Goater?= Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Hello, When a CPU is stopped with the 'stop-self' RTAS call, its state 'halted' is switched to 1 and, in this case, the MSR is not taken into account anymore in the cpu_has_work() routine. Only the pending hardware interrupts are checked with their LPCR:PECE* enablement bit. If the DECR timer fires after 'stop-self' is called and before the CPU 'stop' state is reached, the nearly-dead CPU will have some work to do and the guest will crash. This case happens very frequently with the not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is occasionally fired but after 'stop' state, so no work is to be done and the guest survives. I suspect there is a race between the QEMU mainloop triggering the timers and the TCG CPU thread but I could not quite identify the root cause. To be safe, let's disable the decrementer interrupt in the LPCR when the CPU is halted and reenable it when the CPU is restarted. Reseting the MSR is now pointless, so remove this dubious workaround. Thanks, C. Changes in v3: - removed the ppc_cpu_pvr_match() routine testing the CPU family. - introduced a cpu_ppc_papr_pece_bits() helper to gather the PECE bits depending on the CPU family. - enabled Power-saving mode Exit Cause exceptions only on the boot CPU. Changes in v2: - used a new routine ppc_cpu_pvr_match() to discriminate CPU versions - removed the LPCR:PECE* enablement bit when the CPU is initialized if it is a secondary - included Nikunj's fix to reboot SMP TCG guests Cédric Le Goater (3): spapr/rtas: disable the decrementer interrupt when a CPU is unplugged spapr/rtas: fix reboot of a a SMP TCG guest spapr/rtas: do not reset the MSR in stop-self command hw/ppc/spapr_cpu_core.c | 7 +++++++ hw/ppc/spapr_rtas.c | 19 +++++++++---------- target/ppc/cpu.h | 1 + target/ppc/translate_init.c | 33 +++++++++++++++++++++++++-------- 4 files changed, 42 insertions(+), 18 deletions(-)