From patchwork Mon Nov 4 13:51:21 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Michael S. Tsirkin" X-Patchwork-Id: 288194 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id B73792C0086 for ; Tue, 5 Nov 2013 00:50:18 +1100 (EST) Received: from localhost ([::1]:50010 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VdKXo-0007KB-Sb for incoming@patchwork.ozlabs.org; Mon, 04 Nov 2013 08:50:16 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37982) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VdKWC-0005Jq-3w for qemu-devel@nongnu.org; Mon, 04 Nov 2013 08:48:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VdKW6-0007My-54 for qemu-devel@nongnu.org; Mon, 04 Nov 2013 08:48:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:12509) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VdKW5-0007Mu-Th for qemu-devel@nongnu.org; Mon, 04 Nov 2013 08:48:30 -0500 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rA4DmS8i013256 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 4 Nov 2013 08:48:28 -0500 Received: from redhat.com (vpn1-6-193.ams2.redhat.com [10.36.6.193]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with SMTP id rA4DmPIC021159; Mon, 4 Nov 2013 08:48:26 -0500 Date: Mon, 4 Nov 2013 15:51:21 +0200 From: "Michael S. Tsirkin" To: qemu-devel@nongnu.org Message-ID: <1383572851-28326-4-git-send-email-mst@redhat.com> References: <1383572851-28326-1-git-send-email-mst@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1383572851-28326-1-git-send-email-mst@redhat.com> X-Mutt-Fcc: =sent X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: Paolo Bonzini , Laszlo Ersek , Anthony Liguori , Luiz Capitulino Subject: [Qemu-devel] [PULL 3/3] vl: allow "cont" from panicked state X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Paolo Bonzini After reporting the GUEST_PANICKED monitor event, QEMU stops the VM. The reason for this is that events are edge-triggered, and can be lost if management dies at the wrong time. Stopping a panicked VM lets management know of a panic even if it has crashed; management can learn about the panic when it restarts and queries running QEMU processes. The downside is of course that the VM will be paused while management is not running, but that is acceptable if it only happens with explicit "-device pvpanic". Upon learning of a panic, management (if configured to do so) can pick a variety of behaviors: leave the VM paused, reset it, destroy it. In addition to all of these behaviors, it is possible to dump the VM core from the host. However, right now, the panicked state is irreversible, and can only be exited by resetting the machine. This means that any policy decision is entirely in the hands of the host. In particular there is no way to use the "reboot on panic" option together with pvpanic. This patch makes the panicked state reversible (and removes various workarounds that were there because of the state being irreversible). With this change, management has a wider set of possible policies: it can just log the crash and leave policy to the guest, it can leave the VM paused. In particular, the "log the crash and continue" is implemented simply by sending a "cont" as soon as management learns about the panic. Management could also implement the "irreversible paused state" itself. And again, all such actions can be coupled with dumping the VM core. Unfortunately we cannot change the behavior of 1.6.0. Thus, even if it uses "-device pvpanic", management should check for "cont" failures. If "cont" fails, management can then log that the VM remained paused and urge the administrator to update QEMU. Reviewed-by: Laszlo Ersek Reviewed-by: Luiz Capitulino Acked-by: Michael S. Tsirkin Signed-off-by: Paolo Bonzini Signed-off-by: Michael S. Tsirkin --- gdbstub.c | 3 --- vl.c | 6 ++---- 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/gdbstub.c b/gdbstub.c index 0e5a3f5..e8ab0b2 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -368,9 +368,6 @@ static inline void gdb_continue(GDBState *s) #ifdef CONFIG_USER_ONLY s->running_state = 1; #else - if (runstate_check(RUN_STATE_GUEST_PANICKED)) { - runstate_set(RUN_STATE_DEBUG); - } if (!runstate_needs_reset()) { vm_start(); } diff --git a/vl.c b/vl.c index efbff65..4ad15b8 100644 --- a/vl.c +++ b/vl.c @@ -638,9 +638,8 @@ static const RunStateTransition runstate_transitions_def[] = { { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING }, { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE }, - { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED }, + { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING }, { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE }, - { RUN_STATE_GUEST_PANICKED, RUN_STATE_DEBUG }, { RUN_STATE_MAX, RUN_STATE_MAX }, }; @@ -686,8 +685,7 @@ int runstate_is_running(void) bool runstate_needs_reset(void) { return runstate_check(RUN_STATE_INTERNAL_ERROR) || - runstate_check(RUN_STATE_SHUTDOWN) || - runstate_check(RUN_STATE_GUEST_PANICKED); + runstate_check(RUN_STATE_SHUTDOWN); } StatusInfo *qmp_query_status(Error **errp)