Patchwork [RESEND,2/2,v3] deal with guest panicked event

login
register
mail settings
Submitter Wen Congyang
Date March 8, 2012, 10:15 a.m.
Message ID <4F5886C4.7040100@cn.fujitsu.com>
Download mbox | patch
Permalink /patch/145478/
State New
Headers show

Comments

Wen Congyang - March 8, 2012, 10:15 a.m.
When the host knows the guest is panicked, it will set
exit_reason to KVM_EXIT_GUEST_PANICKED. So if qemu receive
this exit_reason, we can send a event to tell management
application that the guest is panicked and set the guest
status to RUN_STATE_PANICKED.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 kvm-all.c        |    5 +++++
 monitor.c        |    3 +++
 monitor.h        |    1 +
 qapi-schema.json |    2 +-
 qmp.c            |    3 ++-
 vl.c             |    1 +
 6 files changed, 13 insertions(+), 2 deletions(-)
Avi Kivity - March 8, 2012, 11:28 a.m.
On 03/08/2012 12:15 PM, Wen Congyang wrote:
> When the host knows the guest is panicked, it will set
> exit_reason to KVM_EXIT_GUEST_PANICKED. So if qemu receive
> this exit_reason, we can send a event to tell management
> application that the guest is panicked and set the guest
> status to RUN_STATE_PANICKED.
>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>  kvm-all.c        |    5 +++++
>  monitor.c        |    3 +++
>  monitor.h        |    1 +
>  qapi-schema.json |    2 +-
>  qmp.c            |    3 ++-
>  vl.c             |    1 +
>  6 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/kvm-all.c b/kvm-all.c
> index 77eadf6..b3c9a83 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -1290,6 +1290,11 @@ int kvm_cpu_exec(CPUState *env)
>                      (uint64_t)run->hw.hardware_exit_reason);
>              ret = -1;
>              break;
> +        case KVM_EXIT_GUEST_PANICKED:
> +            monitor_protocol_event(QEVENT_GUEST_PANICKED, NULL);
> +            vm_stop(RUN_STATE_PANICKED);
> +            ret = -1;
> +            break;
>

If the management application is not aware of this event, then it will
never resume the guest, so it will appear hung.
Daniel P. Berrange - March 8, 2012, 11:36 a.m.
On Thu, Mar 08, 2012 at 01:28:56PM +0200, Avi Kivity wrote:
> On 03/08/2012 12:15 PM, Wen Congyang wrote:
> > When the host knows the guest is panicked, it will set
> > exit_reason to KVM_EXIT_GUEST_PANICKED. So if qemu receive
> > this exit_reason, we can send a event to tell management
> > application that the guest is panicked and set the guest
> > status to RUN_STATE_PANICKED.
> >
> > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> > ---
> >  kvm-all.c        |    5 +++++
> >  monitor.c        |    3 +++
> >  monitor.h        |    1 +
> >  qapi-schema.json |    2 +-
> >  qmp.c            |    3 ++-
> >  vl.c             |    1 +
> >  6 files changed, 13 insertions(+), 2 deletions(-)
> >
> > diff --git a/kvm-all.c b/kvm-all.c
> > index 77eadf6..b3c9a83 100644
> > --- a/kvm-all.c
> > +++ b/kvm-all.c
> > @@ -1290,6 +1290,11 @@ int kvm_cpu_exec(CPUState *env)
> >                      (uint64_t)run->hw.hardware_exit_reason);
> >              ret = -1;
> >              break;
> > +        case KVM_EXIT_GUEST_PANICKED:
> > +            monitor_protocol_event(QEVENT_GUEST_PANICKED, NULL);
> > +            vm_stop(RUN_STATE_PANICKED);
> > +            ret = -1;
> > +            break;
> >
> 
> If the management application is not aware of this event, then it will
> never resume the guest, so it will appear hung.

Even if the mgmt app doesn't know about the QEVENT_GUEST_PANICKED, it should
still see a QEVENT_STOP event emitted by vm_stop() surely ? So it will
know the guest CPUs have been stopped, even if it isn't aware of the
reason why, which seems fine to me.

Daniel
Avi Kivity - March 8, 2012, 11:52 a.m.
On 03/08/2012 01:36 PM, Daniel P. Berrange wrote:
> On Thu, Mar 08, 2012 at 01:28:56PM +0200, Avi Kivity wrote:
> > On 03/08/2012 12:15 PM, Wen Congyang wrote:
> > > When the host knows the guest is panicked, it will set
> > > exit_reason to KVM_EXIT_GUEST_PANICKED. So if qemu receive
> > > this exit_reason, we can send a event to tell management
> > > application that the guest is panicked and set the guest
> > > status to RUN_STATE_PANICKED.
> > >
> > > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> > > ---
> > >  kvm-all.c        |    5 +++++
> > >  monitor.c        |    3 +++
> > >  monitor.h        |    1 +
> > >  qapi-schema.json |    2 +-
> > >  qmp.c            |    3 ++-
> > >  vl.c             |    1 +
> > >  6 files changed, 13 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kvm-all.c b/kvm-all.c
> > > index 77eadf6..b3c9a83 100644
> > > --- a/kvm-all.c
> > > +++ b/kvm-all.c
> > > @@ -1290,6 +1290,11 @@ int kvm_cpu_exec(CPUState *env)
> > >                      (uint64_t)run->hw.hardware_exit_reason);
> > >              ret = -1;
> > >              break;
> > > +        case KVM_EXIT_GUEST_PANICKED:
> > > +            monitor_protocol_event(QEVENT_GUEST_PANICKED, NULL);
> > > +            vm_stop(RUN_STATE_PANICKED);
> > > +            ret = -1;
> > > +            break;
> > >
> > 
> > If the management application is not aware of this event, then it will
> > never resume the guest, so it will appear hung.
>
> Even if the mgmt app doesn't know about the QEVENT_GUEST_PANICKED, it should
> still see a QEVENT_STOP event emitted by vm_stop() surely ? So it will
> know the guest CPUs have been stopped, even if it isn't aware of the
> reason why, which seems fine to me.

No.  The guest is stopped, and there's no reason to suppose that the
management app will restart it.  Behaviour has changed.

Suppose the guest has reboot_on_panic set; now the behaviour change is
even more visible - service will stop completely instead of being
interrupted for a bit while the guest reboots.
Daniel P. Berrange - March 8, 2012, 11:56 a.m.
On Thu, Mar 08, 2012 at 01:52:45PM +0200, Avi Kivity wrote:
> On 03/08/2012 01:36 PM, Daniel P. Berrange wrote:
> > On Thu, Mar 08, 2012 at 01:28:56PM +0200, Avi Kivity wrote:
> > > On 03/08/2012 12:15 PM, Wen Congyang wrote:
> > > > When the host knows the guest is panicked, it will set
> > > > exit_reason to KVM_EXIT_GUEST_PANICKED. So if qemu receive
> > > > this exit_reason, we can send a event to tell management
> > > > application that the guest is panicked and set the guest
> > > > status to RUN_STATE_PANICKED.
> > > >
> > > > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> > > > ---
> > > >  kvm-all.c        |    5 +++++
> > > >  monitor.c        |    3 +++
> > > >  monitor.h        |    1 +
> > > >  qapi-schema.json |    2 +-
> > > >  qmp.c            |    3 ++-
> > > >  vl.c             |    1 +
> > > >  6 files changed, 13 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/kvm-all.c b/kvm-all.c
> > > > index 77eadf6..b3c9a83 100644
> > > > --- a/kvm-all.c
> > > > +++ b/kvm-all.c
> > > > @@ -1290,6 +1290,11 @@ int kvm_cpu_exec(CPUState *env)
> > > >                      (uint64_t)run->hw.hardware_exit_reason);
> > > >              ret = -1;
> > > >              break;
> > > > +        case KVM_EXIT_GUEST_PANICKED:
> > > > +            monitor_protocol_event(QEVENT_GUEST_PANICKED, NULL);
> > > > +            vm_stop(RUN_STATE_PANICKED);
> > > > +            ret = -1;
> > > > +            break;
> > > >
> > > 
> > > If the management application is not aware of this event, then it will
> > > never resume the guest, so it will appear hung.
> >
> > Even if the mgmt app doesn't know about the QEVENT_GUEST_PANICKED, it should
> > still see a QEVENT_STOP event emitted by vm_stop() surely ? So it will
> > know the guest CPUs have been stopped, even if it isn't aware of the
> > reason why, which seems fine to me.
> 
> No.  The guest is stopped, and there's no reason to suppose that the
> management app will restart it.  Behaviour has changed.
> 
> Suppose the guest has reboot_on_panic set; now the behaviour change is
> even more visible - service will stop completely instead of being
> interrupted for a bit while the guest reboots.

Hmm, so this calls for a new command line argument to control behaviour,
similar to what we do for disk werror, eg something like

  --onpanic "report|pause|stop|..."

where

 report - emit QEVENT_GUEST_PANICKED only
 pause  - emit QEVENT_GUEST_PANICKED and pause VM
 stop   - emit QEVENT_GUEST_PANICKED and quit VM
 stop   - emit QEVENT_GUEST_PANICKED and quit VM

This would map fairly well into libvirt, where we already have config
parameters for controlling what todo with a guest when it panics.

Regards,
Daniel
Marcelo Tosatti - March 9, 2012, 10:22 p.m.
On Thu, Mar 08, 2012 at 11:56:56AM +0000, Daniel P. Berrange wrote:
> On Thu, Mar 08, 2012 at 01:52:45PM +0200, Avi Kivity wrote:
> > On 03/08/2012 01:36 PM, Daniel P. Berrange wrote:
> > > On Thu, Mar 08, 2012 at 01:28:56PM +0200, Avi Kivity wrote:
> > > > On 03/08/2012 12:15 PM, Wen Congyang wrote:
> > > > > When the host knows the guest is panicked, it will set
> > > > > exit_reason to KVM_EXIT_GUEST_PANICKED. So if qemu receive
> > > > > this exit_reason, we can send a event to tell management
> > > > > application that the guest is panicked and set the guest
> > > > > status to RUN_STATE_PANICKED.
> > > > >
> > > > > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> > > > > ---
> > > > >  kvm-all.c        |    5 +++++
> > > > >  monitor.c        |    3 +++
> > > > >  monitor.h        |    1 +
> > > > >  qapi-schema.json |    2 +-
> > > > >  qmp.c            |    3 ++-
> > > > >  vl.c             |    1 +
> > > > >  6 files changed, 13 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/kvm-all.c b/kvm-all.c
> > > > > index 77eadf6..b3c9a83 100644
> > > > > --- a/kvm-all.c
> > > > > +++ b/kvm-all.c
> > > > > @@ -1290,6 +1290,11 @@ int kvm_cpu_exec(CPUState *env)
> > > > >                      (uint64_t)run->hw.hardware_exit_reason);
> > > > >              ret = -1;
> > > > >              break;
> > > > > +        case KVM_EXIT_GUEST_PANICKED:
> > > > > +            monitor_protocol_event(QEVENT_GUEST_PANICKED, NULL);
> > > > > +            vm_stop(RUN_STATE_PANICKED);
> > > > > +            ret = -1;
> > > > > +            break;
> > > > >
> > > > 
> > > > If the management application is not aware of this event, then it will
> > > > never resume the guest, so it will appear hung.
> > >
> > > Even if the mgmt app doesn't know about the QEVENT_GUEST_PANICKED, it should
> > > still see a QEVENT_STOP event emitted by vm_stop() surely ? So it will
> > > know the guest CPUs have been stopped, even if it isn't aware of the
> > > reason why, which seems fine to me.
> > 
> > No.  The guest is stopped, and there's no reason to suppose that the
> > management app will restart it.  Behaviour has changed.
> > 
> > Suppose the guest has reboot_on_panic set; now the behaviour change is
> > even more visible - service will stop completely instead of being
> > interrupted for a bit while the guest reboots.
> 
> Hmm, so this calls for a new command line argument to control behaviour,
> similar to what we do for disk werror, eg something like
> 
>   --onpanic "report|pause|stop|..."
> 
> where
> 
>  report - emit QEVENT_GUEST_PANICKED only

Should be the default.

>  pause  - emit QEVENT_GUEST_PANICKED and pause VM
>  stop   - emit QEVENT_GUEST_PANICKED and quit VM

"quit" is a better name than "stop".

> This would map fairly well into libvirt, where we already have config
> parameters for controlling what todo with a guest when it panics.
> 
> Regards,
> Daniel
Wen Congyang - March 12, 2012, 1:46 a.m.
At 03/08/2012 07:56 PM, Daniel P. Berrange Wrote:
> On Thu, Mar 08, 2012 at 01:52:45PM +0200, Avi Kivity wrote:
>> On 03/08/2012 01:36 PM, Daniel P. Berrange wrote:
>>> On Thu, Mar 08, 2012 at 01:28:56PM +0200, Avi Kivity wrote:
>>>> On 03/08/2012 12:15 PM, Wen Congyang wrote:
>>>>> When the host knows the guest is panicked, it will set
>>>>> exit_reason to KVM_EXIT_GUEST_PANICKED. So if qemu receive
>>>>> this exit_reason, we can send a event to tell management
>>>>> application that the guest is panicked and set the guest
>>>>> status to RUN_STATE_PANICKED.
>>>>>
>>>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>>>> ---
>>>>>  kvm-all.c        |    5 +++++
>>>>>  monitor.c        |    3 +++
>>>>>  monitor.h        |    1 +
>>>>>  qapi-schema.json |    2 +-
>>>>>  qmp.c            |    3 ++-
>>>>>  vl.c             |    1 +
>>>>>  6 files changed, 13 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/kvm-all.c b/kvm-all.c
>>>>> index 77eadf6..b3c9a83 100644
>>>>> --- a/kvm-all.c
>>>>> +++ b/kvm-all.c
>>>>> @@ -1290,6 +1290,11 @@ int kvm_cpu_exec(CPUState *env)
>>>>>                      (uint64_t)run->hw.hardware_exit_reason);
>>>>>              ret = -1;
>>>>>              break;
>>>>> +        case KVM_EXIT_GUEST_PANICKED:
>>>>> +            monitor_protocol_event(QEVENT_GUEST_PANICKED, NULL);
>>>>> +            vm_stop(RUN_STATE_PANICKED);
>>>>> +            ret = -1;
>>>>> +            break;
>>>>>
>>>>
>>>> If the management application is not aware of this event, then it will
>>>> never resume the guest, so it will appear hung.
>>>
>>> Even if the mgmt app doesn't know about the QEVENT_GUEST_PANICKED, it should
>>> still see a QEVENT_STOP event emitted by vm_stop() surely ? So it will
>>> know the guest CPUs have been stopped, even if it isn't aware of the
>>> reason why, which seems fine to me.
>>
>> No.  The guest is stopped, and there's no reason to suppose that the
>> management app will restart it.  Behaviour has changed.
>>
>> Suppose the guest has reboot_on_panic set; now the behaviour change is
>> even more visible - service will stop completely instead of being
>> interrupted for a bit while the guest reboots.
> 
> Hmm, so this calls for a new command line argument to control behaviour,
> similar to what we do for disk werror, eg something like
> 
>   --onpanic "report|pause|stop|..."
> 
> where
> 
>  report - emit QEVENT_GUEST_PANICKED only

If the guest is panicked when libvirt is stopped, and we only emit a event,
we cannot know the guest is panicked when libvirt starts.

So I add a new RunState to solve this problem. If the guest is stopped when
it is panicked, it will change the behaviour. So I think the new RunState
should be a running state.

Thanks
Wen Congyang

>  pause  - emit QEVENT_GUEST_PANICKED and pause VM
>  stop   - emit QEVENT_GUEST_PANICKED and quit VM
>  stop   - emit QEVENT_GUEST_PANICKED and quit VM
> 
> This would map fairly well into libvirt, where we already have config
> parameters for controlling what todo with a guest when it panics.
> 
> Regards,
> Daniel
Anthony Liguori - March 21, 2012, 7:01 p.m.
On 03/09/2012 04:22 PM, Marcelo Tosatti wrote:
> On Thu, Mar 08, 2012 at 11:56:56AM +0000, Daniel P. Berrange wrote:
>> On Thu, Mar 08, 2012 at 01:52:45PM +0200, Avi Kivity wrote:
>>> On 03/08/2012 01:36 PM, Daniel P. Berrange wrote:
>>>> On Thu, Mar 08, 2012 at 01:28:56PM +0200, Avi Kivity wrote:
>>>>> On 03/08/2012 12:15 PM, Wen Congyang wrote:
>>>>>> When the host knows the guest is panicked, it will set
>>>>>> exit_reason to KVM_EXIT_GUEST_PANICKED. So if qemu receive
>>>>>> this exit_reason, we can send a event to tell management
>>>>>> application that the guest is panicked and set the guest
>>>>>> status to RUN_STATE_PANICKED.
>>>>>>
>>>>>> Signed-off-by: Wen Congyang<wency@cn.fujitsu.com>
>>>>>> ---
>>>>>>   kvm-all.c        |    5 +++++
>>>>>>   monitor.c        |    3 +++
>>>>>>   monitor.h        |    1 +
>>>>>>   qapi-schema.json |    2 +-
>>>>>>   qmp.c            |    3 ++-
>>>>>>   vl.c             |    1 +
>>>>>>   6 files changed, 13 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/kvm-all.c b/kvm-all.c
>>>>>> index 77eadf6..b3c9a83 100644
>>>>>> --- a/kvm-all.c
>>>>>> +++ b/kvm-all.c
>>>>>> @@ -1290,6 +1290,11 @@ int kvm_cpu_exec(CPUState *env)
>>>>>>                       (uint64_t)run->hw.hardware_exit_reason);
>>>>>>               ret = -1;
>>>>>>               break;
>>>>>> +        case KVM_EXIT_GUEST_PANICKED:
>>>>>> +            monitor_protocol_event(QEVENT_GUEST_PANICKED, NULL);
>>>>>> +            vm_stop(RUN_STATE_PANICKED);
>>>>>> +            ret = -1;
>>>>>> +            break;
>>>>>>
>>>>>
>>>>> If the management application is not aware of this event, then it will
>>>>> never resume the guest, so it will appear hung.
>>>>
>>>> Even if the mgmt app doesn't know about the QEVENT_GUEST_PANICKED, it should
>>>> still see a QEVENT_STOP event emitted by vm_stop() surely ? So it will
>>>> know the guest CPUs have been stopped, even if it isn't aware of the
>>>> reason why, which seems fine to me.
>>>
>>> No.  The guest is stopped, and there's no reason to suppose that the
>>> management app will restart it.  Behaviour has changed.
>>>
>>> Suppose the guest has reboot_on_panic set; now the behaviour change is
>>> even more visible - service will stop completely instead of being
>>> interrupted for a bit while the guest reboots.
>>
>> Hmm, so this calls for a new command line argument to control behaviour,
>> similar to what we do for disk werror, eg something like
>>
>>    --onpanic "report|pause|stop|..."
>>
>> where
>>
>>   report - emit QEVENT_GUEST_PANICKED only
>
> Should be the default.

Should we just have a mechanism to stop the guest on certain types of QMP 
events?  For instance:

-stop-on guest-panicked,block-ioerror

Likewise, we could have a:

-quit-on guest-panicked.

In the very least, we should make what we use for rerror,werror an enumeration 
that's shared here.

Regards,

Anthony Liguori

>
>>   pause  - emit QEVENT_GUEST_PANICKED and pause VM
>>   stop   - emit QEVENT_GUEST_PANICKED and quit VM
>
> "quit" is a better name than "stop".
>
>> This would map fairly well into libvirt, where we already have config
>> parameters for controlling what todo with a guest when it panics.
>>
>> Regards,
>> Daniel
>
>

Patch

diff --git a/kvm-all.c b/kvm-all.c
index 77eadf6..b3c9a83 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1290,6 +1290,11 @@  int kvm_cpu_exec(CPUState *env)
                     (uint64_t)run->hw.hardware_exit_reason);
             ret = -1;
             break;
+        case KVM_EXIT_GUEST_PANICKED:
+            monitor_protocol_event(QEVENT_GUEST_PANICKED, NULL);
+            vm_stop(RUN_STATE_PANICKED);
+            ret = -1;
+            break;
         case KVM_EXIT_INTERNAL_ERROR:
             ret = kvm_handle_internal_error(env, run);
             break;
diff --git a/monitor.c b/monitor.c
index cbdfbad..a0ad0a9 100644
--- a/monitor.c
+++ b/monitor.c
@@ -494,6 +494,9 @@  void monitor_protocol_event(MonitorEvent event, QObject *data)
         case QEVENT_WAKEUP:
             event_name = "WAKEUP";
             break;
+        case QEVENT_GUEST_PANICKED:
+            event_name = "GUEST_PANICKED";
+            break;
         default:
             abort();
             break;
diff --git a/monitor.h b/monitor.h
index 0d49800..94e8a3c 100644
--- a/monitor.h
+++ b/monitor.h
@@ -41,6 +41,7 @@  typedef enum MonitorEvent {
     QEVENT_DEVICE_TRAY_MOVED,
     QEVENT_SUSPEND,
     QEVENT_WAKEUP,
+    QEVENT_GUEST_PANICKED,
     QEVENT_MAX,
 } MonitorEvent;
 
diff --git a/qapi-schema.json b/qapi-schema.json
index 5f293c4..4f1ae20 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -121,7 +121,7 @@ 
 { 'enum': 'RunState',
   'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
             'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
-            'running', 'save-vm', 'shutdown', 'watchdog' ] }
+            'running', 'save-vm', 'shutdown', 'watchdog', 'panicked' ] }
 
 ##
 # @StatusInfo:
diff --git a/qmp.c b/qmp.c
index a182b51..e535969 100644
--- a/qmp.c
+++ b/qmp.c
@@ -148,7 +148,8 @@  void qmp_cont(Error **errp)
         error_set(errp, QERR_MIGRATION_EXPECTED);
         return;
     } else if (runstate_check(RUN_STATE_INTERNAL_ERROR) ||
-               runstate_check(RUN_STATE_SHUTDOWN)) {
+               runstate_check(RUN_STATE_SHUTDOWN) ||
+               runstate_check(RUN_STATE_PANICKED)) {
         error_set(errp, QERR_RESET_REQUIRED);
         return;
     }
diff --git a/vl.c b/vl.c
index 97ab2b9..65390fa 100644
--- a/vl.c
+++ b/vl.c
@@ -359,6 +359,7 @@  static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_SAVE_VM },
     { RUN_STATE_RUNNING, RUN_STATE_SHUTDOWN },
     { RUN_STATE_RUNNING, RUN_STATE_WATCHDOG },
+    { RUN_STATE_RUNNING, RUN_STATE_PANICKED },
 
     { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },