diff mbox

[1/1] qmp: process system-reset event in paused state

Message ID 1450256449-23779-1-git-send-email-den@openvz.org
State New
Headers show

Commit Message

Denis V. Lunev Dec. 16, 2015, 9 a.m. UTC
With pvpanic or HyperV panic devices could be moved into the paused state
with ' <on_crash>preserve</on_crash>'. In this state VM reacts only to
'virsh destroy' or 'continue'.

'virsh reset' command is usually used to force guest reset. The expectation
of the behavior of this command is that the guest will be force restarted.
This is not true at the moment.

Thus it is quite natural to process 'virh reset' aka qmp_system_reset
this way, i.e. allow to reset the guest. This behavior is similar to
one observed with 'reset' button on real hardware :)

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Markus Armbruster <armbru@redhat.com>
CC: Dmitry Andreev <dandreev@virtuozzo.com>
---
 qmp.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Paolo Bonzini Dec. 16, 2015, 9:12 a.m. UTC | #1
On 16/12/2015 10:00, Denis V. Lunev wrote:
> With pvpanic or HyperV panic devices could be moved into the paused state
> with ' <on_crash>preserve</on_crash>'. In this state VM reacts only to
> 'virsh destroy' or 'continue'.
> 
> 'virsh reset' command is usually used to force guest reset. The expectation
> of the behavior of this command is that the guest will be force restarted.
> This is not true at the moment.

Does "virsh reset" + "virsh continue" work, and if not why?

> Thus it is quite natural to process 'virh reset' aka qmp_system_reset
> this way, i.e. allow to reset the guest. This behavior is similar to
> one observed with 'reset' button on real hardware :)

Paolo

> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Markus Armbruster <armbru@redhat.com>
> CC: Dmitry Andreev <dandreev@virtuozzo.com>
> ---
>  qmp.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/qmp.c b/qmp.c
> index 0a1fa19..df17a33 100644
> --- a/qmp.c
> +++ b/qmp.c
> @@ -112,6 +112,10 @@ void qmp_stop(Error **errp)
>  void qmp_system_reset(Error **errp)
>  {
>      qemu_system_reset_request();
> +
> +    if (!runstate_is_running()) {
> +        vm_start();
> +    }
>  }
>  
>  void qmp_system_powerdown(Error **erp)
>
Denis V. Lunev Dec. 16, 2015, 9:32 a.m. UTC | #2
On 12/16/2015 12:12 PM, Paolo Bonzini wrote:
>
> On 16/12/2015 10:00, Denis V. Lunev wrote:
>> With pvpanic or HyperV panic devices could be moved into the paused state
>> with ' <on_crash>preserve</on_crash>'. In this state VM reacts only to
>> 'virsh destroy' or 'continue'.
>>
>> 'virsh reset' command is usually used to force guest reset. The expectation
>> of the behavior of this command is that the guest will be force restarted.
>> This is not true at the moment.
> Does "virsh reset" + "virsh continue" work, and if not why?
as far as I can see there is no such command in virsh at all :(

hades ~/src/qemu $ dpkg -l libvirt-bin
Desired=Unknown/Install/Remove/Purge/Hold
| 
Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version      Architecture Description
+++-==============-============-============-=================================
ii  libvirt-bin    1.2.16-2ubun amd64        programs for the libvirt 
library
hades ~/src/qemu $

hades ~/src/qemu $ virsh continue fedora20
error: unknown command: 'continue'
hades ~/src/qemu $

hades ~/src/libvirt $ cfind . | xargs fgrep \"continue\"
./src/conf/nwfilter_conf.c:              "continue");
hades ~/src/libvirt $ cfind . | xargs fgrep \"reset\"
./src/util/virpci.c:            STREQ(ent->d_name, "reset")) {
./src/conf/domain_conf.c:              "reset",
./src/conf/domain_conf.c:            if (STREQ(tmp, "reset")) {
./src/qemu/qemu_monitor_json.c:              "none", "pause", "reset", 
"poweroff", "shutdown", "debug", "inject-nmi");
./src/vmware/vmware_driver.c:        "reset", PROGRAM_SENTINEL, "soft", NULL
./src/access/viraccessperm.c:              "start", "stop", "reset",
./tools/virsh-domain.c:    {.name = "reset",
./tools/virsh-domain.c:    if (vshCommandOptBool(cmd, "reset"))
./tools/virsh-domain.c: * "reset" command
./tools/virsh-domain.c:              N_("reset"),
./tools/virsh-domain.c:    {.name = "reset",
hades ~/src/libvirt $

Do you propose to kludge libvirt and send 'continue' unconditionally
before a 'reset'? We can but there is not much sense not to implement
this is QEMU.

Sending command in 'crashed' state only is a bit racy. The event
moving domain into 'crashed' state could be not processed at
the moment.

Den
Paolo Bonzini Dec. 16, 2015, 9:35 a.m. UTC | #3
On 16/12/2015 10:32, Denis V. Lunev wrote:
>>>
>>>
>>> 'virsh reset' command is usually used to force guest reset. The
>>> expectation
>>> of the behavior of this command is that the guest will be force
>>> restarted.
>>> This is not true at the moment.
>> Does "virsh reset" + "virsh continue" work, and if not why?
> as far as I can see there is no such command in virsh at all :(

Right, it's "virsh resume". :)

Paolo
Peter Krempa Dec. 16, 2015, 9:37 a.m. UTC | #4
On Wed, Dec 16, 2015 at 12:32:13 +0300, Denis V. Lunev wrote:
> On 12/16/2015 12:12 PM, Paolo Bonzini wrote:
> >
> > On 16/12/2015 10:00, Denis V. Lunev wrote:
> >> With pvpanic or HyperV panic devices could be moved into the paused state
> >> with ' <on_crash>preserve</on_crash>'. In this state VM reacts only to
> >> 'virsh destroy' or 'continue'.
> >>
> >> 'virsh reset' command is usually used to force guest reset. The expectation
> >> of the behavior of this command is that the guest will be force restarted.
> >> This is not true at the moment.
> > Does "virsh reset" + "virsh continue" work, and if not why?
> as far as I can see there is no such command in virsh at all :(

Paolo probably meant 'virsh resume $VM'.
Denis V. Lunev Dec. 16, 2015, 9:50 a.m. UTC | #5
On 12/16/2015 12:35 PM, Paolo Bonzini wrote:
>
> On 16/12/2015 10:32, Denis V. Lunev wrote:
>>>>
>>>> 'virsh reset' command is usually used to force guest reset. The
>>>> expectation
>>>> of the behavior of this command is that the guest will be force
>>>> restarted.
>>>> This is not true at the moment.
>>> Does "virsh reset" + "virsh continue" work, and if not why?
>> as far as I can see there is no such command in virsh at all :(
> Right, it's "virsh resume". :)
>
> Paolo

actually it does not.

'virsh resume' does not emit command to trigger qmp_cont
in this state. With manual 'virsh qemu-monitor event {"execute": "cont"}'
it starts to work.

Do you propose to kludge libvirt and send 'continue' unconditionally
before a 'reset'? We can but there is not much sense not to implement
this is libvirt due to 'unconditionality'...

Sending command in 'crashed' state only is a bit racy. The event
moving domain into 'crashed' state could be not processed at
the moment.

Den
Peter Krempa Dec. 16, 2015, 9:50 a.m. UTC | #6
On Wed, Dec 16, 2015 at 10:12:20 +0100, Paolo Bonzini wrote:
> 
> 
> On 16/12/2015 10:00, Denis V. Lunev wrote:
> > With pvpanic or HyperV panic devices could be moved into the paused state
> > with ' <on_crash>preserve</on_crash>'. In this state VM reacts only to
> > 'virsh destroy' or 'continue'.
> > 
> > 'virsh reset' command is usually used to force guest reset. The expectation
> > of the behavior of this command is that the guest will be force restarted.
> > This is not true at the moment.
> 
> Does "virsh reset" + "virsh continue" work, and if not why?

So .. it won't work:

    state = virDomainObjGetState(vm, NULL);
    if (state == VIR_DOMAIN_PMSUSPENDED) {
        virReportError(VIR_ERR_OPERATION_INVALID,
                       "%s", _("domain is pmsuspended"));
        goto endjob;
    } else if (state == VIR_DOMAIN_PAUSED) {
        if (qemuProcessStartCPUs(driver, vm, dom->conn,
                                 VIR_DOMAIN_RUNNING_UNPAUSED,
                                 QEMU_ASYNC_JOB_NONE) < 0) {
            if (virGetLastError() == NULL)
                virReportError(VIR_ERR_OPERATION_FAILED,
                               "%s", _("resume operation failed"));
            goto endjob;
        }
        event = virDomainEventLifecycleNewFromObj(vm,
                                         VIR_DOMAIN_EVENT_RESUMED,
                                         VIR_DOMAIN_EVENT_RESUMED_UNPAUSED);
    }


We check that the state is "paused" and continue the vCPUs only in that
case. The panic devices will move the VM to 'crashed' state.
The code that is issuing 'system_reset' does not modify the state
in any way.

> 
> > Thus it is quite natural to process 'virh reset' aka qmp_system_reset
> > this way, i.e. allow to reset the guest. This behavior is similar to
> > one observed with 'reset' button on real hardware :)
> 
> Paolo
> 
> > Signed-off-by: Denis V. Lunev <den@openvz.org>
> > CC: Markus Armbruster <armbru@redhat.com>
> > CC: Dmitry Andreev <dandreev@virtuozzo.com>
> > ---
> >  qmp.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/qmp.c b/qmp.c
> > index 0a1fa19..df17a33 100644
> > --- a/qmp.c
> > +++ b/qmp.c
> > @@ -112,6 +112,10 @@ void qmp_stop(Error **errp)
> >  void qmp_system_reset(Error **errp)
> >  {
> >      qemu_system_reset_request();
> > +
> > +    if (!runstate_is_running()) {
> > +        vm_start();
> > +    }

I'd say NACK here. This will break the possibility to reset a system
while the vCPUs are paused. The problem should be fixed in libvirt.

> >  }
> >  
> >  void qmp_system_powerdown(Error **erp)
> > 
> 

Peter
Denis V. Lunev Dec. 16, 2015, 9:55 a.m. UTC | #7
On 12/16/2015 12:50 PM, Peter Krempa wrote:
> On Wed, Dec 16, 2015 at 10:12:20 +0100, Paolo Bonzini wrote:
>>
>> On 16/12/2015 10:00, Denis V. Lunev wrote:
>>> With pvpanic or HyperV panic devices could be moved into the paused state
>>> with ' <on_crash>preserve</on_crash>'. In this state VM reacts only to
>>> 'virsh destroy' or 'continue'.
>>>
>>> 'virsh reset' command is usually used to force guest reset. The expectation
>>> of the behavior of this command is that the guest will be force restarted.
>>> This is not true at the moment.
>> Does "virsh reset" + "virsh continue" work, and if not why?
> So .. it won't work:
>
>      state = virDomainObjGetState(vm, NULL);
>      if (state == VIR_DOMAIN_PMSUSPENDED) {
>          virReportError(VIR_ERR_OPERATION_INVALID,
>                         "%s", _("domain is pmsuspended"));
>          goto endjob;
>      } else if (state == VIR_DOMAIN_PAUSED) {
>          if (qemuProcessStartCPUs(driver, vm, dom->conn,
>                                   VIR_DOMAIN_RUNNING_UNPAUSED,
>                                   QEMU_ASYNC_JOB_NONE) < 0) {
>              if (virGetLastError() == NULL)
>                  virReportError(VIR_ERR_OPERATION_FAILED,
>                                 "%s", _("resume operation failed"));
>              goto endjob;
>          }
>          event = virDomainEventLifecycleNewFromObj(vm,
>                                           VIR_DOMAIN_EVENT_RESUMED,
>                                           VIR_DOMAIN_EVENT_RESUMED_UNPAUSED);
>      }
>
>
> We check that the state is "paused" and continue the vCPUs only in that
> case. The panic devices will move the VM to 'crashed' state.
> The code that is issuing 'system_reset' does not modify the state
> in any way.
>
>>> Thus it is quite natural to process 'virh reset' aka qmp_system_reset
>>> this way, i.e. allow to reset the guest. This behavior is similar to
>>> one observed with 'reset' button on real hardware :)
>> Paolo
>>
>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>> CC: Markus Armbruster <armbru@redhat.com>
>>> CC: Dmitry Andreev <dandreev@virtuozzo.com>
>>> ---
>>>   qmp.c | 4 ++++
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/qmp.c b/qmp.c
>>> index 0a1fa19..df17a33 100644
>>> --- a/qmp.c
>>> +++ b/qmp.c
>>> @@ -112,6 +112,10 @@ void qmp_stop(Error **errp)
>>>   void qmp_system_reset(Error **errp)
>>>   {
>>>       qemu_system_reset_request();
>>> +
>>> +    if (!runstate_is_running()) {
>>> +        vm_start();
>>> +    }
> I'd say NACK here. This will break the possibility to reset a system
> while the vCPUs are paused. The problem should be fixed in libvirt.

I do not get your explanation, sorry. If vCPUs are paused
original code just sets flags and do nothing else, i.e.
reset in this state will never happens until external kick.

Anyway, I will be fine with any solution here :) This is a
matter of consideration. QEMU kludge is just a way shorter.

Den
Paolo Bonzini Dec. 16, 2015, 12:02 p.m. UTC | #8
On 16/12/2015 10:50, Peter Krempa wrote:
> We check that the state is "paused" and continue the vCPUs only in
> that case. The panic devices will move the VM to 'crashed' state. 
> The code that is issuing 'system_reset' does not modify the state 
> in any way.

Ok, thanks.

> I'd say NACK here. This will break the possibility to reset a
> system while the vCPUs are paused. The problem should be fixed in
> libvirt.

It is indeed a QEMU bug, and it was introduced in commit df39076 ("vl:
allow "cont" from panicked state", 2013-11-04).

Until that commit, a system_reset in panicked state would change the
status to paused.  The commit changed that as a side effect of
removing VM_STATE_GUEST_PANICKED from runstate_needs_reset; see the
call to runstate_needs_reset in main_loop_should_exit.

IMO, after a reset, main_loop_should_exit should actually transition
to VM_STATE_PRELAUNCH (*not* RUN_STATE_PAUSED) for *all* states except
RUN_STATE_INMIGRATE, RUN_STATE_SAVE_VM (which I think cannot happen
there) and (of course) RUN_STATE_RUNNING.  Some changes will be required
to the transition table as well.

This will fix similar bugs for other runstates as well, though most of
them probably cannot be triggered from libvirt.

Thanks,

Paolo
Denis V. Lunev Dec. 16, 2015, 2:47 p.m. UTC | #9
On 12/16/2015 03:02 PM, Paolo Bonzini wrote:
>
> On 16/12/2015 10:50, Peter Krempa wrote:
>> We check that the state is "paused" and continue the vCPUs only in
>> that case. The panic devices will move the VM to 'crashed' state.
>> The code that is issuing 'system_reset' does not modify the state
>> in any way.
> Ok, thanks.
>
>> I'd say NACK here. This will break the possibility to reset a
>> system while the vCPUs are paused. The problem should be fixed in
>> libvirt.
> It is indeed a QEMU bug, and it was introduced in commit df39076 ("vl:
> allow "cont" from panicked state", 2013-11-04).
>
> Until that commit, a system_reset in panicked state would change the
> status to paused.  The commit changed that as a side effect of
> removing VM_STATE_GUEST_PANICKED from runstate_needs_reset; see the
> call to runstate_needs_reset in main_loop_should_exit.
>
> IMO, after a reset, main_loop_should_exit should actually transition
> to VM_STATE_PRELAUNCH (*not* RUN_STATE_PAUSED) for *all* states except
> RUN_STATE_INMIGRATE, RUN_STATE_SAVE_VM (which I think cannot happen
> there) and (of course) RUN_STATE_RUNNING.  Some changes will be required
> to the transition table as well.
>
> This will fix similar bugs for other runstates as well, though most of
> them probably cannot be triggered from libvirt.
>
> Thanks,
>
> Paolo
ok. Thank you for this input. I'll analyse this and come with
corrected patch :)

Den
Denis V. Lunev Jan. 11, 2016, 10:31 a.m. UTC | #10
On 12/16/2015 05:47 PM, Denis V. Lunev wrote:
> On 12/16/2015 03:02 PM, Paolo Bonzini wrote:
>>
>> On 16/12/2015 10:50, Peter Krempa wrote:
>>> We check that the state is "paused" and continue the vCPUs only in
>>> that case. The panic devices will move the VM to 'crashed' state.
>>> The code that is issuing 'system_reset' does not modify the state
>>> in any way.
>> Ok, thanks.
>>
>>> I'd say NACK here. This will break the possibility to reset a
>>> system while the vCPUs are paused. The problem should be fixed in
>>> libvirt.
>> It is indeed a QEMU bug, and it was introduced in commit df39076 ("vl:
>> allow "cont" from panicked state", 2013-11-04).
>>
>> Until that commit, a system_reset in panicked state would change the
>> status to paused.  The commit changed that as a side effect of
>> removing VM_STATE_GUEST_PANICKED from runstate_needs_reset; see the
>> call to runstate_needs_reset in main_loop_should_exit.
>>
>> IMO, after a reset, main_loop_should_exit should actually transition
>> to VM_STATE_PRELAUNCH (*not* RUN_STATE_PAUSED) for *all* states except
>> RUN_STATE_INMIGRATE, RUN_STATE_SAVE_VM (which I think cannot happen
>> there) and (of course) RUN_STATE_RUNNING.  Some changes will be required
>> to the transition table as well.
>>
>> This will fix similar bugs for other runstates as well, though most of
>> them probably cannot be triggered from libvirt.
>>
>> Thanks,
>>
>> Paolo
> ok. Thank you for this input. I'll analyse this and come with
> corrected patch :)
>
> Den

What would be correct procedure to handle this state?

Setting VM_STATE_PRELAUNCH in main_loop_should_exit does not
move QEMU into VM_STATE_RUNNING and thus subsequent 'resume'
command is necessary.

In this case the processing of 'reset' command should be different
in libvirt, i.e. libvirt should send two commands ('reset' and 'resume')
in this state.

Does I understand right?  In the other case we could stick to shortcut
proposed by me in the original patch.

Den
Paolo Bonzini Jan. 11, 2016, 11:29 a.m. UTC | #11
On 11/01/2016 11:31, Denis V. Lunev wrote:
>>>
>>> IMO, after a reset, main_loop_should_exit should actually transition
>>> to VM_STATE_PRELAUNCH (*not* RUN_STATE_PAUSED) for *all* states except
>>> RUN_STATE_INMIGRATE, RUN_STATE_SAVE_VM (which I think cannot happen
>>> there) and (of course) RUN_STATE_RUNNING.  Some changes will be required
>>> to the transition table as well.
>>>
>>> This will fix similar bugs for other runstates as well, though most of
>>> them probably cannot be triggered from libvirt.
>>
>> ok. Thank you for this input. I'll analyse this and come with
>> corrected patch :)
> 
> What would be correct procedure to handle this state?
> 
> Setting VM_STATE_PRELAUNCH in main_loop_should_exit does not
> move QEMU into VM_STATE_RUNNING and thus subsequent 'resume'
> command is necessary.
> 
> In this case the processing of 'reset' command should be different
> in libvirt, i.e. libvirt should send two commands ('reset' and 'resume')
> in this state.

Either that, or libvirt's client should send two of them.

As far as QEMU is concerned, a paused VM remains paused after a reset,
no matter why it was reset.  I'm not sure what are the desired semantics
for libvirt.

Paolo
diff mbox

Patch

diff --git a/qmp.c b/qmp.c
index 0a1fa19..df17a33 100644
--- a/qmp.c
+++ b/qmp.c
@@ -112,6 +112,10 @@  void qmp_stop(Error **errp)
 void qmp_system_reset(Error **errp)
 {
     qemu_system_reset_request();
+
+    if (!runstate_is_running()) {
+        vm_start();
+    }
 }
 
 void qmp_system_powerdown(Error **erp)