Patchwork [v4,05/11] suspend: add infrastructure

login
register
mail settings
Submitter Gerd Hoffmann
Date Feb. 9, 2012, 5:05 p.m.
Message ID <1328807143-29499-6-git-send-email-kraxel@redhat.com>
Download mbox | patch
Permalink /patch/140409/
State New
Headers show

Comments

Gerd Hoffmann - Feb. 9, 2012, 5:05 p.m.
This patch adds some infrastructure to handle suspend and resume to
qemu.  First there are two functions to switch state and second there
is a suspend notifier:

 * qemu_system_suspend_request is supposed to be called when the
   guest asks for being be suspended, for example via ACPI.

 * qemu_system_wakeup_request is supposed to be called on events
   which should wake up the guest.

 * qemu_register_suspend_notifier can be used to register a notifier
   which will be called when the guest is suspended.  Machine types
   and device models can hook in there to modify state if needed.

 * qemu_register_wakeup_notifier can be used to register a notifier
   which will be called when the guest is woken up.  Machine types
   and device models can hook in there to modify state if needed.

 * qemu_system_wakeup_enable can be used to enable/disable wakeup
   events.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 sysemu.h |    9 +++++++++
 vl.c     |   49 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+), 0 deletions(-)
Gleb Natapov - Feb. 13, 2012, 9:21 a.m.
On Thu, Feb 09, 2012 at 06:05:37PM +0100, Gerd Hoffmann wrote:
> This patch adds some infrastructure to handle suspend and resume to
> qemu.  First there are two functions to switch state and second there
> is a suspend notifier:
> 
>  * qemu_system_suspend_request is supposed to be called when the
>    guest asks for being be suspended, for example via ACPI.
> 
>  * qemu_system_wakeup_request is supposed to be called on events
>    which should wake up the guest.
> 
>  * qemu_register_suspend_notifier can be used to register a notifier
>    which will be called when the guest is suspended.  Machine types
>    and device models can hook in there to modify state if needed.
> 
>  * qemu_register_wakeup_notifier can be used to register a notifier
>    which will be called when the guest is woken up.  Machine types
>    and device models can hook in there to modify state if needed.
> 
>  * qemu_system_wakeup_enable can be used to enable/disable wakeup
>    events.
> 
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>  sysemu.h |    9 +++++++++
>  vl.c     |   49 +++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 58 insertions(+), 0 deletions(-)
> 
> diff --git a/sysemu.h b/sysemu.h
> index 9d5ce33..af73813 100644
> --- a/sysemu.h
> +++ b/sysemu.h
> @@ -38,7 +38,16 @@ void vm_start(void);
>  void vm_stop(RunState state);
>  void vm_stop_force_state(RunState state);
>  
> +typedef enum WakeupReason {
> +    QEMU_WAKEUP_REASON_OTHER = 0,
> +} WakeupReason;
> +
>  void qemu_system_reset_request(void);
> +void qemu_system_suspend_request(void);
> +void qemu_register_suspend_notifier(Notifier *notifier);
> +void qemu_system_wakeup_request(WakeupReason reason);
> +void qemu_system_wakeup_enable(WakeupReason reason, bool enabled);
> +void qemu_register_wakeup_notifier(Notifier *notifier);
>  void qemu_system_shutdown_request(void);
>  void qemu_system_powerdown_request(void);
>  void qemu_system_debug_request(void);
> diff --git a/vl.c b/vl.c
> index 63dd725..5095e06 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1283,6 +1283,12 @@ static int shutdown_requested, shutdown_signal = -1;
>  static pid_t shutdown_pid;
>  static int powerdown_requested;
>  static int debug_requested;
> +static bool is_suspended;
> +static NotifierList suspend_notifiers =
> +    NOTIFIER_LIST_INITIALIZER(suspend_notifiers);
> +static NotifierList wakeup_notifiers =
> +    NOTIFIER_LIST_INITIALIZER(wakeup_notifiers);
> +static uint32_t wakeup_reason_mask = ~0;
>  static RunState vmstop_requested = RUN_STATE_MAX;
>  
>  int qemu_shutdown_requested_get(void)
> @@ -1398,6 +1404,49 @@ void qemu_system_reset_request(void)
>      qemu_notify_event();
>  }
>  
> +void qemu_system_suspend_request(void)
> +{
> +    if (is_suspended) {
> +        return;
> +    }
> +    cpu_stop_current();
> +    notifier_list_notify(&suspend_notifiers, NULL);
> +    is_suspended = true;
> +}
> +
Shouldn't we stop the whole VM at some point, not only vcpu that
does ACPI IO? May be I missed where it is done in the patch series.

> +void qemu_register_suspend_notifier(Notifier *notifier)
> +{
> +    notifier_list_add(&suspend_notifiers, notifier);
> +}
> +
> +void qemu_system_wakeup_request(WakeupReason reason)
> +{
> +    if (!is_suspended) {
> +        return;
> +    }
> +    if (!(wakeup_reason_mask & (1 << reason))) {
> +        return;
> +    }
> +    notifier_list_notify(&wakeup_notifiers, &reason);
> +    reset_requested = 1;
> +    qemu_notify_event();
> +    is_suspended = false;
> +}
> +
> +void qemu_system_wakeup_enable(WakeupReason reason, bool enabled)
> +{
> +    if (enabled) {
> +        wakeup_reason_mask |= (1 << reason);
> +    } else {
> +        wakeup_reason_mask &= ~(1 << reason);
> +    }
> +}
> +
> +void qemu_register_wakeup_notifier(Notifier *notifier)
> +{
> +    notifier_list_add(&wakeup_notifiers, notifier);
> +}
> +
>  void qemu_system_killed(int signal, pid_t pid)
>  {
>      shutdown_signal = signal;
> -- 
> 1.7.1

--
			Gleb.
Gerd Hoffmann - Feb. 14, 2012, 8:18 a.m.
Hi,

> Shouldn't we stop the whole VM at some point, not only vcpu that
> does ACPI IO? May be I missed where it is done in the patch series.

It isn't hidden elsewhere, qemu doesn't do it.   The code was like that
before, and I think the reason is that the guest has to stop the other
cpus before entering s3.

cheers,
  Gerd
Gleb Natapov - Feb. 14, 2012, 8:37 a.m.
On Tue, Feb 14, 2012 at 09:18:34AM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > Shouldn't we stop the whole VM at some point, not only vcpu that
> > does ACPI IO? May be I missed where it is done in the patch series.
> 
> It isn't hidden elsewhere, qemu doesn't do it.   The code was like that
> before, and I think the reason is that the guest has to stop the other
> cpus before entering s3.
> 
Current code calls qemu_system_reset_request() which takes care of
stopping (and reseting) all vcpus (and rest of the machine) by setting
reset_requested flag immediately on suspend. We cannot just stop
current cpu on S3 and delay reset till wakeup since guest can leave
other vcpus in spinning state and they will take 100% of host cpu while
guest is suspended. I think it is also important to reset all device
immediately to ensure that no device will do DMA into main memory after
suspend. Technically if this happens it would be a guest bug since
guest should make sure that devices are stopped before entering S3,
but I wouldn't want to debug such bug report :)

--
			Gleb.
Gerd Hoffmann - Feb. 14, 2012, 8:57 a.m.
On 02/14/12 09:37, Gleb Natapov wrote:
> On Tue, Feb 14, 2012 at 09:18:34AM +0100, Gerd Hoffmann wrote:
>>   Hi,
>>
>>> Shouldn't we stop the whole VM at some point, not only vcpu that
>>> does ACPI IO? May be I missed where it is done in the patch series.
>>
>> It isn't hidden elsewhere, qemu doesn't do it.   The code was like that
>> before, and I think the reason is that the guest has to stop the other
>> cpus before entering s3.
>>
> Current code calls qemu_system_reset_request() which takes care of
> stopping (and reseting) all vcpus (and rest of the machine) by setting
> reset_requested flag immediately on suspend.

No.  The current code simply has no separate suspend and wakeup steps.

> We cannot just stop
> current cpu on S3 and delay reset till wakeup since guest can leave
> other vcpus in spinning state and they will take 100% of host cpu while
> guest is suspended.

I see.  I've expeced the the guest os putting them into a hlt loop or
some simliar idle state.  Play save and expliticly pausing them all is
certainly good from a robustness perspective.

> I think it is also important to reset all device
> immediately to ensure that no device will do DMA into main memory after
> suspend.

Didn't investigate yet, but I suspect this could break wakeup from pci
devices (nic, usb-tablet via uhci) ...

> Technically if this happens it would be a guest bug since
> guest should make sure that devices are stopped before entering S3,
> but I wouldn't want to debug such bug report :)

... and it shouldn't be needed.  Although I agree that bugs in that area
are nasty to debug ...

cheers,
  Gerd
Gleb Natapov - Feb. 14, 2012, 9:08 a.m.
On Tue, Feb 14, 2012 at 09:57:01AM +0100, Gerd Hoffmann wrote:
> On 02/14/12 09:37, Gleb Natapov wrote:
> > On Tue, Feb 14, 2012 at 09:18:34AM +0100, Gerd Hoffmann wrote:
> >>   Hi,
> >>
> >>> Shouldn't we stop the whole VM at some point, not only vcpu that
> >>> does ACPI IO? May be I missed where it is done in the patch series.
> >>
> >> It isn't hidden elsewhere, qemu doesn't do it.   The code was like that
> >> before, and I think the reason is that the guest has to stop the other
> >> cpus before entering s3.
> >>
> > Current code calls qemu_system_reset_request() which takes care of
> > stopping (and reseting) all vcpus (and rest of the machine) by setting
> > reset_requested flag immediately on suspend.
> 
> No.  The current code simply has no separate suspend and wakeup steps.
> 
Nod.

> > We cannot just stop
> > current cpu on S3 and delay reset till wakeup since guest can leave
> > other vcpus in spinning state and they will take 100% of host cpu while
> > guest is suspended.
> 
> I see.  I've expeced the the guest os putting them into a hlt loop or
> some simliar idle state.  Play save and expliticly pausing them all is
> certainly good from a robustness perspective.
Yes. We should not trust a guest to do the "right thing".

> 
> > I think it is also important to reset all device
> > immediately to ensure that no device will do DMA into main memory after
> > suspend.
> 
> Didn't investigate yet, but I suspect this could break wakeup from pci
> devices (nic, usb-tablet via uhci) ...

Yes. Can't say I fully understand how this works on real HW. I know
that there are separate "power planes" for different system sates
(this is defined in ACPI spec). So in S3 some devices (or even part of
a device?) may be powered down, but others still have power. Not sure
we should dive into emulating that in this patch series.

> 
> > Technically if this happens it would be a guest bug since
> > guest should make sure that devices are stopped before entering S3,
> > but I wouldn't want to debug such bug report :)
> 
> ... and it shouldn't be needed.  Although I agree that bugs in that area
> are nasty to debug ...
> 
> cheers,
>   Gerd

--
			Gleb.

Patch

diff --git a/sysemu.h b/sysemu.h
index 9d5ce33..af73813 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -38,7 +38,16 @@  void vm_start(void);
 void vm_stop(RunState state);
 void vm_stop_force_state(RunState state);
 
+typedef enum WakeupReason {
+    QEMU_WAKEUP_REASON_OTHER = 0,
+} WakeupReason;
+
 void qemu_system_reset_request(void);
+void qemu_system_suspend_request(void);
+void qemu_register_suspend_notifier(Notifier *notifier);
+void qemu_system_wakeup_request(WakeupReason reason);
+void qemu_system_wakeup_enable(WakeupReason reason, bool enabled);
+void qemu_register_wakeup_notifier(Notifier *notifier);
 void qemu_system_shutdown_request(void);
 void qemu_system_powerdown_request(void);
 void qemu_system_debug_request(void);
diff --git a/vl.c b/vl.c
index 63dd725..5095e06 100644
--- a/vl.c
+++ b/vl.c
@@ -1283,6 +1283,12 @@  static int shutdown_requested, shutdown_signal = -1;
 static pid_t shutdown_pid;
 static int powerdown_requested;
 static int debug_requested;
+static bool is_suspended;
+static NotifierList suspend_notifiers =
+    NOTIFIER_LIST_INITIALIZER(suspend_notifiers);
+static NotifierList wakeup_notifiers =
+    NOTIFIER_LIST_INITIALIZER(wakeup_notifiers);
+static uint32_t wakeup_reason_mask = ~0;
 static RunState vmstop_requested = RUN_STATE_MAX;
 
 int qemu_shutdown_requested_get(void)
@@ -1398,6 +1404,49 @@  void qemu_system_reset_request(void)
     qemu_notify_event();
 }
 
+void qemu_system_suspend_request(void)
+{
+    if (is_suspended) {
+        return;
+    }
+    cpu_stop_current();
+    notifier_list_notify(&suspend_notifiers, NULL);
+    is_suspended = true;
+}
+
+void qemu_register_suspend_notifier(Notifier *notifier)
+{
+    notifier_list_add(&suspend_notifiers, notifier);
+}
+
+void qemu_system_wakeup_request(WakeupReason reason)
+{
+    if (!is_suspended) {
+        return;
+    }
+    if (!(wakeup_reason_mask & (1 << reason))) {
+        return;
+    }
+    notifier_list_notify(&wakeup_notifiers, &reason);
+    reset_requested = 1;
+    qemu_notify_event();
+    is_suspended = false;
+}
+
+void qemu_system_wakeup_enable(WakeupReason reason, bool enabled)
+{
+    if (enabled) {
+        wakeup_reason_mask |= (1 << reason);
+    } else {
+        wakeup_reason_mask &= ~(1 << reason);
+    }
+}
+
+void qemu_register_wakeup_notifier(Notifier *notifier)
+{
+    notifier_list_add(&wakeup_notifiers, notifier);
+}
+
 void qemu_system_killed(int signal, pid_t pid)
 {
     shutdown_signal = signal;