diff mbox

[v5,07/16] apic: Open-code timer save/restore

Message ID 61e59db37279bb3834b996c84e9a0523638f5e35.1323952403.git.jan.kiszka@siemens.com
State New
Headers show

Commit Message

Jan Kiszka Dec. 15, 2011, 12:33 p.m. UTC
To enable migration between accelerated and non-accelerated APIC models,
we will need to handle the timer saving and restoring specially and can
no longer rely on the automatics of VMSTATE_TIMER. Specifically,
accelerated model will not start any QEMUTimer.

This patch therefore factors out the generic bits into apic_next_timer
and introduces a post-load callback that can be implemented differently
by both models.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/apic.c          |   30 ++++++++++++------------------
 hw/apic_common.c   |   51 +++++++++++++++++++++++++++++++++++++++++++++++++--
 hw/apic_internal.h |    3 +++
 3 files changed, 64 insertions(+), 20 deletions(-)

Comments

Anthony Liguori Dec. 19, 2011, 10:21 p.m. UTC | #1
On 12/15/2011 06:33 AM, Jan Kiszka wrote:
> To enable migration between accelerated and non-accelerated APIC models,
> we will need to handle the timer saving and restoring specially and can
> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
> accelerated model will not start any QEMUTimer.
>
> This patch therefore factors out the generic bits into apic_next_timer
> and introduces a post-load callback that can be implemented differently
> by both models.
>
> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>

So you basically want the timer to be a dummy field for the in-kernel apic?

Can you fix this up in a pre-save routine (put QEMUTimer into a state where 
there isn't an event pending)?

Regards,

Anthony Liguori
Jan Kiszka Dec. 19, 2011, 11:45 p.m. UTC | #2
On 2011-12-19 23:21, Anthony Liguori wrote:
> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>> To enable migration between accelerated and non-accelerated APIC models,
>> we will need to handle the timer saving and restoring specially and can
>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>> accelerated model will not start any QEMUTimer.
>>
>> This patch therefore factors out the generic bits into apic_next_timer
>> and introduces a post-load callback that can be implemented differently
>> by both models.
>>
>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
> 
> So you basically want the timer to be a dummy field for the in-kernel apic?
> 
> Can you fix this up in a pre-save routine (put QEMUTimer into a state
> where there isn't an event pending)?

It is not a dummy field, it contains the proper state in both cases. We
just need to convert it to an open-coded state to avoid the QEMUTimer
restoration magic in the in-kernel case (where there must be no QEMUTimer).

Jan
Anthony Liguori Dec. 20, 2011, 12:31 a.m. UTC | #3
On 12/19/2011 05:45 PM, Jan Kiszka wrote:
> On 2011-12-19 23:21, Anthony Liguori wrote:
>> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>>> To enable migration between accelerated and non-accelerated APIC models,
>>> we will need to handle the timer saving and restoring specially and can
>>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>>> accelerated model will not start any QEMUTimer.
>>>
>>> This patch therefore factors out the generic bits into apic_next_timer
>>> and introduces a post-load callback that can be implemented differently
>>> by both models.
>>>
>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>
>> So you basically want the timer to be a dummy field for the in-kernel apic?
>>
>> Can you fix this up in a pre-save routine (put QEMUTimer into a state
>> where there isn't an event pending)?
>
> It is not a dummy field, it contains the proper state in both cases. We
> just need to convert it to an open-coded state to avoid the QEMUTimer
> restoration magic in the in-kernel case (where there must be no QEMUTimer).

So the state gets fed into the kernel instead of userspace?

This seems a bit much to me, can't we just have two VMStateDescriptions that 
happen to look the same and break migration between userspace and in-kernel?

Are we trying to solve a problem no one cares about?

If you want to avoid regressing migration compat in qemu-kvm, have the vmstate 
name both be the same, it can be two separate devices as the vmstate name is not 
tied to the qdev name right now.

Regards,

Anthony Liguori

>
> Jan
>
Jan Kiszka Dec. 20, 2011, 12:34 a.m. UTC | #4
On 2011-12-20 01:31, Anthony Liguori wrote:
> On 12/19/2011 05:45 PM, Jan Kiszka wrote:
>> On 2011-12-19 23:21, Anthony Liguori wrote:
>>> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>>>> To enable migration between accelerated and non-accelerated APIC
>>>> models,
>>>> we will need to handle the timer saving and restoring specially and can
>>>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>>>> accelerated model will not start any QEMUTimer.
>>>>
>>>> This patch therefore factors out the generic bits into apic_next_timer
>>>> and introduces a post-load callback that can be implemented differently
>>>> by both models.
>>>>
>>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>>
>>> So you basically want the timer to be a dummy field for the in-kernel
>>> apic?
>>>
>>> Can you fix this up in a pre-save routine (put QEMUTimer into a state
>>> where there isn't an event pending)?
>>
>> It is not a dummy field, it contains the proper state in both cases. We
>> just need to convert it to an open-coded state to avoid the QEMUTimer
>> restoration magic in the in-kernel case (where there must be no
>> QEMUTimer).
> 
> So the state gets fed into the kernel instead of userspace?

Nope. It's kept for eventual use by a user space model.

> 
> This seems a bit much to me, can't we just have two VMStateDescriptions
> that happen to look the same and break migration between userspace and
> in-kernel?

There is nothing broken, at least according to my tests. Migration works
between both backend variants.

Jan
Anthony Liguori Dec. 20, 2011, 12:53 a.m. UTC | #5
On 12/19/2011 06:34 PM, Jan Kiszka wrote:
> On 2011-12-20 01:31, Anthony Liguori wrote:
>> On 12/19/2011 05:45 PM, Jan Kiszka wrote:
>>> On 2011-12-19 23:21, Anthony Liguori wrote:
>>>> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>>>>> To enable migration between accelerated and non-accelerated APIC
>>>>> models,
>>>>> we will need to handle the timer saving and restoring specially and can
>>>>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>>>>> accelerated model will not start any QEMUTimer.
>>>>>
>>>>> This patch therefore factors out the generic bits into apic_next_timer
>>>>> and introduces a post-load callback that can be implemented differently
>>>>> by both models.
>>>>>
>>>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>>>
>>>> So you basically want the timer to be a dummy field for the in-kernel
>>>> apic?
>>>>
>>>> Can you fix this up in a pre-save routine (put QEMUTimer into a state
>>>> where there isn't an event pending)?
>>>
>>> It is not a dummy field, it contains the proper state in both cases. We
>>> just need to convert it to an open-coded state to avoid the QEMUTimer
>>> restoration magic in the in-kernel case (where there must be no
>>> QEMUTimer).
>>
>> So the state gets fed into the kernel instead of userspace?
>
> Nope. It's kept for eventual use by a user space model.

I think you misunderstood my comments.

When you are using the in-kernel APIC, the is no implementation for the 
post_load hook.  As far as I can tell, the state isn't used.

I know it's used by the user space model but from what I can tell, the value is 
essentially sync with the in-kernel APIC almost immediately as it happens during 
KVM_RUN.

So it's a QEMUTimer in the userspace model, but it's just an integer when used 
in the in-kernel APIC as the timer never fires.  It is just saved/restored from 
and to the kernel.

Is this correct?

Regards,

Anthony Liguori
Jan Kiszka Dec. 20, 2011, 1:24 a.m. UTC | #6
On 2011-12-20 01:53, Anthony Liguori wrote:
> On 12/19/2011 06:34 PM, Jan Kiszka wrote:
>> On 2011-12-20 01:31, Anthony Liguori wrote:
>>> On 12/19/2011 05:45 PM, Jan Kiszka wrote:
>>>> On 2011-12-19 23:21, Anthony Liguori wrote:
>>>>> On 12/15/2011 06:33 AM, Jan Kiszka wrote:
>>>>>> To enable migration between accelerated and non-accelerated APIC
>>>>>> models,
>>>>>> we will need to handle the timer saving and restoring specially
>>>>>> and can
>>>>>> no longer rely on the automatics of VMSTATE_TIMER. Specifically,
>>>>>> accelerated model will not start any QEMUTimer.
>>>>>>
>>>>>> This patch therefore factors out the generic bits into
>>>>>> apic_next_timer
>>>>>> and introduces a post-load callback that can be implemented
>>>>>> differently
>>>>>> by both models.
>>>>>>
>>>>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>>>>
>>>>> So you basically want the timer to be a dummy field for the in-kernel
>>>>> apic?
>>>>>
>>>>> Can you fix this up in a pre-save routine (put QEMUTimer into a state
>>>>> where there isn't an event pending)?
>>>>
>>>> It is not a dummy field, it contains the proper state in both cases. We
>>>> just need to convert it to an open-coded state to avoid the QEMUTimer
>>>> restoration magic in the in-kernel case (where there must be no
>>>> QEMUTimer).
>>>
>>> So the state gets fed into the kernel instead of userspace?
>>
>> Nope. It's kept for eventual use by a user space model.
> 
> I think you misunderstood my comments.
> 
> When you are using the in-kernel APIC, the is no implementation for the
> post_load hook.  As far as I can tell, the state isn't used.

Correct, it's just kept up to date.

> 
> I know it's used by the user space model but from what I can tell, the
> value is essentially sync with the in-kernel APIC almost immediately as
> it happens during KVM_RUN.
> 
> So it's a QEMUTimer in the userspace model, but it's just an integer
> when used in the in-kernel APIC as the timer never fires.  It is just
> saved/restored from and to the kernel.
> 
> Is this correct?

Almost. timer_expiry is calculated on get_apic_state based on the APIC
registers. And it is initialized on reset. But it is never saved into
the kernel nor does it otherwise affect the in-kernel model. It is
really just a compatibility field for potential user space apic usage.

Jan
diff mbox

Patch

diff --git a/hw/apic.c b/hw/apic.c
index 5fa3111..36d4ff3 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -521,25 +521,9 @@  static uint32_t apic_get_current_count(APICState *s)
 
 static void apic_timer_update(APICState *s, int64_t current_time)
 {
-    int64_t next_time, d;
-
-    if (!(s->lvt[APIC_LVT_TIMER] & APIC_LVT_MASKED)) {
-        d = (current_time - s->initial_count_load_time) >>
-            s->count_shift;
-        if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
-            if (!s->initial_count)
-                goto no_timer;
-            d = ((d / ((uint64_t)s->initial_count + 1)) + 1) * ((uint64_t)s->initial_count + 1);
-        } else {
-            if (d >= s->initial_count)
-                goto no_timer;
-            d = (uint64_t)s->initial_count + 1;
-        }
-        next_time = s->initial_count_load_time + (d << s->count_shift);
-        qemu_mod_timer(s->timer, next_time);
-        s->next_time = next_time;
+    if (apic_next_timer(s, current_time)) {
+        qemu_mod_timer(s->timer, s->next_time);
     } else {
-    no_timer:
         qemu_del_timer(s->timer);
     }
 }
@@ -770,12 +754,22 @@  static void apic_backend_init(APICState *s)
     local_apics[s->idx] = s;
 }
 
+static void apic_post_load(APICState *s)
+{
+    if (s->timer_expiry != -1) {
+        qemu_mod_timer(s->timer, s->timer_expiry);
+    } else {
+        qemu_del_timer(s->timer);
+    }
+}
+
 static APICBackend apic_backend = {
     .name = "QEMU",
     .init = apic_backend_init,
     .set_base = apic_set_base,
     .set_tpr = apic_set_tpr,
     .external_nmi = apic_external_nmi,
+    .post_load = apic_post_load,
 };
 
 static void apic_register_devices(void)
diff --git a/hw/apic_common.c b/hw/apic_common.c
index 4cdc45c..3d345b9 100644
--- a/hw/apic_common.c
+++ b/hw/apic_common.c
@@ -89,6 +89,39 @@  void apic_deliver_nmi(DeviceState *d)
     s->backend->external_nmi(s);
 }
 
+bool apic_next_timer(APICState *s, int64_t current_time)
+{
+    int64_t d;
+
+    /* We need to store the timer state separately to support APIC
+     * implementations that maintain a non-QEMU timer, e.g. inside the
+     * host kernel. This open-coded state allows us to migrate between
+     * both models. */
+    s->timer_expiry = -1;
+
+    if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_MASKED) {
+        return false;
+    }
+
+    d = (current_time - s->initial_count_load_time) >> s->count_shift;
+
+    if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
+        if (!s->initial_count) {
+            return false;
+        }
+        d = ((d / ((uint64_t)s->initial_count + 1)) + 1) *
+            ((uint64_t)s->initial_count + 1);
+    } else {
+        if (d >= s->initial_count) {
+            return false;
+        }
+        d = (uint64_t)s->initial_count + 1;
+    }
+    s->next_time = s->initial_count_load_time + (d << s->count_shift);
+    s->timer_expiry = s->next_time;
+    return true;
+}
+
 void apic_init_reset(DeviceState *d)
 {
     APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
@@ -116,7 +149,10 @@  void apic_init_reset(DeviceState *d)
     s->next_time = 0;
     s->wait_for_sipi = 1;
 
-    qemu_del_timer(s->timer);
+    if (s->timer) {
+        qemu_del_timer(s->timer);
+    }
+    s->timer_expiry = -1;
 }
 
 static void apic_reset(DeviceState *d)
@@ -181,12 +217,23 @@  static int apic_load_old(QEMUFile *f, void *opaque, int version_id)
     return 0;
 }
 
+static int apic_dispatch_post_load(void *opaque, int version_id)
+{
+    APICState *s = opaque;
+
+    if (s->backend->post_load) {
+        s->backend->post_load(s);
+    }
+    return 0;
+}
+
 static const VMStateDescription vmstate_apic = {
     .name = "apic",
     .version_id = 3,
     .minimum_version_id = 3,
     .minimum_version_id_old = 1,
     .load_state_old = apic_load_old,
+    .post_load = apic_dispatch_post_load,
     .fields      = (VMStateField[]) {
         VMSTATE_UINT32(apicbase, APICState),
         VMSTATE_UINT8(id, APICState),
@@ -206,7 +253,7 @@  static const VMStateDescription vmstate_apic = {
         VMSTATE_UINT32(initial_count, APICState),
         VMSTATE_INT64(initial_count_load_time, APICState),
         VMSTATE_INT64(next_time, APICState),
-        VMSTATE_TIMER(timer, APICState),
+        VMSTATE_INT64(timer_expiry, APICState), /* open-coded timer state */
         VMSTATE_END_OF_LIST()
     }
 };
diff --git a/hw/apic_internal.h b/hw/apic_internal.h
index 6cbd901..898c93c 100644
--- a/hw/apic_internal.h
+++ b/hw/apic_internal.h
@@ -75,6 +75,7 @@  struct APICBackend {
     void (*set_base)(APICState *s, uint64_t val);
     void (*set_tpr)(APICState *s, uint8_t val);
     void (*external_nmi)(APICState *s);
+    void (*post_load)(APICState *s);
 
     QSIMPLEQ_ENTRY(APICBackend) entry;
 };
@@ -104,6 +105,7 @@  struct APICState {
     int64_t next_time;
     int idx;
     QEMUTimer *timer;
+    int64_t timer_expiry;
     int sipi_vector;
     int wait_for_sipi;
 
@@ -114,6 +116,7 @@  struct APICState {
 void apic_register_device(void);
 void apic_register_backend(APICBackend *backend);
 
+bool apic_next_timer(APICState *s, int64_t current_time);
 void apic_report_irq_delivered(int delivered);
 
 #endif /* !QEMU_APIC_INTERNAL_H */