diff mbox

[RFC,v7,15/21] replay: checkpoints

Message ID 20150112120111.3504.51955.stgit@PASHA-ISP
State New
Headers show

Commit Message

Pavel Dovgalyuk Jan. 12, 2015, 12:01 p.m. UTC
This patch introduces checkpoints that synchronize cpu thread and iothread.
When checkpoint is met in the code all asynchronous events from the queue
are executed.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
---
 block.c                  |   11 +++++++++++
 cpus.c                   |    7 ++++++-
 include/qemu/timer.h     |    6 ++++--
 main-loop.c              |    5 +++++
 qemu-timer.c             |   46 ++++++++++++++++++++++++++++++++++++++--------
 replay/replay-internal.h |    3 +++
 replay/replay.c          |   28 ++++++++++++++++++++++++++++
 replay/replay.h          |    6 ++++++
 stubs/replay.c           |   11 +++++++++++
 vl.c                     |    3 ++-
 10 files changed, 114 insertions(+), 12 deletions(-)

Comments

Paolo Bonzini Jan. 12, 2015, 12:13 p.m. UTC | #1
On 12/01/2015 13:01, Pavel Dovgalyuk wrote:
> +    default:
> +    case QEMU_CLOCK_VIRTUAL:
> +        if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
> +            || !replay_checkpoint(run_all ? 2 : 3)) {
> +            return false;
> +        }
> +        break;

Please document the meaning of the numbers by making an enum.  Why do
you have to distinguish run_all?

Paolo
Pavel Dovgalyuk Jan. 13, 2015, 9:07 a.m. UTC | #2
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> On 12/01/2015 13:01, Pavel Dovgalyuk wrote:
> > +    default:
> > +    case QEMU_CLOCK_VIRTUAL:
> > +        if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
> > +            || !replay_checkpoint(run_all ? 2 : 3)) {
> > +            return false;
> > +        }
> > +        break;
> 
> Please document the meaning of the numbers by making an enum.  

The numbers have no meaning. They just have to be distinct in different places.

> Why do you have to distinguish run_all?

It seems to be early versions artifact. I'll remove it.

Pavel Dovgalyuk
Pavel Dovgalyuk Jan. 13, 2015, 9:15 a.m. UTC | #3
> From: Pavel Dovgaluk [mailto:Pavel.Dovgaluk@ispras.ru]
> > From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> > On 12/01/2015 13:01, Pavel Dovgalyuk wrote:
> > > +    default:
> > > +    case QEMU_CLOCK_VIRTUAL:
> > > +        if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
> > > +            || !replay_checkpoint(run_all ? 2 : 3)) {
> > > +            return false;
> > > +        }
> > > +        break;
> >
> > Please document the meaning of the numbers by making an enum.
> 
> The numbers have no meaning. They just have to be distinct in different places.
> 
> > Why do you have to distinguish run_all?
> 
> It seems to be early versions artifact. I'll remove it.

Sorry, missed one thing.
run_all is used to distinguish timers processed in AIO by calling of timerlistgroup_run_timers function
and in main loop by calling qemu_clock_run_all_timers.
We need to distinguish that to secure the sequence of the events.
It makes sense when we use checkpointing while recording the execution.

Pavel Dovgalyuk
Paolo Bonzini Jan. 13, 2015, 9:40 a.m. UTC | #4
On 13/01/2015 10:15, Pavel Dovgaluk wrote:
> The numbers have no meaning. They just have to be distinct in different places.

This is easier to achieve if you give a name to each place.

> Sorry, missed one thing.
> run_all is used to distinguish timers processed in AIO by calling of timerlistgroup_run_timers function
> and in main loop by calling qemu_clock_run_all_timers.

Should you instead distinguish which TimerListGroup is being run?  Then
main loop would checkpoint once for every TimerListGroup (which means
twice).

> We need to distinguish that to secure the sequence of the events.
> It makes sense when we use checkpointing while recording the execution.

I see.

Paolo
Pavel Dovgalyuk Jan. 13, 2015, 2:26 p.m. UTC | #5
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> On 13/01/2015 10:15, Pavel Dovgaluk wrote:
> > The numbers have no meaning. They just have to be distinct in different places.
> 
> This is easier to achieve if you give a name to each place.
> 
> > Sorry, missed one thing.
> > run_all is used to distinguish timers processed in AIO by calling of
> timerlistgroup_run_timers function
> > and in main loop by calling qemu_clock_run_all_timers.
> 
> Should you instead distinguish which TimerListGroup is being run?  Then
> main loop would checkpoint once for every TimerListGroup (which means
> twice).

Then I'll have to introduce some kind of deterministic ID for them
and new asynchronous event to be saved instead of common checkpoints.
Because these calls can be invoked in any order.
But I have no idea of how to make such a deterministic ID.

Pavel Dovgalyuk
Paolo Bonzini Jan. 13, 2015, 2:52 p.m. UTC | #6
On 13/01/2015 15:26, Pavel Dovgaluk wrote:
>> > Should you instead distinguish which TimerListGroup is being run?  Then
>> > main loop would checkpoint once for every TimerListGroup (which means
>> > twice).
> Then I'll have to introduce some kind of deterministic ID for them
> and new asynchronous event to be saved instead of common checkpoints.
> Because these calls can be invoked in any order.
> But I have no idea of how to make such a deterministic ID.

There is currently one TimerListGroup per iothread object, and two more.
 You can start by not supporting iothreads, or you can give a
deterministic ID to iothreads.

Paolo
Paolo Bonzini Jan. 13, 2015, 2:53 p.m. UTC | #7
On 13/01/2015 15:26, Pavel Dovgaluk wrote:
>> > Should you instead distinguish which TimerListGroup is being run?  Then
>> > main loop would checkpoint once for every TimerListGroup (which means
>> > twice).
> Then I'll have to introduce some kind of deterministic ID for them
> and new asynchronous event to be saved instead of common checkpoints.
> Because these calls can be invoked in any order.
> But I have no idea of how to make such a deterministic ID.

There is currently one TimerListGroup per iothread object, and two more.
 You can start by not supporting iothreads, or you can give a
deterministic ID to iothreads.  Thread pools are similar to
TimerListGroups, except there is only one global pool instead of two.

Paolo
Pavel Dovgalyuk Jan. 22, 2015, 8:50 a.m. UTC | #8
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> On 13/01/2015 10:15, Pavel Dovgaluk wrote:
> > The numbers have no meaning. They just have to be distinct in different places.
> 
> This is easier to achieve if you give a name to each place.
> 
> > Sorry, missed one thing.
> > run_all is used to distinguish timers processed in AIO by calling of
> timerlistgroup_run_timers function
> > and in main loop by calling qemu_clock_run_all_timers.
> 
> Should you instead distinguish which TimerListGroup is being run?  Then
> main loop would checkpoint once for every TimerListGroup (which means
> twice).

I checked that. We should distinguish only run_all_timers and all other calls.
Distinguishing the TimerListGroups does not work - replay hangs.

Pavel Dovgalyuk
diff mbox

Patch

diff --git a/block.c b/block.c
index 4165d42..35f389d 100644
--- a/block.c
+++ b/block.c
@@ -1966,6 +1966,11 @@  void bdrv_drain_all(void)
     BlockDriverState *bs;
 
     while (busy) {
+        if (!replay_checkpoint(8)) {
+            /* Do not wait anymore, we stopped at some place in
+               the middle of execution during replay */
+            return;
+        }
         busy = false;
 
         QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
@@ -1976,6 +1981,12 @@  void bdrv_drain_all(void)
             aio_context_release(aio_context);
         }
     }
+    if (replay_mode == REPLAY_MODE_PLAY) {
+        /* Skip checkpoints from the log */
+        while (replay_checkpoint(8)) {
+            /* Nothing */
+        }
+    }
 }
 
 /* make a BlockDriverState anonymous by removing from bdrv_state and
diff --git a/cpus.c b/cpus.c
index 01d89aa..907a4bb 100644
--- a/cpus.c
+++ b/cpus.c
@@ -388,7 +388,7 @@  void qtest_clock_warp(int64_t dest)
         timers_state.qemu_icount_bias += warp;
         seqlock_write_unlock(&timers_state.vm_clock_seqlock);
 
-        qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
+        qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL, false);
         clock = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
     }
     qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
@@ -408,6 +408,11 @@  void qemu_clock_warp(QEMUClockType type)
         return;
     }
 
+    /* warp clock deterministically in record/replay mode */
+    if (!replay_checkpoint(4)) {
+        return;
+    }
+
     /*
      * If the CPUs have been sleeping, advance QEMU_CLOCK_VIRTUAL timer now.
      * This ensures that the deadline for the timer is computed correctly below.
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index a8bc9eb..19a79b8 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -240,13 +240,14 @@  void qemu_clock_unregister_reset_notifier(QEMUClockType type,
 /**
  * qemu_clock_run_timers:
  * @type: clock on which to operate
+ * @run_all: true, when called from qemu_clock_run_all_timers
  *
  * Run all the timers associated with the default timer list
  * of a clock.
  *
  * Returns: true if any timer ran.
  */
-bool qemu_clock_run_timers(QEMUClockType type);
+bool qemu_clock_run_timers(QEMUClockType type, bool run_all);
 
 /**
  * qemu_clock_run_all_timers:
@@ -337,12 +338,13 @@  QEMUClockType timerlist_get_clock(QEMUTimerList *timer_list);
 /**
  * timerlist_run_timers:
  * @timer_list: the timer list to use
+ * @run_all: true, when called from qemu_clock_run_all_timers
  *
  * Call all expired timers associated with the timer list.
  *
  * Returns: true if any timer expired
  */
-bool timerlist_run_timers(QEMUTimerList *timer_list);
+bool timerlist_run_timers(QEMUTimerList *timer_list, bool run_all);
 
 /**
  * timerlist_notify:
diff --git a/main-loop.c b/main-loop.c
index 981bcb5..d6e93c3 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -497,6 +497,11 @@  int main_loop_wait(int nonblocking)
     slirp_pollfds_poll(gpollfds, (ret < 0));
 #endif
 
+    /* CPU thread can infinitely wait for event after
+       missing the warp */
+    if (replay_mode == REPLAY_MODE_PLAY) {
+        qemu_clock_warp(QEMU_CLOCK_VIRTUAL);
+    }
     qemu_clock_run_all_timers();
 
     return ret;
diff --git a/qemu-timer.c b/qemu-timer.c
index 19b82f6..e8b61d7 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -465,7 +465,7 @@  bool timer_expired(QEMUTimer *timer_head, int64_t current_time)
     return timer_expired_ns(timer_head, current_time * timer_head->scale);
 }
 
-bool timerlist_run_timers(QEMUTimerList *timer_list)
+bool timerlist_run_timers(QEMUTimerList *timer_list, bool run_all)
 {
     QEMUTimer *ts;
     int64_t current_time;
@@ -473,6 +473,29 @@  bool timerlist_run_timers(QEMUTimerList *timer_list)
     QEMUTimerCB *cb;
     void *opaque;
 
+    switch (timer_list->clock->type) {
+    case QEMU_CLOCK_REALTIME:
+        break;
+    default:
+    case QEMU_CLOCK_VIRTUAL:
+        if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
+            || !replay_checkpoint(run_all ? 2 : 3)) {
+            return false;
+        }
+        break;
+    case QEMU_CLOCK_HOST:
+        if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
+            || !replay_checkpoint(run_all ? 5 : 6)) {
+            return false;
+        }
+    case QEMU_CLOCK_VIRTUAL_RT:
+        if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
+            || !replay_checkpoint(run_all ? 10 : 11)) {
+            return false;
+        }
+        break;
+    }
+
     qemu_event_reset(&timer_list->timers_done_ev);
     if (!timer_list->clock->enabled) {
         goto out;
@@ -505,9 +528,9 @@  out:
     return progress;
 }
 
-bool qemu_clock_run_timers(QEMUClockType type)
+bool qemu_clock_run_timers(QEMUClockType type, bool run_all)
 {
-    return timerlist_run_timers(main_loop_tlg.tl[type]);
+    return timerlist_run_timers(main_loop_tlg.tl[type], run_all);
 }
 
 void timerlistgroup_init(QEMUTimerListGroup *tlg,
@@ -532,7 +555,7 @@  bool timerlistgroup_run_timers(QEMUTimerListGroup *tlg)
     QEMUClockType type;
     bool progress = false;
     for (type = 0; type < QEMU_CLOCK_MAX; type++) {
-        progress |= timerlist_run_timers(tlg->tl[type]);
+        progress |= timerlist_run_timers(tlg->tl[type], false);
     }
     return progress;
 }
@@ -541,11 +564,18 @@  int64_t timerlistgroup_deadline_ns(QEMUTimerListGroup *tlg)
 {
     int64_t deadline = -1;
     QEMUClockType type;
+    bool play = replay_mode == REPLAY_MODE_PLAY;
     for (type = 0; type < QEMU_CLOCK_MAX; type++) {
         if (qemu_clock_use_for_deadline(tlg->tl[type]->clock->type)) {
-            deadline = qemu_soonest_timeout(deadline,
-                                            timerlist_deadline_ns(
-                                                tlg->tl[type]));
+            if (!play || tlg->tl[type]->clock->type == QEMU_CLOCK_REALTIME) {
+                deadline = qemu_soonest_timeout(deadline,
+                                                timerlist_deadline_ns(
+                                                    tlg->tl[type]));
+            } else {
+                /* Read clock from the replay file and
+                   do not calculate the deadline, based on virtual clock. */
+                qemu_clock_get_ns(tlg->tl[type]->clock->type);
+            }
         }
     }
     return deadline;
@@ -615,7 +645,7 @@  bool qemu_clock_run_all_timers(void)
     QEMUClockType type;
 
     for (type = 0; type < QEMU_CLOCK_MAX; type++) {
-        progress |= qemu_clock_run_timers(type);
+        progress |= qemu_clock_run_timers(type, true);
     }
 
     return progress;
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index ec6973d..5dad566 100755
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -32,6 +32,9 @@ 
 #define EVENT_CLOCK                 64
 /* some of grteater codes are reserved for clocks */
 
+/* for checkpoint event */
+#define EVENT_CHECKPOINT            96
+
 /* Asynchronous events IDs */
 
 #define REPLAY_ASYNC_COUNT             0
diff --git a/replay/replay.c b/replay/replay.c
index 7a0b7b1..27176c3 100755
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -158,3 +158,31 @@  void replay_shutdown_request(void)
         replay_put_event(EVENT_SHUTDOWN);
     }
 }
+
+/* Used checkpoints: 2 3 4 5 6 7 8 9 10 11 */
+int replay_checkpoint(unsigned int checkpoint)
+{
+    replay_save_instructions();
+
+    if (replay_file) {
+        if (replay_mode == REPLAY_MODE_PLAY) {
+            if (!skip_async_events(EVENT_CHECKPOINT + checkpoint)) {
+                if (replay_data_kind == EVENT_ASYNC_OPT) {
+                    replay_read_events(checkpoint);
+                    replay_fetch_data_kind();
+                    return replay_data_kind != EVENT_ASYNC_OPT;
+                }
+                return 0;
+            }
+            replay_has_unread_data = 0;
+            replay_read_events(checkpoint);
+            replay_fetch_data_kind();
+            return replay_data_kind != EVENT_ASYNC_OPT;
+        } else if (replay_mode == REPLAY_MODE_RECORD) {
+            replay_put_event(EVENT_CHECKPOINT + checkpoint);
+            replay_save_events(checkpoint);
+        }
+    }
+
+    return 1;
+}
diff --git a/replay/replay.h b/replay/replay.h
index 00c9906..6961751 100755
--- a/replay/replay.h
+++ b/replay/replay.h
@@ -82,6 +82,12 @@  void replay_read_tm(struct tm *tm);
 
 /*! Called when qemu shutdown is requested. */
 void replay_shutdown_request(void);
+/*! Should be called at check points in the execution.
+    These check points are skipped, if they were not met.
+    Saves checkpoint in the SAVE mode and validates in the PLAY mode.
+    Returns 0 in PLAY mode if checkpoint was not found.
+    Returns 1 in all other cases. */
+int replay_checkpoint(unsigned int checkpoint);
 
 /* Asynchronous events queue */
 
diff --git a/stubs/replay.c b/stubs/replay.c
index 397d4f2..3bb2d90 100755
--- a/stubs/replay.c
+++ b/stubs/replay.c
@@ -1,4 +1,5 @@ 
 #include "replay/replay.h"
+#include "sysemu/sysemu.h"
 
 ReplayMode replay_mode;
 
@@ -15,3 +16,13 @@  int64_t replay_read_clock(unsigned int kind)
 {
     return 0;
 }
+
+int replay_checkpoint(unsigned int checkpoint)
+{
+    return 0;
+}
+
+int runstate_is_running(void)
+{
+    return 0;
+}
diff --git a/vl.c b/vl.c
index e37a016..663e4db 100644
--- a/vl.c
+++ b/vl.c
@@ -1768,7 +1768,8 @@  static bool main_loop_should_exit(void)
             return true;
         }
     }
-    if (qemu_reset_requested()) {
+    if (qemu_reset_requested_get() && replay_checkpoint(7)) {
+        qemu_reset_requested();
         pause_all_vcpus();
         cpu_synchronize_all_states();
         qemu_system_reset(VMRESET_REPORT);