diff mbox series

[v7,04/19] replay: don't drain/flush bdrv queue while RR is working

Message ID 20181010133357.24538.19180.stgit@pasha-VirtualBox
State New
Headers show
Series Fixing record/replay and adding reverse debugging | expand

Commit Message

Pavel Dovgalyuk Oct. 10, 2018, 1:33 p.m. UTC
In record/replay mode bdrv queue is controlled by replay mechanism.
It does not allow saving or loading the snapshots
when bdrv queue is not empty. Stopping the VM is not blocked by nonempty
queue, but flushing the queue is still impossible there,
because it may cause deadlocks in replay mode.
This patch disables bdrv_drain_all and bdrv_flush_all in
record/replay mode.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
---
 block/io.c |   22 ++++++++++++++++++++++
 cpus.c     |    2 --
 2 files changed, 22 insertions(+), 2 deletions(-)

Comments

Kevin Wolf Nov. 28, 2018, 3:45 p.m. UTC | #1
Am 10.10.2018 um 15:33 hat Pavel Dovgalyuk geschrieben:
> In record/replay mode bdrv queue is controlled by replay mechanism.
> It does not allow saving or loading the snapshots
> when bdrv queue is not empty. Stopping the VM is not blocked by nonempty
> queue, but flushing the queue is still impossible there,
> because it may cause deadlocks in replay mode.
> This patch disables bdrv_drain_all and bdrv_flush_all in
> record/replay mode.
> 
> Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>

Disabling bdrv_flush_all() shouldn't make a big difference in replay
mode, so I agree with that.

But does the bdrv_drain_all() change mean that bdrv_drain_all_begin()
can return while there are still requests in flight? This sounds like
something that could break some existing code, because the meaning of
the function is that if it returns success, no requests are in flight
any more.

What would move the request forward in the replay case once it has been
created? Does this also involve some user interaction? Or may we process
the next event in a drain callback of blkreplay or something?

Kevin
Pavel Dovgalyuk Nov. 30, 2018, 7:55 a.m. UTC | #2
> From: Kevin Wolf [mailto:kwolf@redhat.com]
> Am 10.10.2018 um 15:33 hat Pavel Dovgalyuk geschrieben:
> > In record/replay mode bdrv queue is controlled by replay mechanism.
> > It does not allow saving or loading the snapshots
> > when bdrv queue is not empty. Stopping the VM is not blocked by nonempty
> > queue, but flushing the queue is still impossible there,
> > because it may cause deadlocks in replay mode.
> > This patch disables bdrv_drain_all and bdrv_flush_all in
> > record/replay mode.
> >
> > Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
> 
> Disabling bdrv_flush_all() shouldn't make a big difference in replay
> mode, so I agree with that.
> 
> But does the bdrv_drain_all() change mean that bdrv_drain_all_begin()
> can return while there are still requests in flight? This sounds like
> something that could break some existing code, because the meaning of
> the function is that if it returns success, no requests are in flight
> any more.
> 
> What would move the request forward in the replay case once it has been
> created? Does this also involve some user interaction? Or may we process
> the next event in a drain callback of blkreplay or something?

You are right - there are some operation that can't be performed at any
time during the replay, because of unfinished block requests.

One of these is creating the VM snapshot. I've added a check in another
patch, which prevents snapshotting while the event queue (which holds
the block requests) is not empty.

There could also be other things, that will be fixed when we try to use it.

Moving replay forward until the queue is empty is not a good idea, because
step-by-step debugging should be available and work without skipping the instructions.

Pavel Dovgalyuk
Kevin Wolf Nov. 30, 2018, 10:01 a.m. UTC | #3
Am 30.11.2018 um 08:55 hat Pavel Dovgalyuk geschrieben:
> > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > Am 10.10.2018 um 15:33 hat Pavel Dovgalyuk geschrieben:
> > > In record/replay mode bdrv queue is controlled by replay mechanism.
> > > It does not allow saving or loading the snapshots
> > > when bdrv queue is not empty. Stopping the VM is not blocked by nonempty
> > > queue, but flushing the queue is still impossible there,
> > > because it may cause deadlocks in replay mode.
> > > This patch disables bdrv_drain_all and bdrv_flush_all in
> > > record/replay mode.
> > >
> > > Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
> > 
> > Disabling bdrv_flush_all() shouldn't make a big difference in replay
> > mode, so I agree with that.
> > 
> > But does the bdrv_drain_all() change mean that bdrv_drain_all_begin()
> > can return while there are still requests in flight? This sounds like
> > something that could break some existing code, because the meaning of
> > the function is that if it returns success, no requests are in flight
> > any more.
> > 
> > What would move the request forward in the replay case once it has been
> > created? Does this also involve some user interaction? Or may we process
> > the next event in a drain callback of blkreplay or something?
> 
> You are right - there are some operation that can't be performed at any
> time during the replay, because of unfinished block requests.
> 
> One of these is creating the VM snapshot. I've added a check in another
> patch, which prevents snapshotting while the event queue (which holds
> the block requests) is not empty.
> 
> There could also be other things, that will be fixed when we try to use it.
> 
> Moving replay forward until the queue is empty is not a good idea, because
> step-by-step debugging should be available and work without skipping the instructions.

So if the idea is that in replay mode, bdrv_drain_all_begin() is only
called when there are no requests in flight anyway (because callers
catch this situation earler), can we make this an assertion that the
block device is already quiesced?

Skipping drain without knowing that there are no requests in flight
feels simply wrong, and I'm almost sure it will result in bugs. On the
other hand, if we know that no requests are in flight, there's no real
point in skipping.

So I think I still don't understand the situation where skipping a drain
is the correct solution.

Kevin
Pavel Dovgalyuk Nov. 30, 2018, 10:51 a.m. UTC | #4
> From: Kevin Wolf [mailto:kwolf@redhat.com]
> Am 30.11.2018 um 08:55 hat Pavel Dovgalyuk geschrieben:
> > > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > > Am 10.10.2018 um 15:33 hat Pavel Dovgalyuk geschrieben:
> > > > In record/replay mode bdrv queue is controlled by replay mechanism.
> > > > It does not allow saving or loading the snapshots
> > > > when bdrv queue is not empty. Stopping the VM is not blocked by nonempty
> > > > queue, but flushing the queue is still impossible there,
> > > > because it may cause deadlocks in replay mode.
> > > > This patch disables bdrv_drain_all and bdrv_flush_all in
> > > > record/replay mode.
> > > >
> > > > Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
> > >
> > > Disabling bdrv_flush_all() shouldn't make a big difference in replay
> > > mode, so I agree with that.
> > >
> > > But does the bdrv_drain_all() change mean that bdrv_drain_all_begin()
> > > can return while there are still requests in flight? This sounds like
> > > something that could break some existing code, because the meaning of
> > > the function is that if it returns success, no requests are in flight
> > > any more.
> > >
> > > What would move the request forward in the replay case once it has been
> > > created? Does this also involve some user interaction? Or may we process
> > > the next event in a drain callback of blkreplay or something?
> >
> > You are right - there are some operation that can't be performed at any
> > time during the replay, because of unfinished block requests.
> >
> > One of these is creating the VM snapshot. I've added a check in another
> > patch, which prevents snapshotting while the event queue (which holds
> > the block requests) is not empty.
> >
> > There could also be other things, that will be fixed when we try to use it.
> >
> > Moving replay forward until the queue is empty is not a good idea, because
> > step-by-step debugging should be available and work without skipping the instructions.
> 
> So if the idea is that in replay mode, bdrv_drain_all_begin() is only
> called when there are no requests in flight anyway (because callers
> catch this situation earler), can we make this an assertion that the
> block device is already quiesced?

Maybe it's behavior changed in the latest version?
We observed that bdrv_drain_all hangs, when there is a block request in
the queue. Therefore we decided that stopping with non-empty queue
is better than skipping some steps.

> Skipping drain without knowing that there are no requests in flight
> feels simply wrong, and I'm almost sure it will result in bugs. On the
> other hand, if we know that no requests are in flight, there's no real
> point in skipping.

What bugs can happen?

Pavel Dovgalyuk
diff mbox series

Patch

diff --git a/block/io.c b/block/io.c
index bd9d688..96fb245 100644
--- a/block/io.c
+++ b/block/io.c
@@ -32,6 +32,7 @@ 
 #include "qemu/cutils.h"
 #include "qapi/error.h"
 #include "qemu/error-report.h"
+#include "sysemu/replay.h"
 
 #define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */
 
@@ -538,6 +539,13 @@  void bdrv_drain_all_begin(void)
         return;
     }
 
+    /* bdrv queue is managed by record/replay,
+       waiting for finishing the I/O requests may
+       be infinite */
+    if (replay_events_enabled()) {
+        return;
+    }
+
     /* AIO_WAIT_WHILE() with a NULL context can only be called from the main
      * loop AioContext, so make sure we're in the main context. */
     assert(qemu_get_current_aio_context() == qemu_get_aio_context());
@@ -566,6 +574,13 @@  void bdrv_drain_all_end(void)
 {
     BlockDriverState *bs = NULL;
 
+    /* bdrv queue is managed by record/replay,
+       waiting for finishing the I/O requests may
+       be endless */
+    if (replay_events_enabled()) {
+        return;
+    }
+
     while ((bs = bdrv_next_all_states(bs))) {
         AioContext *aio_context = bdrv_get_aio_context(bs);
 
@@ -1997,6 +2012,13 @@  int bdrv_flush_all(void)
     BlockDriverState *bs = NULL;
     int result = 0;
 
+    /* bdrv queue is managed by record/replay,
+       creating new flush request for stopping
+       the VM may break the determinism */
+    if (replay_events_enabled()) {
+        return result;
+    }
+
     for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
         AioContext *aio_context = bdrv_get_aio_context(bs);
         int ret;
diff --git a/cpus.c b/cpus.c
index 1c741bc..d60ddf1 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1078,7 +1078,6 @@  static int do_vm_stop(RunState state, bool send_stop)
     }
 
     bdrv_drain_all();
-    replay_disable_events();
     ret = bdrv_flush_all();
 
     return ret;
@@ -2135,7 +2134,6 @@  int vm_prepare_start(void)
     /* We are sending this now, but the CPUs will be resumed shortly later */
     qapi_event_send_resume();
 
-    replay_enable_events();
     cpu_enable_ticks();
     runstate_set(RUN_STATE_RUNNING);
     vm_state_notify(1, RUN_STATE_RUNNING);