diff mbox

[v2,4/4] migration: use bdrv_drain_all_begin/end() instead bdrv_drain_all()

Message ID 20170519103247.4405-5-stefanha@redhat.com
State New
Headers show

Commit Message

Stefan Hajnoczi May 19, 2017, 10:32 a.m. UTC
blk/bdrv_drain_all() only takes effect for a single instant and then
resumes block jobs, guest devices, and other external clients like the
NBD server.  This can be handy when performing a synchronous drain
before terminating the program, for example.

Monitor commands usually need to quiesce I/O across an entire code
region so blk/bdrv_drain_all() is not suitable.  They must use
bdrv_drain_all_begin/end() to mark the region.  This prevents new I/O
requests from slipping in or worse - block jobs completing and modifying
the graph.

I audited other blk/bdrv_drain_all() callers but did not find anything
that needs a similar fix.  This patch fixes the savevm/loadvm commands.
Although I haven't encountered a read world issue this makes the code
safer.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 migration/savevm.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Kevin Wolf May 22, 2017, 12:17 p.m. UTC | #1
Am 19.05.2017 um 12:32 hat Stefan Hajnoczi geschrieben:
> blk/bdrv_drain_all() only takes effect for a single instant and then
> resumes block jobs, guest devices, and other external clients like the
> NBD server.  This can be handy when performing a synchronous drain
> before terminating the program, for example.
> 
> Monitor commands usually need to quiesce I/O across an entire code
> region so blk/bdrv_drain_all() is not suitable.  They must use
> bdrv_drain_all_begin/end() to mark the region.  This prevents new I/O
> requests from slipping in or worse - block jobs completing and modifying
> the graph.
> 
> I audited other blk/bdrv_drain_all() callers but did not find anything
> that needs a similar fix.  This patch fixes the savevm/loadvm commands.
> Although I haven't encountered a read world issue this makes the code
> safer.
> 
> Suggested-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

> @@ -2279,7 +2284,7 @@ int load_vmstate(const char *name, Error **errp)
>      }
>  
>      /* Flush all IO requests so they don't interfere with the new state.  */
> -    bdrv_drain_all();
> +    bdrv_drain_all_begin();
>  
>      ret = bdrv_all_goto_snapshot(name, &bs);
>      if (ret < 0) {
> @@ -2303,6 +2308,8 @@ int load_vmstate(const char *name, Error **errp)
>      qemu_fclose(f);
>      aio_context_release(aio_context);
>  
> +    bdrv_drain_all_end();
> +
>      migration_incoming_state_destroy();
>      if (ret < 0) {
>          error_setg(errp, "Error %d while loading VM state", ret);

There are a few error returns between these two places where a
bdrv_drain_all_end() is missing.

Kevin
Stefan Hajnoczi May 22, 2017, 1:39 p.m. UTC | #2
On Mon, May 22, 2017 at 02:17:35PM +0200, Kevin Wolf wrote:
> Am 19.05.2017 um 12:32 hat Stefan Hajnoczi geschrieben:
> > blk/bdrv_drain_all() only takes effect for a single instant and then
> > resumes block jobs, guest devices, and other external clients like the
> > NBD server.  This can be handy when performing a synchronous drain
> > before terminating the program, for example.
> > 
> > Monitor commands usually need to quiesce I/O across an entire code
> > region so blk/bdrv_drain_all() is not suitable.  They must use
> > bdrv_drain_all_begin/end() to mark the region.  This prevents new I/O
> > requests from slipping in or worse - block jobs completing and modifying
> > the graph.
> > 
> > I audited other blk/bdrv_drain_all() callers but did not find anything
> > that needs a similar fix.  This patch fixes the savevm/loadvm commands.
> > Although I haven't encountered a read world issue this makes the code
> > safer.
> > 
> > Suggested-by: Kevin Wolf <kwolf@redhat.com>
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> > @@ -2279,7 +2284,7 @@ int load_vmstate(const char *name, Error **errp)
> >      }
> >  
> >      /* Flush all IO requests so they don't interfere with the new state.  */
> > -    bdrv_drain_all();
> > +    bdrv_drain_all_begin();
> >  
> >      ret = bdrv_all_goto_snapshot(name, &bs);
> >      if (ret < 0) {
> > @@ -2303,6 +2308,8 @@ int load_vmstate(const char *name, Error **errp)
> >      qemu_fclose(f);
> >      aio_context_release(aio_context);
> >  
> > +    bdrv_drain_all_end();
> > +
> >      migration_incoming_state_destroy();
> >      if (ret < 0) {
> >          error_setg(errp, "Error %d while loading VM state", ret);
> 
> There are a few error returns between these two places where a
> bdrv_drain_all_end() is missing.

Thank you, will fix in the next revision.

Stefan
diff mbox

Patch

diff --git a/migration/savevm.c b/migration/savevm.c
index 3ca319f..f134e48 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2113,6 +2113,8 @@  int save_vmstate(const char *name, Error **errp)
     }
     vm_stop(RUN_STATE_SAVE_VM);
 
+    bdrv_drain_all_begin();
+
     aio_context_acquire(aio_context);
 
     memset(sn, 0, sizeof(*sn));
@@ -2171,6 +2173,9 @@  int save_vmstate(const char *name, Error **errp)
     if (aio_context) {
         aio_context_release(aio_context);
     }
+
+    bdrv_drain_all_end();
+
     if (saved_vm_running) {
         vm_start();
     }
@@ -2279,7 +2284,7 @@  int load_vmstate(const char *name, Error **errp)
     }
 
     /* Flush all IO requests so they don't interfere with the new state.  */
-    bdrv_drain_all();
+    bdrv_drain_all_begin();
 
     ret = bdrv_all_goto_snapshot(name, &bs);
     if (ret < 0) {
@@ -2303,6 +2308,8 @@  int load_vmstate(const char *name, Error **errp)
     qemu_fclose(f);
     aio_context_release(aio_context);
 
+    bdrv_drain_all_end();
+
     migration_incoming_state_destroy();
     if (ret < 0) {
         error_setg(errp, "Error %d while loading VM state", ret);