Patchwork [3/5] block: allow migration to work with image files (v3)

login
register
mail settings
Submitter Anthony Liguori
Date Nov. 14, 2011, 9:09 p.m.
Message ID <1321304987-23041-3-git-send-email-aliguori@us.ibm.com>
Download mbox | patch
Permalink /patch/125610/
State New
Headers show

Comments

Anthony Liguori - Nov. 14, 2011, 9:09 p.m.
Image files have two types of data: immutable data that describes things like
image size, backing files, etc. and mutable data that includes offset and
reference count tables.

Today, image formats aggressively cache mutable data to improve performance.  In
some cases, this happens before a guest even starts.  When dealing with live
migration, since a file is open on two machines, the caching of meta data can
lead to data corruption.

This patch addresses this by introducing a mechanism to invalidate any cached
mutable data a block driver may have which is then used by the live migration
code.

NB, this still requires coherent shared storage.  Addressing migration without
coherent shared storage (i.e. NFS) requires additional work.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
---
v2 -> v3
 - remove bdrv_invalidate_cache_all() in vm_stop.
v1 -> v2
 - rebase to latest master
---
 block.c     |   16 ++++++++++++++++
 block.h     |    4 ++++
 block_int.h |    5 +++++
 migration.c |    3 +++
 4 files changed, 28 insertions(+), 0 deletions(-)

Patch

diff --git a/block.c b/block.c
index 86910b0..d015887 100644
--- a/block.c
+++ b/block.c
@@ -2839,6 +2839,22 @@  int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
     }
 }
 
+void bdrv_invalidate_cache(BlockDriverState *bs)
+{
+    if (bs->drv && bs->drv->bdrv_invalidate_cache) {
+        bs->drv->bdrv_invalidate_cache(bs);
+    }
+}
+
+void bdrv_invalidate_cache_all(void)
+{
+    BlockDriverState *bs;
+
+    QTAILQ_FOREACH(bs, &bdrv_states, list) {
+        bdrv_invalidate_cache(bs);
+    }
+}
+
 int bdrv_flush(BlockDriverState *bs)
 {
     Coroutine *co;
diff --git a/block.h b/block.h
index 051a25d..a826059 100644
--- a/block.h
+++ b/block.h
@@ -197,6 +197,10 @@  BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
         unsigned long int req, void *buf,
         BlockDriverCompletionFunc *cb, void *opaque);
 
+/* Invalidate any cached metadata used by image formats */
+void bdrv_invalidate_cache(BlockDriverState *bs);
+void bdrv_invalidate_cache_all(void);
+
 /* Ensure contents are flushed to disk.  */
 int bdrv_flush(BlockDriverState *bs);
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
diff --git a/block_int.h b/block_int.h
index 1ec4921..77c0187 100644
--- a/block_int.h
+++ b/block_int.h
@@ -88,6 +88,11 @@  struct BlockDriver {
         int64_t sector_num, int nb_sectors);
 
     /*
+     * Invalidate any cached meta-data.
+     */
+    void (*bdrv_invalidate_cache)(BlockDriverState *bs);
+
+    /*
      * Flushes all data that was already written to the OS all the way down to
      * the disk (for example raw-posix calls fsync()).
      */
diff --git a/migration.c b/migration.c
index 6764d3a..8280d71 100644
--- a/migration.c
+++ b/migration.c
@@ -89,6 +89,9 @@  void process_incoming_migration(QEMUFile *f)
     qemu_announce_self();
     DPRINTF("successfully loaded vm state\n");
 
+    /* Make sure all file formats flush their mutable metadata */
+    bdrv_invalidate_cache_all();
+
     if (autostart) {
         vm_start();
     } else {