diff mbox

qcow2: make cache=unsafe usable

Message ID 1315387463-14623-1-git-send-email-avi@redhat.com
State New
Headers show

Commit Message

Avi Kivity Sept. 7, 2011, 9:24 a.m. UTC
Currently cache=unsafe is unsafe to the point of unusability - the
caches are never written to disk except on exit so anything except
an orderly exit -- including live migration -- leaves the disk image
corrupted.

Fix by interpreting flush requests and doing everything except flushing
the underlying file.  The contents of the metadata cache are transferred
to the host pagecache, so that qemu aborts keep the disk in a consistent
state, and live migration (on the same host, or if using a coherent
filesystem) works.

Signed-off-by: Avi Kivity <avi@redhat.com>
---

Untested - is this the right approach?

 block/qcow2.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

Comments

Alexander Graf Sept. 7, 2011, 9:47 a.m. UTC | #1
On 07.09.2011, at 11:24, Avi Kivity wrote:

> Currently cache=unsafe is unsafe to the point of unusability - the
> caches are never written to disk except on exit so anything except
> an orderly exit -- including live migration -- leaves the disk image
> corrupted.
> 
> Fix by interpreting flush requests and doing everything except flushing
> the underlying file.  The contents of the metadata cache are transferred
> to the host pagecache, so that qemu aborts keep the disk in a consistent
> state, and live migration (on the same host, or if using a coherent
> filesystem) works.

Yes, I've seen breakage with cache=unsafe and qcow2 myself. Thus semantically, the patch seems very reasonable to me. However, I'll leave it to Kevin to decide if it's a good idea to just unset random flags in open() or if we want to have something more expressive there :)


Alex
diff mbox

Patch

diff --git a/block/qcow2.c b/block/qcow2.c
index bfff6cd..7ecd096 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -275,6 +275,13 @@  static int qcow2_open(BlockDriverState *bs, int flags)
         ret = -EINVAL;
         goto fail;
     }
+    /*
+     * Request flush callbask so that we can write metadata to the host
+     * pagecache.  Flushes to bs->file will still be ignored.  This keeps
+     * metadata consistent in host pagecache, so we're safe wrt unexpected
+     * exits, but avoids slow disk flushes (and is vulnerable to host crashes)
+     */
+    bs->open_flags &= ~BDRV_O_NO_FLUSH;
 
     /* Initialise locks */
     qemu_co_mutex_init(&s->lock);