Patchwork qcow2: make cache=unsafe usable

login
register
mail settings
Submitter Kevin Wolf
Date Sept. 7, 2011, 9:56 a.m.
Message ID <4E673FC6.60007@redhat.com>
Download mbox | patch
Permalink /patch/113733/
State New
Headers show

Comments

Kevin Wolf - Sept. 7, 2011, 9:56 a.m.
Am 07.09.2011 11:24, schrieb Avi Kivity:
> Currently cache=unsafe is unsafe to the point of unusability - the
> caches are never written to disk except on exit so anything except
> an orderly exit -- including live migration -- leaves the disk image
> corrupted.
> 
> Fix by interpreting flush requests and doing everything except flushing
> the underlying file.  The contents of the metadata cache are transferred
> to the host pagecache, so that qemu aborts keep the disk in a consistent
> state, and live migration (on the same host, or if using a coherent
> filesystem) works.
> 
> Signed-off-by: Avi Kivity <avi@redhat.com>
> ---
> 
> Untested - is this the right approach?

Hm, could work, even though I don't like it very much. The alternative
approach would be something like this (not only untested, but won't even
compile):


@@ -839,6 +843,11 @@ static int raw_create(const char *filename,
QEMUOptionParameter *options)
 static int raw_flush(BlockDriverState *bs)
 {
     BDRVRawState *s = bs->opaque;
+
+    if (bs->open_flags & BDRV_O_NO_FLUSH) {
+        return 0;
+    }
+
     return qemu_fdatasync(s->fd);
 }
Avi Kivity - Sept. 7, 2011, 10:07 a.m.
On 09/07/2011 12:56 PM, Kevin Wolf wrote:
> Am 07.09.2011 11:24, schrieb Avi Kivity:
> >  Currently cache=unsafe is unsafe to the point of unusability - the
> >  caches are never written to disk except on exit so anything except
> >  an orderly exit -- including live migration -- leaves the disk image
> >  corrupted.
> >
> >  Fix by interpreting flush requests and doing everything except flushing
> >  the underlying file.  The contents of the metadata cache are transferred
> >  to the host pagecache, so that qemu aborts keep the disk in a consistent
> >  state, and live migration (on the same host, or if using a coherent
> >  filesystem) works.
> >
> >  Signed-off-by: Avi Kivity<avi@redhat.com>
> >  ---
> >
> >  Untested - is this the right approach?
>
> Hm, could work, even though I don't like it very much. The alternative
> approach would be something like this

I think that your version is better - it fixes all the layered format 
drivers at once (even though qcow2 is the only one that needs fixing).

Patch

diff --git a/block.c b/block.c
index a8c789a..1aa5967 100644
--- a/block.c
+++ b/block.c
@@ -1723,10 +1723,6 @@  const char *bdrv_get_device_name(BlockDriverState
*bs)

 int bdrv_flush(BlockDriverState *bs)
 {
-    if (bs->open_flags & BDRV_O_NO_FLUSH) {
-        return 0;
-    }
-
     if (bs->drv && bdrv_has_async_flush(bs->drv) && qemu_in_coroutine()) {
         return bdrv_co_flush_em(bs);
     }
@@ -2624,10 +2620,6 @@  BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState
*bs,

     trace_bdrv_aio_flush(bs, opaque);

-    if (bs->open_flags & BDRV_O_NO_FLUSH) {
-        return bdrv_aio_noop_em(bs, cb, opaque);
-    }
-
     if (!drv)
         return NULL;
     return drv->bdrv_aio_flush(bs, cb, opaque);
diff --git a/block/raw-posix.c b/block/raw-posix.c
index bcf50b2..bb0c0c5 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -629,6 +629,10 @@  static BlockDriverAIOCB
*raw_aio_flush(BlockDriverState *bs,
 {
     BDRVRawState *s = bs->opaque;

+    if (bs->open_flags & BDRV_O_NO_FLUSH) {
+        return bdrv_aio_noop_em(bs, cb, opaque);
+    }
+
     if (fd_open(bs) < 0)
         return NULL;