Patchwork [1/3] Add blkmirror block driver

login
register
mail settings
Submitter Federico Simoncelli
Date Feb. 22, 2012, 5:13 p.m.
Message ID <1329930815-7995-2-git-send-email-fsimonce@redhat.com>
Download mbox | patch
Permalink /patch/142515/
State New
Headers show

Comments

Federico Simoncelli - Feb. 22, 2012, 5:13 p.m.
From: Marcelo Tosatti <mtosatti@redhat.com>

Mirrored writes are used by live block copy.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
---
 Makefile.objs      |    2 +-
 block/blkmirror.c  |  282 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 docs/blkmirror.txt |   15 +++
 3 files changed, 298 insertions(+), 1 deletions(-)
 create mode 100644 block/blkmirror.c
 create mode 100644 docs/blkmirror.txt
Stefan Hajnoczi - Feb. 23, 2012, 4:14 p.m.
On Wed, Feb 22, 2012 at 5:13 PM, Federico Simoncelli
<fsimonce@redhat.com> wrote:
> From: Marcelo Tosatti <mtosatti@redhat.com>
>
> Mirrored writes are used by live block copy.

I think the right approach is to create a single blkmirror driver that
also includes blkverify functionality.  The code is basically the same
except blkverify also compares reads - just use a flag to
enable/disable that behavior.

Feel free to rename the blkverify driver to blkmirror if you wish.

By the way, this code does not build against qemu.git/master.  It uses
interfaces which have been dropped.

Stefan
Stefan Hajnoczi - Feb. 23, 2012, 4:18 p.m.
On Thu, Feb 23, 2012 at 4:14 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> By the way, this code does not build against qemu.git/master.  It uses
> interfaces which have been dropped.

Ah, I see you changed that in Patch 2/3.  Please don't do that, it's
hard to review and breaks git-bisect.

Stefan
Federico Simoncelli - Feb. 23, 2012, 4:18 p.m.
----- Original Message -----
> From: "Stefan Hajnoczi" <stefanha@gmail.com>
> To: "Federico Simoncelli" <fsimonce@redhat.com>
> Cc: qemu-devel@nongnu.org, kwolf@redhat.com, mtosatti@redhat.com
> Sent: Thursday, February 23, 2012 5:14:09 PM
> Subject: Re: [Qemu-devel] [PATCH 1/3] Add blkmirror block driver
> 
> On Wed, Feb 22, 2012 at 5:13 PM, Federico Simoncelli
> <fsimonce@redhat.com> wrote:
> > From: Marcelo Tosatti <mtosatti@redhat.com>
> >
> > Mirrored writes are used by live block copy.
> 
> I think the right approach is to create a single blkmirror driver
> that
> also includes blkverify functionality.  The code is basically the
> same
> except blkverify also compares reads - just use a flag to
> enable/disable that behavior.
> 
> Feel free to rename the blkverify driver to blkmirror if you wish.
> 
> By the way, this code does not build against qemu.git/master.  It
> uses
> interfaces which have been dropped.

Yes you also need:

[PATCH 2/3] Update the blkmirror block driver

Which was sent separately to facilitate the review for people who
already reviewed this.
Federico Simoncelli - Feb. 23, 2012, 4:20 p.m.
----- Original Message -----
> From: "Stefan Hajnoczi" <stefanha@gmail.com>
> To: "Federico Simoncelli" <fsimonce@redhat.com>
> Cc: qemu-devel@nongnu.org, kwolf@redhat.com, mtosatti@redhat.com
> Sent: Thursday, February 23, 2012 5:18:41 PM
> Subject: Re: [Qemu-devel] [PATCH 1/3] Add blkmirror block driver
> 
> On Thu, Feb 23, 2012 at 4:14 PM, Stefan Hajnoczi <stefanha@gmail.com>
> wrote:
> > By the way, this code does not build against qemu.git/master.  It
> > uses
> > interfaces which have been dropped.
> 
> Ah, I see you changed that in Patch 2/3.  Please don't do that, it's
> hard to review and breaks git-bisect.

Squashing is easy (anyone could do it at the commit phase), reviewing
is hard (if you hide your changes in someone else's code).
Stefan Hajnoczi - Feb. 23, 2012, 4:28 p.m.
On Thu, Feb 23, 2012 at 4:20 PM, Federico Simoncelli
<fsimonce@redhat.com> wrote:
> ----- Original Message -----
>> From: "Stefan Hajnoczi" <stefanha@gmail.com>
>> To: "Federico Simoncelli" <fsimonce@redhat.com>
>> Cc: qemu-devel@nongnu.org, kwolf@redhat.com, mtosatti@redhat.com
>> Sent: Thursday, February 23, 2012 5:18:41 PM
>> Subject: Re: [Qemu-devel] [PATCH 1/3] Add blkmirror block driver
>>
>> On Thu, Feb 23, 2012 at 4:14 PM, Stefan Hajnoczi <stefanha@gmail.com>
>> wrote:
>> > By the way, this code does not build against qemu.git/master.  It
>> > uses
>> > interfaces which have been dropped.
>>
>> Ah, I see you changed that in Patch 2/3.  Please don't do that, it's
>> hard to review and breaks git-bisect.
>
> Squashing is easy (anyone could do it at the commit phase), reviewing
> is hard (if you hide your changes in someone else's code).

I don't care if you or Marcelo wrote it, I want to see a coherent piece of code.

The way to do it is to merge the g_malloc() and BlockDriver interface
changes into this path.  Then you have not snuck any significant
changes into Marcelo's code.

Keep the BDRV_O_NO_BACKING separate.

Stefan
Federico Simoncelli - Feb. 23, 2012, 4:51 p.m.
----- Original Message -----
> From: "Stefan Hajnoczi" <stefanha@gmail.com>
> To: "Federico Simoncelli" <fsimonce@redhat.com>
> Cc: qemu-devel@nongnu.org, kwolf@redhat.com, mtosatti@redhat.com
> Sent: Thursday, February 23, 2012 5:28:23 PM
> Subject: Re: [Qemu-devel] [PATCH 1/3] Add blkmirror block driver
> 
> On Thu, Feb 23, 2012 at 4:20 PM, Federico Simoncelli
> <fsimonce@redhat.com> wrote:
> > ----- Original Message -----
> >> From: "Stefan Hajnoczi" <stefanha@gmail.com>
> >> To: "Federico Simoncelli" <fsimonce@redhat.com>
> >> Cc: qemu-devel@nongnu.org, kwolf@redhat.com, mtosatti@redhat.com
> >> Sent: Thursday, February 23, 2012 5:18:41 PM
> >> Subject: Re: [Qemu-devel] [PATCH 1/3] Add blkmirror block driver
> >>
> >> On Thu, Feb 23, 2012 at 4:14 PM, Stefan Hajnoczi
> >> <stefanha@gmail.com>
> >> wrote:
> >> > By the way, this code does not build against qemu.git/master.
> >> >  It
> >> > uses
> >> > interfaces which have been dropped.
> >>
> >> Ah, I see you changed that in Patch 2/3.  Please don't do that,
> >> it's
> >> hard to review and breaks git-bisect.
> >
> > Squashing is easy (anyone could do it at the commit phase),
> > reviewing
> > is hard (if you hide your changes in someone else's code).
> 
> I don't care if you or Marcelo wrote it, I want to see a coherent
> piece of code.
> 
> The way to do it is to merge the g_malloc() and BlockDriver interface
> changes into this path.  Then you have not snuck any significant
> changes into Marcelo's code.

Well for example I'd have snuck the qemu_vmalloc/qemu_vfree mistake.

> Keep the BDRV_O_NO_BACKING separate.

Ok, thanks.
Stefan Hajnoczi - Feb. 27, 2012, 9:23 a.m.
On Thu, Feb 23, 2012 at 4:14 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Wed, Feb 22, 2012 at 5:13 PM, Federico Simoncelli
> <fsimonce@redhat.com> wrote:
>> From: Marcelo Tosatti <mtosatti@redhat.com>
>>
>> Mirrored writes are used by live block copy.
>
> I think the right approach is to create a single blkmirror driver that
> also includes blkverify functionality.  The code is basically the same
> except blkverify also compares reads - just use a flag to
> enable/disable that behavior.
>
> Feel free to rename the blkverify driver to blkmirror if you wish.

Federico: Any response to this?  I still think blkmirror and blkverify
do essentially the same thing and should be unified.

Stefan
Paolo Bonzini - Feb. 27, 2012, 11:37 a.m.
On 02/27/2012 10:23 AM, Stefan Hajnoczi wrote:
> > I think the right approach is to create a single blkmirror driver that
> > also includes blkverify functionality.  The code is basically the same
> > except blkverify also compares reads - just use a flag to
> > enable/disable that behavior.
> >
> > Feel free to rename the blkverify driver to blkmirror if you wish.
>
> Federico: Any response to this?  I still think blkmirror and blkverify
> do essentially the same thing and should be unified.

Once non-incremental mode is added, I suspect blkmirror will diverge
from blkverify significantly.  In particular, we would need to track
where have writes been done in the destination.  We also would need to
hooks for block/stream.c, or perhaps a completely separate
implementation of streaming.

Also, blkverify doesn't support cancellation.  I know we do quite poorly
in this area, but I'd rather not make it worse...

Paolo
Stefan Hajnoczi - Feb. 27, 2012, 11:42 a.m.
On Mon, Feb 27, 2012 at 11:37 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 02/27/2012 10:23 AM, Stefan Hajnoczi wrote:
>> > I think the right approach is to create a single blkmirror driver that
>> > also includes blkverify functionality.  The code is basically the same
>> > except blkverify also compares reads - just use a flag to
>> > enable/disable that behavior.
>> >
>> > Feel free to rename the blkverify driver to blkmirror if you wish.
>>
>> Federico: Any response to this?  I still think blkmirror and blkverify
>> do essentially the same thing and should be unified.
>
> Once non-incremental mode is added, I suspect blkmirror will diverge
> from blkverify significantly.  In particular, we would need to track
> where have writes been done in the destination.  We also would need to
> hooks for block/stream.c, or perhaps a completely separate
> implementation of streaming.

I'm not familiar with non-incremental mode, could you please explain
it or link to it?

> Also, blkverify doesn't support cancellation.  I know we do quite poorly
> in this area, but I'd rather not make it worse...

That's why I suggested unifying them - take the best of both approaches.

Stefan
Paolo Bonzini - Feb. 27, 2012, 11:48 a.m.
On 02/27/2012 12:42 PM, Stefan Hajnoczi wrote:
>> >
>> > Once non-incremental mode is added, I suspect blkmirror will diverge
>> > from blkverify significantly.  In particular, we would need to track
>> > where have writes been done in the destination.  We also would need to
>> > hooks for block/stream.c, or perhaps a completely separate
>> > implementation of streaming.
> I'm not familiar with non-incremental mode, could you please explain
> it or link to it?

Non-incremental mode is "pre-copy" migration.  It would stream in the
background from the source to the destination.  In this case:

- you need to differentiate streaming writes from other writes.  When
streaming, you do not want to issue spurious writes to the source;

- you can skip streaming anything that has been written to the
destination already.  This means that you need: 1) a bitmap similar to
is_allocated; 2) to widen writes to a cluster; 3) serialization similar
to copy-on-read.

I'm not yet sure how much of this will be generalized in the block layer
and block/stream.c, and how much will be specific to blkmirror, but in
general none of this applies to blkverify.

Paolo
Stefan Hajnoczi - Feb. 27, 2012, 1:09 p.m.
On Mon, Feb 27, 2012 at 11:48 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 02/27/2012 12:42 PM, Stefan Hajnoczi wrote:
>>> >
>>> > Once non-incremental mode is added, I suspect blkmirror will diverge
>>> > from blkverify significantly.  In particular, we would need to track
>>> > where have writes been done in the destination.  We also would need to
>>> > hooks for block/stream.c, or perhaps a completely separate
>>> > implementation of streaming.
>> I'm not familiar with non-incremental mode, could you please explain
>> it or link to it?
>
> Non-incremental mode is "pre-copy" migration.  It would stream in the
> background from the source to the destination.  In this case:
>
> - you need to differentiate streaming writes from other writes.  When
> streaming, you do not want to issue spurious writes to the source;
>
> - you can skip streaming anything that has been written to the
> destination already.  This means that you need: 1) a bitmap similar to
> is_allocated; 2) to widen writes to a cluster; 3) serialization similar
> to copy-on-read.
>
> I'm not yet sure how much of this will be generalized in the block layer
> and block/stream.c, and how much will be specific to blkmirror, but in
> general none of this applies to blkverify.

Agreed but I'm not sure it has to do with blkmirror either.  blkmirror
and blkverify perform the same function except that blkverify mirrors
reads too and checks that they match.

Who knows when and if pre-copy will be (re)implemented, it's not a
good argument to say let's duplicate block mirroring because we're not
sure about the design of a feature feature yet which might want to
hook in here.

Stefan
Paolo Bonzini - Feb. 27, 2012, 1:47 p.m.
On 02/27/2012 02:09 PM, Stefan Hajnoczi wrote:
> > Non-incremental mode is "pre-copy" migration.  It would stream in the
> > background from the source to the destination.  In this case:
> >
> > - you need to differentiate streaming writes from other writes.  When
> > streaming, you do not want to issue spurious writes to the source;
> >
> > - you can skip streaming anything that has been written to the
> > destination already.  This means that you need: 1) a bitmap similar to
> > is_allocated; 2) to widen writes to a cluster; 3) serialization similar
> > to copy-on-read.
> >
> > I'm not yet sure how much of this will be generalized in the block layer
> > and block/stream.c, and how much will be specific to blkmirror, but in
> > general none of this applies to blkverify.
> 
> Agreed but I'm not sure it has to do with blkmirror either.  blkmirror
> and blkverify perform the same function except that blkverify mirrors
> reads too and checks that they match.
> 
> Who knows when and if pre-copy will be (re)implemented, it's not a
> good argument to say let's duplicate block mirroring because we're not
> sure about the design of a feature feature yet which might want to
> hook in here.

It's not a duplicate, it just happens that two very simple drivers share
1 operation out of 4 (open, read, write, flush).  There are other
differences, for example:

1) blkverify hardcodes raw for one format and auto-detects the other.
blkmirror needs to have a specified format for both, and I don't think
starting another bikeshedding discussion on blkverify backwards
compatibility is a good idea;

2) blkverify doesn't flush the raw file;

3) blkverify in the future could add callbacks to create snapshots or
load/save imgstate, and forward them to the test file; this doesn't make
sense for blkmirror.

Paolo
Stefan Hajnoczi - Feb. 27, 2012, 2:49 p.m.
On Mon, Feb 27, 2012 at 1:47 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 02/27/2012 02:09 PM, Stefan Hajnoczi wrote:
>> > Non-incremental mode is "pre-copy" migration.  It would stream in the
>> > background from the source to the destination.  In this case:
>> >
>> > - you need to differentiate streaming writes from other writes.  When
>> > streaming, you do not want to issue spurious writes to the source;
>> >
>> > - you can skip streaming anything that has been written to the
>> > destination already.  This means that you need: 1) a bitmap similar to
>> > is_allocated; 2) to widen writes to a cluster; 3) serialization similar
>> > to copy-on-read.
>> >
>> > I'm not yet sure how much of this will be generalized in the block layer
>> > and block/stream.c, and how much will be specific to blkmirror, but in
>> > general none of this applies to blkverify.
>>
>> Agreed but I'm not sure it has to do with blkmirror either.  blkmirror
>> and blkverify perform the same function except that blkverify mirrors
>> reads too and checks that they match.
>>
>> Who knows when and if pre-copy will be (re)implemented, it's not a
>> good argument to say let's duplicate block mirroring because we're not
>> sure about the design of a feature feature yet which might want to
>> hook in here.
>
> It's not a duplicate, it just happens that two very simple drivers share
> 1 operation out of 4 (open, read, write, flush).  There are other
> differences, for example:
>
> 1) blkverify hardcodes raw for one format and auto-detects the other.
> blkmirror needs to have a specified format for both, and I don't think
> starting another bikeshedding discussion on blkverify backwards
> compatibility is a good idea;

Backwards compatibility isn't needed here, it's a debugging tool and
some distros even disable it.

Allowing another format for the reference image is a feature that
would be nice to have.  It allows direct qcow2 to vmdk comparison, for
example, if we ever hit a bug that benefits from this type of
comparison.

> 2) blkverify doesn't flush the raw file;

It's fine to flush the reference file.  This is an accidental difference :).

> 3) blkverify in the future could add callbacks to create snapshots or
> load/save imgstate, and forward them to the test file; this doesn't make
> sense for blkmirror.

I guess what I'm saying is, if we need to copy-paste in order to fork
them in the future that's fine, but why maintain duplicates in the
mean-time?  Please make the codebase nice today.  We can always extend
it in the future, we're not forced to keep them unified forever if it
turns out we want to split them.  But as it stands they are
essentially the same thing.

Stefan
Stefan Hajnoczi - Feb. 27, 2012, 2:59 p.m.
On Mon, Feb 27, 2012 at 2:49 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Mon, Feb 27, 2012 at 1:47 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 02/27/2012 02:09 PM, Stefan Hajnoczi wrote:
>> 3) blkverify in the future could add callbacks to create snapshots or
>> load/save imgstate, and forward them to the test file; this doesn't make
>> sense for blkmirror.
>
> I guess what I'm saying is, if we need to copy-paste in order to fork
> them in the future that's fine, but why maintain duplicates in the
> mean-time?  Please make the codebase nice today.  We can always extend
> it in the future, we're not forced to keep them unified forever if it
> turns out we want to split them.  But as it stands they are
> essentially the same thing.

BTW, I feel that I may just be missing context on what the plans are
around mirroring and block migration.  Is there a plan or discussions
that haven't hit qemu-devel (maybe they were done in ovirt or
internally)?

Stefan
Paolo Bonzini - Feb. 27, 2012, 3:08 p.m.
On 02/27/2012 03:59 PM, Stefan Hajnoczi wrote:
>> > I guess what I'm saying is, if we need to copy-paste in order to fork
>> > them in the future that's fine, but why maintain duplicates in the
>> > mean-time?  Please make the codebase nice today.  We can always extend
>> > it in the future, we're not forced to keep them unified forever if it
>> > turns out we want to split them.  But as it stands they are
>> > essentially the same thing.
> BTW, I feel that I may just be missing context on what the plans are
> around mirroring and block migration.  Is there a plan or discussions
> that haven't hit qemu-devel (maybe they were done in ovirt or
> internally)?

Not that I know of.  Me, I'm not on any oVirt list. :)

Paolo

Patch

diff --git a/Makefile.objs b/Makefile.objs
index 391e524..cd65e1b 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -34,7 +34,7 @@  block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vv
 block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
 block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-nested-y += qed-check.o
-block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
+block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o blkmirror.o
 block-nested-y += stream.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
 block-nested-$(CONFIG_POSIX) += raw-posix.o
diff --git a/block/blkmirror.c b/block/blkmirror.c
new file mode 100644
index 0000000..1c02710
--- /dev/null
+++ b/block/blkmirror.c
@@ -0,0 +1,282 @@ 
+/*
+ * Block driver for mirrored writes.
+ *
+ * Copyright (C) 2011 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include <stdarg.h>
+#include "block_int.h"
+
+typedef struct {
+    BlockDriverState *bs[2];
+} BdrvMirrorState;
+
+typedef struct DupAIOCB DupAIOCB;
+
+typedef struct SingleAIOCB {
+    BlockDriverAIOCB *aiocb;
+    int finished;
+    DupAIOCB *parent;
+} SingleAIOCB;
+
+struct DupAIOCB {
+    BlockDriverAIOCB common;
+    int count;
+
+    BlockDriverCompletionFunc *cb;
+    SingleAIOCB aios[2];
+    int ret;
+};
+
+/* Valid blkmirror filenames look like
+ * blkmirror:path/to/image1:path/to/image2 */
+static int blkmirror_open(BlockDriverState *bs, const char *filename, int flags)
+{
+    BdrvMirrorState *m = bs->opaque;
+    int ret, escape, i, n;
+    char *filename2;
+
+    /* Parse the blkmirror: prefix */
+    if (strncmp(filename, "blkmirror:", strlen("blkmirror:"))) {
+        return -EINVAL;
+    }
+    filename += strlen("blkmirror:");
+
+    /* Parse the raw image filename */
+    filename2 = qemu_malloc(strlen(filename)+1);
+    escape = 0;
+    for (i = n = 0; i < strlen(filename); i++) {
+        if (!escape && filename[i] == ':') {
+            break;
+        }
+        if (!escape && filename[i] == '\\') {
+            escape = 1;
+        } else {
+            escape = 0;
+        }
+
+        if (!escape) {
+            filename2[n++] = filename[i];
+        }
+    }
+    filename2[n] = '\0';
+
+    m->bs[0] = bdrv_new("");
+    if (m->bs[0] == NULL) {
+        free(filename2);
+        return -ENOMEM;
+    }
+    ret = bdrv_open(m->bs[0], filename2, flags, NULL);
+    free(filename2);
+    if (ret < 0) {
+        return ret;
+    }
+    filename += i + 1;
+
+    m->bs[1] = bdrv_new("");
+    if (m->bs[1] == NULL) {
+        bdrv_delete(m->bs[0]);
+        return -ENOMEM;
+    }
+    ret = bdrv_open(m->bs[1], filename, flags, NULL);
+    if (ret < 0) {
+        bdrv_delete(m->bs[0]);
+        return ret;
+    }
+
+    return 0;
+}
+
+static void blkmirror_close(BlockDriverState *bs)
+{
+    BdrvMirrorState *m = bs->opaque;
+    int i;
+
+    for (i = 0; i < 2; i++) {
+        bdrv_delete(m->bs[i]);
+        m->bs[i] = NULL;
+    }
+}
+
+static int blkmirror_flush(BlockDriverState *bs)
+{
+    BdrvMirrorState *m = bs->opaque;
+    int ret;
+
+    ret = bdrv_flush(m->bs[0]);
+    if (ret < 0) {
+        return ret;
+    }
+
+    return bdrv_flush(m->bs[1]);
+}
+
+static int64_t blkmirror_getlength(BlockDriverState *bs)
+{
+    BdrvMirrorState *m = bs->opaque;
+
+    return bdrv_getlength(m->bs[0]);
+}
+
+static BlockDriverAIOCB *blkmirror_aio_readv(BlockDriverState *bs,
+                                             int64_t sector_num,
+                                             QEMUIOVector *qiov,
+                                             int nb_sectors,
+                                             BlockDriverCompletionFunc *cb,
+                                             void *opaque)
+{
+    BdrvMirrorState *m = bs->opaque;
+    return bdrv_aio_readv(m->bs[0], sector_num, qiov, nb_sectors, cb, opaque);
+}
+
+static void dup_aio_cancel(BlockDriverAIOCB *blockacb)
+{
+    DupAIOCB *acb = container_of(blockacb, DupAIOCB, common);
+    int i;
+
+    for (i = 0 ; i < 2; i++) {
+        if (!acb->aios[i].finished) {
+            bdrv_aio_cancel(acb->aios[i].aiocb);
+        }
+    }
+    qemu_aio_release(acb);
+}
+
+static AIOPool dup_aio_pool = {
+    .aiocb_size         = sizeof(DupAIOCB),
+    .cancel             = dup_aio_cancel,
+};
+
+static void blkmirror_aio_cb(void *opaque, int ret)
+{
+    SingleAIOCB *scb = opaque;
+    DupAIOCB *dcb = scb->parent;
+
+    scb->finished = 1;
+    dcb->count--;
+    assert(dcb->count >= 0);
+    if (ret < 0) {
+        dcb->ret = ret;
+    }
+    if (dcb->count == 0) {
+        dcb->common.cb(dcb->common.opaque, dcb->ret);
+        qemu_aio_release(dcb);
+    }
+}
+
+static DupAIOCB *dup_aio_get(BlockDriverState *bs,
+                             BlockDriverCompletionFunc *cb,
+                             void *opaque)
+{
+    DupAIOCB *dcb;
+    int i;
+
+    dcb = qemu_aio_get(&dup_aio_pool, bs, cb, opaque);
+    if (!dcb) {
+        return NULL;
+    }
+    dcb->count = 2;
+    for (i = 0; i < 2; i++) {
+        dcb->aios[i].parent = dcb;
+        dcb->aios[i].finished = 0;
+    }
+    dcb->ret = 0;
+
+    return dcb;
+}
+
+static BlockDriverAIOCB *blkmirror_aio_writev(BlockDriverState *bs,
+                                              int64_t sector_num,
+                                              QEMUIOVector *qiov,
+                                              int nb_sectors,
+                                              BlockDriverCompletionFunc *cb,
+                                              void *opaque)
+{
+    BdrvMirrorState *m = bs->opaque;
+    DupAIOCB *dcb = dup_aio_get(bs, cb, opaque);
+    int i;
+
+    for (i = 0; i < 2; i++) {
+        dcb->aios[i].aiocb = bdrv_aio_writev(m->bs[i], sector_num, qiov,
+                                             nb_sectors, &blkmirror_aio_cb,
+                                             &dcb->aios[i]);
+        if (!dcb->aios[i].aiocb) {
+            int a;
+
+            for (a = 0; a < i; a++) {
+                bdrv_aio_cancel(dcb->aios[i].aiocb);
+            }
+            qemu_aio_release(dcb);
+            return NULL;
+        }
+    }
+
+    return &dcb->common;
+}
+
+static BlockDriverAIOCB *blkmirror_aio_flush(BlockDriverState *bs,
+                                             BlockDriverCompletionFunc *cb,
+                                             void *opaque)
+{
+    BdrvMirrorState *m = bs->opaque;
+    DupAIOCB *dcb = dup_aio_get(bs, cb, opaque);
+    int i;
+
+    for (i = 0; i < 2; i++) {
+        dcb->aios[i].aiocb = bdrv_aio_flush(m->bs[i], &blkmirror_aio_cb,
+                                            &dcb->aios[i]);
+        if (!dcb->aios[i].aiocb) {
+            int a;
+
+            for (a = 0; a < i; a++) {
+                bdrv_aio_cancel(dcb->aios[i].aiocb);
+            }
+            qemu_aio_release(dcb);
+            return NULL;
+        }
+    }
+
+    return &dcb->common;
+}
+
+static int blkmirror_discard(BlockDriverState *bs, int64_t sector_num,
+                             int nb_sectors)
+{
+    BdrvMirrorState *m = bs->opaque;
+    int ret;
+
+    ret = bdrv_discard(m->bs[0], sector_num, nb_sectors);
+    if (ret < 0) {
+        return ret;
+    }
+
+    return bdrv_discard(m->bs[1], sector_num, nb_sectors);
+}
+
+
+static BlockDriver bdrv_blkmirror = {
+    .format_name        = "blkmirror",
+    .protocol_name      = "blkmirror",
+    .instance_size      = sizeof(BdrvMirrorState),
+
+    .bdrv_getlength     = blkmirror_getlength,
+
+    .bdrv_file_open     = blkmirror_open,
+    .bdrv_close         = blkmirror_close,
+    .bdrv_flush         = blkmirror_flush,
+    .bdrv_discard       = blkmirror_discard,
+
+    .bdrv_aio_readv     = blkmirror_aio_readv,
+    .bdrv_aio_writev    = blkmirror_aio_writev,
+    .bdrv_aio_flush     = blkmirror_aio_flush,
+};
+
+static void bdrv_blkmirror_init(void)
+{
+    bdrv_register(&bdrv_blkmirror);
+}
+
+block_init(bdrv_blkmirror_init);
diff --git a/docs/blkmirror.txt b/docs/blkmirror.txt
new file mode 100644
index 0000000..202be7d
--- /dev/null
+++ b/docs/blkmirror.txt
@@ -0,0 +1,15 @@ 
+Block mirror driver
+-------------------
+
+This driver will mirror writes to two distinct images.
+Its used internally by live block copy.
+
+Format
+------
+
+blkmirror:/image1.img:/image2.img
+
+'\' (backslash) can be used to escape colon processing
+as a separator, in the first image filename.
+
+