diff mbox

[2/2,v2] Add the blockdev-reopen and blockdev-migrate commands

Message ID 1330102144-14491-2-git-send-email-fsimonce@redhat.com
State New
Headers show

Commit Message

Federico Simoncelli Feb. 24, 2012, 4:49 p.m. UTC
Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
---
 blockdev.c       |  107 ++++++++++++++++++++++++++++++++++++++++++++++++------
 hmp-commands.hx  |   38 +++++++++++++++++++
 hmp.c            |   24 ++++++++++++
 hmp.h            |    2 +
 qapi-schema.json |   54 +++++++++++++++++++++++++++
 5 files changed, 213 insertions(+), 12 deletions(-)

Comments

Eric Blake Feb. 24, 2012, 5:46 p.m. UTC | #1
On 02/24/2012 09:49 AM, Federico Simoncelli wrote:
> Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>

Pretty sparse on the commit message.

> +++ b/hmp-commands.hx
> @@ -886,6 +886,44 @@ Snapshot device, using snapshot file as target if provided
>  ETEXI
>  
>      {
> +        .name       = "drive_reopen",
> +        .args_type  = "device:B,new-image-file:s?,format:s?",
> +        .params     = "device [new-image-file] [format]",
> +        .help       = "Assigns a new image file to a device.\n\t\t\t"
> +                      "The image will be opened using the format\n\t\t\t"
> +                      "specified or 'qcow2' by default.\n\t\t\t"
> +                      "This command is deprecated.",
> +        .mhandler.cmd = hmp_drive_reopen,

Libvirt will just use the QMP version, so I'm not reviewing the HMP
version very closely.

> +++ b/qapi-schema.json
> @@ -1136,6 +1136,60 @@
>    'data': { 'device': 'str', 'snapshot-file': 'str', '*format': 'str' } }
>  
>  ##
> +# @drive-reopen
> +#
> +# Assigns a new image file to a device.
> +#
> +# @device: the name of the device for which we are chainging the image file.

s/chainging/changing/

> +#
> +# @new-image-file: the target of the new image. If the file doesn't exists the
> +#                  command will fail.

I think you need to be more explicit that @new-image-file MUST have
identical contents as the current image file, for this to be useful, and
that qemu does not validate whether the new image met those conditions.
 Possible ways to achieve this:

1. Freeze all guest I/O, so that you know the current image file is
stable, copy the contents to new-image-file, drive-reopen, then unfreeze
guest I/O

2. Create an empty qcow2 file with backing image of the current image
file.  drive-reopen will then see the new image file as containing the
same contents as the previous image file.

3. Create the new image file via mirroring (blockdev-migrate with
incremental and new-image-file), and reopen to the mirror

> +#
> +# @format: #optional the format of the new image, default is 'qcow2'.
> +#
> +# Returns: nothing on success
> +#          If @device is not a valid block device, DeviceNotFound
> +#          If @new-image-file can't be opened, OpenFileFailed
> +#          If @format is invalid, InvalidBlockFormat
> +#
> +# Notes: This command is deprecated.

Generally, a deprecation note should list the preferred replacement; but
we don't have a replacement, so I don't see this note adding any
information.

> +#
> +# Since 1.1
> +##
> +
> +{ 'command': 'drive-reopen',
> +  'data': { 'device': 'str', 'new-image-file': 'str', '*format': 'str' } }
> +
> +##
> +# @blockdev-migrate
> +#
> +# Migrates the device to a new destination.
> +#
> +# @device: the name of the block device to migrate.
> +#
> +# @incremental: if true only the new writes are sent to the destination.
> +#               This method is particularly useful if used in conjunction
> +#               with new-image-file allowing the current image to be
> +#               transferred to the destination by an external manager.
> +#
> +# @destination: the destination of the block migration.

Where do you list what format the destination is?  Shouldn't this have
an optional format, defaulting to qcow2?  Does qemu create the
destination file, or must it already be existing?

> +#
> +# @new-image-file: #optional an existing image file that will replace
> +#                  the current one in the device.

Where do you list what format the new-image-file is?  Shouldn't this
have an optional format, defaulting to qcow2?  Does qemu create the
new-image-file, or can one already be existing?

I know that this patch is only implementing the case where incremental
is true and new-image-file is provided; but I'm not quite sure what
semantics are intended if incremental is false.  Is that still a case
where this sets up mirroring (writes go to two images) but additionally
the contents from the current image are (asynchronously) streamed into
the destination?  I guess I'm not sure what the intended semantics of
the modes that we aren't implementing, or why we don't just go with a
simpler command that only exposes the semantics that we are implementing.

> +#
> +# Returns: nothing on success
> +#          If @device is not a valid block device, DeviceNotFound
> +#          If @mode is not a valid migration mode, InvalidParameterValue
> +#          If @destination can't be opened, OpenFileFailed
> +#          If @new-image-file can't be opened, OpenFileFailed
> +#
> +# Since 1.1
> +##
> +{ 'command': 'blockdev-migrate',
> +  'data': { 'device': 'str', 'incremental' : 'bool',
> +            'destination': 'str', '*new-image-file': 'str' } }
> +
> +##
>  # @human-monitor-command:
>  #
>  # Execute a command on the human monitor and return the output.
Paolo Bonzini Feb. 24, 2012, 6:57 p.m. UTC | #2
On 02/24/2012 06:46 PM, Eric Blake wrote:
> I think you need to be more explicit that @new-image-file MUST have
> identical contents as the current image file, for this to be useful, and
> that qemu does not validate whether the new image met those conditions.
>  Possible ways to achieve this:

Not necessarily, you could always do this on a paused machine.

>> > +# @destination: the destination of the block migration.
> Where do you list what format the destination is?  Shouldn't this have
> an optional format, defaulting to qcow2?  Does qemu create the
> destination file, or must it already be existing?

I think no default, raw makes as much or more sense in the
non-incremental case.  Anyway either autodetect, or no default.

>> > +#
>> > +# @new-image-file: #optional an existing image file that will replace
>> > +#                  the current one in the device.
> Where do you list what format the new-image-file is?  Shouldn't this
> have an optional format, defaulting to qcow2?  Does qemu create the
> new-image-file, or can one already be existing?

qemu does not create the file now, but in the future we may add a flag
to create a snapshot.  I think no default is better here too, or autodetect.

> I know that this patch is only implementing the case where incremental
> is true and new-image-file is provided; but I'm not quite sure what
> semantics are intended if incremental is false.  Is that still a case
> where this sets up mirroring (writes go to two images) but additionally
> the contents from the current image are (asynchronously) streamed into
> the destination?

Yes.  The image should already be there also in this case, and
new-image-file will usually be omitted.

Paolo
Luiz Capitulino Feb. 24, 2012, 7:01 p.m. UTC | #3
On Fri, 24 Feb 2012 16:49:04 +0000
Federico Simoncelli <fsimonce@redhat.com> wrote:

> Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>

How does this relate to migrate -b? Should it be deprecated?

Btw, would be nice to have a 0/2 intro email describing the feature and changelog
info.

> ---
>  blockdev.c       |  107 ++++++++++++++++++++++++++++++++++++++++++++++++------
>  hmp-commands.hx  |   38 +++++++++++++++++++
>  hmp.c            |   24 ++++++++++++
>  hmp.h            |    2 +
>  qapi-schema.json |   54 +++++++++++++++++++++++++++
>  5 files changed, 213 insertions(+), 12 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 2c132a3..f51bd6f 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -646,9 +646,8 @@ void do_commit(Monitor *mon, const QDict *qdict)
>      }
>  }
>  
> -void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
> -                                bool has_format, const char *format,
> -                                Error **errp)
> +static void change_blockdev_image(const char *device, const char *new_image_file,
> +                                  const char *format, bool create, Error **errp)
>  {
>      BlockDriverState *bs;
>      BlockDriver *drv, *old_drv, *proto_drv;
> @@ -671,7 +670,7 @@ void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
>      old_drv = bs->drv;
>      flags = bs->open_flags;
>  
> -    if (!has_format) {
> +    if (!format) {
>          format = "qcow2";
>      }
>  
> @@ -681,24 +680,26 @@ void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
>          return;
>      }
>  
> -    proto_drv = bdrv_find_protocol(snapshot_file);
> +    proto_drv = bdrv_find_protocol(new_image_file);
>      if (!proto_drv) {
>          error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
>          return;
>      }
>  
> -    ret = bdrv_img_create(snapshot_file, format, bs->filename,
> -                          bs->drv->format_name, NULL, -1, flags);
> -    if (ret) {
> -        error_set(errp, QERR_UNDEFINED_ERROR);
> -        return;
> +    if (create) {
> +        ret = bdrv_img_create(new_image_file, format, bs->filename,
> +                              bs->drv->format_name, NULL, -1, flags);
> +        if (ret) {
> +            error_set(errp, QERR_UNDEFINED_ERROR);
> +            return;
> +        }
>      }
>  
>      bdrv_drain_all();
>      bdrv_flush(bs);
>  
>      bdrv_close(bs);
> -    ret = bdrv_open(bs, snapshot_file, flags, drv);
> +    ret = bdrv_open(bs, new_image_file, flags, drv);
>      /*
>       * If reopening the image file we just created fails, fall back
>       * and try to re-open the original image. If that fails too, we
> @@ -709,11 +710,93 @@ void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
>          if (ret != 0) {
>              error_set(errp, QERR_OPEN_FILE_FAILED, old_filename);
>          } else {
> -            error_set(errp, QERR_OPEN_FILE_FAILED, snapshot_file);
> +            error_set(errp, QERR_OPEN_FILE_FAILED, new_image_file);
>          }
>      }
>  }
>  
> +void qmp_drive_reopen(const char *device, const char *new_image_file,
> +                         bool has_format, const char *format, Error **errp)
> +{
> +    change_blockdev_image(device, new_image_file,
> +                          has_format ? format : NULL, false, errp);
> +}
> +
> +void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
> +                                bool has_format, const char *format,
> +                                Error **errp)
> +{
> +    change_blockdev_image(device, snapshot_file,
> +                          has_format ? format : NULL, true, errp);
> +}
> +
> +static void qmp_blockdev_mirror(const char *device, const char *destination,
> +                                const char *new_image_file, Error **errp)
> +{

Minor: this is a not a qmp function, please drop the qmp_ prefix then.

> +    BlockDriver *drv;
> +    int i, j, escape;
> +    char new_filename[2048], *filename;

I'd use PATH_MAX for new_filename's size.

> +
> +    drv = bdrv_find_format("blkmirror");
> +    if (!drv) {
> +        error_set(errp, QERR_INVALID_BLOCK_FORMAT, "blkmirror");
> +        return;
> +    }
> +
> +    escape = 0;
> +    for (i = 0, j = 0; j < strlen(new_image_file); j++) {
> + loop:
> +        if (!(i < sizeof(new_filename) - 2)) {
> +            error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> +                      "new-image-file", "shorter filename");
> +            return;
> +        }
> +
> +        if (new_image_file[j] == ':' || new_image_file[j] == '\\') {
> +            if (!escape) {
> +                new_filename[i++] = '\\', escape = 1;
> +                goto loop;
> +            } else {
> +                escape = 0;
> +            }
> +        }
> +
> +        new_filename[i++] = new_image_file[j];
> +    }

IMO, you could require the filename arguments to be escaped already.

> +
> +    filename = g_strdup_printf("blkmirror:%s:%s", new_filename, destination);
> +    change_blockdev_image(device, filename, "blkmirror", false, errp);
> +    g_free(filename);
> +}
> +
> +void qmp_blockdev_migrate(const char *device, bool incremental,
> +                          const char *destination, bool has_new_image_file,
> +                          const char *new_image_file, Error **errp)
> +{
> +    BlockDriverState *bs;
> +
> +    bs = bdrv_find(device);
> +    if (!bs) {
> +        error_set(errp, QERR_DEVICE_NOT_FOUND, device);
> +        return;
> +    }
> +    if (bdrv_in_use(bs)) {
> +        error_set(errp, QERR_DEVICE_IN_USE, device);
> +        return;
> +    }
> +
> +    if (incremental) {
> +        if (!has_new_image_file) {
> +            error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> +                      "incremental", "a new image file");
> +        } else {
> +            qmp_blockdev_mirror(device, destination, new_image_file, errp);
> +        }
> +    } else {
> +        error_set(errp, QERR_NOT_SUPPORTED);
> +    }

The command documentation says that 'incremental' and 'new_image_file' are
optionals, but the code makes them required. Why?

> +}
> +
>  static void eject_device(BlockDriverState *bs, int force, Error **errp)
>  {
>      if (bdrv_in_use(bs)) {
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 573b823..4aacf2a 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -886,6 +886,44 @@ Snapshot device, using snapshot file as target if provided
>  ETEXI
>  
>      {
> +        .name       = "drive_reopen",
> +        .args_type  = "device:B,new-image-file:s?,format:s?",
> +        .params     = "device [new-image-file] [format]",
> +        .help       = "Assigns a new image file to a device.\n\t\t\t"
> +                      "The image will be opened using the format\n\t\t\t"
> +                      "specified or 'qcow2' by default.\n\t\t\t"
> +                      "This command is deprecated.",
> +        .mhandler.cmd = hmp_drive_reopen,
> +    },
> +
> +STEXI
> +@item drive_reopen
> +@findex drive_reopen
> +Assigns a new image file to a device.
> +ETEXI
> +
> +    {
> +        .name       = "blkdev_migrate",
> +        .args_type  = "incremental:-i,device:B,destination:s,new-image-file:s?",
> +        .params     = "device mode destination [new-image-file]",
> +        .help       = "Migrates the device to a new destination.\n\t\t\t"
> +                      "If the incremental (-i) option is used then\n\t\t\t"
> +                      "only the new writes are sent to the destination.\n\t\t\t"
> +                      "This method is particularly useful if used in\n\t\t\t"
> +                      "conjunction with new-image-file (a new image for\n\t\t\t"
> +                      "the device) allowing the current image to be\n\t\t\t"
> +                      "transferred to the destination by an external\n\t\t\t"
> +                      "manager.",
> +        .mhandler.cmd = hmp_blkdev_migrate,
> +    },
> +
> +STEXI
> +@item blkdev_migrate
> +@findex blkdev_migrate
> +Migrates the device to a new destination.
> +ETEXI
> +
> +    {
>          .name       = "drive_add",
>          .args_type  = "pci_addr:s,opts:s",
>          .params     = "[[<domain>:]<bus>:]<slot>\n"
> diff --git a/hmp.c b/hmp.c
> index 8ff8c94..282d3e5 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -701,6 +701,30 @@ void hmp_snapshot_blkdev(Monitor *mon, const QDict *qdict)
>      hmp_handle_error(mon, &errp);
>  }
>  
> +void hmp_drive_reopen(Monitor *mon, const QDict *qdict)
> +{
> +    const char *device = qdict_get_str(qdict, "device");
> +    const char *filename = qdict_get_str(qdict, "new-image-file");
> +    const char *format = qdict_get_try_str(qdict, "format");
> +    Error *errp = NULL;
> +
> +    qmp_drive_reopen(device, filename, !!format, format, &errp);
> +    hmp_handle_error(mon, &errp);
> +}
> +
> +void hmp_blkdev_migrate(Monitor *mon, const QDict *qdict)
> +{
> +    bool incremental = qdict_get_try_bool(qdict, "incremental", 0);
> +    const char *device = qdict_get_str(qdict, "device");
> +    const char *destination = qdict_get_str(qdict, "destination");
> +    const char *new_image_file = qdict_get_try_str(qdict, "new-image-file");
> +    Error *errp = NULL;
> +
> +    qmp_blockdev_migrate(device, incremental, destination,
> +                         !!new_image_file, new_image_file, &errp);
> +    hmp_handle_error(mon, &errp);
> +}
> +
>  void hmp_migrate_cancel(Monitor *mon, const QDict *qdict)
>  {
>      qmp_migrate_cancel(NULL);
> diff --git a/hmp.h b/hmp.h
> index 18eecbd..3c66257 100644
> --- a/hmp.h
> +++ b/hmp.h
> @@ -47,6 +47,8 @@ void hmp_block_passwd(Monitor *mon, const QDict *qdict);
>  void hmp_balloon(Monitor *mon, const QDict *qdict);
>  void hmp_block_resize(Monitor *mon, const QDict *qdict);
>  void hmp_snapshot_blkdev(Monitor *mon, const QDict *qdict);
> +void hmp_drive_reopen(Monitor *mon, const QDict *qdict);
> +void hmp_blkdev_migrate(Monitor *mon, const QDict *qdict);
>  void hmp_migrate_cancel(Monitor *mon, const QDict *qdict);
>  void hmp_migrate_set_downtime(Monitor *mon, const QDict *qdict);
>  void hmp_migrate_set_speed(Monitor *mon, const QDict *qdict);
> diff --git a/qapi-schema.json b/qapi-schema.json
> index d02ee86..df1248c 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -1136,6 +1136,60 @@
>    'data': { 'device': 'str', 'snapshot-file': 'str', '*format': 'str' } }
>  
>  ##
> +# @drive-reopen
> +#
> +# Assigns a new image file to a device.
> +#
> +# @device: the name of the device for which we are chainging the image file.
> +#
> +# @new-image-file: the target of the new image. If the file doesn't exists the
> +#                  command will fail.
> +#
> +# @format: #optional the format of the new image, default is 'qcow2'.
> +#
> +# Returns: nothing on success
> +#          If @device is not a valid block device, DeviceNotFound
> +#          If @new-image-file can't be opened, OpenFileFailed
> +#          If @format is invalid, InvalidBlockFormat
> +#
> +# Notes: This command is deprecated.

As I said in the other thread, we need a replacement to deprecate something.

> +#
> +# Since 1.1
> +##
> +
> +{ 'command': 'drive-reopen',
> +  'data': { 'device': 'str', 'new-image-file': 'str', '*format': 'str' } }
> +
> +##
> +# @blockdev-migrate
> +#
> +# Migrates the device to a new destination.
> +#
> +# @device: the name of the block device to migrate.
> +#
> +# @incremental: if true only the new writes are sent to the destination.
> +#               This method is particularly useful if used in conjunction
> +#               with new-image-file allowing the current image to be
> +#               transferred to the destination by an external manager.
> +#
> +# @destination: the destination of the block migration.
> +#
> +# @new-image-file: #optional an existing image file that will replace
> +#                  the current one in the device.
> +#
> +# Returns: nothing on success
> +#          If @device is not a valid block device, DeviceNotFound
> +#          If @mode is not a valid migration mode, InvalidParameterValue
> +#          If @destination can't be opened, OpenFileFailed
> +#          If @new-image-file can't be opened, OpenFileFailed
> +#
> +# Since 1.1
> +##
> +{ 'command': 'blockdev-migrate',
> +  'data': { 'device': 'str', 'incremental' : 'bool',
> +            'destination': 'str', '*new-image-file': 'str' } }
> +
> +##
>  # @human-monitor-command:
>  #
>  # Execute a command on the human monitor and return the output.
Eric Blake Feb. 24, 2012, 7:37 p.m. UTC | #4
On 02/24/2012 11:57 AM, Paolo Bonzini wrote:
> On 02/24/2012 06:46 PM, Eric Blake wrote:
>> I think you need to be more explicit that @new-image-file MUST have
>> identical contents as the current image file, for this to be useful, and
>> that qemu does not validate whether the new image met those conditions.
>>  Possible ways to achieve this:
> 
> Not necessarily, you could always do this on a paused machine.

You lost me; I think you snipped too much.  Was it about this statement
in my original?

>> 1. Freeze all guest I/O, so that you know the current image file is
>> stable,

To which I have a counter-question: Is pausing a machine guaranteed to
flush all I/O out to the image prior to returning control to the
controlling app, or do we have to rely on the qemu-ga agent command to
actually freeze I/O within the guest?

Overall, I was arguing that from a usability perspective, if
new-image-file does not have the same contents (from the guest's
perspective) as the current image, then when you resume the machine, you
will be injecting arbitrary disk changes into the guest's disk, probably
with disastrous results.  I'm not arguing that new-image-file has to be
identical to the current image (in fact, converting between raw or qcow2
via drive-reopen seems like great use cases for this new command).  So
documenting this distinction (guest data must be consistent across the
reopen), as part of the contract of the monitor command, would be useful.

> 
>>>> +# @destination: the destination of the block migration.
>> Where do you list what format the destination is?  Shouldn't this have
>> an optional format, defaulting to qcow2?  Does qemu create the
>> destination file, or must it already be existing?
> 
> I think no default, raw makes as much or more sense in the
> non-incremental case.  Anyway either autodetect, or no default.

So if there is no default, then how do I specify the format?  Libvirt
_cannot_ rely on autodetect, ever since we had to solve a CVE where
relying on autodetect would allow the guest to cause libvirt to mislabel
files on the host.

>> I know that this patch is only implementing the case where incremental
>> is true and new-image-file is provided; but I'm not quite sure what
>> semantics are intended if incremental is false.  Is that still a case
>> where this sets up mirroring (writes go to two images) but additionally
>> the contents from the current image are (asynchronously) streamed into
>> the destination?
> 
> Yes.  The image should already be there also in this case, and
> new-image-file will usually be omitted.

If the image to be opened must already exist, then explicitly state that
in the command documentation.
Eric Blake Feb. 24, 2012, 7:40 p.m. UTC | #5
On 02/24/2012 12:01 PM, Luiz Capitulino wrote:

>> +    BlockDriver *drv;
>> +    int i, j, escape;
>> +    char new_filename[2048], *filename;
> 
> I'd use PATH_MAX for new_filename's size.

PATH_MAX need not be defined (and on Hurd, it intentionally is not
defined); or might be so huge as to be useless.  But 2048 is smaller
than typical PATH_MAX of 4096, so you may still want to consider a
compromise.

>> +
>> +        new_filename[i++] = new_image_file[j];
>> +    }
> 
> IMO, you could require the filename arguments to be escaped already.

And if you do that, then the escaping should be consistent for both
arguments.  Libvirt is okay with pre-escaping file names before passing
them to qemu.
Luiz Capitulino Feb. 24, 2012, 8:26 p.m. UTC | #6
On Fri, 24 Feb 2012 12:40:17 -0700
Eric Blake <eblake@redhat.com> wrote:

> On 02/24/2012 12:01 PM, Luiz Capitulino wrote:
> 
> >> +    BlockDriver *drv;
> >> +    int i, j, escape;
> >> +    char new_filename[2048], *filename;
> > 
> > I'd use PATH_MAX for new_filename's size.
> 
> PATH_MAX need not be defined (and on Hurd, it intentionally is not
> defined); or might be so huge as to be useless.

Aren't those extreme cases? PATH_MAX is a standard define and is used in
QEMU in several places. If it's not good here, it shouldn't be good anywhere.
Paolo Bonzini Feb. 24, 2012, 8:32 p.m. UTC | #7
On 02/24/2012 08:01 PM, Luiz Capitulino wrote:
> IMO, you could require the filename arguments to be escaped already.

The monitor command was introduced exactly to avoid having to worry
about details such as escaping.  JSON is already supposed to take care
of those.

That said, I think Eric is right that we need to specify the formats, so
we'll need another iteration, and we probably need to rename the command
to drive-migrate too.

Paolo
Luiz Capitulino Feb. 24, 2012, 8:36 p.m. UTC | #8
On Fri, 24 Feb 2012 21:32:30 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 02/24/2012 08:01 PM, Luiz Capitulino wrote:
> > IMO, you could require the filename arguments to be escaped already.
> 
> The monitor command was introduced exactly to avoid having to worry
> about details such as escaping.  JSON is already supposed to take care
> of those.

Then the escaping should be done by the parser, not by the command
implementation.

> 
> That said, I think Eric is right that we need to specify the formats, so
> we'll need another iteration, and we probably need to rename the command
> to drive-migrate too.
> 
> Paolo
>
Paolo Bonzini Feb. 24, 2012, 9:05 p.m. UTC | #9
On 02/24/2012 09:36 PM, Luiz Capitulino wrote:
> > > IMO, you could require the filename arguments to be escaped already.
> > 
> > The monitor command was introduced exactly to avoid having to worry
> > about details such as escaping.  JSON is already supposed to take care
> > of those.
> 
> Then the escaping should be done by the parser, not by the command
> implementation.

This command is passing its arguments to the block layer, which needs
its own escaping.  It's a wart of the block layer and not supposed to
percolate outside QEMU.

Paolo
Eric Blake Feb. 24, 2012, 10:30 p.m. UTC | #10
On 02/24/2012 12:01 PM, Luiz Capitulino wrote:
> On Fri, 24 Feb 2012 16:49:04 +0000
> Federico Simoncelli <fsimonce@redhat.com> wrote:
> 
>> Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
> 
> How does this relate to migrate -b? Should it be deprecated?
> 
> Btw, would be nice to have a 0/2 intro email describing the feature and changelog
> info.

Another question - how does this interact with Jeff Cody's work to add
blockdev-group-snapshot-sync to add atomic group snapshots?  Do you ever
envision the need to migrate and/or reopen multiple disks at once, in
which case, we would want to be setting things up to work on a group in
the same way that Jeff's patches for live snapshot are headed?
Eric Blake Feb. 24, 2012, 10:46 p.m. UTC | #11
On 02/24/2012 01:26 PM, Luiz Capitulino wrote:
> On Fri, 24 Feb 2012 12:40:17 -0700
> Eric Blake <eblake@redhat.com> wrote:
> 
>> On 02/24/2012 12:01 PM, Luiz Capitulino wrote:
>>
>>>> +    BlockDriver *drv;
>>>> +    int i, j, escape;
>>>> +    char new_filename[2048], *filename;
>>>
>>> I'd use PATH_MAX for new_filename's size.
>>
>> PATH_MAX need not be defined (and on Hurd, it intentionally is not
>> defined); or might be so huge as to be useless.
> 
> Aren't those extreme cases? PATH_MAX is a standard define and is used in
> QEMU in several places. If it's not good here, it shouldn't be good anywhere.

PATH_MAX is specifically declared in POSIX to be defined if there is a
limit, or undefined if there is no limit.  There is no limit in GNU
Hurd, so PATH_MAX is undefined there, and you will get a compile error
(then again, no one has ported qemu to Hurd).

Here's what gnulib has recommended:

https://lists.gnu.org/archive/html/bug-gnulib/2011-06/msg00328.html

> A package like coreutils can also do
>   #ifndef PATH_MAX
>   # define PATH_MAX 8192
>   #endif
> in its system.h.
> 
> Looking at both uses of PATH_MAX in coreutils (src/pwd.c:88 and
> src/remove.c:186) the value of PATH_MAX is capped by 8192 or 16384 anyway.
> So, on systems like GNU/Hurd, where filenames can have arbitrary size, you
> are calling pathconf for no real purpose.
> 
> To me, this confirms that a generic pathmax.h (like the one in gnulib)
> should only define PATH_MAX when it makes sense - like POSIX says -,
> and that the handling of the GNU/Hurd case should be done on a case-by-case
> basis:
>   - Either a package-wide handling, or a per-file handling.
>   - Either a fallback value of 8192, or a fallback value of
>     pathconf ("/", _PC_PATH_MAX), or just a #ifdef test.

Other mails in that thread are also an interesting read.

In short, use of PATH_MAX should only ever be used to optimize routines
to the common case; in which case, you can pick your own cap for
PATH_MAX if the implementation did not provide one or reduce the
implementation's 8k PATH_MAX down to something like 2048 that you can
safely fit on the stack for the common case before malloc'ing for the
larger strings.  But using it as a bounds to a statically-sized object
is a recipe for artificially limiting software; if you are okay with
introducing that artificial limit, then go for it; but if you want to be
truly portable, it is best to never use PATH_MAX as an array bounds, and
to write fallback code paths to handle the cases where user input
exceeds PATH_MAX but can still be handled without error by the system
you are running on.
Paolo Bonzini Feb. 25, 2012, 6:47 a.m. UTC | #12
On 02/24/2012 11:30 PM, Eric Blake wrote:
> On 02/24/2012 12:01 PM, Luiz Capitulino wrote:
>> On Fri, 24 Feb 2012 16:49:04 +0000
>> Federico Simoncelli <fsimonce@redhat.com> wrote:
>>
>>> Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
>>
>> How does this relate to migrate -b? Should it be deprecated?
>>
>> Btw, would be nice to have a 0/2 intro email describing the feature and changelog
>> info.
> 
> Another question - how does this interact with Jeff Cody's work to add
> blockdev-group-snapshot-sync to add atomic group snapshots?  Do you ever
> envision the need to migrate and/or reopen multiple disks at once, in
> which case, we would want to be setting things up to work on a group in
> the same way that Jeff's patches for live snapshot are headed?

I was thinking that perhaps we needed something like
blockdev-group-start, blockdev-group-commit, blockdev-group-abort
instead of Jeff's group snapshot, but I was afraid of proposing it. :)

Paolo
Federico Simoncelli Feb. 27, 2012, 11:29 a.m. UTC | #13
----- Original Message -----
> From: "Luiz Capitulino" <lcapitulino@redhat.com>
> To: "Federico Simoncelli" <fsimonce@redhat.com>
> Cc: qemu-devel@nongnu.org, mtosatti@redhat.com, pbonzini@redhat.com, kwolf@redhat.com, armbru@redhat.com
> Sent: Friday, February 24, 2012 8:01:43 PM
> Subject: Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands
> 
> On Fri, 24 Feb 2012 16:49:04 +0000
> Federico Simoncelli <fsimonce@redhat.com> wrote:
> 
> > Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
> 
> Btw, would be nice to have a 0/2 intro email describing the feature
> and changelog
> info.

Yes the 0/2 (actually 0/3) was sent at the beginning of the thread so you might
have missed it because you were added later on but I'm sure you can still find
it in the archives.

> 
> > +    BlockDriver *drv;
> > +    int i, j, escape;
> > +    char new_filename[2048], *filename;
> 
> I'd use PATH_MAX for new_filename's size.

Maybe PATH_MAX * 2 + 1? (To handle the case where all the characters should be
escaped).

> > +    escape = 0;
> > +    for (i = 0, j = 0; j < strlen(new_image_file); j++) {
> > + loop:
> > +        if (!(i < sizeof(new_filename) - 2)) {
> > +            error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> > +                      "new-image-file", "shorter filename");
> > +            return;
> > +        }
> > +
> > +        if (new_image_file[j] == ':' || new_image_file[j] == '\\')
> > {
> > +            if (!escape) {
> > +                new_filename[i++] = '\\', escape = 1;
> > +                goto loop;
> > +            } else {
> > +                escape = 0;
> > +            }
> > +        }
> > +
> > +        new_filename[i++] = new_image_file[j];
> > +    }
> 
> IMO, you could require the filename arguments to be escaped already.

Can you be more explicit, who should escape it?
The only thing that comes to my mind right now is requiring the input of
blockdev-migrate already escaped but that would expose an internal format.
(I'd not recommend it).

> > +void qmp_blockdev_migrate(const char *device, bool incremental,
> > +                          const char *destination, bool
> > has_new_image_file,
> > +                          const char *new_image_file, Error
> > **errp)
> > +{
> > +    BlockDriverState *bs;
> > +
> > +    bs = bdrv_find(device);
> > +    if (!bs) {
> > +        error_set(errp, QERR_DEVICE_NOT_FOUND, device);
> > +        return;
> > +    }
> > +    if (bdrv_in_use(bs)) {
> > +        error_set(errp, QERR_DEVICE_IN_USE, device);
> > +        return;
> > +    }
> > +
> > +    if (incremental) {
> > +        if (!has_new_image_file) {
> > +            error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> > +                      "incremental", "a new image file");
> > +        } else {
> > +            qmp_blockdev_mirror(device, destination,
> > new_image_file, errp);
> > +        }
> > +    } else {
> > +        error_set(errp, QERR_NOT_SUPPORTED);
> > +    }
> 
> The command documentation says that 'incremental' and
> 'new_image_file' are
> optionals, but the code makes them required. Why?

If I didn't make any mistake in the code I'm just enforcing that when
you specify "incremental" you also need a new image.
There are still other valid cases where they are optional.
Luiz Capitulino Feb. 27, 2012, 12:12 p.m. UTC | #14
On Mon, 27 Feb 2012 06:29:39 -0500 (EST)
Federico Simoncelli <fsimonce@redhat.com> wrote:

> ----- Original Message -----
> > From: "Luiz Capitulino" <lcapitulino@redhat.com>
> > To: "Federico Simoncelli" <fsimonce@redhat.com>
> > Cc: qemu-devel@nongnu.org, mtosatti@redhat.com, pbonzini@redhat.com, kwolf@redhat.com, armbru@redhat.com
> > Sent: Friday, February 24, 2012 8:01:43 PM
> > Subject: Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands
> > 
> > On Fri, 24 Feb 2012 16:49:04 +0000
> > Federico Simoncelli <fsimonce@redhat.com> wrote:
> > 
> > > Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
> > 
> > Btw, would be nice to have a 0/2 intro email describing the feature
> > and changelog
> > info.
> 
> Yes the 0/2 (actually 0/3) was sent at the beginning of the thread so you might
> have missed it because you were added later on but I'm sure you can still find
> it in the archives.

I only found v1 iirc, but this is not important right now, as you're going to
post v3 right? And that's going to have the intro :)

> > 
> > > +    BlockDriver *drv;
> > > +    int i, j, escape;
> > > +    char new_filename[2048], *filename;
> > 
> > I'd use PATH_MAX for new_filename's size.
> 
> Maybe PATH_MAX * 2 + 1? (To handle the case where all the characters should be
> escaped).

Well, I was discussing this with Eric and he thinks that a value of 4096
should be fine. I personally prefer using PATH_MAX because I don't believe
I'm better at choosing a random value for this vs. using what the system
provides me.

Feel free to choose what you think is the best for this case.

> > > +    escape = 0;
> > > +    for (i = 0, j = 0; j < strlen(new_image_file); j++) {
> > > + loop:
> > > +        if (!(i < sizeof(new_filename) - 2)) {
> > > +            error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> > > +                      "new-image-file", "shorter filename");
> > > +            return;
> > > +        }
> > > +
> > > +        if (new_image_file[j] == ':' || new_image_file[j] == '\\')
> > > {
> > > +            if (!escape) {
> > > +                new_filename[i++] = '\\', escape = 1;
> > > +                goto loop;
> > > +            } else {
> > > +                escape = 0;
> > > +            }
> > > +        }
> > > +
> > > +        new_filename[i++] = new_image_file[j];
> > > +    }
> > 
> > IMO, you could require the filename arguments to be escaped already.
> 
> Can you be more explicit, who should escape it?

Paolo thinks this should be done by the block layer, fine with me.

> The only thing that comes to my mind right now is requiring the input of
> blockdev-migrate already escaped but that would expose an internal format.
> (I'd not recommend it).
> 
> > > +void qmp_blockdev_migrate(const char *device, bool incremental,
> > > +                          const char *destination, bool
> > > has_new_image_file,
> > > +                          const char *new_image_file, Error
> > > **errp)
> > > +{
> > > +    BlockDriverState *bs;
> > > +
> > > +    bs = bdrv_find(device);
> > > +    if (!bs) {
> > > +        error_set(errp, QERR_DEVICE_NOT_FOUND, device);
> > > +        return;
> > > +    }
> > > +    if (bdrv_in_use(bs)) {
> > > +        error_set(errp, QERR_DEVICE_IN_USE, device);
> > > +        return;
> > > +    }
> > > +
> > > +    if (incremental) {
> > > +        if (!has_new_image_file) {
> > > +            error_set(errp, QERR_INVALID_PARAMETER_VALUE,
> > > +                      "incremental", "a new image file");
> > > +        } else {
> > > +            qmp_blockdev_mirror(device, destination,
> > > new_image_file, errp);
> > > +        }
> > > +    } else {
> > > +        error_set(errp, QERR_NOT_SUPPORTED);
> > > +    }
> > 
> > The command documentation says that 'incremental' and
> > 'new_image_file' are
> > optionals, but the code makes them required. Why?
> 
> If I didn't make any mistake in the code I'm just enforcing that when
> you specify "incremental" you also need a new image.
> There are still other valid cases where they are optional.

Which operation will be performed if 'incremental' is not passed? If
it's passed, which operation will be performed if 'new_image_file' is not?
Paolo Bonzini Feb. 27, 2012, 12:49 p.m. UTC | #15
On 02/27/2012 01:12 PM, Luiz Capitulino wrote:
> > If I didn't make any mistake in the code I'm just enforcing that when
> > you specify "incremental" you also need a new image.
> > There are still other valid cases where they are optional.
> 
> Which operation will be performed if 'incremental' is not passed? If
> it's passed, which operation will be performed if 'new_image_file' is not?

There are four cases:

1) incremental=false, new_image_file not passed:
   right now fail; in the future:
   - create an image on dest with no backing file;
   - writes will be mirrored to current_image_file and dest
   - start streaming from current_image_file to dest

2) incremental=false, new_image_file passed:
   right now fail; in the future:
   - create an image on dest with no backing file;
   - live-snapshot based on current_image_file to new_image_file;
   - writes will be mirrored to new_image_file and dest
   - start streaming from current_image_file to dest

3) incremental=true, new_image_file not passed:
   - error

4) incremental=true, new_image_file passed:
   - no images will be created
   - writes will be mirrored to new_image_file and dest

Paolo
Luiz Capitulino Feb. 27, 2012, 1:06 p.m. UTC | #16
On Mon, 27 Feb 2012 13:49:17 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 02/27/2012 01:12 PM, Luiz Capitulino wrote:
> > > If I didn't make any mistake in the code I'm just enforcing that when
> > > you specify "incremental" you also need a new image.
> > > There are still other valid cases where they are optional.
> > 
> > Which operation will be performed if 'incremental' is not passed? If
> > it's passed, which operation will be performed if 'new_image_file' is not?
> 
> There are four cases:
> 
> 1) incremental=false, new_image_file not passed:
>    right now fail; in the future:
>    - create an image on dest with no backing file;
>    - writes will be mirrored to current_image_file and dest
>    - start streaming from current_image_file to dest
> 
> 2) incremental=false, new_image_file passed:
>    right now fail; in the future:
>    - create an image on dest with no backing file;
>    - live-snapshot based on current_image_file to new_image_file;
>    - writes will be mirrored to new_image_file and dest
>    - start streaming from current_image_file to dest
>
> 
> 3) incremental=true, new_image_file not passed:
>    - error
> 
> 4) incremental=true, new_image_file passed:
>    - no images will be created
>    - writes will be mirrored to new_image_file and dest

IMHO, this is asking for two commands, where cases 1 & 2 is one of them
and cases 3 & 4 is the other one. Note how 'incremental' goes away and
'new_image_file' really becomes an optional.
Paolo Bonzini Feb. 27, 2012, 2:39 p.m. UTC | #17
On 02/27/2012 02:06 PM, Luiz Capitulino wrote:
> IMHO, this is asking for two commands, where cases 1 & 2 is one of them
> and cases 3 & 4 is the other one. Note how 'incremental' goes away and
> 'new_image_file' really becomes an optional.

I really would have no idea on naming except perhaps "drive_migrate" and
"funny_drive_migrate_for_ovirt".  But actually I must admit that it's a
rat's nest.

First, there's no reason why live-snapshotting to new_image_file
shouldn't be handled within QEMU.  That would make the above table much
more orthogonal.  However, oVirt likes to create its own snapshots.

Perhaps we do need to rethink this together with group snapshots.

There are four kinds of operations that we do on block devices:
(a) create an image.  This is part of what blockdev-snapshot does.
(b) switch a block device to a new image.  drive-reopen does this.
(c) add mirroring to a new destination.
(d) activate streaming from a base image

drive-migrate does (b) and (c), will do (a) and (d) when we add
pre-copy, and should do (a) right now if Federico wasn't an oVirt
developer. :)

Thinking more about it, the commands we _do_ need are:
- start a transaction
- switch to a new image
- add mirroring to a new destination
- commit a transaction
- rollback a transaction
- query transaction errors

Creating an image can be done outside a transaction for now because we
only support live external snapshots.  Streaming can also be started
outside a transaction, because it doesn't need to be started atomically.

If we have the above elementary commands, blockdev-snapshot-sync becomes
sugar for this:

    (create image)
    blockdev-start-transaction if no active transaction
    drive-reopen
    blockdev-commit-transaction if no active transaction

Jeff's group snapshot can be realized as this:

    blockdev-begin-transaction
    blockdev-snapshot-sync
    ...
    blockdev-snapshot-sync
    blockdev-commit-transaction

And for drive-migrate, let's look at the above 3 cases:

> > 1) incremental=false, new_image_file not passed:
> >    right now fail; in the future:
> >    - create an image on dest with no backing file;
> >    - writes will be mirrored to current_image_file and dest
> >    - start streaming from current_image_file to dest

This is a new command "drive-mirror device dest", which does:

    (create image for dest)
    blockdev-begin-transaction if no active transaction
    drive-mirror device dest
    blockdev-commit-transaction if no active transaction

The command does this:

- mirror writes to current_image_file and dest
- start streaming from current_image_file to dest

The second part can be suppressed with a boolean argument.

> > 2) incremental=false, new_image_file passed:
> >    right now fail; in the future:
> >    - create an image on dest with no backing file;
> >    - live-snapshot based on current_image_file to new_image_file;
> >    - writes will be mirrored to new_image_file and dest
> >    - start streaming from current_image_file to dest

Atomicity is not needed here, so the user can simply issue:

    blockdev-snapshot-sync device new-image-file
    drive-mirror device dest

> > 4) incremental=true, new_image_file passed:
> >    - no images will be created
> >    - writes will be mirrored to new_image_file and dest

No need to provide this from within QEMU, because libvirt/oVirt can do
the dance using elementary operations:

    blockdev-begin-transaction
    drive-reopen device new-image-file
    drive-mirror streaming=false device dest
    blockdev-commit-transaction

No strange optional arguments, no proliferation of commands, etc.  The
only downside is that if someone tries to do (4) without transactions
(or without stopping the VM) they'll get corruption because atomicity is
required.

Paolo
Anthony Liguori Feb. 27, 2012, 2:46 p.m. UTC | #18
On 02/27/2012 08:39 AM, Paolo Bonzini wrote:
> On 02/27/2012 02:06 PM, Luiz Capitulino wrote:
>> IMHO, this is asking for two commands, where cases 1&  2 is one of them
>> and cases 3&  4 is the other one. Note how 'incremental' goes away and
>> 'new_image_file' really becomes an optional.
>
> I really would have no idea on naming except perhaps "drive_migrate" and
> "funny_drive_migrate_for_ovirt".  But actually I must admit that it's a
> rat's nest.
>
> First, there's no reason why live-snapshotting to new_image_file
> shouldn't be handled within QEMU.  That would make the above table much
> more orthogonal.  However, oVirt likes to create its own snapshots.
>
> Perhaps we do need to rethink this together with group snapshots.
>
> There are four kinds of operations that we do on block devices:
> (a) create an image.  This is part of what blockdev-snapshot does.
> (b) switch a block device to a new image.  drive-reopen does this.
> (c) add mirroring to a new destination.
> (d) activate streaming from a base image
>
> drive-migrate does (b) and (c), will do (a) and (d) when we add
> pre-copy, and should do (a) right now if Federico wasn't an oVirt
> developer. :)
>
> Thinking more about it, the commands we _do_ need are:
> - start a transaction
> - switch to a new image
> - add mirroring to a new destination
> - commit a transaction
> - rollback a transaction
> - query transaction errors

I think a better way to think of this is as a batch submission.  It would be 
relatively easy to model in QMP too (just have a batch-command that has a list 
of commands as it's argument).

The difference between batch submission and a transaction is atomic rollback. 
But I don't think atomic rollback is really needed here.

Regards,

Anthony Liguori
Paolo Bonzini Feb. 27, 2012, 2:54 p.m. UTC | #19
On 02/27/2012 03:46 PM, Anthony Liguori wrote:
> I think a better way to think of this is as a batch submission.  It
> would be relatively easy to model in QMP too (just have a batch-command
> that has a list of commands as it's argument).
> 
> The difference between batch submission and a transaction is atomic
> rollback. But I don't think atomic rollback is really needed here.

A transaction enforces atomicity at the block layer level.  It's
different from batch commands in two ways:

* bdrv_drain_all/bdrv_flush needs to be called at the beginning of the
commit.  This may not be the case with batch commands.

* with batch commands, atomicity happens by chance because VCPUs cannot
send I/O while the monitor is holding the global mutex.

Paolo
Anthony Liguori Feb. 27, 2012, 2:59 p.m. UTC | #20
On 02/27/2012 08:54 AM, Paolo Bonzini wrote:
> On 02/27/2012 03:46 PM, Anthony Liguori wrote:
>> I think a better way to think of this is as a batch submission.  It
>> would be relatively easy to model in QMP too (just have a batch-command
>> that has a list of commands as it's argument).
>>
>> The difference between batch submission and a transaction is atomic
>> rollback. But I don't think atomic rollback is really needed here.
>
> A transaction enforces atomicity at the block layer level.  It's
> different from batch commands in two ways:
>
> * bdrv_drain_all/bdrv_flush needs to be called at the beginning of the
> commit.  This may not be the case with batch commands.
>
> * with batch commands, atomicity happens by chance because VCPUs cannot
> send I/O while the monitor is holding the global mutex.

If the commands are designed correctly, then this works well.  The problem is 
that the current commands are not designed well.  For instance, multi-snapshot 
could look like:

block-freeze ide0-hd0
block-freeze ide1-hd1
block-reopen ide0-hd0 my-new-file0.qcow2
block-reopen ide1-hd1 my-new-file1.qcow2
block-unfreeze ide1-hd1
block-unfreeze ide1-hd0

This would work regardless of whether the commands were implemented 
asynchronously within QEMU too.

Regards,

Anthony Liguori

>
> Paolo
Paolo Bonzini Feb. 27, 2012, 3:03 p.m. UTC | #21
On 02/27/2012 03:59 PM, Anthony Liguori wrote:
> The problem is that the current commands are not designed well.  For
> instance, multi-snapshot could look like:
> 
> block-freeze ide0-hd0
> block-freeze ide1-hd1
> block-reopen ide0-hd0 my-new-file0.qcow2
> block-reopen ide1-hd1 my-new-file1.qcow2
> block-unfreeze ide1-hd1
> block-unfreeze ide1-hd0
> 
> This would work regardless of whether the commands were implemented
> asynchronously within QEMU too.

This looks good, too.  Positive: maps well to fsfreeze/thaw with help
from the guest agent.  Negative: you have to specify the devices three
times.  Overall, I think I like it.

However, you need to add freeze/unfreeze capabilities to the block
layer.  Not hard, but one more thing to do.

Paolo
Anthony Liguori Feb. 27, 2012, 3:06 p.m. UTC | #22
On 02/27/2012 09:03 AM, Paolo Bonzini wrote:
> On 02/27/2012 03:59 PM, Anthony Liguori wrote:
>> The problem is that the current commands are not designed well.  For
>> instance, multi-snapshot could look like:
>>
>> block-freeze ide0-hd0
>> block-freeze ide1-hd1
>> block-reopen ide0-hd0 my-new-file0.qcow2
>> block-reopen ide1-hd1 my-new-file1.qcow2
>> block-unfreeze ide1-hd1
>> block-unfreeze ide1-hd0
>>
>> This would work regardless of whether the commands were implemented
>> asynchronously within QEMU too.
>
> This looks good, too.  Positive: maps well to fsfreeze/thaw with help
> from the guest agent.  Negative: you have to specify the devices three
> times.  Overall, I think I like it.
>
> However, you need to add freeze/unfreeze capabilities to the block
> layer.  Not hard, but one more thing to do.

Right.  But it also generalizes to other QMP operations which is potentially 
interesting.

And providing mechanisms like this gives more flexibility to management tools to 
implement interesting features without constantly chancing new QMP commands.

Regards,

Anthony Liguori

>
> Paolo
Kevin Wolf Feb. 27, 2012, 3:17 p.m. UTC | #23
Am 27.02.2012 15:59, schrieb Anthony Liguori:
> On 02/27/2012 08:54 AM, Paolo Bonzini wrote:
>> On 02/27/2012 03:46 PM, Anthony Liguori wrote:
>>> I think a better way to think of this is as a batch submission.  It
>>> would be relatively easy to model in QMP too (just have a batch-command
>>> that has a list of commands as it's argument).
>>>
>>> The difference between batch submission and a transaction is atomic
>>> rollback. But I don't think atomic rollback is really needed here.
>>
>> A transaction enforces atomicity at the block layer level.  It's
>> different from batch commands in two ways:
>>
>> * bdrv_drain_all/bdrv_flush needs to be called at the beginning of the
>> commit.  This may not be the case with batch commands.
>>
>> * with batch commands, atomicity happens by chance because VCPUs cannot
>> send I/O while the monitor is holding the global mutex.
> 
> If the commands are designed correctly, then this works well.  The problem is 
> that the current commands are not designed well.  For instance, multi-snapshot 
> could look like:
> 
> block-freeze ide0-hd0
> block-freeze ide1-hd1
> block-reopen ide0-hd0 my-new-file0.qcow2
> block-reopen ide1-hd1 my-new-file1.qcow2
> block-unfreeze ide1-hd1
> block-unfreeze ide1-hd0
> 
> This would work regardless of whether the commands were implemented 
> asynchronously within QEMU too.

What if block-reopen ide1-hd1 fails? In case of failure, you want both
drives to point to their old image, not the first one to the new image
and the second one to the old image.

Kevin
Anthony Liguori Feb. 27, 2012, 3:24 p.m. UTC | #24
On 02/27/2012 09:17 AM, Kevin Wolf wrote:
> Am 27.02.2012 15:59, schrieb Anthony Liguori:
>> On 02/27/2012 08:54 AM, Paolo Bonzini wrote:
>>> On 02/27/2012 03:46 PM, Anthony Liguori wrote:
>>>> I think a better way to think of this is as a batch submission.  It
>>>> would be relatively easy to model in QMP too (just have a batch-command
>>>> that has a list of commands as it's argument).
>>>>
>>>> The difference between batch submission and a transaction is atomic
>>>> rollback. But I don't think atomic rollback is really needed here.
>>>
>>> A transaction enforces atomicity at the block layer level.  It's
>>> different from batch commands in two ways:
>>>
>>> * bdrv_drain_all/bdrv_flush needs to be called at the beginning of the
>>> commit.  This may not be the case with batch commands.
>>>
>>> * with batch commands, atomicity happens by chance because VCPUs cannot
>>> send I/O while the monitor is holding the global mutex.
>>
>> If the commands are designed correctly, then this works well.  The problem is
>> that the current commands are not designed well.  For instance, multi-snapshot
>> could look like:
>>
>> block-freeze ide0-hd0
>> block-freeze ide1-hd1
>> block-reopen ide0-hd0 my-new-file0.qcow2
>> block-reopen ide1-hd1 my-new-file1.qcow2
>> block-unfreeze ide1-hd1
>> block-unfreeze ide1-hd0
>>
>> This would work regardless of whether the commands were implemented
>> asynchronously within QEMU too.
>
> What if block-reopen ide1-hd1 fails? In case of failure, you want both
> drives to point to their old image, not the first one to the new image
> and the second one to the old image.

Then you get an error with the block devices still frozen.  You can execute 
another command to reopen back to the old image to roll back the transaction.

Pushing the rollback logic to the client does make the client interface a bit 
more complicated and adds latency to the error path but it's much easier than 
building a complex transaction infrastructure.

And there are examples of this in the wild too.  LDAP uses a similar mechanism.

Regards,

Anthony Liguori

>
> Kevin
Federico Simoncelli Feb. 27, 2012, 4:33 p.m. UTC | #25
----- Original Message -----
> From: "Paolo Bonzini" <pbonzini@redhat.com>
> To: "Luiz Capitulino" <lcapitulino@redhat.com>
> Cc: "Federico Simoncelli" <fsimonce@redhat.com>, kwolf@redhat.com, mtosatti@redhat.com, qemu-devel@nongnu.org,
> armbru@redhat.com, "Jeff Cody" <jcody@redhat.com>
> Sent: Monday, February 27, 2012 3:39:33 PM
> Subject: drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands)
> 
> > > 4) incremental=true, new_image_file passed:
> > >    - no images will be created
> > >    - writes will be mirrored to new_image_file and dest
> 
> No need to provide this from within QEMU, because libvirt/oVirt can
> do
> the dance using elementary operations:
> 
>     blockdev-begin-transaction
>     drive-reopen device new-image-file
>     drive-mirror streaming=false device dest
>     blockdev-commit-transaction
> 
> No strange optional arguments, no proliferation of commands, etc.
>  The
> only downside is that if someone tries to do (4) without transactions
> (or without stopping the VM) they'll get corruption because atomicity
> is
> required.

I'm all for the modularity of the commands (I suggested it since the beginning),
but all this infrastructure goes way beyond what I'd need for oVirt now.

When I submitted my patches we knew that my work wasn't the definitive solution
(eg: the future implementation of -blockdev). So I'd suggest to try and settle
with something that is good enough and that is not preventing a future improvement.

What about having a (temporary) flag in drive-mirror to accept also a new-image-file
until we will have the optimal solution?
Paolo Bonzini Feb. 27, 2012, 4:41 p.m. UTC | #26
On 02/27/2012 05:33 PM, Federico Simoncelli wrote:
>> > 
>> >     blockdev-begin-transaction
>> >     drive-reopen device new-image-file
>> >     drive-mirror streaming=false device dest
>> >     blockdev-commit-transaction
>> > 
>> > No strange optional arguments, no proliferation of commands, etc.
>> >  The
>> > only downside is that if someone tries to do (4) without transactions
>> > (or without stopping the VM) they'll get corruption because atomicity
>> > is
>> > required.
> I'm all for the modularity of the commands (I suggested it since the beginning),
> but all this infrastructure goes way beyond what I'd need for oVirt now.
> 
> When I submitted my patches we knew that my work wasn't the definitive solution
> (eg: the future implementation of -blockdev). So I'd suggest to try and settle
> with something that is good enough and that is not preventing a future improvement.
> 
> What about having a (temporary) flag in drive-mirror to accept also a new-image-file
> until we will have the optimal solution?

What if libvirt could simply try blockdev-freeze and, if it's not there,
try passing a new-image-file argument.  Add the new-image-file
downstream if you have time constraints, and leave it out upstream.  Too
ugly?

Paolo
Anthony Liguori Feb. 27, 2012, 4:42 p.m. UTC | #27
On 02/27/2012 10:33 AM, Federico Simoncelli wrote:
> I'm all for the modularity of the commands (I suggested it since the beginning),
> but all this infrastructure goes way beyond what I'd need for oVirt now.
>
> When I submitted my patches we knew that my work wasn't the definitive solution
> (eg: the future implementation of -blockdev). So I'd suggest to try and settle
> with something that is good enough and that is not preventing a future improvement.
>
> What about having a (temporary) flag in drive-mirror to accept also a new-image-file
> until we will have the optimal solution?

Unless there are extenuating circumstances (like the absence of core 
infrastructure in QEMU), then we should not add commands that we know are not 
the right command.

We have to support our commands forever.

Regards,

Anthony Liguori

>
Federico Simoncelli Feb. 27, 2012, 4:50 p.m. UTC | #28
----- Original Message -----
> From: "Anthony Liguori" <anthony@codemonkey.ws>
> To: "Federico Simoncelli" <fsimonce@redhat.com>
> Cc: "Paolo Bonzini" <pbonzini@redhat.com>, kwolf@redhat.com, armbru@redhat.com, "Jeff Cody" <jcody@redhat.com>,
> mtosatti@redhat.com, qemu-devel@nongnu.org, "Luiz Capitulino" <lcapitulino@redhat.com>
> Sent: Monday, February 27, 2012 5:42:31 PM
> Subject: Re: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate
> commands)
> 
> On 02/27/2012 10:33 AM, Federico Simoncelli wrote:
> > I'm all for the modularity of the commands (I suggested it since
> > the beginning),
> > but all this infrastructure goes way beyond what I'd need for oVirt
> > now.
> >
> > When I submitted my patches we knew that my work wasn't the
> > definitive solution
> > (eg: the future implementation of -blockdev). So I'd suggest to try
> > and settle
> > with something that is good enough and that is not preventing a
> > future improvement.
> >
> > What about having a (temporary) flag in drive-mirror to accept also
> > a new-image-file
> > until we will have the optimal solution?
> 
> Unless there are extenuating circumstances (like the absence of core
> infrastructure in QEMU), then we should not add commands that we know
> are not
> the right command.

So are you in favor or against my suggestion? It looks like this is exactly
the case where the core infrastructure (transactions) is missing.
Paolo Bonzini Feb. 27, 2012, 4:51 p.m. UTC | #29
On 02/27/2012 04:24 PM, Anthony Liguori wrote:
> 
> Then you get an error with the block devices still frozen.  You can
> execute another command to reopen back to the old image to roll back the
> transaction.
> 
> Pushing the rollback logic to the client does make the client interface
> a bit more complicated and adds latency to the error path but it's much
> easier than building a complex transaction infrastructure.
> 
> And there are examples of this in the wild too.  LDAP uses a similar
> mechanism.

Actually, have you seen Jeff's atomic snapshot patch?  It really
implements all that is needed to do transactions, and gets 100% ok
error-recovery unlike the existing blockdev_snapshot_sync.  It really
looks like we can do better than client-side error management, and there
is not that much complexity at all.

Jeff could rework his patches to work with transaction begin/commit
commands, and Federico can then add drive-reopen and drive-migrate on top.

Paolo
Anthony Liguori Feb. 27, 2012, 4:53 p.m. UTC | #30
>> On 02/27/2012 10:33 AM, Federico Simoncelli wrote:
>>> I'm all for the modularity of the commands (I suggested it since
>>> the beginning),
>>> but all this infrastructure goes way beyond what I'd need for oVirt
>>> now.
>>>
>>> When I submitted my patches we knew that my work wasn't the
>>> definitive solution
>>> (eg: the future implementation of -blockdev). So I'd suggest to try
>>> and settle
>>> with something that is good enough and that is not preventing a
>>> future improvement.
>>>
>>> What about having a (temporary) flag in drive-mirror to accept also
>>> a new-image-file
>>> until we will have the optimal solution?
>>
>> Unless there are extenuating circumstances (like the absence of core
>> infrastructure in QEMU), then we should not add commands that we know
>> are not
>> the right command.
>
> So are you in favor or against my suggestion?

Against.  I think we have everything we need.

> It looks like this is exactly
> the case where the core infrastructure (transactions) is missing.

Batch requests are incredibly easy to add.  I'm stuck in meetings for the next 
couple days but I'm sure Luiz throw it together in no time at all.

Regards,

Anthony Liguori

>
Paolo Bonzini Feb. 27, 2012, 4:54 p.m. UTC | #31
On 02/27/2012 05:53 PM, Anthony Liguori wrote:
>> It looks like this is exactly
>> the case where the core infrastructure (transactions) is missing.
> 
> Batch requests are incredibly easy to add.  I'm stuck in meetings for
> the next couple days but I'm sure Luiz throw it together in no time at all.

Batch requests are unnecessary.  They should be a convenience, not a
correctness feature.

Paolo
Anthony Liguori Feb. 27, 2012, 4:58 p.m. UTC | #32
On 02/27/2012 10:51 AM, Paolo Bonzini wrote:
> On 02/27/2012 04:24 PM, Anthony Liguori wrote:
>>
>> Then you get an error with the block devices still frozen.  You can
>> execute another command to reopen back to the old image to roll back the
>> transaction.
>>
>> Pushing the rollback logic to the client does make the client interface
>> a bit more complicated and adds latency to the error path but it's much
>> easier than building a complex transaction infrastructure.
>>
>> And there are examples of this in the wild too.  LDAP uses a similar
>> mechanism.
>
> Actually, have you seen Jeff's atomic snapshot patch?  It really
> implements all that is needed to do transactions, and gets 100% ok
> error-recovery unlike the existing blockdev_snapshot_sync.  It really
> looks like we can do better than client-side error management, and there
> is not that much complexity at all.
>
> Jeff could rework his patches to work with transaction begin/commit
> commands, and Federico can then add drive-reopen and drive-migrate on top.

Yes, maybe I lack imagination but I fail to see how it generalizes easily/nicely.

 From what I can tell, all of the rollback logic is very specific to the 
commands being used, right?

Regards,

Anthony Liguori

>
> Paolo
Anthony Liguori Feb. 27, 2012, 4:59 p.m. UTC | #33
On 02/27/2012 10:54 AM, Paolo Bonzini wrote:
> On 02/27/2012 05:53 PM, Anthony Liguori wrote:
>>> It looks like this is exactly
>>> the case where the core infrastructure (transactions) is missing.
>>
>> Batch requests are incredibly easy to add.  I'm stuck in meetings for
>> the next couple days but I'm sure Luiz throw it together in no time at all.
>
> Batch requests are unnecessary.  They should be a convenience, not a
> correctness feature.

I think this discussion has gotten a bit unwieldy as I'm having trouble keeping 
up with the various proposals.  Can we move to something more formal and written 
on the wiki?

Can we start with a clear description of what the various requirements are?

Regards,

Anthony Liguroi

>
> Paolo
>
Paolo Bonzini Feb. 27, 2012, 5:06 p.m. UTC | #34
On 02/27/2012 05:58 PM, Anthony Liguori wrote:
>>
>> Jeff could rework his patches to work with transaction begin/commit
>> commands, and Federico can then add drive-reopen and drive-migrate on
>> top.
> 
> Yes, maybe I lack imagination but I fail to see how it generalizes
> easily/nicely.

> From what I can tell, all of the rollback logic is very specific to the
> commands being used, right?

The rollback logic is just "close the new devices".

The commit logic is specific to the commands being used, but reopen
should be easier than snapshot (and basically the same except for
backing_hd handling).

Migrate is really syntactic sugar around reopen, so no surprises there.

Paolo
Luiz Capitulino Feb. 27, 2012, 5:37 p.m. UTC | #35
On Mon, 27 Feb 2012 10:42:31 -0600
Anthony Liguori <anthony@codemonkey.ws> wrote:

> On 02/27/2012 10:33 AM, Federico Simoncelli wrote:
> > I'm all for the modularity of the commands (I suggested it since the beginning),
> > but all this infrastructure goes way beyond what I'd need for oVirt now.
> >
> > When I submitted my patches we knew that my work wasn't the definitive solution
> > (eg: the future implementation of -blockdev). So I'd suggest to try and settle
> > with something that is good enough and that is not preventing a future improvement.
> >
> > What about having a (temporary) flag in drive-mirror to accept also a new-image-file
> > until we will have the optimal solution?
> 
> Unless there are extenuating circumstances (like the absence of core 
> infrastructure in QEMU), then we should not add commands that we know are not 
> the right command.
> 
> We have to support our commands forever.

Agreed. Worst case we have vendor extension support that downstream can use to
add its own commands.
diff mbox

Patch

diff --git a/blockdev.c b/blockdev.c
index 2c132a3..f51bd6f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -646,9 +646,8 @@  void do_commit(Monitor *mon, const QDict *qdict)
     }
 }
 
-void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
-                                bool has_format, const char *format,
-                                Error **errp)
+static void change_blockdev_image(const char *device, const char *new_image_file,
+                                  const char *format, bool create, Error **errp)
 {
     BlockDriverState *bs;
     BlockDriver *drv, *old_drv, *proto_drv;
@@ -671,7 +670,7 @@  void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
     old_drv = bs->drv;
     flags = bs->open_flags;
 
-    if (!has_format) {
+    if (!format) {
         format = "qcow2";
     }
 
@@ -681,24 +680,26 @@  void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
         return;
     }
 
-    proto_drv = bdrv_find_protocol(snapshot_file);
+    proto_drv = bdrv_find_protocol(new_image_file);
     if (!proto_drv) {
         error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
         return;
     }
 
-    ret = bdrv_img_create(snapshot_file, format, bs->filename,
-                          bs->drv->format_name, NULL, -1, flags);
-    if (ret) {
-        error_set(errp, QERR_UNDEFINED_ERROR);
-        return;
+    if (create) {
+        ret = bdrv_img_create(new_image_file, format, bs->filename,
+                              bs->drv->format_name, NULL, -1, flags);
+        if (ret) {
+            error_set(errp, QERR_UNDEFINED_ERROR);
+            return;
+        }
     }
 
     bdrv_drain_all();
     bdrv_flush(bs);
 
     bdrv_close(bs);
-    ret = bdrv_open(bs, snapshot_file, flags, drv);
+    ret = bdrv_open(bs, new_image_file, flags, drv);
     /*
      * If reopening the image file we just created fails, fall back
      * and try to re-open the original image. If that fails too, we
@@ -709,11 +710,93 @@  void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
         if (ret != 0) {
             error_set(errp, QERR_OPEN_FILE_FAILED, old_filename);
         } else {
-            error_set(errp, QERR_OPEN_FILE_FAILED, snapshot_file);
+            error_set(errp, QERR_OPEN_FILE_FAILED, new_image_file);
         }
     }
 }
 
+void qmp_drive_reopen(const char *device, const char *new_image_file,
+                         bool has_format, const char *format, Error **errp)
+{
+    change_blockdev_image(device, new_image_file,
+                          has_format ? format : NULL, false, errp);
+}
+
+void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
+                                bool has_format, const char *format,
+                                Error **errp)
+{
+    change_blockdev_image(device, snapshot_file,
+                          has_format ? format : NULL, true, errp);
+}
+
+static void qmp_blockdev_mirror(const char *device, const char *destination,
+                                const char *new_image_file, Error **errp)
+{
+    BlockDriver *drv;
+    int i, j, escape;
+    char new_filename[2048], *filename;
+
+    drv = bdrv_find_format("blkmirror");
+    if (!drv) {
+        error_set(errp, QERR_INVALID_BLOCK_FORMAT, "blkmirror");
+        return;
+    }
+
+    escape = 0;
+    for (i = 0, j = 0; j < strlen(new_image_file); j++) {
+ loop:
+        if (!(i < sizeof(new_filename) - 2)) {
+            error_set(errp, QERR_INVALID_PARAMETER_VALUE,
+                      "new-image-file", "shorter filename");
+            return;
+        }
+
+        if (new_image_file[j] == ':' || new_image_file[j] == '\\') {
+            if (!escape) {
+                new_filename[i++] = '\\', escape = 1;
+                goto loop;
+            } else {
+                escape = 0;
+            }
+        }
+
+        new_filename[i++] = new_image_file[j];
+    }
+
+    filename = g_strdup_printf("blkmirror:%s:%s", new_filename, destination);
+    change_blockdev_image(device, filename, "blkmirror", false, errp);
+    g_free(filename);
+}
+
+void qmp_blockdev_migrate(const char *device, bool incremental,
+                          const char *destination, bool has_new_image_file,
+                          const char *new_image_file, Error **errp)
+{
+    BlockDriverState *bs;
+
+    bs = bdrv_find(device);
+    if (!bs) {
+        error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+        return;
+    }
+    if (bdrv_in_use(bs)) {
+        error_set(errp, QERR_DEVICE_IN_USE, device);
+        return;
+    }
+
+    if (incremental) {
+        if (!has_new_image_file) {
+            error_set(errp, QERR_INVALID_PARAMETER_VALUE,
+                      "incremental", "a new image file");
+        } else {
+            qmp_blockdev_mirror(device, destination, new_image_file, errp);
+        }
+    } else {
+        error_set(errp, QERR_NOT_SUPPORTED);
+    }
+}
+
 static void eject_device(BlockDriverState *bs, int force, Error **errp)
 {
     if (bdrv_in_use(bs)) {
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 573b823..4aacf2a 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -886,6 +886,44 @@  Snapshot device, using snapshot file as target if provided
 ETEXI
 
     {
+        .name       = "drive_reopen",
+        .args_type  = "device:B,new-image-file:s?,format:s?",
+        .params     = "device [new-image-file] [format]",
+        .help       = "Assigns a new image file to a device.\n\t\t\t"
+                      "The image will be opened using the format\n\t\t\t"
+                      "specified or 'qcow2' by default.\n\t\t\t"
+                      "This command is deprecated.",
+        .mhandler.cmd = hmp_drive_reopen,
+    },
+
+STEXI
+@item drive_reopen
+@findex drive_reopen
+Assigns a new image file to a device.
+ETEXI
+
+    {
+        .name       = "blkdev_migrate",
+        .args_type  = "incremental:-i,device:B,destination:s,new-image-file:s?",
+        .params     = "device mode destination [new-image-file]",
+        .help       = "Migrates the device to a new destination.\n\t\t\t"
+                      "If the incremental (-i) option is used then\n\t\t\t"
+                      "only the new writes are sent to the destination.\n\t\t\t"
+                      "This method is particularly useful if used in\n\t\t\t"
+                      "conjunction with new-image-file (a new image for\n\t\t\t"
+                      "the device) allowing the current image to be\n\t\t\t"
+                      "transferred to the destination by an external\n\t\t\t"
+                      "manager.",
+        .mhandler.cmd = hmp_blkdev_migrate,
+    },
+
+STEXI
+@item blkdev_migrate
+@findex blkdev_migrate
+Migrates the device to a new destination.
+ETEXI
+
+    {
         .name       = "drive_add",
         .args_type  = "pci_addr:s,opts:s",
         .params     = "[[<domain>:]<bus>:]<slot>\n"
diff --git a/hmp.c b/hmp.c
index 8ff8c94..282d3e5 100644
--- a/hmp.c
+++ b/hmp.c
@@ -701,6 +701,30 @@  void hmp_snapshot_blkdev(Monitor *mon, const QDict *qdict)
     hmp_handle_error(mon, &errp);
 }
 
+void hmp_drive_reopen(Monitor *mon, const QDict *qdict)
+{
+    const char *device = qdict_get_str(qdict, "device");
+    const char *filename = qdict_get_str(qdict, "new-image-file");
+    const char *format = qdict_get_try_str(qdict, "format");
+    Error *errp = NULL;
+
+    qmp_drive_reopen(device, filename, !!format, format, &errp);
+    hmp_handle_error(mon, &errp);
+}
+
+void hmp_blkdev_migrate(Monitor *mon, const QDict *qdict)
+{
+    bool incremental = qdict_get_try_bool(qdict, "incremental", 0);
+    const char *device = qdict_get_str(qdict, "device");
+    const char *destination = qdict_get_str(qdict, "destination");
+    const char *new_image_file = qdict_get_try_str(qdict, "new-image-file");
+    Error *errp = NULL;
+
+    qmp_blockdev_migrate(device, incremental, destination,
+                         !!new_image_file, new_image_file, &errp);
+    hmp_handle_error(mon, &errp);
+}
+
 void hmp_migrate_cancel(Monitor *mon, const QDict *qdict)
 {
     qmp_migrate_cancel(NULL);
diff --git a/hmp.h b/hmp.h
index 18eecbd..3c66257 100644
--- a/hmp.h
+++ b/hmp.h
@@ -47,6 +47,8 @@  void hmp_block_passwd(Monitor *mon, const QDict *qdict);
 void hmp_balloon(Monitor *mon, const QDict *qdict);
 void hmp_block_resize(Monitor *mon, const QDict *qdict);
 void hmp_snapshot_blkdev(Monitor *mon, const QDict *qdict);
+void hmp_drive_reopen(Monitor *mon, const QDict *qdict);
+void hmp_blkdev_migrate(Monitor *mon, const QDict *qdict);
 void hmp_migrate_cancel(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_downtime(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_speed(Monitor *mon, const QDict *qdict);
diff --git a/qapi-schema.json b/qapi-schema.json
index d02ee86..df1248c 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1136,6 +1136,60 @@ 
   'data': { 'device': 'str', 'snapshot-file': 'str', '*format': 'str' } }
 
 ##
+# @drive-reopen
+#
+# Assigns a new image file to a device.
+#
+# @device: the name of the device for which we are chainging the image file.
+#
+# @new-image-file: the target of the new image. If the file doesn't exists the
+#                  command will fail.
+#
+# @format: #optional the format of the new image, default is 'qcow2'.
+#
+# Returns: nothing on success
+#          If @device is not a valid block device, DeviceNotFound
+#          If @new-image-file can't be opened, OpenFileFailed
+#          If @format is invalid, InvalidBlockFormat
+#
+# Notes: This command is deprecated.
+#
+# Since 1.1
+##
+
+{ 'command': 'drive-reopen',
+  'data': { 'device': 'str', 'new-image-file': 'str', '*format': 'str' } }
+
+##
+# @blockdev-migrate
+#
+# Migrates the device to a new destination.
+#
+# @device: the name of the block device to migrate.
+#
+# @incremental: if true only the new writes are sent to the destination.
+#               This method is particularly useful if used in conjunction
+#               with new-image-file allowing the current image to be
+#               transferred to the destination by an external manager.
+#
+# @destination: the destination of the block migration.
+#
+# @new-image-file: #optional an existing image file that will replace
+#                  the current one in the device.
+#
+# Returns: nothing on success
+#          If @device is not a valid block device, DeviceNotFound
+#          If @mode is not a valid migration mode, InvalidParameterValue
+#          If @destination can't be opened, OpenFileFailed
+#          If @new-image-file can't be opened, OpenFileFailed
+#
+# Since 1.1
+##
+{ 'command': 'blockdev-migrate',
+  'data': { 'device': 'str', 'incremental' : 'bool',
+            'destination': 'str', '*new-image-file': 'str' } }
+
+##
 # @human-monitor-command:
 #
 # Execute a command on the human monitor and return the output.