From patchwork Wed Jul 27 13:44:48 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 107072 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [140.186.70.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id CF0D2B6F6B for ; Wed, 27 Jul 2011 23:46:10 +1000 (EST) Received: from localhost ([::1]:36001 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qm4R5-0001ae-BN for incoming@patchwork.ozlabs.org; Wed, 27 Jul 2011 09:46:07 -0400 Received: from eggs.gnu.org ([140.186.70.92]:33500) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qm4QD-0007Ua-OX for qemu-devel@nongnu.org; Wed, 27 Jul 2011 09:45:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Qm4Q8-0004Z3-Fo for qemu-devel@nongnu.org; Wed, 27 Jul 2011 09:45:13 -0400 Received: from mtagate2.uk.ibm.com ([194.196.100.162]:34777) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Qm4Q7-0004Xn-Vf for qemu-devel@nongnu.org; Wed, 27 Jul 2011 09:45:08 -0400 Received: from d06nrmr1707.portsmouth.uk.ibm.com (d06nrmr1707.portsmouth.uk.ibm.com [9.149.39.225]) by mtagate2.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p6RDj6CW015786 for ; Wed, 27 Jul 2011 13:45:06 GMT Received: from d06av09.portsmouth.uk.ibm.com (d06av09.portsmouth.uk.ibm.com [9.149.37.250]) by d06nrmr1707.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p6RDj6Ms2076912 for ; Wed, 27 Jul 2011 14:45:06 +0100 Received: from d06av09.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av09.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p6RDj6HA016248 for ; Wed, 27 Jul 2011 07:45:06 -0600 Received: from stefanha-thinkpad.ibm.com ([9.78.66.144]) by d06av09.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p6RDj1DH016013; Wed, 27 Jul 2011 07:45:05 -0600 From: Stefan Hajnoczi To: Date: Wed, 27 Jul 2011 14:44:48 +0100 Message-Id: <1311774295-8696-9-git-send-email-stefanha@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.5.4 In-Reply-To: <1311774295-8696-1-git-send-email-stefanha@linux.vnet.ibm.com> References: <1311774295-8696-1-git-send-email-stefanha@linux.vnet.ibm.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Received-From: 194.196.100.162 Cc: Kevin Wolf , Anthony Liguori , Stefan Hajnoczi , Adam Litke Subject: [Qemu-devel] [PATCH 08/15] qmp: add block_stream command X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org For leaf images with copy-on-read semantics, the stream command allows the user to populate the image file by copying data from the backing file while the guest is running. Once all blocks have been streamed, the dependency on the original backing file is removed. Therefore, stream commands can be used to implement post-copy live block migration and rapid deployment. The command synopsis is: block_stream ------------ Copy data from a backing file into a block device. The block streaming operation is performed in the background until the entire backing file has been copied. This command returns immediately once streaming has started. The status of ongoing block streaming operations can be checked with query-block-jobs. The operation can be stopped before it has completed using the block_job_cancel command. If a base file is specified then sectors are not copied from that base file and its backing chain. When streaming completes the image file will have the base file as its backing file. This can be used to stream a subset of the backing file chain instead of flattening the entire image. On successful completion the image file is updated to drop the backing file. Arguments: - device: device name (json-string) - base: common backing file (json-string, optional) Errors: DeviceInUse: streaming is already active on this device DeviceNotFound: device name is invalid NotSupported: image streaming is not supported by this device Events: On completion the BLOCK_JOB_COMPLETED event is raised with the following fields: - type: job type ("stream" for image streaming, json-string) - device: device name (json-string) - end: maximum progress value (json-int) - position: current progress value (json-int) - speed: rate limit, bytes per second (json-int) - error: error message (json-string, only on error) The completion event is raised both on success and on failure. On success position is equal to end. On failure position and end can be used to indicate at which point the operation failed. On failure the error field contains a human-readable error message. There are no semantics other than that streaming has failed and clients should not try to interpret the error string. Examples: -> { "execute": "block_stream", "arguments": { "device": "virtio0" } } <- { "return": {} } Signed-off-by: Adam Litke Signed-off-by: Stefan Hajnoczi --- blockdev.c | 133 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ blockdev.h | 1 + hmp-commands.hx | 14 ++++++ monitor.c | 3 + monitor.h | 1 + qerror.c | 8 +++ qerror.h | 6 +++ qmp-commands.hx | 64 ++++++++++++++++++++++++++ 8 files changed, 230 insertions(+), 0 deletions(-) diff --git a/blockdev.c b/blockdev.c index b337732..cd5e49c 100644 --- a/blockdev.c +++ b/blockdev.c @@ -16,6 +16,7 @@ #include "sysemu.h" #include "hw/qdev.h" #include "block_int.h" +#include "qjson.h" static QTAILQ_HEAD(drivelist, DriveInfo) drives = QTAILQ_HEAD_INITIALIZER(drives); @@ -50,6 +51,131 @@ static const int if_max_devs[IF_COUNT] = { [IF_SCSI] = 7, }; +typedef struct StreamState { + int64_t offset; /* current position in block device */ + BlockDriverState *bs; + QEMUTimer *timer; + QLIST_ENTRY(StreamState) list; +} StreamState; + +static QLIST_HEAD(, StreamState) block_streams = + QLIST_HEAD_INITIALIZER(block_streams); + +static QObject *stream_get_qobject(StreamState *s) +{ + const char *name = bdrv_get_device_name(s->bs); + int64_t len = bdrv_getlength(s->bs); + + return qobject_from_jsonf("{ 'device': %s, 'type': 'stream', " + "'offset': %" PRId64 ", 'len': %" PRId64 ", " + "'speed': %" PRId64 " }", + name, s->offset, len, (int64_t)0); +} + +static void stream_mon_event(StreamState *s, int ret) +{ + QObject *data = stream_get_qobject(s); + + if (ret < 0) { + QDict *qdict = qobject_to_qdict(data); + + qdict_put(qdict, "error", qstring_from_str(strerror(-ret))); + } + + monitor_protocol_event(QEVENT_BLOCK_JOB_COMPLETED, data); + qobject_decref(data); +} + +static void stream_free(StreamState *s) +{ + QLIST_REMOVE(s, list); + + qemu_del_timer(s->timer); + qemu_free_timer(s->timer); + qemu_free(s); +} + +static void stream_complete(StreamState *s, int ret) +{ + stream_mon_event(s, ret); + stream_free(s); +} + +static void stream_cb(void *opaque, int nb_sectors) +{ + StreamState *s = opaque; + + if (nb_sectors < 0) { + stream_complete(s, nb_sectors); + return; + } + + s->offset += nb_sectors * BDRV_SECTOR_SIZE; + + if (s->offset == bdrv_getlength(s->bs)) { + bdrv_change_backing_file(s->bs, NULL, NULL); + stream_complete(s, 0); + } else { + qemu_mod_timer(s->timer, qemu_get_clock_ns(rt_clock)); + } +} + +/* We can't call bdrv_aio_stream() directly from the callback because that + * makes qemu_aio_flush() not complete until the streaming is completed. + * By delaying with a timer, we give qemu_aio_flush() a chance to complete. + */ +static void stream_next_iteration(void *opaque) +{ + StreamState *s = opaque; + + bdrv_aio_copy_backing(s->bs, s->offset / BDRV_SECTOR_SIZE, stream_cb, s); +} + +static StreamState *stream_find(const char *device) +{ + StreamState *s; + + QLIST_FOREACH(s, &block_streams, list) { + if (strcmp(bdrv_get_device_name(s->bs), device) == 0) { + return s; + } + } + return NULL; +} + +static StreamState *stream_start(const char *device) +{ + StreamState *s; + BlockDriverAIOCB *acb; + BlockDriverState *bs; + + s = stream_find(device); + if (s) { + qerror_report(QERR_DEVICE_IN_USE, device); + return NULL; + } + + bs = bdrv_find(device); + if (!bs) { + qerror_report(QERR_DEVICE_NOT_FOUND, device); + return NULL; + } + + s = qemu_mallocz(sizeof(*s)); + s->bs = bs; + s->timer = qemu_new_timer_ns(rt_clock, stream_next_iteration, s); + QLIST_INSERT_HEAD(&block_streams, s, list); + + acb = bdrv_aio_copy_backing(s->bs, s->offset / BDRV_SECTOR_SIZE, + stream_cb, s); + if (acb == NULL) { + stream_free(s); + qerror_report(QERR_NOT_SUPPORTED); + return NULL; + } + return s; +} + /* * We automatically delete the drive when a device using it gets * unplugged. Questionable feature, but we can't just drop it. @@ -650,6 +776,13 @@ out: return ret; } +int do_block_stream(Monitor *mon, const QDict *params, QObject **ret_data) +{ + const char *device = qdict_get_str(params, "device"); + + return stream_start(device) ? 0 : -1; +} + static int eject_device(Monitor *mon, BlockDriverState *bs, int force) { if (!force) { diff --git a/blockdev.h b/blockdev.h index 3587786..f475aa8 100644 --- a/blockdev.h +++ b/blockdev.h @@ -65,5 +65,6 @@ int do_change_block(Monitor *mon, const char *device, int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_block_resize(Monitor *mon, const QDict *qdict, QObject **ret_data); +int do_block_stream(Monitor *mon, const QDict *params, QObject **ret_data); #endif diff --git a/hmp-commands.hx b/hmp-commands.hx index cbaa9a0..9bf1025 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -38,6 +38,20 @@ Commit changes to the disk images (if -snapshot is used) or backing files. ETEXI { + .name = "block_stream", + .args_type = "device:B", + .params = "device", + .help = "Copy data from a backing file into a block device", + .mhandler.cmd_new = do_block_stream, + }, + +STEXI +@item block_stream +@findex block_stream +Copy data from a backing file into a block device. +ETEXI + + { .name = "q|quit", .args_type = "", .params = "", diff --git a/monitor.c b/monitor.c index 718935b..700b534 100644 --- a/monitor.c +++ b/monitor.c @@ -468,6 +468,9 @@ void monitor_protocol_event(MonitorEvent event, QObject *data) case QEVENT_SPICE_DISCONNECTED: event_name = "SPICE_DISCONNECTED"; break; + case QEVENT_BLOCK_JOB_COMPLETED: + event_name = "BLOCK_JOB_COMPLETED"; + break; default: abort(); break; diff --git a/monitor.h b/monitor.h index 4f2d328..135c927 100644 --- a/monitor.h +++ b/monitor.h @@ -35,6 +35,7 @@ typedef enum MonitorEvent { QEVENT_SPICE_CONNECTED, QEVENT_SPICE_INITIALIZED, QEVENT_SPICE_DISCONNECTED, + QEVENT_BLOCK_JOB_COMPLETED, QEVENT_MAX, } MonitorEvent; diff --git a/qerror.c b/qerror.c index 69c1bc9..c5bd197 100644 --- a/qerror.c +++ b/qerror.c @@ -162,6 +162,10 @@ static const QErrorStringTable qerror_table[] = { .desc = "No '%(bus)' bus found for device '%(device)'", }, { + .error_fmt = QERR_NOT_SUPPORTED, + .desc = "Operation is not supported", + }, + { .error_fmt = QERR_OPEN_FILE_FAILED, .desc = "Could not open '%(filename)'", }, @@ -230,6 +234,10 @@ static const QErrorStringTable qerror_table[] = { .error_fmt = QERR_QGA_COMMAND_FAILED, .desc = "Guest agent command failed, error was '%(message)'", }, + { + .error_fmt = QERR_STREAMING_ERROR, + .desc = "An error occurred during streaming: %(msg)", + }, {} }; diff --git a/qerror.h b/qerror.h index 8058456..ffe3190 100644 --- a/qerror.h +++ b/qerror.h @@ -139,6 +139,9 @@ QError *qobject_to_qerror(const QObject *obj); #define QERR_NO_BUS_FOR_DEVICE \ "{ 'class': 'NoBusForDevice', 'data': { 'device': %s, 'bus': %s } }" +#define QERR_NOT_SUPPORTED \ + "{ 'class': 'NotSupported', 'data': {} }" + #define QERR_OPEN_FILE_FAILED \ "{ 'class': 'OpenFileFailed', 'data': { 'filename': %s } }" @@ -193,4 +196,7 @@ QError *qobject_to_qerror(const QObject *obj); #define QERR_QGA_COMMAND_FAILED \ "{ 'class': 'QgaCommandFailed', 'data': { 'message': %s } }" +#define QERR_STREAMING_ERROR \ + "{ 'class': 'StreamingError', 'data': { 'msg': %s } }" + #endif /* QERROR_H */ diff --git a/qmp-commands.hx b/qmp-commands.hx index 54e313c..80402c7 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -945,6 +945,70 @@ Example: <- { "return": {} } EQMP + + { + .name = "block_stream", + .args_type = "device:B", + .params = "device", + .help = "Copy data from a backing file into a block device", + .mhandler.cmd_new = do_block_stream, + }, + +SQMP + +Copy data from a backing file into a block device. + +The block streaming operation is performed in the background until the entire +backing file has been copied. This command returns immediately once streaming +has started. The status of ongoing block streaming operations can be checked +with query-block-jobs. The operation can be stopped before it has completed +using the block_job_cancel command. + +If a base file is specified then sectors are not copied from that base file and +its backing chain. When streaming completes the image file will have the base +file as its backing file. This can be used to stream a subset of the backing +file chain instead of flattening the entire image. + +On successful completion the image file is updated to drop the backing file. + +Arguments: + +- device: device name (json-string) +- base: common backing file (json-string, optional) + +Errors: + +DeviceInUse: streaming is already active on this device +DeviceNotFound: device name is invalid +NotSupported: image streaming is not supported by this device + +Events: + +On completion the BLOCK_JOB_COMPLETED event is raised with the following +fields: + +- type: job type ("stream" for image streaming, json-string) +- device: device name (json-string) +- end: maximum progress value (json-int) +- position: current progress value (json-int) +- speed: rate limit, bytes per second (json-int) +- error: error message (json-string, only on error) + +The completion event is raised both on success and on failure. On +success position is equal to end. On failure position and end can be +used to indicate at which point the operation failed. + +On failure the error field contains a human-readable error message. There are +no semantics other than that streaming has failed and clients should not try +to interpret the error string. + +Examples: + +-> { "execute": "block_stream", "arguments": { "device": "virtio0" } } +<- { "return": {} } + +EQMP + { .name = "qmp_capabilities", .args_type = "",