From patchwork Tue Jul 24 11:04:21 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 172875 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 9D4B42C0090 for ; Tue, 24 Jul 2012 22:43:30 +1000 (EST) Received: from localhost ([::1]:45185 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SteSW-0000zq-Pp for incoming@patchwork.ozlabs.org; Tue, 24 Jul 2012 08:43:28 -0400 Received: from eggs.gnu.org ([208.118.235.92]:57100) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Stcx3-0000H8-Qa for qemu-devel@nongnu.org; Tue, 24 Jul 2012 07:07:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Stcwx-0005w3-5e for qemu-devel@nongnu.org; Tue, 24 Jul 2012 07:06:53 -0400 Received: from mail-gg0-f173.google.com ([209.85.161.173]:58807) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Stcwx-0005ql-0X for qemu-devel@nongnu.org; Tue, 24 Jul 2012 07:06:47 -0400 Received: by mail-gg0-f173.google.com with SMTP id p1so6580043ggn.4 for ; Tue, 24 Jul 2012 04:06:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:x-mailer:in-reply-to :references; bh=CNFFo2YTXPoXoEr1jIDdn6XsNSB2c71K02PX1ONEVqM=; b=redOhZrDZ3Wzs+czGw1do5U0mBSNX9hVydxZC92wEM9yTayWS3BMLiJrrBCZtsBsu4 tWufRlNqomtbepDXqDZG5ZIk59yY3tMQzMyZRV0F8h6MsTb+cqHPH7VbNzjJjfCuqLhh wFuvw+AT8cR08K1xbQ57fSgg00iegc3oeTVZetI9ZeUHy4hzJM5GH08+lvnhJOpU+6zb j904HxFaAozSQ3dXWWMp23iKLP8gghCrF0OGloK/q0txG9l02m9oMoNbHmwkbo1vPiSI WMkRasmUXzAXaimBCEQ2nniAzjxEKr9tp6Rr9KDAZHv30t0ev2Uta6OMtWherBJUZgGn I9NA== Received: by 10.43.134.134 with SMTP id ic6mr14619149icc.26.1343128006640; Tue, 24 Jul 2012 04:06:46 -0700 (PDT) Received: from yakj.usersys.redhat.com (93-34-189-113.ip51.fastwebnet.it. [93.34.189.113]) by mx.google.com with ESMTPS id if4sm1752561igc.10.2012.07.24.04.06.43 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 24 Jul 2012 04:06:45 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Date: Tue, 24 Jul 2012 13:04:21 +0200 Message-Id: <1343127865-16608-44-git-send-email-pbonzini@redhat.com> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1343127865-16608-1-git-send-email-pbonzini@redhat.com> References: <1343127865-16608-1-git-send-email-pbonzini@redhat.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.161.173 Cc: kwolf@redhat.com, jcody@redhat.com, eblake@redhat.com, stefanha@linux.vnet.ibm.com Subject: [Qemu-devel] [PATCH 43/47] mirror: allow customizing the granularity X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The desired granularity may be very different depending on the kind of operation (e.g. continous replication vs. collapse-to-raw) and whether the VM is expected to perform lots of I/O while mirroring is in progress. Allow the user to customize it. Signed-off-by: Paolo Bonzini --- block/mirror.c | 38 ++++++++++++++++++++------------------ block_int.h | 3 ++- blockdev.c | 15 ++++++++++++++- hmp.c | 2 +- qapi-schema.json | 7 ++++++- qmp-commands.hx | 5 ++++- 6 files changed, 47 insertions(+), 23 deletions(-) diff --git a/block/mirror.c b/block/mirror.c index 48ee963..81a600b 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -17,9 +17,6 @@ #include "qemu/ratelimit.h" #include "bitmap.h" -#define BLOCK_SIZE (1 << 20) -#define BDRV_SECTORS_PER_DIRTY_CHUNK (BLOCK_SIZE >> BDRV_SECTOR_BITS) - #define SLICE_TIME 100000000ULL /* ns */ typedef struct MirrorBlockJob { @@ -31,6 +28,7 @@ typedef struct MirrorBlockJob { bool synced; bool complete; int64_t sector_num; + int64_t granularity; size_t buf_size; unsigned long *cow_bitmap; HBitmapIter hbi; @@ -43,7 +41,7 @@ static int coroutine_fn mirror_iteration(MirrorBlockJob *s, BlockDriverState *source = s->common.bs; BlockDriverState *target = s->target; QEMUIOVector qiov; - int ret, nb_sectors; + int ret, nb_sectors, nb_sectors_chunk; int64_t end, sector_num, cluster_num; struct iovec iov; @@ -59,19 +57,19 @@ static int coroutine_fn mirror_iteration(MirrorBlockJob *s, * is very large, we need to do COW ourselves. The first time a cluster is * copied, copy it entirely. * - * Because both BDRV_SECTORS_PER_DIRTY_CHUNK and the cluster size are - * powers of two, the number of sectors to copy cannot exceed one cluster. + * Because both the granularity and the cluster size are powers of two, the + * number of sectors to copy cannot exceed one cluster. */ sector_num = s->sector_num; - nb_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; - cluster_num = sector_num / BDRV_SECTORS_PER_DIRTY_CHUNK; + nb_sectors_chunk = nb_sectors = s->granularity >> BDRV_SECTOR_BITS; + cluster_num = sector_num / nb_sectors_chunk; if (s->cow_bitmap && !test_bit(cluster_num, s->cow_bitmap)) { trace_mirror_cow(s, sector_num); bdrv_round_to_clusters(s->target, - sector_num, BDRV_SECTORS_PER_DIRTY_CHUNK, + sector_num, nb_sectors_chunk, §or_num, &nb_sectors); - bitmap_set(s->cow_bitmap, sector_num / BDRV_SECTORS_PER_DIRTY_CHUNK, - nb_sectors / BDRV_SECTORS_PER_DIRTY_CHUNK); + bitmap_set(s->cow_bitmap, sector_num / nb_sectors_chunk, + nb_sectors / nb_sectors_chunk); } end = s->common.len >> BDRV_SECTOR_BITS; @@ -110,7 +108,7 @@ static void coroutine_fn mirror_run(void *opaque) { MirrorBlockJob *s = opaque; BlockDriverState *bs = s->common.bs; - int64_t sector_num, end, length; + int64_t sector_num, end, nb_sectors_chunk, length; BlockDriverInfo bdi; char backing_filename[1024]; int ret = 0; @@ -136,20 +134,21 @@ static void coroutine_fn mirror_run(void *opaque) bdrv_get_info(s->target, &bdi); if (s->buf_size < bdi.cluster_size) { s->buf_size = bdi.cluster_size; - length = (bdrv_getlength(bs) + BLOCK_SIZE - 1) / BLOCK_SIZE; + length = (bdrv_getlength(bs) + s->granularity - 1) / s->granularity; s->cow_bitmap = bitmap_new(length); } } end = s->common.len >> BDRV_SECTOR_BITS; s->buf = qemu_blockalign(bs, s->buf_size); + nb_sectors_chunk = s->granularity >> BDRV_SECTOR_BITS; if (s->mode == MIRROR_SYNC_MODE_FULL || s->mode == MIRROR_SYNC_MODE_TOP) { /* First part, loop on the sectors and initialize the dirty bitmap. */ BlockDriverState *base; base = s->mode == MIRROR_SYNC_MODE_FULL ? NULL : bs->backing_hd; for (sector_num = 0; sector_num < end; ) { - int64_t next = (sector_num | (BDRV_SECTORS_PER_DIRTY_CHUNK - 1)) + 1; + int64_t next = (sector_num | (nb_sectors_chunk - 1)) + 1; ret = bdrv_co_is_allocated_above(bs, base, sector_num, next - sector_num, &n); @@ -224,7 +223,7 @@ static void coroutine_fn mirror_run(void *opaque) s->common.offset = (end - cnt) * BDRV_SECTOR_SIZE; if (s->common.speed) { - delay_ns = ratelimit_calculate_delay(&s->limit, BDRV_SECTORS_PER_DIRTY_CHUNK); + delay_ns = ratelimit_calculate_delay(&s->limit, nb_sectors_chunk); } else { delay_ns = 0; } @@ -323,7 +322,7 @@ static BlockJobType mirror_job_type = { }; void mirror_start(BlockDriverState *bs, BlockDriverState *target, - int64_t speed, MirrorSyncMode mode, + int64_t speed, int64_t granularity, MirrorSyncMode mode, BlockdevOnError on_source_error, BlockdevOnError on_target_error, BlockDriverCompletionFunc *cb, @@ -331,6 +330,8 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target, { MirrorBlockJob *s; + assert((granularity & (granularity - 1)) == 0); + if ((on_source_error == BLOCKDEV_ON_ERROR_STOP || on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) && !bdrv_iostatus_is_enabled(bs)) { @@ -347,9 +348,10 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target, s->on_target_error = on_target_error; s->target = target; s->mode = mode; - s->buf_size = BLOCK_SIZE; + s->granularity = granularity; + s->buf_size = granularity; - bdrv_set_dirty_tracking(bs, BDRV_SECTORS_PER_DIRTY_CHUNK); + bdrv_set_dirty_tracking(bs, granularity >> BDRV_SECTOR_BITS); bdrv_set_on_error(s->target, on_target_error, on_target_error); bdrv_iostatus_enable(s->target); s->common.co = qemu_coroutine_create(mirror_run); diff --git a/block_int.h b/block_int.h index 4cb3c5b..69bb769 100644 --- a/block_int.h +++ b/block_int.h @@ -310,6 +310,7 @@ void stream_start(BlockDriverState *bs, BlockDriverState *base, * @bs: Block device to operate on. * @target: Block device to write to. * @speed: The maximum speed, in bytes per second, or 0 for unlimited. + * @granularity: The chosen granularity for the dirty bitmap. * @mode: Whether to collapse all images in the chain to the target. * @on_source_error: The action to take upon error reading from the source. * @on_target_error: The action to take upon error writing to the target. @@ -323,7 +324,7 @@ void stream_start(BlockDriverState *bs, BlockDriverState *base, * @bs will be switched to read from @target. */ void mirror_start(BlockDriverState *bs, BlockDriverState *target, - int64_t speed, MirrorSyncMode mode, + int64_t speed, int64_t granularity, MirrorSyncMode mode, BlockdevOnError on_source_error, BlockdevOnError on_target_error, BlockDriverCompletionFunc *cb, diff --git a/blockdev.c b/blockdev.c index e160610..13b6217 100644 --- a/blockdev.c +++ b/blockdev.c @@ -832,6 +832,7 @@ void qmp_drive_mirror(const char *device, const char *target, enum MirrorSyncMode sync, bool has_mode, enum NewImageMode mode, bool has_speed, int64_t speed, + bool has_granularity, int64_t granularity, bool has_on_source_error, BlockdevOnError on_source_error, bool has_on_target_error, BlockdevOnError on_target_error, Error **errp) @@ -857,6 +858,17 @@ void qmp_drive_mirror(const char *device, const char *target, if (!has_mode) { mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS; } + if (!has_granularity) { + granularity = 65536; + } + if (granularity < 512 || granularity > 1048576 * 64) { + error_set(errp, QERR_INVALID_PARAMETER, device); + return; + } + if (granularity & (granularity - 1)) { + error_set(errp, QERR_INVALID_PARAMETER, device); + return; + } bs = bdrv_find(device); if (!bs) { @@ -936,7 +948,8 @@ void qmp_drive_mirror(const char *device, const char *target, return; } - mirror_start(bs, target_bs, speed, sync, on_source_error, on_target_error, + mirror_start(bs, target_bs, speed, granularity, sync, + on_source_error, on_target_error, block_job_cb, bs, &local_err); if (local_err != NULL) { bdrv_delete(target_bs); diff --git a/hmp.c b/hmp.c index b6bc263..4f95096 100644 --- a/hmp.c +++ b/hmp.c @@ -719,7 +719,7 @@ void hmp_drive_mirror(Monitor *mon, const QDict *qdict) qmp_drive_mirror(device, filename, !!format, format, full ? MIRROR_SYNC_MODE_FULL : MIRROR_SYNC_MODE_TOP, - true, mode, false, 0, + true, mode, false, 0, false, 0, false, 0, false, 0, &errp); hmp_handle_error(mon, &errp); } diff --git a/qapi-schema.json b/qapi-schema.json index 7a97fad..fb0ccc7 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1389,6 +1389,9 @@ # (all the disk, only the sectors allocated in the topmost image, or # only new I/O). # +# @granularity: #optional granularity of the dirty bitmap, default is 64K. +# Must be a power of 2 between 512 and 64M. +# # @on-source-error: #optional the action to take on an error on the source, # default 'report'. 'stop' and 'enospc' can only be used # if the block device supports io-status (see BlockInfo). @@ -1401,6 +1404,7 @@ # If @device is not a valid block device, DeviceNotFound # If @target can't be opened, OpenFileFailed # If @format is invalid, InvalidBlockFormat +# If @granularity is not a power of 2, InvalidParameter # If @on_source_error is not supported, InvalidParameter # # Since 1.2 @@ -1408,7 +1412,8 @@ { 'command': 'drive-mirror', 'data': { 'device': 'str', 'target': 'str', '*format': 'str', 'sync': 'MirrorSyncMode', '*mode': 'NewImageMode', - '*speed': 'int', '*on-source-error': 'BlockdevOnError', + '*speed': 'int', '*granularity': 'int', + '*on-source-error': 'BlockdevOnError', '*on-target-error': 'BlockdevOnError' } } ## diff --git a/qmp-commands.hx b/qmp-commands.hx index 5081b01..89e12f5 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -839,7 +839,8 @@ EQMP { .name = "drive-mirror", .args_type = "sync:s,device:B,target:s,speed:i?,mode:s?,format:s?," - "on-source-error:s?,on-target-error:s?", + "on-source-error:s?,on-target-error:s?," + "granularity:i?", .mhandler.cmd_new = qmp_marshal_input_drive_mirror, }, @@ -863,6 +864,8 @@ Arguments: file/device (NewImageMode, optional, default 'absolute-paths') - "speed": maximum speed of the streaming job, in bytes per second (json-int) +- "granularity": granularity of the dirty bitmap (json-int, default 64k, + must be a power of two between 512 and 64M. - "sync": what parts of the disk image should be copied to the destination; possibilities include "full" for all the disk, "top" for only the sectors allocated in the topmost image, or "none" to only replicate new I/O