From patchwork Tue Jul 24 11:04:25 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 172852 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 906CA2C008F for ; Tue, 24 Jul 2012 22:05:42 +1000 (EST) Received: from localhost ([::1]:43104 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1StcxT-0000cS-4S for incoming@patchwork.ozlabs.org; Tue, 24 Jul 2012 07:07:19 -0400 Received: from eggs.gnu.org ([208.118.235.92]:57143) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1StcxF-0000R2-Kl for qemu-devel@nongnu.org; Tue, 24 Jul 2012 07:07:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Stcx9-0005xn-94 for qemu-devel@nongnu.org; Tue, 24 Jul 2012 07:07:05 -0400 Received: from mail-gh0-f173.google.com ([209.85.160.173]:59147) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Stcx9-0005Rh-4U for qemu-devel@nongnu.org; Tue, 24 Jul 2012 07:06:59 -0400 Received: by mail-gh0-f173.google.com with SMTP id r14so6581704ghr.4 for ; Tue, 24 Jul 2012 04:06:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:x-mailer:in-reply-to :references; bh=2fZAC06cHiRgVYAuK0AbCscUfe+eVKB8RXdbRIQ+18c=; b=MLdoAmQKTgLgMqcOHr3My0NDfH7IoaPFQMrl+78zhdEF07Vpy4EffAVglYZZMZ7xY9 81GourFVxG8+nfkXkiRY1tVkvbRscujM+E5FPYhGVNxpWgVre3A9FnA36CFWlKlHWIhT 1c8waJKs2tCDWGUgP16S7PNhOHl3ewJCZHG9rkBTVqB5d6e53JDgoIoulQe1WEs5+4+4 xmnF9+AAxjlkhpwqxAdTIMc37W6LjsiKQddbc2+jU+N8C9XtO2ULqCNA+21+Qw8Q4GIz 0KWrxCjEiaTcqKaIekeJCMtqRh7iu8Aaww7dfBNONJr5NnmSIep2qxZQIuTHqbf/Bx5G DxpQ== Received: by 10.42.41.11 with SMTP id n11mr14455780ice.13.1343128018682; Tue, 24 Jul 2012 04:06:58 -0700 (PDT) Received: from yakj.usersys.redhat.com (93-34-189-113.ip51.fastwebnet.it. [93.34.189.113]) by mx.google.com with ESMTPS id if4sm1752561igc.10.2012.07.24.04.06.55 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 24 Jul 2012 04:06:57 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Date: Tue, 24 Jul 2012 13:04:25 +0200 Message-Id: <1343127865-16608-48-git-send-email-pbonzini@redhat.com> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1343127865-16608-1-git-send-email-pbonzini@redhat.com> References: <1343127865-16608-1-git-send-email-pbonzini@redhat.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.160.173 Cc: kwolf@redhat.com, jcody@redhat.com, eblake@redhat.com, stefanha@linux.vnet.ibm.com Subject: [Qemu-devel] [PATCH 47/47] mirror: support arbitrarily-sized iterations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Yet another optimization is to extend the mirroring iteration to include more adjacent dirty blocks. This limits the number of I/O operations and makes mirroring efficient even with a small granularity. Most of the infrastructure is already in place; we only need to put a loop around the computation of the origin and sector count of the iteration. Signed-off-by: Paolo Bonzini --- block/mirror.c | 100 ++++++++++++++++++++++++++++++++++++++------------------ trace-events | 1 + 2 files changed, 69 insertions(+), 32 deletions(-) diff --git a/block/mirror.c b/block/mirror.c index 93e718f..87d97eb 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -127,7 +127,7 @@ static void coroutine_fn mirror_iteration(MirrorBlockJob *s) { BlockDriverState *source = s->common.bs; int nb_sectors, nb_sectors_chunk, nb_chunks; - int64_t end, sector_num, cluster_num, next_sector, hbitmap_next_sector; + int64_t end, sector_num, next_cluster, next_sector, hbitmap_next_sector; MirrorOp *op; s->sector_num = hbitmap_iter_next(&s->hbi); @@ -139,47 +139,83 @@ static void coroutine_fn mirror_iteration(MirrorBlockJob *s) } hbitmap_next_sector = s->sector_num; + sector_num = s->sector_num; + nb_sectors_chunk = s->granularity >> BDRV_SECTOR_BITS; + end = s->common.len >> BDRV_SECTOR_BITS; - /* If we have no backing file yet in the destination, and the cluster size - * is very large, we need to do COW ourselves. The first time a cluster is - * copied, copy it entirely. + /* Extend the QEMUIOVector to include all adjacent blocks that will + * be copied in this operation. + * + * We have to do this if we have no backing file yet in the destination, + * and the cluster size is very large. Then we need to do COW ourselves. + * The first time a cluster is copied, copy it entirely. Note that, + * because both the granularity and the cluster size are powers of two, + * the number of sectors to copy cannot exceed one cluster. * - * Because both the granularity and the cluster size are powers of two, the - * number of sectors to copy cannot exceed one cluster. + * We also want to extend the QEMUIOVector to include more adjacent + * dirty blocks if possible, to limit the number of I/O operations and + * run efficiently even with a small granularity. */ - sector_num = s->sector_num; - nb_sectors_chunk = nb_sectors = s->granularity >> BDRV_SECTOR_BITS; - cluster_num = sector_num / nb_sectors_chunk; - if (s->cow_bitmap && !test_bit(cluster_num, s->cow_bitmap)) { - trace_mirror_cow(s, sector_num); - bdrv_round_to_clusters(s->target, - sector_num, nb_sectors_chunk, - §or_num, &nb_sectors); - - /* The rounding may make us copy sectors before the - * first dirty one. - */ - cluster_num = sector_num / nb_sectors_chunk; - } + nb_chunks = 0; + nb_sectors = 0; + next_sector = sector_num; + next_cluster = sector_num / nb_sectors_chunk; /* Wait for I/O to this cluster (from a previous iteration) to be done. */ - while (test_bit(cluster_num, s->in_flight_bitmap)) { + while (test_bit(next_cluster, s->in_flight_bitmap)) { trace_mirror_yield_in_flight(s, sector_num, s->in_flight); qemu_coroutine_yield(); } - end = s->common.len >> BDRV_SECTOR_BITS; - nb_sectors = MIN(nb_sectors, end - sector_num); - nb_chunks = (nb_sectors + nb_sectors_chunk - 1) / nb_sectors_chunk; - while (s->buf_free_count < nb_chunks) { - trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight); - qemu_coroutine_yield(); - } + do { + int added_sectors, added_chunks; - /* We have enough free space to copy these sectors. */ - if (s->cow_bitmap) { - bitmap_set(s->cow_bitmap, cluster_num, nb_chunks); - } + if (!bdrv_get_dirty(source, next_sector) || + test_bit(next_cluster, s->in_flight_bitmap)) { + assert(nb_sectors > 0); + break; + } + + added_sectors = nb_sectors_chunk; + if (s->cow_bitmap && !test_bit(next_cluster, s->cow_bitmap)) { + bdrv_round_to_clusters(s->target, + next_sector, added_sectors, + &next_sector, &added_sectors); + + /* On the first iteration, the rounding may make us copy + * sectors before the first dirty one. + */ + if (next_sector < sector_num) { + assert(nb_sectors == 0); + sector_num = next_sector; + next_cluster = next_sector / nb_sectors_chunk; + } + } + + added_sectors = MIN(added_sectors, end - (sector_num + nb_sectors)); + added_chunks = (added_sectors + nb_sectors_chunk - 1) / nb_sectors_chunk; + + /* When doing COW, it may happen that there are not enough free + * buffers to copy a full cluster. Wait if that is the case. + */ + while (nb_chunks == 0 && s->buf_free_count < added_chunks) { + trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight); + qemu_coroutine_yield(); + } + if (s->buf_free_count < nb_chunks + added_chunks) { + trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight); + break; + } + + /* We have enough free space to copy these sectors. */ + if (s->cow_bitmap) { + bitmap_set(s->cow_bitmap, next_cluster, added_chunks); + } + nb_sectors += added_sectors; + nb_chunks += added_chunks; + next_sector += added_sectors; + next_cluster += added_chunks; + } while (next_sector < end); /* Allocate a MirrorOp that is used as an AIO callback. */ op = g_slice_new(MirrorOp); diff --git a/trace-events b/trace-events index 7ae11e9..cd387fa 100644 --- a/trace-events +++ b/trace-events @@ -87,6 +87,7 @@ mirror_iteration_done(void *s, int64_t sector_num, int nb_sectors) "s %p sector_ mirror_yield(void *s, int64_t cnt, int buf_free_count, int in_flight) "s %p dirty count %"PRId64" free buffers %d in_flight %d" mirror_yield_in_flight(void *s, int64_t sector_num, int in_flight) "s %p sector_num %"PRId64" in_flight %d" mirror_yield_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d" +mirror_break_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d" # blockdev.c qmp_block_job_cancel(void *job) "job %p"