From patchwork Tue Jul 24 11:04:25 2012
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Paolo Bonzini <pbonzini@redhat.com>
X-Patchwork-Id: 172852
Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17])
	(using TLSv1 with cipher AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by ozlabs.org (Postfix) with ESMTPS id 906CA2C008F
	for <incoming@patchwork.ozlabs.org>;
	Tue, 24 Jul 2012 22:05:42 +1000 (EST)
Received: from localhost ([::1]:43104 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>)
	id 1StcxT-0000cS-4S
	for incoming@patchwork.ozlabs.org; Tue, 24 Jul 2012 07:07:19 -0400
Received: from eggs.gnu.org ([208.118.235.92]:57143)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1StcxF-0000R2-Kl
	for qemu-devel@nongnu.org; Tue, 24 Jul 2012 07:07:11 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1Stcx9-0005xn-94
	for qemu-devel@nongnu.org; Tue, 24 Jul 2012 07:07:05 -0400
Received: from mail-gh0-f173.google.com ([209.85.160.173]:59147)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1Stcx9-0005Rh-4U
	for qemu-devel@nongnu.org; Tue, 24 Jul 2012 07:06:59 -0400
Received: by mail-gh0-f173.google.com with SMTP id r14so6581704ghr.4
	for <qemu-devel@nongnu.org>; Tue, 24 Jul 2012 04:06:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=sender:from:to:cc:subject:date:message-id:x-mailer:in-reply-to
	:references; bh=2fZAC06cHiRgVYAuK0AbCscUfe+eVKB8RXdbRIQ+18c=;
	b=MLdoAmQKTgLgMqcOHr3My0NDfH7IoaPFQMrl+78zhdEF07Vpy4EffAVglYZZMZ7xY9
	81GourFVxG8+nfkXkiRY1tVkvbRscujM+E5FPYhGVNxpWgVre3A9FnA36CFWlKlHWIhT
	1c8waJKs2tCDWGUgP16S7PNhOHl3ewJCZHG9rkBTVqB5d6e53JDgoIoulQe1WEs5+4+4
	xmnF9+AAxjlkhpwqxAdTIMc37W6LjsiKQddbc2+jU+N8C9XtO2ULqCNA+21+Qw8Q4GIz
	0KWrxCjEiaTcqKaIekeJCMtqRh7iu8Aaww7dfBNONJr5NnmSIep2qxZQIuTHqbf/Bx5G
	DxpQ==
Received: by 10.42.41.11 with SMTP id n11mr14455780ice.13.1343128018682;
	Tue, 24 Jul 2012 04:06:58 -0700 (PDT)
Received: from yakj.usersys.redhat.com (93-34-189-113.ip51.fastwebnet.it.
	[93.34.189.113]) by mx.google.com with ESMTPS id
	if4sm1752561igc.10.2012.07.24.04.06.55
	(version=TLSv1/SSLv3 cipher=OTHER);
	Tue, 24 Jul 2012 04:06:57 -0700 (PDT)
From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Date: Tue, 24 Jul 2012 13:04:25 +0200
Message-Id: <1343127865-16608-48-git-send-email-pbonzini@redhat.com>
X-Mailer: git-send-email 1.7.10.4
In-Reply-To: <1343127865-16608-1-git-send-email-pbonzini@redhat.com>
References: <1343127865-16608-1-git-send-email-pbonzini@redhat.com>
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 209.85.160.173
Cc: kwolf@redhat.com, jcody@redhat.com, eblake@redhat.com,
	stefanha@linux.vnet.ibm.com
Subject: [Qemu-devel] [PATCH 47/47] mirror: support arbitrarily-sized
	iterations
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Yet another optimization is to extend the mirroring iteration to include more
adjacent dirty blocks.  This limits the number of I/O operations and makes
mirroring efficient even with a small granularity.  Most of the infrastructure
is already in place; we only need to put a loop around the computation of
the origin and sector count of the iteration.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/mirror.c |  100 ++++++++++++++++++++++++++++++++++++++------------------
 trace-events   |    1 +
 2 files changed, 69 insertions(+), 32 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 93e718f..87d97eb 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -127,7 +127,7 @@ static void coroutine_fn mirror_iteration(MirrorBlockJob *s)
 {
     BlockDriverState *source = s->common.bs;
     int nb_sectors, nb_sectors_chunk, nb_chunks;
-    int64_t end, sector_num, cluster_num, next_sector, hbitmap_next_sector;
+    int64_t end, sector_num, next_cluster, next_sector, hbitmap_next_sector;
     MirrorOp *op;
 
     s->sector_num = hbitmap_iter_next(&s->hbi);
@@ -139,47 +139,83 @@ static void coroutine_fn mirror_iteration(MirrorBlockJob *s)
     }
 
     hbitmap_next_sector = s->sector_num;
+    sector_num = s->sector_num;
+    nb_sectors_chunk = s->granularity >> BDRV_SECTOR_BITS;
+    end = s->common.len >> BDRV_SECTOR_BITS;
 
-    /* If we have no backing file yet in the destination, and the cluster size
-     * is very large, we need to do COW ourselves.  The first time a cluster is
-     * copied, copy it entirely.
+    /* Extend the QEMUIOVector to include all adjacent blocks that will
+     * be copied in this operation.
+     *
+     * We have to do this if we have no backing file yet in the destination,
+     * and the cluster size is very large.  Then we need to do COW ourselves.
+     * The first time a cluster is copied, copy it entirely.  Note that,
+     * because both the granularity and the cluster size are powers of two,
+     * the number of sectors to copy cannot exceed one cluster.
      *
-     * Because both the granularity and the cluster size are powers of two, the
-     * number of sectors to copy cannot exceed one cluster.
+     * We also want to extend the QEMUIOVector to include more adjacent
+     * dirty blocks if possible, to limit the number of I/O operations and
+     * run efficiently even with a small granularity.
      */
-    sector_num = s->sector_num;
-    nb_sectors_chunk = nb_sectors = s->granularity >> BDRV_SECTOR_BITS;
-    cluster_num = sector_num / nb_sectors_chunk;
-    if (s->cow_bitmap && !test_bit(cluster_num, s->cow_bitmap)) {
-        trace_mirror_cow(s, sector_num);
-        bdrv_round_to_clusters(s->target,
-                               sector_num, nb_sectors_chunk,
-                               &sector_num, &nb_sectors);
-
-        /* The rounding may make us copy sectors before the
-         * first dirty one.
-         */
-        cluster_num = sector_num / nb_sectors_chunk;
-    }
+    nb_chunks = 0;
+    nb_sectors = 0;
+    next_sector = sector_num;
+    next_cluster = sector_num / nb_sectors_chunk;
 
     /* Wait for I/O to this cluster (from a previous iteration) to be done.  */
-    while (test_bit(cluster_num, s->in_flight_bitmap)) {
+    while (test_bit(next_cluster, s->in_flight_bitmap)) {
         trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
         qemu_coroutine_yield();
     }
 
-    end = s->common.len >> BDRV_SECTOR_BITS;
-    nb_sectors = MIN(nb_sectors, end - sector_num);
-    nb_chunks = (nb_sectors + nb_sectors_chunk - 1) / nb_sectors_chunk;
-    while (s->buf_free_count < nb_chunks) {
-        trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
-        qemu_coroutine_yield();
-    }
+    do {
+        int added_sectors, added_chunks;
 
-    /* We have enough free space to copy these sectors.  */
-    if (s->cow_bitmap) {
-        bitmap_set(s->cow_bitmap, cluster_num, nb_chunks);
-    }
+        if (!bdrv_get_dirty(source, next_sector) ||
+            test_bit(next_cluster, s->in_flight_bitmap)) {
+            assert(nb_sectors > 0);
+            break;
+        }
+
+        added_sectors = nb_sectors_chunk;
+        if (s->cow_bitmap && !test_bit(next_cluster, s->cow_bitmap)) {
+            bdrv_round_to_clusters(s->target,
+                                   next_sector, added_sectors,
+                                   &next_sector, &added_sectors);
+
+            /* On the first iteration, the rounding may make us copy
+             * sectors before the first dirty one.
+             */
+            if (next_sector < sector_num) {
+                assert(nb_sectors == 0);
+                sector_num = next_sector;
+                next_cluster = next_sector / nb_sectors_chunk;
+            }
+        }
+
+        added_sectors = MIN(added_sectors, end - (sector_num + nb_sectors));
+        added_chunks = (added_sectors + nb_sectors_chunk - 1) / nb_sectors_chunk;
+
+        /* When doing COW, it may happen that there are not enough free
+         * buffers to copy a full cluster.  Wait if that is the case.
+         */
+        while (nb_chunks == 0 && s->buf_free_count < added_chunks) {
+            trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
+            qemu_coroutine_yield();
+        }
+        if (s->buf_free_count < nb_chunks + added_chunks) {
+            trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
+            break;
+        }
+
+        /* We have enough free space to copy these sectors.  */
+        if (s->cow_bitmap) {
+            bitmap_set(s->cow_bitmap, next_cluster, added_chunks);
+        }
+        nb_sectors += added_sectors;
+        nb_chunks += added_chunks;
+        next_sector += added_sectors;
+        next_cluster += added_chunks;
+    } while (next_sector < end);
 
     /* Allocate a MirrorOp that is used as an AIO callback.  */
     op = g_slice_new(MirrorOp);
diff --git a/trace-events b/trace-events
index 7ae11e9..cd387fa 100644
--- a/trace-events
+++ b/trace-events
@@ -87,6 +87,7 @@ mirror_iteration_done(void *s, int64_t sector_num, int nb_sectors) "s %p sector_
 mirror_yield(void *s, int64_t cnt, int buf_free_count, int in_flight) "s %p dirty count %"PRId64" free buffers %d in_flight %d"
 mirror_yield_in_flight(void *s, int64_t sector_num, int in_flight) "s %p sector_num %"PRId64" in_flight %d"
 mirror_yield_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d"
+mirror_break_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d"
 
 # blockdev.c
 qmp_block_job_cancel(void *job) "job %p"