[19/19] qcow2: Gather clusters in a looping loop

Message ID	1364232620-5293-20-git-send-email-kwolf@redhat.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Kevin Wolf <kwolf@redhat.com> To: qemu-devel@nongnu.org Date: Mon, 25 Mar 2013 18:30:20 +0100 Message-Id: <1364232620-5293-20-git-send-email-kwolf@redhat.com> In-Reply-To: <1364232620-5293-1-git-send-email-kwolf@redhat.com> References: <1364232620-5293-1-git-send-email-kwolf@redhat.com> Cc: kwolf@redhat.com, stefanha@redhat.com Subject: [Qemu-devel] [PATCH 19/19] qcow2: Gather clusters in a looping loop Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Message ID

1364232620-5293-20-git-send-email-kwolf@redhat.com

State

New

Headers

From: Kevin Wolf <kwolf@redhat.com>
To: qemu-devel@nongnu.org
Date: Mon, 25 Mar 2013 18:30:20 +0100
Message-Id: <1364232620-5293-20-git-send-email-kwolf@redhat.com>
In-Reply-To: <1364232620-5293-1-git-send-email-kwolf@redhat.com>
References: <1364232620-5293-1-git-send-email-kwolf@redhat.com>
Cc: kwolf@redhat.com, stefanha@redhat.com
Subject: [Qemu-devel] [PATCH 19/19] qcow2: Gather clusters in a looping loop
Precedence: list
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Commit Message

Kevin Wolf March 25, 2013, 5:30 p.m. UTC

Instead of just checking once in exactly this order if there are
dependendies, non-COW clusters and new allocation, this starts looping
around these. This way we can, for example, gather non-COW clusters after
new allocations as long as the host cluster offsets stay contiguous.

Once handle_dependencies() is extended so that COW areas of in-flight
allocations can be overwritten, this allows to continue with gathering
other clusters (we wouldn't be able to do that without this change
because we would have missed a possible second dependency in one of the
next clusters).

This means that in the typical sequential write case, we can combine the
COW overwrite of one cluster with the allocation of the next cluster as
soon as something like Delayed COW gets actually implemented. It is only
by avoiding splitting requests this way that Delayed COW actually starts
improving performance noticably.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c      | 59 +++++++++++++++++++++++-----------------------
 tests/qemu-iotests/044.out |  2 +-
 2 files changed, 31 insertions(+), 30 deletions(-)

Comments

Kevin Wolf March 25, 2013, 6:48 p.m. UTC | #1

Am 25.03.2013 um 18:30 hat Kevin Wolf geschrieben:
> Instead of just checking once in exactly this order if there are
> dependendies, non-COW clusters and new allocation, this starts looping
> around these. This way we can, for example, gather non-COW clusters after
> new allocations as long as the host cluster offsets stay contiguous.
> 
> Once handle_dependencies() is extended so that COW areas of in-flight
> allocations can be overwritten, this allows to continue with gathering
> other clusters (we wouldn't be able to do that without this change
> because we would have missed a possible second dependency in one of the
> next clusters).
> 
> This means that in the typical sequential write case, we can combine the
> COW overwrite of one cluster with the allocation of the next cluster as
> soon as something like Delayed COW gets actually implemented. It is only
> by avoiding splitting requests this way that Delayed COW actually starts
> improving performance noticably.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Self NAK.

> @@ -1159,9 +1178,12 @@ again:
>           *         the right synchronisation between the in-flight request and
>           *         the new one.
>           */
> -        cur_bytes = remaining;
>          ret = handle_dependencies(bs, start, &cur_bytes);
>          if (ret == -EAGAIN) {
> +            /* Currently handle_dependencies() doesn't yield if we already had
> +             * an allocation. If it did, we would have to clean up the L2Meta
> +             * structs before starting over. */
> +            assert(*m == NULL);

This assertion doesn't actually hold true. Reordering patches is
dangerous.

I also noticed that somewhere in the series I must introduce a
performance regression. This should serve as extra motivation for
reviewers - there is actually something to find!

Kevin

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 39a574f..0e3a389 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1015,16 +1015,16 @@  static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
         nb_clusters = count_cow_clusters(s, nb_clusters, l2_table, l2_index);
     }
 
+    /* This function is only called when there were no non-COW clusters, so if
+     * we can't find any unallocated or COW clusters either, something is
+     * wrong with our code. */
+    assert(nb_clusters > 0);
+
     ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
     if (ret < 0) {
         return ret;
     }
 
-    if (nb_clusters == 0) {
-        *bytes = 0;
-        return 0;
-    }
-
     /* Allocate, if necessary at a given offset in the image file */
     alloc_cluster_offset = *host_offset;
     ret = do_alloc_cluster_offset(bs, guest_offset, &alloc_cluster_offset,
@@ -1139,8 +1139,27 @@  again:
     remaining = (n_end - n_start) << BDRV_SECTOR_BITS;
     cluster_offset = 0;
     *host_offset = 0;
+    cur_bytes = 0;
+    *m = NULL;
 
     while (true) {
+
+        if (!*host_offset) {
+            *host_offset = start_of_cluster(s, cluster_offset);
+        }
+
+        assert(remaining >= cur_bytes);
+
+        start           += cur_bytes;
+        remaining       -= cur_bytes;
+        cluster_offset  += cur_bytes;
+
+        if (remaining == 0) {
+            break;
+        }
+
+        cur_bytes = remaining;
+
         /*
          * Now start gathering as many contiguous clusters as possible:
          *
@@ -1159,9 +1178,12 @@  again:
          *         the right synchronisation between the in-flight request and
          *         the new one.
          */
-        cur_bytes = remaining;
         ret = handle_dependencies(bs, start, &cur_bytes);
         if (ret == -EAGAIN) {
+            /* Currently handle_dependencies() doesn't yield if we already had
+             * an allocation. If it did, we would have to clean up the L2Meta
+             * structs before starting over. */
+            assert(*m == NULL);
             goto again;
         } else if (ret < 0) {
             return ret;
@@ -1178,24 +1200,11 @@  again:
         if (ret < 0) {
             return ret;
         } else if (ret) {
-            if (!*host_offset) {
-                *host_offset = cluster_offset;
-            }
-
-            start           += cur_bytes;
-            remaining       -= cur_bytes;
-            cluster_offset  += cur_bytes;
-
-            cur_bytes = remaining;
+            continue;
         } else if (cur_bytes == 0) {
             break;
         }
 
-        /* If there is something left to allocate, do that now */
-        if (remaining == 0) {
-            break;
-        }
-
         /*
          * 3. If the request still hasn't completed, allocate new clusters,
          *    considering any cluster_offset of steps 1c or 2.
@@ -1204,15 +1213,7 @@  again:
         if (ret < 0) {
             return ret;
         } else if (ret) {
-            if (!*host_offset) {
-                *host_offset = cluster_offset;
-            }
-
-            start           += cur_bytes;
-            remaining       -= cur_bytes;
-            cluster_offset  += cur_bytes;
-
-            break;
+            continue;
         } else {
             assert(cur_bytes == 0);
             break;
diff --git a/tests/qemu-iotests/044.out b/tests/qemu-iotests/044.out
index 34c25c7..5c5aa92 100644
--- a/tests/qemu-iotests/044.out
+++ b/tests/qemu-iotests/044.out
@@ -1,6 +1,6 @@ 
 No errors were found on the image.
 7292415/33554432 = 21.73% allocated, 0.00% fragmented, 0.00% compressed clusters
-Image end offset: 4296447488
+Image end offset: 4296448000
 .
 ----------------------------------------------------------------------
 Ran 1 tests

[19/19] qcow2: Gather clusters in a looping loop

Commit Message

Comments

Patch