diff mbox series

[v2,3/3] qcow2: handle_dependencies(): relax conflict detection

Message ID 20210824101517.59802-4-vsementsov@virtuozzo.com
State New
Headers show
Series qcow2: relax subclusters allocation dependencies | expand

Commit Message

Vladimir Sementsov-Ogievskiy Aug. 24, 2021, 10:15 a.m. UTC
There is no conflict and no dependency if we have parallel writes to
different subclusters of one cluster when the cluster itself is already
allocated. So, relax extra dependency.

Measure performance:
First, prepare build/qemu-img-old and build/qemu-img-new images.

cd scripts/simplebench
./img_bench_templater.py

Paste the following to stdin of running script:

qemu_img=../../build/qemu-img-{old|new}
$qemu_img create -f qcow2 -o extended_l2=on /ssd/x.qcow2 1G
$qemu_img bench -c 100000 -d 8 [-s 2K|-s 2K -o 512|-s $((1024*2+512))] \
        -w -t none -n /ssd/x.qcow2

The result:

All results are in seconds

------------------  ---------  ---------
                    old        new
-s 2K               6.7 ± 15%  6.2 ± 12%
                                 -7%
-s 2K -o 512        13 ± 3%    11 ± 5%
                                 -16%
-s $((1024*2+512))  9.5 ± 4%   8.4
                                 -12%
------------------  ---------  ---------

So small writes are more independent now and that helps to keep deeper
io queue which improves performance.

271 iotest output becomes racy for three allocation in one cluster.
Second and third writes may finish in different order. Second and
third requests don't depend on each other any more. Still they both
depend on first request anyway. Filter out second and third write
offsets to cover both possible outputs.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-cluster.c      | 11 +++++++++++
 tests/qemu-iotests/271     |  5 ++++-
 tests/qemu-iotests/271.out |  4 ++--
 3 files changed, 17 insertions(+), 3 deletions(-)

Comments

Eric Blake Aug. 25, 2021, 2:16 p.m. UTC | #1
On Tue, Aug 24, 2021 at 01:15:17PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> There is no conflict and no dependency if we have parallel writes to
> different subclusters of one cluster when the cluster itself is already
> allocated. So, relax extra dependency.
> 
...
> So small writes are more independent now and that helps to keep deeper
> io queue which improves performance.
> 
> 271 iotest output becomes racy for three allocation in one cluster.
> Second and third writes may finish in different order. Second and
> third requests don't depend on each other any more. Still they both
> depend on first request anyway. Filter out second and third write
> offsets to cover both possible outputs.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/qcow2-cluster.c      | 11 +++++++++++
>  tests/qemu-iotests/271     |  5 ++++-
>  tests/qemu-iotests/271.out |  4 ++--
>  3 files changed, 17 insertions(+), 3 deletions(-)
> 

> +++ b/tests/qemu-iotests/271
> @@ -893,7 +893,10 @@ EOF
>  }
>  
>  _make_test_img -o extended_l2=on 1M
> -_concurrent_io     | $QEMU_IO | _filter_qemu_io
> +# Second an third writes in _concurrent_io() are independent and may finish in

and

> +# different order. So, filter offset out to match both possible variants.
> +_concurrent_io     | $QEMU_IO | _filter_qemu_io | \
> +    $SED -e 's/\(20480\|40960\)/OFFSET/'
>  _concurrent_verify | $QEMU_IO | _filter_qemu_io
>

Reviewed-by: Eric Blake <eblake@redhat.com>
Hanna Czenczek Sept. 13, 2021, 2:51 p.m. UTC | #2
On 24.08.21 12:15, Vladimir Sementsov-Ogievskiy wrote:
> There is no conflict and no dependency if we have parallel writes to
> different subclusters of one cluster when the cluster itself is already
> allocated. So, relax extra dependency.
>
> Measure performance:
> First, prepare build/qemu-img-old and build/qemu-img-new images.
>
> cd scripts/simplebench
> ./img_bench_templater.py
>
> Paste the following to stdin of running script:
>
> qemu_img=../../build/qemu-img-{old|new}
> $qemu_img create -f qcow2 -o extended_l2=on /ssd/x.qcow2 1G
> $qemu_img bench -c 100000 -d 8 [-s 2K|-s 2K -o 512|-s $((1024*2+512))] \
>          -w -t none -n /ssd/x.qcow2
>
> The result:
>
> All results are in seconds
>
> ------------------  ---------  ---------
>                      old        new
> -s 2K               6.7 ± 15%  6.2 ± 12%
>                                   -7%
> -s 2K -o 512        13 ± 3%    11 ± 5%
>                                   -16%
> -s $((1024*2+512))  9.5 ± 4%   8.4
>                                   -12%
> ------------------  ---------  ---------
>
> So small writes are more independent now and that helps to keep deeper
> io queue which improves performance.
>
> 271 iotest output becomes racy for three allocation in one cluster.
> Second and third writes may finish in different order. Second and
> third requests don't depend on each other any more. Still they both
> depend on first request anyway. Filter out second and third write
> offsets to cover both possible outputs.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block/qcow2-cluster.c      | 11 +++++++++++
>   tests/qemu-iotests/271     |  5 ++++-
>   tests/qemu-iotests/271.out |  4 ++--
>   3 files changed, 17 insertions(+), 3 deletions(-)

[...]

> diff --git a/tests/qemu-iotests/271 b/tests/qemu-iotests/271
> index 599b849cc6..d9d391955e 100755
> --- a/tests/qemu-iotests/271
> +++ b/tests/qemu-iotests/271
> @@ -893,7 +893,10 @@ EOF
>   }
>   
>   _make_test_img -o extended_l2=on 1M
> -_concurrent_io     | $QEMU_IO | _filter_qemu_io
> +# Second an third writes in _concurrent_io() are independent and may finish in

s/ an / and /

With that fixed:

Reviewed-by: Hanna Reitz <hreitz@redhat.com>

> +# different order. So, filter offset out to match both possible variants.
> +_concurrent_io     | $QEMU_IO | _filter_qemu_io | \
> +    $SED -e 's/\(20480\|40960\)/OFFSET/'
>   _concurrent_verify | $QEMU_IO | _filter_qemu_io
>   
>   # success, all done
diff mbox series

Patch

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 9917e5c28c..c1c43a891b 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1403,6 +1403,17 @@  static int handle_dependencies(BlockDriverState *bs, uint64_t guest_offset,
             continue;
         }
 
+        if (old_alloc->keep_old_clusters &&
+            (end <= l2meta_cow_start(old_alloc) ||
+             start >= l2meta_cow_end(old_alloc)))
+        {
+            /*
+             * Clusters intersect but COW areas don't. And cluster itself is
+             * already allocated. So, there is no actual conflict.
+             */
+            continue;
+        }
+
         /* Conflict */
 
         if (start < old_start) {
diff --git a/tests/qemu-iotests/271 b/tests/qemu-iotests/271
index 599b849cc6..d9d391955e 100755
--- a/tests/qemu-iotests/271
+++ b/tests/qemu-iotests/271
@@ -893,7 +893,10 @@  EOF
 }
 
 _make_test_img -o extended_l2=on 1M
-_concurrent_io     | $QEMU_IO | _filter_qemu_io
+# Second an third writes in _concurrent_io() are independent and may finish in
+# different order. So, filter offset out to match both possible variants.
+_concurrent_io     | $QEMU_IO | _filter_qemu_io | \
+    $SED -e 's/\(20480\|40960\)/OFFSET/'
 _concurrent_verify | $QEMU_IO | _filter_qemu_io
 
 # success, all done
diff --git a/tests/qemu-iotests/271.out b/tests/qemu-iotests/271.out
index 81043ba4d7..5be780de76 100644
--- a/tests/qemu-iotests/271.out
+++ b/tests/qemu-iotests/271.out
@@ -719,8 +719,8 @@  blkdebug: Suspended request 'A'
 blkdebug: Resuming request 'A'
 wrote 2048/2048 bytes at offset 30720
 2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 2048/2048 bytes at offset 20480
+wrote 2048/2048 bytes at offset OFFSET
 2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 2048/2048 bytes at offset 40960
+wrote 2048/2048 bytes at offset OFFSET
 2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 *** done