diff mbox series

[v3] qcow2: fix preallocation with metadata on bare block device

Message ID 1526053004-12268-1-git-send-email-ivanren@tencent.com
State New
Headers show
Series [v3] qcow2: fix preallocation with metadata on bare block device | expand

Commit Message

Ivan Ren May 11, 2018, 3:36 p.m. UTC
Create a qcow2 directly on bare block device with
"-o preallocation=metadata" option. When read this qcow2, it will
return pre-existing data on block device. This patch add
QCOW_OFLAG_ZERO flag (supported in qcow_version >= 3) for
preallocated l2 entry to avoid this problem.

Signed-off-by: Ivan Ren <ivanren@tencent.com>
---
Changes in v2:
- always pass QCOW_OFLAG_ZERO when preallocate metadta
Changes in v3:
- limit this feature only on qcow_version >= 3
---
 block/qcow2-cluster.c |  5 +++--
 block/qcow2.c         | 12 +++++++++---
 block/qcow2.h         |  3 ++-
 3 files changed, 14 insertions(+), 6 deletions(-)

Comments

Kevin Wolf May 11, 2018, 5:29 p.m. UTC | #1
Am 11.05.2018 um 17:36 hat Ivan Ren geschrieben:
> Create a qcow2 directly on bare block device with
> "-o preallocation=metadata" option. When read this qcow2, it will
> return pre-existing data on block device. This patch add
> QCOW_OFLAG_ZERO flag (supported in qcow_version >= 3) for
> preallocated l2 entry to avoid this problem.
> 
> Signed-off-by: Ivan Ren <ivanren@tencent.com>

Doesn't this defeat the purpose of preallocation? Usually, the idea with
preallocation is that you don't need to update any metadata on the first
write, but if you set QCOW_OFLAG_ZERO, we do need a metadata update
again.

So what's the advantage compared to not preallocating at all?

Kevin
Ivan Ren May 13, 2018, 1:37 p.m. UTC | #2
> Doesn't this defeat the purpose of preallocation? Usually, the idea with
> preallocation is that you don't need to update any metadata on the first
> write, but if you set QCOW_OFLAG_ZERO, we do need a metadata update
> again.
>
> So what's the advantage compared to not preallocating at all?

Yes, when do this, the qcow2_alloc_cluster_offset will call handle_alloc
again, and will lead metada update. Good news is that in handle_alloc the
preallocated zero cluster offset will be reuse. In this case, the
preallocated metata will be keeped except some flags, and the cluster
offset is fixed.

On Sat, May 12, 2018 at 1:29 AM Kevin Wolf <kwolf@redhat.com> wrote:

> Am 11.05.2018 um 17:36 hat Ivan Ren geschrieben:
> > Create a qcow2 directly on bare block device with
> > "-o preallocation=metadata" option. When read this qcow2, it will
> > return pre-existing data on block device. This patch add
> > QCOW_OFLAG_ZERO flag (supported in qcow_version >= 3) for
> > preallocated l2 entry to avoid this problem.
> >
> > Signed-off-by: Ivan Ren <ivanren@tencent.com>
>
> Doesn't this defeat the purpose of preallocation? Usually, the idea with
> preallocation is that you don't need to update any metadata on the first
> write, but if you set QCOW_OFLAG_ZERO, we do need a metadata update
> again.
>
> So what's the advantage compared to not preallocating at all?
>
> Kevin
>
Kevin Wolf May 14, 2018, 6:23 a.m. UTC | #3
Am 13.05.2018 um 15:37 hat Ivan Ren geschrieben:
> > Doesn't this defeat the purpose of preallocation? Usually, the idea with
> > preallocation is that you don't need to update any metadata on the first
> > write, but if you set QCOW_OFLAG_ZERO, we do need a metadata update
> > again.
> >
> > So what's the advantage compared to not preallocating at all?
> 
> Yes, when do this, the qcow2_alloc_cluster_offset will call handle_alloc
> again, and will lead metada update. Good news is that in handle_alloc the
> preallocated zero cluster offset will be reuse. In this case, the
> preallocated metata will be keeped except some flags, and the cluster
> offset is fixed.

Oh, yes, there is no doubt that the result will be correct. My point is
that people aren't usually interested so much in the physical layout of
the clusters, but more about the fact that no metadata updates and no
COW is necessary when you write to a cluster for the first time (i.e.
because preallocation brings some performance improvements).

Do you have a use case where the layout is more important? If there
are good reasons for either option, maybe we need two different
preallocation modes.

Kevin
Ivan Ren May 14, 2018, 2:30 p.m. UTC | #4
> Oh, yes, there is no doubt that the result will be correct. My point is
> that people aren't usually interested so much in the physical layout of
> the clusters, but more about the fact that no metadata updates and no
> COW is necessary when you write to a cluster for the first time (i.e.
> because preallocation brings some performance improvements).
>
> Do you have a use case where the layout is more important? If there
> are good reasons for either option, maybe we need two different
> preallocation modes.

Yes, in my opinion, there are two advantages for preallocated metadata:
1. No metadata update.
2. The qcow2 cluster offset keep the same continuity with the guest offset,
and this is good for performance.

The second point is good if underlying media is bare block device without
filesystem, it means the continuous blocks in guest also keep continuity on
host.
The second point may not always be true if the qcow2 underlying media is a
regular file, in this case, the offset in qcow2 metadat is just the file's
offset but not the underlying block device.

Thanks.
diff mbox series

Patch

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 1aee726..b9e0ceb 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -919,7 +919,8 @@  fail:
     return ret;
 }
 
-int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
+int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m,
+                                uint64_t flags)
 {
     BDRVQcow2State *s = bs->opaque;
     int i, j = 0, l2_index, ret;
@@ -969,7 +970,7 @@  int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
         }
 
         l2_slice[l2_index + i] = cpu_to_be64((cluster_offset +
-                    (i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
+                    (i << s->cluster_bits)) | QCOW_OFLAG_COPIED | flags);
      }
 
 
diff --git a/block/qcow2.c b/block/qcow2.c
index 2f36e63..ee862b0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2044,7 +2044,7 @@  static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
         while (l2meta != NULL) {
             QCowL2Meta *next;
 
-            ret = qcow2_alloc_cluster_link_l2(bs, l2meta);
+            ret = qcow2_alloc_cluster_link_l2(bs, l2meta, 0);
             if (ret < 0) {
                 goto fail;
             }
@@ -2552,7 +2552,13 @@  static void coroutine_fn preallocate_co(void *opaque)
         while (meta) {
             QCowL2Meta *next = meta->next;
 
-            ret = qcow2_alloc_cluster_link_l2(bs, meta);
+            if (s->qcow_version >= 3) {
+                /* add QCOW_OFLAG_ZERO to avoid pre-existing data be read */
+                ret = qcow2_alloc_cluster_link_l2(bs, meta, QCOW_OFLAG_ZERO);
+            } else {
+                ret = qcow2_alloc_cluster_link_l2(bs, meta, 0);
+            }
+
             if (ret < 0) {
                 qcow2_free_any_clusters(bs, meta->alloc_offset,
                                         meta->nb_clusters, QCOW2_DISCARD_NEVER);
@@ -3458,7 +3464,7 @@  static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
             };
             qemu_co_queue_init(&allocation.dependent_requests);
 
-            ret = qcow2_alloc_cluster_link_l2(bs, &allocation);
+            ret = qcow2_alloc_cluster_link_l2(bs, &allocation, 0);
             if (ret < 0) {
                 error_setg_errno(errp, -ret, "Failed to update L2 tables");
                 qcow2_free_clusters(bs, host_offset,
diff --git a/block/qcow2.h b/block/qcow2.h
index adf5c39..9a59602 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -617,7 +617,8 @@  uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
                                          uint64_t offset,
                                          int compressed_size);
 
-int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m);
+int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m,
+                                uint64_t flags);
 int qcow2_cluster_discard(BlockDriverState *bs, uint64_t offset,
                           uint64_t bytes, enum qcow2_discard_type type,
                           bool full_discard);