Message ID | 20190715104508.7568-2-mreitz@redhat.com |
---|---|
State | New |
Headers | show |
Series | block: Fix three .bdrv_has_zero_init()s | expand |
Am 15.07.2019 um 12:45 hat Max Reitz geschrieben: > If a qcow2 file is preallocated, it can no longer guarantee that it > initially appears as filled with zeroes. > > So implement .bdrv_has_zero_init() by checking whether the file is > preallocated; if so, forward the call to the underlying storage node, > except for when it is encrypted: Encrypted preallocated images always > return effectively random data, so .bdrv_has_zero_init() must always > return 0 for them. > > Reported-by: Stefano Garzarella <sgarzare@redhat.com> > Signed-off-by: Max Reitz <mreitz@redhat.com> Hm... This patch only really works directly after image creation (which is indeed where .bdrv_has_zero_init is used). Why do we have to have a full qcow2_is_zero() that loops over the whole image just to find out whether it's preallocated? Wouldn't looking at a single data cluster be enough? Kevin
On 16.07.19 18:54, Kevin Wolf wrote: > Am 15.07.2019 um 12:45 hat Max Reitz geschrieben: >> If a qcow2 file is preallocated, it can no longer guarantee that it >> initially appears as filled with zeroes. >> >> So implement .bdrv_has_zero_init() by checking whether the file is >> preallocated; if so, forward the call to the underlying storage node, >> except for when it is encrypted: Encrypted preallocated images always >> return effectively random data, so .bdrv_has_zero_init() must always >> return 0 for them. >> >> Reported-by: Stefano Garzarella <sgarzare@redhat.com> >> Signed-off-by: Max Reitz <mreitz@redhat.com> > > Hm... This patch only really works directly after image creation (which > is indeed where .bdrv_has_zero_init is used). Why do we have to have a > full qcow2_is_zero() that loops over the whole image just to find out > whether it's preallocated? Wouldn't looking at a single data cluster be > enough? Hm. I would like to agree (because you’re right), but now I see that the callers of bdrv_has_zero_init() don’t necessarily hold to that convention. For example, qemu-img convert has the -n flag, but that doesn’t stop it from invoking bdrv_has_zero_init(). Which is a bug, of course. $ ./qemu-img create -f qcow2 src.qcow2 64M $ ./qemu-img create -f qcow2 dest.qcow2 64M $ ./qemu-io -c 'write -P 42 0 64M' dest.qcow2 $ ./qemu-img convert -n src.qcow2 dest.qcow2 $ ./qemu-img compare src.qcow2 dest.qcow2 Content mismatch at offset 0! Aw, man, why does this keep happening... :-/ OK, so qemu-img convert -n is easy to fix. But there are more callers: mirror: Uses this function to inquire whether it needs to zero the target before actually doing something useful. There is no guarantee that the target is a new image. Well, it just isn’t with mode=existing or blockdev-mirror. parallels: Whether to write zeroes to newly added image areas. That actually sounds correct, because those new areas cannot point to any data yet. Well, maybe not correct, because bdrv_has_zero_init() is not the same as “when this image grows, new areas will be zero”, but at least bdrv_hsa_zero_init() will return false if the the latter is false. vhdx: Similarly to parallels, it uses this information to check whether it needs to zero new areas when growing an image file. raw/vmdk/vpc: Just passing through info from their storage child. Hm, OK. So mirror and qemu-img need fixing. That sounds possible. Max
diff --git a/block/qcow2.c b/block/qcow2.c index 039bdc2f7e..730fd53890 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -4631,6 +4631,94 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs, return spec_info; } +/* + * Return 1 if the file only contains zero and unallocated clusters. + * Return 0 if it contains compressed or normal clusters. + * Return -errno on error. + */ +static int qcow2_is_zero(BlockDriverState *bs) +{ + BDRVQcow2State *s = bs->opaque; + int l1_i; + int ret; + + if (bs->backing) { + return 0; + } + + for (l1_i = 0; l1_i < s->l1_size; l1_i++) { + uint64_t l2_offset = s->l1_table[l1_i] & L1E_OFFSET_MASK; + int slice_start_i; + + if (!l2_offset) { + continue; + } + + for (slice_start_i = 0; slice_start_i < s->l2_size; + slice_start_i += s->l2_slice_size) + { + uint64_t *l2_slice; + int l2_slice_i; + + ret = qcow2_cache_get(bs, s->l2_table_cache, + l2_offset + slice_start_i * sizeof(uint64_t), + (void **)&l2_slice); + if (ret < 0) { + return ret; + } + + for (l2_slice_i = 0; l2_slice_i < s->l2_slice_size; l2_slice_i++) { + uint64_t l2_entry = be64_to_cpu(l2_slice[l2_slice_i]); + + switch (qcow2_get_cluster_type(bs, l2_entry)) { + case QCOW2_CLUSTER_UNALLOCATED: + case QCOW2_CLUSTER_ZERO_PLAIN: + case QCOW2_CLUSTER_ZERO_ALLOC: + break; + + case QCOW2_CLUSTER_NORMAL: + case QCOW2_CLUSTER_COMPRESSED: + qcow2_cache_put(s->l2_table_cache, (void **)&l2_slice); + return 0; + + default: + abort(); + } + } + + qcow2_cache_put(s->l2_table_cache, (void **)&l2_slice); + } + } + + return 1; +} + +static int qcow2_has_zero_init(BlockDriverState *bs) +{ + BDRVQcow2State *s = bs->opaque; + int ret; + + if (qemu_in_coroutine()) { + qemu_co_mutex_lock(&s->lock); + } + /* Check preallocation status */ + ret = qcow2_is_zero(bs); + if (qemu_in_coroutine()) { + qemu_co_mutex_unlock(&s->lock); + } + if (ret < 0) { + return 0; + } + + if (ret == 1) { + return 1; + } else if (bs->encrypted) { + return 0; + } else { + return bdrv_has_zero_init(s->data_file->bs); + } +} + static int qcow2_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) { @@ -5186,7 +5274,7 @@ BlockDriver bdrv_qcow2 = { .bdrv_child_perm = bdrv_format_default_perms, .bdrv_co_create_opts = qcow2_co_create_opts, .bdrv_co_create = qcow2_co_create, - .bdrv_has_zero_init = bdrv_has_zero_init_1, + .bdrv_has_zero_init = qcow2_has_zero_init, .bdrv_co_block_status = qcow2_co_block_status, .bdrv_co_preadv = qcow2_co_preadv,
If a qcow2 file is preallocated, it can no longer guarantee that it initially appears as filled with zeroes. So implement .bdrv_has_zero_init() by checking whether the file is preallocated; if so, forward the call to the underlying storage node, except for when it is encrypted: Encrypted preallocated images always return effectively random data, so .bdrv_has_zero_init() must always return 0 for them. Reported-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> --- block/qcow2.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 89 insertions(+), 1 deletion(-)