[v4,3/5] block/qcow2: refactor qcow2_co_preadv_part
diff mbox series

Message ID 20190816153015.447957-4-vsementsov@virtuozzo.com
State New
Headers show
Series
  • Untitled series #125626
Related show

Commit Message

Vladimir Sementsov-Ogievskiy Aug. 16, 2019, 3:30 p.m. UTC
Further patch will run partial requests of iterations of
qcow2_co_preadv in parallel for performance reasons. To prepare for
this, separate part which may be parallelized into separate function
(qcow2_co_preadv_task).

While being here, also separate encrypted clusters reading to own
function, like it is done for compressed reading.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 qapi/block-core.json |   2 +-
 block/qcow2.c        | 205 +++++++++++++++++++++++--------------------
 2 files changed, 111 insertions(+), 96 deletions(-)

Comments

Kevin Wolf Sept. 13, 2019, 10:01 a.m. UTC | #1
Am 16.08.2019 um 17:30 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Further patch will run partial requests of iterations of
> qcow2_co_preadv in parallel for performance reasons. To prepare for
> this, separate part which may be parallelized into separate function
> (qcow2_co_preadv_task).
> 
> While being here, also separate encrypted clusters reading to own
> function, like it is done for compressed reading.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> ---
>  qapi/block-core.json |   2 +-
>  block/qcow2.c        | 205 +++++++++++++++++++++++--------------------
>  2 files changed, 111 insertions(+), 96 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 0d43d4f37c..dd80aa11db 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -3266,7 +3266,7 @@
>              'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
>              'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
>              'l1_shrink_write_table', 'l1_shrink_free_l2_clusters',
> -            'cor_write', 'cluster_alloc_space', 'none'] }
> +            'cor_write', 'cluster_alloc_space', 'none', 'read_encrypted'] }

What's the point of this new blkdebug event?

Obviously, read_aio for an encrypted image must mean a read of encrypted
data. The same image can never trigger both read_aio and
read_encrypted, so why do we need to distinguish them as two different
events?

Kevin
Vladimir Sementsov-Ogievskiy Sept. 13, 2019, 10:53 a.m. UTC | #2
13.09.2019 13:01, Kevin Wolf wrote:
> Am 16.08.2019 um 17:30 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Further patch will run partial requests of iterations of
>> qcow2_co_preadv in parallel for performance reasons. To prepare for
>> this, separate part which may be parallelized into separate function
>> (qcow2_co_preadv_task).
>>
>> While being here, also separate encrypted clusters reading to own
>> function, like it is done for compressed reading.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Reviewed-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   qapi/block-core.json |   2 +-
>>   block/qcow2.c        | 205 +++++++++++++++++++++++--------------------
>>   2 files changed, 111 insertions(+), 96 deletions(-)
>>
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index 0d43d4f37c..dd80aa11db 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -3266,7 +3266,7 @@
>>               'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
>>               'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
>>               'l1_shrink_write_table', 'l1_shrink_free_l2_clusters',
>> -            'cor_write', 'cluster_alloc_space', 'none'] }
>> +            'cor_write', 'cluster_alloc_space', 'none', 'read_encrypted'] }
> 
> What's the point of this new blkdebug event?
> 
> Obviously, read_aio for an encrypted image must mean a read of encrypted
> data. The same image can never trigger both read_aio and
> read_encrypted, so why do we need to distinguish them as two different
> events?
> 

Seems I just done it looking at qcow2_co_preadv_compressed..

Anyway, I think you are right, so, I don't mind if Max drops this new event
and use read_aio in his branch, or I can resend the series or send a follow-up,
whichever you prefer.
Max Reitz Sept. 13, 2019, 11:06 a.m. UTC | #3
On 13.09.19 12:53, Vladimir Sementsov-Ogievskiy wrote:
> 13.09.2019 13:01, Kevin Wolf wrote:
>> Am 16.08.2019 um 17:30 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>> Further patch will run partial requests of iterations of
>>> qcow2_co_preadv in parallel for performance reasons. To prepare for
>>> this, separate part which may be parallelized into separate function
>>> (qcow2_co_preadv_task).
>>>
>>> While being here, also separate encrypted clusters reading to own
>>> function, like it is done for compressed reading.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> Reviewed-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>   qapi/block-core.json |   2 +-
>>>   block/qcow2.c        | 205 +++++++++++++++++++++++--------------------
>>>   2 files changed, 111 insertions(+), 96 deletions(-)
>>>
>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>> index 0d43d4f37c..dd80aa11db 100644
>>> --- a/qapi/block-core.json
>>> +++ b/qapi/block-core.json
>>> @@ -3266,7 +3266,7 @@
>>>               'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
>>>               'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
>>>               'l1_shrink_write_table', 'l1_shrink_free_l2_clusters',
>>> -            'cor_write', 'cluster_alloc_space', 'none'] }
>>> +            'cor_write', 'cluster_alloc_space', 'none', 'read_encrypted'] }
>>
>> What's the point of this new blkdebug event?
>>
>> Obviously, read_aio for an encrypted image must mean a read of encrypted
>> data. The same image can never trigger both read_aio and
>> read_encrypted, so why do we need to distinguish them as two different
>> events?
>>
> 
> Seems I just done it looking at qcow2_co_preadv_compressed..
> 
> Anyway, I think you are right, so, I don't mind if Max drops this new event
> and use read_aio in his branch, or I can resend the series or send a follow-up,
> whichever you prefer.

Should I squash this in?

diff --git a/qapi/block-core.json b/qapi/block-core.json
index d9ae73a43c..e6edd641f1 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3264,7 +3264,7 @@
             'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
             'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
             'l1_shrink_write_table', 'l1_shrink_free_l2_clusters',
-            'cor_write', 'cluster_alloc_space', 'none', 'read_encrypted'] }
+            'cor_write', 'cluster_alloc_space', 'none'] }

 ##
 # @BlkdebugIOType:
diff --git a/block/qcow2.c b/block/qcow2.c
index b5fe014b20..c07ce84d54 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2001,7 +2001,7 @@ qcow2_co_preadv_encrypted(BlockDriverState *bs,
         return -ENOMEM;
     }

-    BLKDBG_EVENT(bs->file, BLKDBG_READ_ENCRYPTED);
+    BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
     ret = bdrv_co_pread(s->data_file,
                         file_cluster_offset + offset_into_cluster(s,
offset),
                         bytes, buf, 0);
Kevin Wolf Sept. 13, 2019, 11:34 a.m. UTC | #4
Am 13.09.2019 um 13:06 hat Max Reitz geschrieben:
> On 13.09.19 12:53, Vladimir Sementsov-Ogievskiy wrote:
> > 13.09.2019 13:01, Kevin Wolf wrote:
> >> Am 16.08.2019 um 17:30 hat Vladimir Sementsov-Ogievskiy geschrieben:
> >>> Further patch will run partial requests of iterations of
> >>> qcow2_co_preadv in parallel for performance reasons. To prepare for
> >>> this, separate part which may be parallelized into separate function
> >>> (qcow2_co_preadv_task).
> >>>
> >>> While being here, also separate encrypted clusters reading to own
> >>> function, like it is done for compressed reading.
> >>>
> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> >>> Reviewed-by: Max Reitz <mreitz@redhat.com>
> >>> ---
> >>>   qapi/block-core.json |   2 +-
> >>>   block/qcow2.c        | 205 +++++++++++++++++++++++--------------------
> >>>   2 files changed, 111 insertions(+), 96 deletions(-)
> >>>
> >>> diff --git a/qapi/block-core.json b/qapi/block-core.json
> >>> index 0d43d4f37c..dd80aa11db 100644
> >>> --- a/qapi/block-core.json
> >>> +++ b/qapi/block-core.json
> >>> @@ -3266,7 +3266,7 @@
> >>>               'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
> >>>               'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
> >>>               'l1_shrink_write_table', 'l1_shrink_free_l2_clusters',
> >>> -            'cor_write', 'cluster_alloc_space', 'none'] }
> >>> +            'cor_write', 'cluster_alloc_space', 'none', 'read_encrypted'] }
> >>
> >> What's the point of this new blkdebug event?
> >>
> >> Obviously, read_aio for an encrypted image must mean a read of encrypted
> >> data. The same image can never trigger both read_aio and
> >> read_encrypted, so why do we need to distinguish them as two different
> >> events?
> >>
> > 
> > Seems I just done it looking at qcow2_co_preadv_compressed..
> > 
> > Anyway, I think you are right, so, I don't mind if Max drops this new event
> > and use read_aio in his branch, or I can resend the series or send a follow-up,
> > whichever you prefer.
> 
> Should I squash this in?

Looks good to me.
Max Reitz Sept. 13, 2019, 11:55 a.m. UTC | #5
On 13.09.19 13:34, Kevin Wolf wrote:
> Am 13.09.2019 um 13:06 hat Max Reitz geschrieben:
>> On 13.09.19 12:53, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.09.2019 13:01, Kevin Wolf wrote:
>>>> Am 16.08.2019 um 17:30 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>> Further patch will run partial requests of iterations of
>>>>> qcow2_co_preadv in parallel for performance reasons. To prepare for
>>>>> this, separate part which may be parallelized into separate function
>>>>> (qcow2_co_preadv_task).
>>>>>
>>>>> While being here, also separate encrypted clusters reading to own
>>>>> function, like it is done for compressed reading.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>> Reviewed-by: Max Reitz <mreitz@redhat.com>
>>>>> ---
>>>>>   qapi/block-core.json |   2 +-
>>>>>   block/qcow2.c        | 205 +++++++++++++++++++++++--------------------
>>>>>   2 files changed, 111 insertions(+), 96 deletions(-)
>>>>>
>>>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>>>> index 0d43d4f37c..dd80aa11db 100644
>>>>> --- a/qapi/block-core.json
>>>>> +++ b/qapi/block-core.json
>>>>> @@ -3266,7 +3266,7 @@
>>>>>               'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
>>>>>               'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
>>>>>               'l1_shrink_write_table', 'l1_shrink_free_l2_clusters',
>>>>> -            'cor_write', 'cluster_alloc_space', 'none'] }
>>>>> +            'cor_write', 'cluster_alloc_space', 'none', 'read_encrypted'] }
>>>>
>>>> What's the point of this new blkdebug event?
>>>>
>>>> Obviously, read_aio for an encrypted image must mean a read of encrypted
>>>> data. The same image can never trigger both read_aio and
>>>> read_encrypted, so why do we need to distinguish them as two different
>>>> events?
>>>>
>>>
>>> Seems I just done it looking at qcow2_co_preadv_compressed..
>>>
>>> Anyway, I think you are right, so, I don't mind if Max drops this new event
>>> and use read_aio in his branch, or I can resend the series or send a follow-up,
>>> whichever you prefer.
>>
>> Should I squash this in?
> 
> Looks good to me.

OK, done.

Max

Patch
diff mbox series

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0d43d4f37c..dd80aa11db 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3266,7 +3266,7 @@ 
             'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
             'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
             'l1_shrink_write_table', 'l1_shrink_free_l2_clusters',
-            'cor_write', 'cluster_alloc_space', 'none'] }
+            'cor_write', 'cluster_alloc_space', 'none', 'read_encrypted'] }
 
 ##
 # @BlkdebugIOType:
diff --git a/block/qcow2.c b/block/qcow2.c
index 93ab7edcea..89afb4272e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1967,17 +1967,114 @@  out:
     return ret;
 }
 
+static coroutine_fn int
+qcow2_co_preadv_encrypted(BlockDriverState *bs,
+                           uint64_t file_cluster_offset,
+                           uint64_t offset,
+                           uint64_t bytes,
+                           QEMUIOVector *qiov,
+                           uint64_t qiov_offset)
+{
+    int ret;
+    BDRVQcow2State *s = bs->opaque;
+    uint8_t *buf;
+
+    assert(bs->encrypted && s->crypto);
+    assert(bytes <= QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
+
+    /*
+     * For encrypted images, read everything into a temporary
+     * contiguous buffer on which the AES functions can work.
+     * Also, decryption in a separate buffer is better as it
+     * prevents the guest from learning information about the
+     * encrypted nature of the virtual disk.
+     */
+
+    buf = qemu_try_blockalign(s->data_file->bs, bytes);
+    if (buf == NULL) {
+        return -ENOMEM;
+    }
+
+    BLKDBG_EVENT(bs->file, BLKDBG_READ_ENCRYPTED);
+    ret = bdrv_co_pread(s->data_file,
+                        file_cluster_offset + offset_into_cluster(s, offset),
+                        bytes, buf, 0);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+    if (qcow2_co_decrypt(bs, file_cluster_offset, offset, buf, bytes) < 0) {
+        ret = -EIO;
+        goto fail;
+    }
+    qemu_iovec_from_buf(qiov, qiov_offset, buf, bytes);
+
+fail:
+    qemu_vfree(buf);
+
+    return ret;
+}
+
+static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs,
+                                             QCow2ClusterType cluster_type,
+                                             uint64_t file_cluster_offset,
+                                             uint64_t offset, uint64_t bytes,
+                                             QEMUIOVector *qiov,
+                                             size_t qiov_offset)
+{
+    BDRVQcow2State *s = bs->opaque;
+    int offset_in_cluster = offset_into_cluster(s, offset);
+
+    switch (cluster_type) {
+    case QCOW2_CLUSTER_ZERO_PLAIN:
+    case QCOW2_CLUSTER_ZERO_ALLOC:
+        /* Both zero types are handled in qcow2_co_preadv_part */
+        g_assert_not_reached();
+
+    case QCOW2_CLUSTER_UNALLOCATED:
+        assert(bs->backing); /* otherwise handled in qcow2_co_preadv_part */
+
+        BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
+        return bdrv_co_preadv_part(bs->backing, offset, bytes,
+                                   qiov, qiov_offset, 0);
+
+    case QCOW2_CLUSTER_COMPRESSED:
+        return qcow2_co_preadv_compressed(bs, file_cluster_offset,
+                                          offset, bytes, qiov, qiov_offset);
+
+    case QCOW2_CLUSTER_NORMAL:
+        if ((file_cluster_offset & 511) != 0) {
+            return -EIO;
+        }
+
+        if (bs->encrypted) {
+            return qcow2_co_preadv_encrypted(bs, file_cluster_offset,
+                                             offset, bytes, qiov, qiov_offset);
+        }
+
+        BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
+        return bdrv_co_preadv_part(s->data_file,
+                                   file_cluster_offset + offset_in_cluster,
+                                   bytes, qiov, qiov_offset, 0);
+
+    default:
+        g_assert_not_reached();
+    }
+
+    g_assert_not_reached();
+}
+
 static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
                                              uint64_t offset, uint64_t bytes,
                                              QEMUIOVector *qiov,
                                              size_t qiov_offset, int flags)
 {
     BDRVQcow2State *s = bs->opaque;
-    int offset_in_cluster;
     int ret;
     unsigned int cur_bytes; /* number of bytes in current iteration */
     uint64_t cluster_offset = 0;
-    uint8_t *cluster_data = NULL;
 
     while (bytes != 0) {
 
@@ -1992,111 +2089,29 @@  static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
         ret = qcow2_get_cluster_offset(bs, offset, &cur_bytes, &cluster_offset);
         qemu_co_mutex_unlock(&s->lock);
         if (ret < 0) {
-            goto fail;
+            return ret;
         }
 
-        offset_in_cluster = offset_into_cluster(s, offset);
-
-        switch (ret) {
-        case QCOW2_CLUSTER_UNALLOCATED:
-
-            if (bs->backing) {
-                BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
-                ret = bdrv_co_preadv_part(bs->backing, offset, cur_bytes,
-                                          qiov, qiov_offset, 0);
-                if (ret < 0) {
-                    goto fail;
-                }
-            } else {
-                /* Note: in this case, no need to wait */
-                qemu_iovec_memset(qiov, qiov_offset, 0, cur_bytes);
-            }
-            break;
-
-        case QCOW2_CLUSTER_ZERO_PLAIN:
-        case QCOW2_CLUSTER_ZERO_ALLOC:
+        if (ret == QCOW2_CLUSTER_ZERO_PLAIN ||
+            ret == QCOW2_CLUSTER_ZERO_ALLOC ||
+            (ret == QCOW2_CLUSTER_UNALLOCATED && !bs->backing))
+        {
             qemu_iovec_memset(qiov, qiov_offset, 0, cur_bytes);
-            break;
-
-        case QCOW2_CLUSTER_COMPRESSED:
-            ret = qcow2_co_preadv_compressed(bs, cluster_offset,
-                                             offset, cur_bytes,
-                                             qiov, qiov_offset);
+        } else {
+            ret = qcow2_co_preadv_task(bs, ret,
+                                       cluster_offset, offset, cur_bytes,
+                                       qiov, qiov_offset);
             if (ret < 0) {
-                goto fail;
-            }
-
-            break;
-
-        case QCOW2_CLUSTER_NORMAL:
-            if ((cluster_offset & 511) != 0) {
-                ret = -EIO;
-                goto fail;
-            }
-
-            if (bs->encrypted) {
-                assert(s->crypto);
-
-                /*
-                 * For encrypted images, read everything into a temporary
-                 * contiguous buffer on which the AES functions can work.
-                 */
-                if (!cluster_data) {
-                    cluster_data =
-                        qemu_try_blockalign(s->data_file->bs,
-                                            QCOW_MAX_CRYPT_CLUSTERS
-                                            * s->cluster_size);
-                    if (cluster_data == NULL) {
-                        ret = -ENOMEM;
-                        goto fail;
-                    }
-                }
-
-                assert(cur_bytes <= QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
-
-                BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-                ret = bdrv_co_pread(s->data_file,
-                                    cluster_offset + offset_in_cluster,
-                                    cur_bytes, cluster_data, 0);
-                if (ret < 0) {
-                    goto fail;
-                }
-
-                assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
-                assert((cur_bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-                if (qcow2_co_decrypt(bs, cluster_offset, offset,
-                                     cluster_data, cur_bytes) < 0) {
-                    ret = -EIO;
-                    goto fail;
-                }
-                qemu_iovec_from_buf(qiov, qiov_offset, cluster_data, cur_bytes);
-            } else {
-                BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-                ret = bdrv_co_preadv_part(s->data_file,
-                                          cluster_offset + offset_in_cluster,
-                                          cur_bytes, qiov, qiov_offset, 0);
-                if (ret < 0) {
-                    goto fail;
-                }
+                return ret;
             }
-            break;
-
-        default:
-            g_assert_not_reached();
-            ret = -EIO;
-            goto fail;
         }
 
         bytes -= cur_bytes;
         offset += cur_bytes;
         qiov_offset += cur_bytes;
     }
-    ret = 0;
-
-fail:
-    qemu_vfree(cluster_data);
 
-    return ret;
+    return 0;
 }
 
 /* Check if it's possible to merge a write request with the writing of