diff mbox series

[3/7] block/qcow2: use compressed write cache

Message ID 20210129165030.640169-4-vsementsov@virtuozzo.com
State New
Headers show
Series qcow2: compressed write cache | expand

Commit Message

Vladimir Sementsov-Ogievskiy Jan. 29, 2021, 4:50 p.m. UTC
Introduce a new option: compressed-cache-size, with default to 64
clusters (to be not less than 64 default max-workers for backup job).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 qapi/block-core.json   |  8 +++-
 block/qcow2.h          |  4 ++
 block/qcow2-refcount.c | 13 +++++++
 block/qcow2.c          | 87 ++++++++++++++++++++++++++++++++++++++++--
 4 files changed, 108 insertions(+), 4 deletions(-)

Comments

Max Reitz Feb. 10, 2021, 5:11 p.m. UTC | #1
On 29.01.21 17:50, Vladimir Sementsov-Ogievskiy wrote:
> Introduce a new option: compressed-cache-size, with default to 64
> clusters (to be not less than 64 default max-workers for backup job).
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   qapi/block-core.json   |  8 +++-
>   block/qcow2.h          |  4 ++
>   block/qcow2-refcount.c | 13 +++++++
>   block/qcow2.c          | 87 ++++++++++++++++++++++++++++++++++++++++--
>   4 files changed, 108 insertions(+), 4 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 9f555d5c1d..e0be6657f3 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -3202,6 +3202,11 @@
>   #             an image, the data file name is loaded from the image
>   #             file. (since 4.0)
>   #
> +# @compressed-cache-size: The maximum size of compressed write cache in
> +#                         bytes. If positive must be not less than
> +#                         cluster size. 0 disables the feature. Default
> +#                         is 64 * cluster_size. (since 6.0)

Do we need this, really?  If you don’t use compression, the cache won’t 
use any memory, right?  Do you plan on using this option?

I’d just set it to a sane default.

OTOH, “a sane default” poses two questions, namely whether 64 * 
cluster_size is reasonable – with subclusters, the cluster size may be 
rather high, so 64 * cluster_size may well be like 128 MB.  Are 64 
clusters really necessary for a reasonable performance?

Second, I think I could live with a rather high default if clusters are 
flushed as soon as they are full.  OTOH, as I briefly touched on, in 
practice, I suppose compressed images are just written to constantly, so 
even if clusters are flushed as soon as they are full, the cache will 
still remain full all the time.


Different topic: Why is the cache disableable?  I thought there are no 
downsides?

(Not being able to disable it would make the code simpler, hence me asking.)

Max
Vladimir Sementsov-Ogievskiy Feb. 11, 2021, 12:53 p.m. UTC | #2
10.02.2021 20:11, Max Reitz wrote:
> On 29.01.21 17:50, Vladimir Sementsov-Ogievskiy wrote:
>> Introduce a new option: compressed-cache-size, with default to 64
>> clusters (to be not less than 64 default max-workers for backup job).
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   qapi/block-core.json   |  8 +++-
>>   block/qcow2.h          |  4 ++
>>   block/qcow2-refcount.c | 13 +++++++
>>   block/qcow2.c          | 87 ++++++++++++++++++++++++++++++++++++++++--
>>   4 files changed, 108 insertions(+), 4 deletions(-)
>>
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index 9f555d5c1d..e0be6657f3 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -3202,6 +3202,11 @@
>>   #             an image, the data file name is loaded from the image
>>   #             file. (since 4.0)
>>   #
>> +# @compressed-cache-size: The maximum size of compressed write cache in
>> +#                         bytes. If positive must be not less than
>> +#                         cluster size. 0 disables the feature. Default
>> +#                         is 64 * cluster_size. (since 6.0)
> 
> Do we need this, really?  If you don’t use compression, the cache won’t use any memory, right?  Do you plan on using this option?
> 
> I’d just set it to a sane default.

OK for me

> 
> OTOH, “a sane default” poses two questions, namely whether 64 * cluster_size is reasonable – with subclusters, the cluster size may be rather high, so 64 * cluster_size may well be like 128 MB.  Are 64 clusters really necessary for a reasonable performance?
> 
> Second, I think I could live with a rather high default if clusters are flushed as soon as they are full.  OTOH, as I briefly touched on, in practice, I suppose compressed images are just written to constantly, so even if clusters are flushed as soon as they are full, the cache will still remain full all the time.
> 
> 
> Different topic: Why is the cache disableable?  I thought there are no downsides?
> 

to compare performance for example..
Max Reitz Feb. 18, 2021, 4:02 p.m. UTC | #3
On 11.02.21 13:53, Vladimir Sementsov-Ogievskiy wrote:
> 10.02.2021 20:11, Max Reitz wrote:
>> On 29.01.21 17:50, Vladimir Sementsov-Ogievskiy wrote:
>>> Introduce a new option: compressed-cache-size, with default to 64
>>> clusters (to be not less than 64 default max-workers for backup job).
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   qapi/block-core.json   |  8 +++-
>>>   block/qcow2.h          |  4 ++
>>>   block/qcow2-refcount.c | 13 +++++++
>>>   block/qcow2.c          | 87 ++++++++++++++++++++++++++++++++++++++++--
>>>   4 files changed, 108 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>> index 9f555d5c1d..e0be6657f3 100644
>>> --- a/qapi/block-core.json
>>> +++ b/qapi/block-core.json
>>> @@ -3202,6 +3202,11 @@
>>>   #             an image, the data file name is loaded from the image
>>>   #             file. (since 4.0)
>>>   #
>>> +# @compressed-cache-size: The maximum size of compressed write cache in
>>> +#                         bytes. If positive must be not less than
>>> +#                         cluster size. 0 disables the feature. Default
>>> +#                         is 64 * cluster_size. (since 6.0)
>>
>> Do we need this, really?  If you don’t use compression, the cache 
>> won’t use any memory, right?  Do you plan on using this option?
>>
>> I’d just set it to a sane default.
> 
> OK for me
> 
>>
>> OTOH, “a sane default” poses two questions, namely whether 64 * 
>> cluster_size is reasonable – with subclusters, the cluster size may be 
>> rather high, so 64 * cluster_size may well be like 128 MB.  Are 64 
>> clusters really necessary for a reasonable performance?
>>
>> Second, I think I could live with a rather high default if clusters 
>> are flushed as soon as they are full.  OTOH, as I briefly touched on, 
>> in practice, I suppose compressed images are just written to 
>> constantly, so even if clusters are flushed as soon as they are full, 
>> the cache will still remain full all the time.
>>
>>
>> Different topic: Why is the cache disableable?  I thought there are no 
>> downsides?
>>
> 
> to compare performance for example..

Well :D
Doesn’t seem like a reason to expose it to the outside, though, I don’t 
know.

Max
diff mbox series

Patch

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9f555d5c1d..e0be6657f3 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3202,6 +3202,11 @@ 
 #             an image, the data file name is loaded from the image
 #             file. (since 4.0)
 #
+# @compressed-cache-size: The maximum size of compressed write cache in
+#                         bytes. If positive must be not less than
+#                         cluster size. 0 disables the feature. Default
+#                         is 64 * cluster_size. (since 6.0)
+#
 # Since: 2.9
 ##
 { 'struct': 'BlockdevOptionsQcow2',
@@ -3217,7 +3222,8 @@ 
             '*refcount-cache-size': 'int',
             '*cache-clean-interval': 'int',
             '*encrypt': 'BlockdevQcow2Encryption',
-            '*data-file': 'BlockdevRef' } }
+            '*data-file': 'BlockdevRef',
+            '*compressed-cache-size': 'int' } }
 
 ##
 # @SshHostKeyCheckMode:
diff --git a/block/qcow2.h b/block/qcow2.h
index fbdedf89fa..b86aa06006 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -150,6 +150,7 @@ 
 #define QCOW2_OPT_L2_CACHE_ENTRY_SIZE "l2-cache-entry-size"
 #define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size"
 #define QCOW2_OPT_CACHE_CLEAN_INTERVAL "cache-clean-interval"
+#define QCOW2_OPT_COMPRESSED_CACHE_SIZE "compressed-cache-size"
 
 typedef struct QCowHeader {
     uint32_t magic;
@@ -422,6 +423,9 @@  typedef struct BDRVQcow2State {
      * is to convert the image with the desired compression type set.
      */
     Qcow2CompressionType compression_type;
+
+    uint64_t compressed_cache_size;
+    Qcow2CompressedWriteCache *compressed_cache;
 } BDRVQcow2State;
 
 typedef struct Qcow2COWRegion {
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 8e649b008e..5720591460 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -899,6 +899,11 @@  static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
                 qcow2_cache_discard(s->l2_table_cache, table);
             }
 
+            if (s->compressed_cache) {
+                qcow2_compressed_cache_co_discard(s->compressed_cache,
+                                                  cluster_offset);
+            }
+
             if (s->discard_passthrough[type]) {
                 update_refcount_discard(bs, cluster_offset, s->cluster_size);
             }
@@ -1110,6 +1115,14 @@  int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
             }
 
             if (!offset || ROUND_UP(offset, s->cluster_size) != new_cluster) {
+                if (offset && s->compressed_cache) {
+                    /*
+                     * Previous cluster is unfinished, but we'll not write more
+                     * data to it. We should inform compressed cache.
+                     */
+                    qcow2_compressed_cache_co_set_cluster_end(
+                            s->compressed_cache, offset);
+                }
                 offset = new_cluster;
                 free_in_cluster = s->cluster_size;
             } else {
diff --git a/block/qcow2.c b/block/qcow2.c
index 5d94f45be9..3997d8c143 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -808,6 +808,13 @@  static QemuOptsList qcow2_runtime_opts = {
             .type = QEMU_OPT_NUMBER,
             .help = "Clean unused cache entries after this time (in seconds)",
         },
+        {
+            .name = QCOW2_OPT_COMPRESSED_CACHE_SIZE,
+            .type = QEMU_OPT_NUMBER,
+            .help = "The maximum size of compressed write cache in bytes. If "
+                    "positive must be not less than cluster size. 0 disables "
+                    "the feature. Default is 64 * cluster_size",
+        },
         BLOCK_CRYPTO_OPT_DEF_KEY_SECRET("encrypt.",
             "ID of secret providing qcow2 AES key or LUKS passphrase"),
         { /* end of list */ }
@@ -969,6 +976,38 @@  typedef struct Qcow2ReopenState {
     QCryptoBlockOpenOptions *crypto_opts; /* Disk encryption runtime options */
 } Qcow2ReopenState;
 
+static int qcow2_update_compressed_cache_option(BlockDriverState *bs,
+                                                QemuOpts *opts, Error **errp)
+{
+    BDRVQcow2State *s = bs->opaque;
+    uint64_t new_size;
+    int ret;
+
+    new_size = qemu_opt_get_size(opts, QCOW2_OPT_COMPRESSED_CACHE_SIZE,
+                                 64 * s->cluster_size);
+    if (new_size > 0 && new_size < s->cluster_size) {
+        error_setg(errp, "compressed cache size is too small");
+        return -EINVAL;
+    }
+
+    if (s->compressed_cache && !new_size) {
+        ret = qcow2_compressed_cache_stop_flush(bs, s->compressed_cache);
+        if (ret < 0) {
+            error_report("Failed to flush the compressed write cache: %s",
+                         strerror(-ret));
+            return ret;
+        }
+        qcow2_compressed_cache_free(s->compressed_cache);
+        s->compressed_cache = NULL;
+    } else if (s->compressed_cache) {
+        qcow2_compressed_cache_set_size(s->compressed_cache, new_size);
+    }
+
+    s->compressed_cache_size = new_size;
+
+    return 0;
+}
+
 static int qcow2_update_options_prepare(BlockDriverState *bs,
                                         Qcow2ReopenState *r,
                                         QDict *options, int flags,
@@ -994,6 +1033,11 @@  static int qcow2_update_options_prepare(BlockDriverState *bs,
         goto fail;
     }
 
+    ret = qcow2_update_compressed_cache_option(bs, opts, errp);
+    if (ret < 0) {
+        goto fail;
+    }
+
     /* get L2 table/refcount block cache size from command line options */
     read_cache_sizes(bs, opts, &l2_cache_size, &l2_cache_entry_size,
                      &refcount_cache_size, &local_err);
@@ -2660,6 +2704,17 @@  static int qcow2_inactivate(BlockDriverState *bs)
                           bdrv_get_device_or_node_name(bs));
     }
 
+    if (s->compressed_cache) {
+        ret = qcow2_compressed_cache_stop_flush(bs, s->compressed_cache);
+        if (ret < 0) {
+            result = ret;
+            error_report("Failed to flush the compressed write cache: %s",
+                         strerror(-ret));
+        }
+        qcow2_compressed_cache_free(s->compressed_cache);
+        s->compressed_cache = NULL;
+    }
+
     ret = qcow2_cache_flush(bs, s->l2_table_cache);
     if (ret) {
         result = ret;
@@ -2692,6 +2747,8 @@  static void qcow2_close(BlockDriverState *bs)
         qcow2_inactivate(bs);
     }
 
+    assert(!s->compressed_cache);
+
     cache_clean_timer_del(bs);
     qcow2_cache_destroy(s->l2_table_cache);
     qcow2_cache_destroy(s->refcount_block_cache);
@@ -4539,8 +4596,14 @@  qcow2_co_pwritev_compressed_task(BlockDriverState *bs,
         goto fail;
     }
 
-    BLKDBG_EVENT(s->data_file, BLKDBG_WRITE_COMPRESSED);
-    ret = bdrv_co_pwrite(s->data_file, cluster_offset, out_len, out_buf, 0);
+    if (s->compressed_cache) {
+        ret = qcow2_compressed_cache_co_write(s->compressed_cache,
+                                              cluster_offset, out_len, out_buf);
+        out_buf = NULL;
+    } else {
+        BLKDBG_EVENT(s->data_file, BLKDBG_WRITE_COMPRESSED);
+        ret = bdrv_co_pwrite(s->data_file, cluster_offset, out_len, out_buf, 0);
+    }
     if (ret < 0) {
         goto fail;
     }
@@ -4601,6 +4664,12 @@  qcow2_co_pwritev_compressed_part(BlockDriverState *bs,
         return -EINVAL;
     }
 
+    if (!s->compressed_cache && s->compressed_cache_size) {
+        s->compressed_cache =
+            qcow2_compressed_cache_new(s->data_file, s->cluster_size,
+                                       s->compressed_cache_size);
+    }
+
     while (bytes && aio_task_pool_status(aio) == 0) {
         uint64_t chunk_size = MIN(bytes, s->cluster_size);
 
@@ -4656,7 +4725,12 @@  qcow2_co_preadv_compressed(BlockDriverState *bs,
     out_buf = qemu_blockalign(bs, s->cluster_size);
 
     BLKDBG_EVENT(bs->file, BLKDBG_READ_COMPRESSED);
-    ret = bdrv_co_pread(bs->file, coffset, csize, buf, 0);
+    if (s->compressed_cache) {
+        ret = qcow2_compressed_cache_co_read(s->compressed_cache,
+                                             coffset, csize, buf);
+    } else {
+        ret = bdrv_co_pread(bs->file, coffset, csize, buf, 0);
+    }
     if (ret < 0) {
         goto fail;
     }
@@ -4875,6 +4949,13 @@  static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs)
     BDRVQcow2State *s = bs->opaque;
     int ret;
 
+    if (s->compressed_cache) {
+        ret = qcow2_compressed_cache_flush(bs, s->compressed_cache);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
     qemu_co_mutex_lock(&s->lock);
     ret = qcow2_write_caches(bs);
     qemu_co_mutex_unlock(&s->lock);