diff mbox series

[PULL,20/37] qcow2: Give the refcount cache the minimum possible size by default

Message ID 20180515154033.19899-21-kwolf@redhat.com
State New
Headers show
Series [PULL,01/37] block-backend: simplify blk_get_aio_context | expand

Commit Message

Kevin Wolf May 15, 2018, 3:40 p.m. UTC
From: Alberto Garcia <berto@igalia.com>

The L2 and refcount caches have default sizes that can be overridden
using the l2-cache-size and refcount-cache-size (an additional
parameter named cache-size sets the combined size of both caches).

Unless forced by one of the aforementioned parameters, QEMU will set
the unspecified sizes so that the L2 cache is 4 times larger than the
refcount cache.

This is based on the premise that the refcount metadata needs to be
only a fourth of the L2 metadata to cover the same amount of disk
space. This is incorrect for two reasons:

 a) The amount of disk covered by an L2 table depends solely on the
    cluster size, but in the case of a refcount block it depends on
    the cluster size *and* the width of each refcount entry.
    The 4/1 ratio is only valid with 16-bit entries (the default).

 b) When we talk about disk space and L2 tables we are talking about
    guest space (L2 tables map guest clusters to host clusters),
    whereas refcount blocks are used for host clusters (including
    L1/L2 tables and the refcount blocks themselves). On a fully
    populated (and uncompressed) qcow2 file, image size > virtual size
    so there are more refcount entries than L2 entries.

Problem (a) could be fixed by adjusting the algorithm to take into
account the refcount entry width. Problem (b) could be fixed by
increasing a bit the refcount cache size to account for the clusters
used for qcow2 metadata.

However this patch takes a completely different approach and instead
of keeping a ratio between both cache sizes it assigns as much as
possible to the L2 cache and the remainder to the refcount cache.

The reason is that L2 tables are used for every single I/O request
from the guest and the effect of increasing the cache is significant
and clearly measurable. Refcount blocks are however only used for
cluster allocation and internal snapshots and in practice are accessed
sequentially in most cases, so the effect of increasing the cache is
negligible (even when doing random writes from the guest).

So, make the refcount cache as small as possible unless the user
explicitly asks for a larger one.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 9695182c2eb11b77cb319689a1ebaa4e7c9d6591.1523968389.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2.h              |  4 ----
 block/qcow2.c              | 31 +++++++++++++++++++------------
 tests/qemu-iotests/137.out |  2 +-
 3 files changed, 20 insertions(+), 17 deletions(-)

Comments

Peter Maydell May 25, 2018, 5:10 p.m. UTC | #1
On 15 May 2018 at 16:40, Kevin Wolf <kwolf@redhat.com> wrote:
> From: Alberto Garcia <berto@igalia.com>
>
> The L2 and refcount caches have default sizes that can be overridden
> using the l2-cache-size and refcount-cache-size (an additional
> parameter named cache-size sets the combined size of both caches).

Hi; Coverity raises a nit about this patch (CID 1391229):


> diff --git a/block/qcow2.c b/block/qcow2.c
> index 2f36e632f9..6d532470a8 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -802,23 +802,30 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
>          } else if (refcount_cache_size_set) {
>              *l2_cache_size = combined_cache_size - *refcount_cache_size;
>          } else {
> -            *refcount_cache_size = combined_cache_size
> -                                 / (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
> -            *l2_cache_size = combined_cache_size - *refcount_cache_size;
> +            uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
> +            uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
> +            uint64_t min_refcount_cache =
> +                (uint64_t) MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;

Here we have a (uint64_t) cast that ensures we're doing a 64x64 multiply
rather than a 32x32 one...

> +
> +            /* Assign as much memory as possible to the L2 cache, and
> +             * use the remainder for the refcount cache */
> +            if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
> +                *l2_cache_size = max_l2_cache;
> +                *refcount_cache_size = combined_cache_size - *l2_cache_size;
> +            } else {
> +                *refcount_cache_size =
> +                    MIN(combined_cache_size, min_refcount_cache);
> +                *l2_cache_size = combined_cache_size - *refcount_cache_size;
> +            }
>          }
>      } else {
> -        if (!l2_cache_size_set && !refcount_cache_size_set) {
> +        if (!l2_cache_size_set) {
>              *l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
>                                   (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
>                                   * s->cluster_size);
> -            *refcount_cache_size = *l2_cache_size
> -                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
> -        } else if (!l2_cache_size_set) {
> -            *l2_cache_size = *refcount_cache_size
> -                           * DEFAULT_L2_REFCOUNT_SIZE_RATIO;
> -        } else if (!refcount_cache_size_set) {
> -            *refcount_cache_size = *l2_cache_size
> -                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
> +        }
> +        if (!refcount_cache_size_set) {
> +            *refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;

...but in the else clause down here, we don't have the cast, and
Coverity complains that we evaluate a 32-bit result and then
put it in a 64-bit variable. Should this have the (uint64_t)
cast as well ?

>          }
>      }
>

thanks
-- PMM
Kevin Wolf May 28, 2018, 8:38 a.m. UTC | #2
Am 25.05.2018 um 19:10 hat Peter Maydell geschrieben:
> On 15 May 2018 at 16:40, Kevin Wolf <kwolf@redhat.com> wrote:
> > From: Alberto Garcia <berto@igalia.com>
> >
> > The L2 and refcount caches have default sizes that can be overridden
> > using the l2-cache-size and refcount-cache-size (an additional
> > parameter named cache-size sets the combined size of both caches).
> 
> Hi; Coverity raises a nit about this patch (CID 1391229):

CCing Berto as the patch author.

> > diff --git a/block/qcow2.c b/block/qcow2.c
> > index 2f36e632f9..6d532470a8 100644
> > --- a/block/qcow2.c
> > +++ b/block/qcow2.c
> > @@ -802,23 +802,30 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
> >          } else if (refcount_cache_size_set) {
> >              *l2_cache_size = combined_cache_size - *refcount_cache_size;
> >          } else {
> > -            *refcount_cache_size = combined_cache_size
> > -                                 / (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
> > -            *l2_cache_size = combined_cache_size - *refcount_cache_size;
> > +            uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
> > +            uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
> > +            uint64_t min_refcount_cache =
> > +                (uint64_t) MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
> 
> Here we have a (uint64_t) cast that ensures we're doing a 64x64 multiply
> rather than a 32x32 one...
> 
> > +
> > +            /* Assign as much memory as possible to the L2 cache, and
> > +             * use the remainder for the refcount cache */
> > +            if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
> > +                *l2_cache_size = max_l2_cache;
> > +                *refcount_cache_size = combined_cache_size - *l2_cache_size;
> > +            } else {
> > +                *refcount_cache_size =
> > +                    MIN(combined_cache_size, min_refcount_cache);
> > +                *l2_cache_size = combined_cache_size - *refcount_cache_size;
> > +            }
> >          }
> >      } else {
> > -        if (!l2_cache_size_set && !refcount_cache_size_set) {
> > +        if (!l2_cache_size_set) {
> >              *l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
> >                                   (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
> >                                   * s->cluster_size);
> > -            *refcount_cache_size = *l2_cache_size
> > -                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
> > -        } else if (!l2_cache_size_set) {
> > -            *l2_cache_size = *refcount_cache_size
> > -                           * DEFAULT_L2_REFCOUNT_SIZE_RATIO;
> > -        } else if (!refcount_cache_size_set) {
> > -            *refcount_cache_size = *l2_cache_size
> > -                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
> > +        }
> > +        if (!refcount_cache_size_set) {
> > +            *refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
> 
> ...but in the else clause down here, we don't have the cast, and
> Coverity complains that we evaluate a 32-bit result and then
> put it in a 64-bit variable. Should this have the (uint64_t)
> cast as well ?

It's a false positive, MIN_REFCOUNT_CACHE_SIZE is 4 and s->cluster_size
is at most 2 MB, so this will never overflow.

I guess we can change the code anyway to silence it?

Kevin
Alberto Garcia May 28, 2018, 8:58 a.m. UTC | #3
On Mon 28 May 2018 10:38:55 AM CEST, Kevin Wolf wrote:
>> > +        if (!refcount_cache_size_set) {
>> > +            *refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
>> 
>> ...but in the else clause down here, we don't have the cast, and
>> Coverity complains that we evaluate a 32-bit result and then put it
>> in a 64-bit variable. Should this have the (uint64_t) cast as well ?

I suppose that's not because we put a 32-bit result in a 64-bit
variable, but because it could theoretically overflow (if
s->cluster_size > INT32_MAX / 4).

> It's a false positive, MIN_REFCOUNT_CACHE_SIZE is 4 and
> s->cluster_size is at most 2 MB, so this will never overflow.
>
> I guess we can change the code anyway to silence it?

Same opinion here, no problem with adding a cast to silence the warning.

Berto
Peter Maydell May 28, 2018, 1:49 p.m. UTC | #4
On 28 May 2018 at 09:58, Alberto Garcia <berto@igalia.com> wrote:
> On Mon 28 May 2018 10:38:55 AM CEST, Kevin Wolf wrote:
>>> > +        if (!refcount_cache_size_set) {
>>> > +            *refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
>>>
>>> ...but in the else clause down here, we don't have the cast, and
>>> Coverity complains that we evaluate a 32-bit result and then put it
>>> in a 64-bit variable. Should this have the (uint64_t) cast as well ?
>
> I suppose that's not because we put a 32-bit result in a 64-bit
> variable, but because it could theoretically overflow (if
> s->cluster_size > INT32_MAX / 4).

Well, coverity warns because it's one of those things that
could be correct code, or could be the author tripping over
one of C's less-than-obvious traps. 32x32->32 multiplies are
just as susceptible to overflow, but the heuristic Coverity
uses is "calculated a 32-bit multiply result and put it in
a 64-bit variable", on the assumption that the type of the
destination implies that whatever you're calculating could
well be that big, and "truncate the result of my multiply
even though it would fit in the destination" is a bit
unexpected. Coverity's wrong quite a bit on this one, naturally,
but it's usually easier to shut it up on the wrong guesses
for the benefit of the times when it turns out to be right.

thanks
-- PMM
Alberto Garcia May 28, 2018, 1:58 p.m. UTC | #5
On Mon 28 May 2018 03:49:07 PM CEST, Peter Maydell wrote:
> On 28 May 2018 at 09:58, Alberto Garcia <berto@igalia.com> wrote:
>> On Mon 28 May 2018 10:38:55 AM CEST, Kevin Wolf wrote:
>>>> > +        if (!refcount_cache_size_set) {
>>>> > +            *refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
>>>>
>>>> ...but in the else clause down here, we don't have the cast, and
>>>> Coverity complains that we evaluate a 32-bit result and then put it
>>>> in a 64-bit variable. Should this have the (uint64_t) cast as well ?
>>
>> I suppose that's not because we put a 32-bit result in a 64-bit
>> variable, but because it could theoretically overflow (if
>> s->cluster_size > INT32_MAX / 4).
>
> Well, coverity warns because it's one of those things that could be
> correct code, or could be the author tripping over one of C's
> less-than-obvious traps. 32x32->32 multiplies are just as susceptible
> to overflow, but the heuristic Coverity uses is "calculated a 32-bit
> multiply result and put it in a 64-bit variable", on the assumption
> that the type of the destination implies that whatever you're
> calculating could well be that big, and "truncate the result of my
> multiply even though it would fit in the destination" is a bit
> unexpected. Coverity's wrong quite a bit on this one, naturally, but
> it's usually easier to shut it up on the wrong guesses for the benefit
> of the times when it turns out to be right.

Yes, it's a fair warning. I'll send a patch.

Berto
diff mbox series

Patch

diff --git a/block/qcow2.h b/block/qcow2.h
index adf5c3950f..01b5250415 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -77,10 +77,6 @@ 
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
 #define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
 
-/* The refblock cache needs only a fourth of the L2 cache size to cover as many
- * clusters */
-#define DEFAULT_L2_REFCOUNT_SIZE_RATIO 4
-
 #define DEFAULT_CLUSTER_SIZE 65536
 
 
diff --git a/block/qcow2.c b/block/qcow2.c
index 2f36e632f9..6d532470a8 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -802,23 +802,30 @@  static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
         } else if (refcount_cache_size_set) {
             *l2_cache_size = combined_cache_size - *refcount_cache_size;
         } else {
-            *refcount_cache_size = combined_cache_size
-                                 / (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
-            *l2_cache_size = combined_cache_size - *refcount_cache_size;
+            uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+            uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
+            uint64_t min_refcount_cache =
+                (uint64_t) MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+
+            /* Assign as much memory as possible to the L2 cache, and
+             * use the remainder for the refcount cache */
+            if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
+                *l2_cache_size = max_l2_cache;
+                *refcount_cache_size = combined_cache_size - *l2_cache_size;
+            } else {
+                *refcount_cache_size =
+                    MIN(combined_cache_size, min_refcount_cache);
+                *l2_cache_size = combined_cache_size - *refcount_cache_size;
+            }
         }
     } else {
-        if (!l2_cache_size_set && !refcount_cache_size_set) {
+        if (!l2_cache_size_set) {
             *l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
                                  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
                                  * s->cluster_size);
-            *refcount_cache_size = *l2_cache_size
-                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-        } else if (!l2_cache_size_set) {
-            *l2_cache_size = *refcount_cache_size
-                           * DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-        } else if (!refcount_cache_size_set) {
-            *refcount_cache_size = *l2_cache_size
-                                 / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
+        }
+        if (!refcount_cache_size_set) {
+            *refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
         }
     }
 
diff --git a/tests/qemu-iotests/137.out b/tests/qemu-iotests/137.out
index e28e1eadba..96724a6c33 100644
--- a/tests/qemu-iotests/137.out
+++ b/tests/qemu-iotests/137.out
@@ -22,7 +22,7 @@  refcount-cache-size may not exceed cache-size
 L2 cache size too big
 L2 cache entry size must be a power of two between 512 and the cluster size (65536)
 L2 cache entry size must be a power of two between 512 and the cluster size (65536)
-L2 cache size too big
+Refcount cache size too big
 Conflicting values for qcow2 options 'overlap-check' ('constant') and 'overlap-check.template' ('all')
 Unsupported value 'blubb' for qcow2 option 'overlap-check'. Allowed are any of the following: none, constant, cached, all
 Unsupported value 'blubb' for qcow2 option 'overlap-check'. Allowed are any of the following: none, constant, cached, all