diff mbox

[1/3] qcow2: mark the memory as no longer needed after qcow2_cache_empty()

Message ID 3587410bf941dc495ec4e41201d1d53a21f7e6fb.1432891306.git.berto@igalia.com
State New
Headers show

Commit Message

Alberto Garcia May 29, 2015, 9:24 a.m. UTC
After having emptied the cache, the data in the cache tables is no
longer useful, so we can tell the kernel that we are done with it. In
Linux this frees the resources associated with it.

The effect of this can be seen in the HMP commit operation: it moves
data from the top to the base image (and fills both caches), then it
empties the top image. At this point the data in that cache is no
longer needed so it's just wasting memory.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cache.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Kevin Wolf June 2, 2015, 10:05 a.m. UTC | #1
Am 29.05.2015 um 11:24 hat Alberto Garcia geschrieben:
> After having emptied the cache, the data in the cache tables is no
> longer useful, so we can tell the kernel that we are done with it. In
> Linux this frees the resources associated with it.
> 
> The effect of this can be seen in the HMP commit operation: it moves
> data from the top to the base image (and fills both caches), then it
> empties the top image. At this point the data in that cache is no
> longer needed so it's just wasting memory.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2-cache.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
> index ed92a09..ed14a92 100644
> --- a/block/qcow2-cache.c
> +++ b/block/qcow2-cache.c
> @@ -22,8 +22,10 @@
>   * THE SOFTWARE.
>   */
>  
> +#include <sys/mman.h>
>  #include "block/block_int.h"
>  #include "qemu-common.h"
> +#include "qemu/osdep.h"
>  #include "qcow2.h"
>  #include "trace.h"

This breaks the mingw build:

/mnt/qemu/block/qcow2-cache.c:25:22: fatal error: sys/mman.h: No such file or directory
 #include <sys/mman.h>

> @@ -60,6 +62,22 @@ static inline int qcow2_cache_get_table_idx(BlockDriverState *bs,
>      return idx;
>  }
>  
> +static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c,
> +                                      int i, int num_tables)
> +{
> +#if QEMU_MADV_DONTNEED != QEMU_MADV_INVALID
> +    BDRVQcowState *s = bs->opaque;
> +    void *t = qcow2_cache_get_table_addr(bs, c, i);
> +    long align = sysconf(_SC_PAGESIZE);

It seems that getpagesize() is usually used in qemu.

> +    size_t mem_size = (size_t) s->cluster_size * num_tables;
> +    size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t;
> +    size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align);

Instead of all the aligning here, shouldn't we just make sure that the
tables are created with the right alignment?

> +    if (length > 0) {
> +        qemu_madvise((uint8_t *) t + offset, length, QEMU_MADV_DONTNEED);
> +    }
> +#endif
> +}

Kevin
Alberto Garcia June 2, 2015, 10:50 a.m. UTC | #2
On Tue 02 Jun 2015 12:05:59 PM CEST, Kevin Wolf <kwolf@redhat.com> wrote:

>> +#include <sys/mman.h>
>>  #include "block/block_int.h"
>>  #include "qemu-common.h"
>> +#include "qemu/osdep.h"
>>  #include "qcow2.h"
>>  #include "trace.h"
>
> This breaks the mingw build:
>
> /mnt/qemu/block/qcow2-cache.c:25:22: fatal error: sys/mman.h: No such file or directory
>  #include <sys/mman.h>

Ok, I can use #if defined(CONFIG_MADVISE) || defined(CONFIG_POSIX_MADVISE)
as it's done in util/osdep.c for the same header.

>> +#if QEMU_MADV_DONTNEED != QEMU_MADV_INVALID
>> +    BDRVQcowState *s = bs->opaque;
>> +    void *t = qcow2_cache_get_table_addr(bs, c, i);
>> +    long align = sysconf(_SC_PAGESIZE);
>
> It seems that getpagesize() is usually used in qemu.

The getpagesize() manual page actually recommends using sysconf() for
portability reasons. But other than that I don't have a problem with
getpagesize() if it's the preferred choice in QEMU.

>> +    size_t mem_size = (size_t) s->cluster_size * num_tables;
>> +    size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t;
>> +    size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align);
>
> Instead of all the aligning here, shouldn't we just make sure that the
> tables are created with the right alignment?

The tables should have the right alignment and their size should be a
multiple of the page size. The latter condition is not necessarily true
(a cluster can be smaller than one page) so I'd keep that code in any
case.

Berto
Kevin Wolf June 2, 2015, 10:56 a.m. UTC | #3
Am 02.06.2015 um 12:50 hat Alberto Garcia geschrieben:
> On Tue 02 Jun 2015 12:05:59 PM CEST, Kevin Wolf <kwolf@redhat.com> wrote:
> 
> >> +#include <sys/mman.h>
> >>  #include "block/block_int.h"
> >>  #include "qemu-common.h"
> >> +#include "qemu/osdep.h"
> >>  #include "qcow2.h"
> >>  #include "trace.h"
> >
> > This breaks the mingw build:
> >
> > /mnt/qemu/block/qcow2-cache.c:25:22: fatal error: sys/mman.h: No such file or directory
> >  #include <sys/mman.h>
> 
> Ok, I can use #if defined(CONFIG_MADVISE) || defined(CONFIG_POSIX_MADVISE)
> as it's done in util/osdep.c for the same header.
> 
> >> +#if QEMU_MADV_DONTNEED != QEMU_MADV_INVALID
> >> +    BDRVQcowState *s = bs->opaque;
> >> +    void *t = qcow2_cache_get_table_addr(bs, c, i);
> >> +    long align = sysconf(_SC_PAGESIZE);
> >
> > It seems that getpagesize() is usually used in qemu.
> 
> The getpagesize() manual page actually recommends using sysconf() for
> portability reasons. But other than that I don't have a problem with
> getpagesize() if it's the preferred choice in QEMU.

There is a Windows implementation for getpagesize() in qemu, so I guess
it's actually more portable in our context.

Now, as you'll #ifdef this code out for Windows, that probably doesn't
matter, so it's just about consistency.

> >> +    size_t mem_size = (size_t) s->cluster_size * num_tables;
> >> +    size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t;
> >> +    size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align);
> >
> > Instead of all the aligning here, shouldn't we just make sure that the
> > tables are created with the right alignment?
> 
> The tables should have the right alignment and their size should be a
> multiple of the page size. The latter condition is not necessarily true
> (a cluster can be smaller than one page) so I'd keep that code in any
> case.

We could adjust the size to MAX(size, pagesize), but perhaps that wastes
a bit too much memory indeed.

Kevin
Alberto Garcia June 2, 2015, 11:08 a.m. UTC | #4
On Tue 02 Jun 2015 12:56:10 PM CEST, Kevin Wolf wrote:

>> > It seems that getpagesize() is usually used in qemu.
>> 
>> The getpagesize() manual page actually recommends using sysconf() for
>> portability reasons. But other than that I don't have a problem with
>> getpagesize() if it's the preferred choice in QEMU.
>
> There is a Windows implementation for getpagesize() in qemu, so I
> guess it's actually more portable in our context.

Ok, I'll change that.

> Now, as you'll #ifdef this code out for Windows, that probably doesn't
> matter, so it's just about consistency.

There's actually no need to #ifdef that code out, I only do it because I
know in advance that it's going to be a no-op, but qemu_madvise() itself
doesn't break or anything if the operation is not supported.

Berto
diff mbox

Patch

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index ed92a09..ed14a92 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -22,8 +22,10 @@ 
  * THE SOFTWARE.
  */
 
+#include <sys/mman.h>
 #include "block/block_int.h"
 #include "qemu-common.h"
+#include "qemu/osdep.h"
 #include "qcow2.h"
 #include "trace.h"
 
@@ -60,6 +62,22 @@  static inline int qcow2_cache_get_table_idx(BlockDriverState *bs,
     return idx;
 }
 
+static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c,
+                                      int i, int num_tables)
+{
+#if QEMU_MADV_DONTNEED != QEMU_MADV_INVALID
+    BDRVQcowState *s = bs->opaque;
+    void *t = qcow2_cache_get_table_addr(bs, c, i);
+    long align = sysconf(_SC_PAGESIZE);
+    size_t mem_size = (size_t) s->cluster_size * num_tables;
+    size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t;
+    size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align);
+    if (length > 0) {
+        qemu_madvise((uint8_t *) t + offset, length, QEMU_MADV_DONTNEED);
+    }
+#endif
+}
+
 Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
 {
     BDRVQcowState *s = bs->opaque;
@@ -237,6 +255,8 @@  int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c)
         c->entries[i].lru_counter = 0;
     }
 
+    qcow2_cache_table_release(bs, c, 0, c->size);
+
     c->lru_counter = 0;
 
     return 0;