Patchwork [v13,04/13] Add cache handling functions

login
register
mail settings
Submitter Orit Wasserman
Date June 27, 2012, 10:34 a.m.
Message ID <1340793261-11400-5-git-send-email-owasserm@redhat.com>
Download mbox | patch
Permalink /patch/167612/
State New
Headers show

Comments

Orit Wasserman - June 27, 2012, 10:34 a.m.
Add LRU page cache mechanism.
The page are accessed by their address.

Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
Signed-off-by: Petter Svard <petters@cs.umu.se>
Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 Makefile.objs        |    1 +
 cache.c              |  217 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/qemu/cache.h |   79 ++++++++++++++++++
 qemu-common.h        |   10 +++
 4 files changed, 307 insertions(+), 0 deletions(-)
 create mode 100644 cache.c
 create mode 100644 include/qemu/cache.h
Eric Blake - June 27, 2012, 4:55 p.m.
On 06/27/2012 04:34 AM, Orit Wasserman wrote:
> Add LRU page cache mechanism.
> The page are accessed by their address.
> 
> Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
> Signed-off-by: Petter Svard <petters@cs.umu.se>
> Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>

> +++ b/cache.c

cache.c is a rather generic name; should it be page-cache.c instead to
reflect that it only caches memory pages?

> @@ -0,0 +1,217 @@
> +/*
> + * Page cache for qemu
> + * The cache is base on a hash on the page address

s/base on a hash on/based on a hash of/

> +
> +Cache *cache_init(int64_t num_pages, unsigned int page_size)
> +{

> +
> +    /* round down to the nearest power of 2 */
> +    if (!is_power_of_2(num_pages)) {
> +        num_pages = 1 << ffs(num_pages);

That's not how you round down.  For example, if I passed in 0x5, you end
up giving me 1 << ffs(5) == 1 << 1 == 2, but the correct answer should be 4.

http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
and http://aggregate.org/MAGIC/#Leading%20Zero%20Count give some hints
about what you really want to be doing; offhand, I came up with this (it
works because you already rejected negative num_pages):

if (!is_power_of_2(num_pages)) {
    num_pages |= num_pages >> 1;
    num_pages |= num_pages >> 2;
    num_pages |= num_pages >> 4;
    num_pages |= num_pages >> 8;
    num_pages |= num_pages >> 16;
    num_pages |= num_pages >> 32;
    num_pages -= num_pages / 2;
}

> +    cache->page_cache = g_malloc((cache->max_num_items) *
> +                                 sizeof(CacheItem));
> +    if (!cache->page_cache) {
> +        DPRINTF("could not allocate cache\n");
> +        g_free(cache);
> +        return NULL;
> +    }
> +
> +    for (i = 0; i < cache->max_num_items; i++) {
> +        cache->page_cache[i].it_data = NULL;
> +        cache->page_cache[i].it_age = 0;

Does g_malloc leave memory uninitialized, or is it like calloc where it
zeros out the memory making these two assignments redundant?

> +
> +int cache_resize(Cache *cache, int64_t new_num_pages)
> +{
> +    Cache *new_cache;
> +    int i;
> +
> +    CacheItem *old_it, *new_it;
> +
> +    g_assert(cache);
> +
> +    /* same size */
> +    if (new_num_pages == cache->max_num_items) {
> +        return 0;
> +    }
> +
> +    /* cache was not inited */
> +    if (cache->page_cache == NULL) {
> +        return -1;
> +    }

Shouldn't these two conditions be swapped?  Non-init failure should take
precedence over no size change.

If new_num_pages is not a power of 2, but rounds down to the same as the
existing size, the size won't compare equal and you end up wasting a lot
of effort moving pages between the resulting identically sized caches.
I'd factor out your rounding-down code, and call it from multiple places
prior to checking for size equality.

> +    /* move all data from old cache */
> +    for (i = 0; i < cache->max_num_items; i++) {
> +        old_it = &cache->page_cache[i];
> +        if (old_it->it_addr != -1) {
> +            /* check for collision , if there  is keep the first value */

s/collision , if there  is/collision, if there is,/

> +++ b/include/qemu/cache.h
> @@ -0,0 +1,79 @@
> +/*
> + * Page cache for qemu
> + * The cache is base on a hash on the page address

Same comments as for cache.c.
Orit Wasserman - June 27, 2012, 6:01 p.m.
On 06/27/2012 07:55 PM, Eric Blake wrote:
> On 06/27/2012 04:34 AM, Orit Wasserman wrote:
>> Add LRU page cache mechanism.
>> The page are accessed by their address.
>>
>> Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
>> Signed-off-by: Petter Svard <petters@cs.umu.se>
>> Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> 
>> +++ b/cache.c
> 
> cache.c is a rather generic name; should it be page-cache.c instead to
> reflect that it only caches memory pages?
ok
> 
>> @@ -0,0 +1,217 @@
>> +/*
>> + * Page cache for qemu
>> + * The cache is base on a hash on the page address
> 
> s/base on a hash on/based on a hash of/
> 
>> +
>> +Cache *cache_init(int64_t num_pages, unsigned int page_size)
>> +{
> 
>> +
>> +    /* round down to the nearest power of 2 */
>> +    if (!is_power_of_2(num_pages)) {
>> +        num_pages = 1 << ffs(num_pages);
> 
> That's not how you round down.  For example, if I passed in 0x5, you end
> up giving me 1 << ffs(5) == 1 << 1 == 2, but the correct answer should be 4.
> 
> http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
> and http://aggregate.org/MAGIC/#Leading%20Zero%20Count give some hints
> about what you really want to be doing; offhand, I came up with this (it
> works because you already rejected negative num_pages):
> 
> if (!is_power_of_2(num_pages)) {
>     num_pages |= num_pages >> 1;
>     num_pages |= num_pages >> 2;
>     num_pages |= num_pages >> 4;
>     num_pages |= num_pages >> 8;
>     num_pages |= num_pages >> 16;
>     num_pages |= num_pages >> 32;
>     num_pages -= num_pages / 2;
> }
you are right for some reason I decided it is the last set bit and not the first ..
> 
>> +    cache->page_cache = g_malloc((cache->max_num_items) *
>> +                                 sizeof(CacheItem));
>> +    if (!cache->page_cache) {
>> +        DPRINTF("could not allocate cache\n");
>> +        g_free(cache);
>> +        return NULL;
>> +    }
>> +
>> +    for (i = 0; i < cache->max_num_items; i++) {
>> +        cache->page_cache[i].it_data = NULL;
>> +        cache->page_cache[i].it_age = 0;
> 
> Does g_malloc leave memory uninitialized, or is it like calloc where it
> zeros out the memory making these two assignments redundant?
> 
>> +
>> +int cache_resize(Cache *cache, int64_t new_num_pages)
>> +{
>> +    Cache *new_cache;
>> +    int i;
>> +
>> +    CacheItem *old_it, *new_it;
>> +
>> +    g_assert(cache);
>> +
>> +    /* same size */
>> +    if (new_num_pages == cache->max_num_items) {
>> +        return 0;
>> +    }
>> +
>> +    /* cache was not inited */
>> +    if (cache->page_cache == NULL) {
>> +        return -1;
>> +    }
> 
> Shouldn't these two conditions be swapped?  Non-init failure should take
> precedence over no size change.
> 
> If new_num_pages is not a power of 2, but rounds down to the same as the
> existing size, the size won't compare equal and you end up wasting a lot
> of effort moving pages between the resulting identically sized caches.
> I'd factor out your rounding-down code, and call it from multiple places
> prior to checking for size equality.
right
> 
>> +    /* move all data from old cache */
>> +    for (i = 0; i < cache->max_num_items; i++) {
>> +        old_it = &cache->page_cache[i];
>> +        if (old_it->it_addr != -1) {
>> +            /* check for collision , if there  is keep the first value */
> 
> s/collision , if there  is/collision, if there is,/
> 
>> +++ b/include/qemu/cache.h
>> @@ -0,0 +1,79 @@
>> +/*
>> + * Page cache for qemu
>> + * The cache is base on a hash on the page address
> 
> Same comments as for cache.c.
>
Blue Swirl - June 27, 2012, 6:06 p.m.
On Wed, Jun 27, 2012 at 10:34 AM, Orit Wasserman <owasserm@redhat.com> wrote:
> Add LRU page cache mechanism.
> The page are accessed by their address.
>
> Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
> Signed-off-by: Petter Svard <petters@cs.umu.se>
> Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> ---
>  Makefile.objs        |    1 +
>  cache.c              |  217 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/qemu/cache.h |   79 ++++++++++++++++++
>  qemu-common.h        |   10 +++
>  4 files changed, 307 insertions(+), 0 deletions(-)
>  create mode 100644 cache.c
>  create mode 100644 include/qemu/cache.h
>
> diff --git a/Makefile.objs b/Makefile.objs
> index 625c4d5..d9c6859 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -77,6 +77,7 @@ common-obj-y += qemu-char.o #aio.o
>  common-obj-y += block-migration.o iohandler.o
>  common-obj-y += pflib.o
>  common-obj-y += bitmap.o bitops.o
> +common-obj-y += cache.o
>
>  common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
>  common-obj-$(CONFIG_WIN32) += version.o
> diff --git a/cache.c b/cache.c
> new file mode 100644
> index 0000000..729fc64
> --- /dev/null
> +++ b/cache.c
> @@ -0,0 +1,217 @@
> +/*
> + * Page cache for qemu

QEMU

> + * The cache is base on a hash on the page address
> + *
> + * Copyright 2012 Red Hat, Inc. and/or its affiliates
> + *
> + * Authors:
> + *  Orit Wasserman  <owasserm@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <strings.h>
> +#include <string.h>
> +#include <sys/time.h>
> +#include <sys/types.h>
> +#include <stdbool.h>
> +#include <glib.h>
> +#include <strings.h>
> +
> +#include "qemu-common.h"
> +#include "qemu/cache.h"
> +
> +#ifdef DEBUG_CACHE
> +#define DPRINTF(fmt, ...) \
> +    do { fprintf(stdout, "cache: " fmt, ## __VA_ARGS__); } while (0)
> +#else
> +#define DPRINTF(fmt, ...) \
> +    do { } while (0)
> +#endif
> +
> +typedef struct CacheItem CacheItem;
> +
> +struct CacheItem {
> +    uint64_t it_addr;
> +    unsigned long it_age;
> +    uint8_t *it_data;
> +};
> +
> +struct Cache {
> +    CacheItem *page_cache;
> +    unsigned int page_size;
> +    int64_t max_num_items;
> +    uint64_t max_item_age;
> +    int64_t num_items;
> +};
> +
> +Cache *cache_init(int64_t num_pages, unsigned int page_size)
> +{
> +    int i;
> +
> +    Cache *cache = g_malloc(sizeof(Cache));
> +    if (!cache) {
> +        DPRINTF("Error allocation Cache\n");
> +        return NULL;
> +    }
> +
> +    if (num_pages <= 0) {
> +        DPRINTF("invalid number pages\n");
> +        return NULL;
> +    }
> +
> +    /* round down to the nearest power of 2 */
> +    if (!is_power_of_2(num_pages)) {
> +        num_pages = 1 << ffs(num_pages);
> +        DPRINTF("rounding down to %ld\n", num_pages);

num_pages is int64, so %ld would not work on 32 bit host. Please use
PRId64 and check other DPRINTFs if they have the same problem.

> +    }
> +    cache->page_size = page_size;
> +    cache->num_items = 0;
> +    cache->max_item_age = 0;
> +    cache->max_num_items = num_pages;
> +
> +    DPRINTF("Setting cache buckets to %lu\n", cache->max_num_items);
> +
> +    cache->page_cache = g_malloc((cache->max_num_items) *
> +                                 sizeof(CacheItem));
> +    if (!cache->page_cache) {

g_malloc() will exit if there's no memory.

> +        DPRINTF("could not allocate cache\n");
> +        g_free(cache);
> +        return NULL;
> +    }
> +
> +    for (i = 0; i < cache->max_num_items; i++) {
> +        cache->page_cache[i].it_data = NULL;
> +        cache->page_cache[i].it_age = 0;
> +        cache->page_cache[i].it_addr = -1;
> +    }
> +
> +    return cache;
> +}
> +
> +void cache_fini(Cache *cache)
> +{
> +    int i;
> +
> +    g_assert(cache);
> +    g_assert(cache->page_cache);
> +
> +    for (i = 0; i < cache->max_num_items; i++) {
> +        g_free(cache->page_cache[i].it_data);
> +        cache->page_cache[i].it_data = 0;
> +    }
> +
> +    g_free(cache->page_cache);
> +    cache->page_cache = NULL;
> +}
> +
> +static unsigned long cache_get_cache_pos(const Cache *cache, uint64_t address)
> +{
> +    unsigned long pos;
> +
> +    g_assert(cache->max_num_items);
> +    pos = (address/cache->page_size) & (cache->max_num_items - 1);
> +    return pos;
> +}
> +
> +bool cache_is_cached(const Cache *cache, uint64_t addr)
> +{
> +    unsigned long pos;
> +
> +    g_assert(cache);
> +    g_assert(cache->page_cache);
> +
> +    pos = cache_get_cache_pos(cache, addr);
> +
> +    return (cache->page_cache[pos].it_addr == addr);
> +}
> +
> +static CacheItem *cache_get_by_addr(const Cache *cache, uint64_t addr)
> +{
> +    unsigned long pos;
> +
> +    g_assert(cache);
> +    g_assert(cache->page_cache);
> +
> +    pos = cache_get_cache_pos(cache, addr);
> +
> +    return &cache->page_cache[pos];
> +}
> +
> +uint8_t *get_cached_data(const Cache *cache, uint64_t addr)
> +{
> +    return cache_get_by_addr(cache, addr)->it_data;
> +}
> +
> +void cache_insert(Cache *cache, unsigned long addr, uint8_t *pdata)
> +{
> +
> +    CacheItem *it = NULL;
> +
> +    g_assert(cache);
> +    g_assert(cache->page_cache);
> +
> +    /* actual update of entry */
> +    it = cache_get_by_addr(cache, addr);
> +
> +    if (!it->it_data) {
> +        cache->num_items++;
> +    }
> +
> +    it->it_data = pdata;
> +    it->it_age = ++cache->max_item_age;
> +    it->it_addr = addr;
> +}
> +
> +int cache_resize(Cache *cache, int64_t new_num_pages)
> +{
> +    Cache *new_cache;
> +    int i;
> +
> +    CacheItem *old_it, *new_it;
> +
> +    g_assert(cache);
> +
> +    /* same size */
> +    if (new_num_pages == cache->max_num_items) {
> +        return 0;
> +    }
> +
> +    /* cache was not inited */
> +    if (cache->page_cache == NULL) {
> +        return -1;
> +    }
> +
> +    new_cache = cache_init(new_num_pages, cache->page_size);
> +    if (!(new_cache)) {
> +        DPRINTF("Error creating new cache\n");
> +        return -1;
> +    }
> +
> +    /* move all data from old cache */
> +    for (i = 0; i < cache->max_num_items; i++) {
> +        old_it = &cache->page_cache[i];
> +        if (old_it->it_addr != -1) {
> +            /* check for collision , if there  is keep the first value */
> +            new_it = cache_get_by_addr(new_cache, old_it->it_addr);
> +            if (new_it->it_data) {
> +                g_free(old_it->it_data);
> +            } else {
> +                cache_insert(new_cache, old_it->it_addr, old_it->it_data);
> +            }
> +        }
> +    }
> +
> +    cache->page_cache = new_cache->page_cache;
> +    cache->max_num_items = new_cache->max_num_items;
> +    cache->num_items = new_cache->num_items;
> +
> +    g_free(new_cache);
> +
> +    return 0;
> +}
> diff --git a/include/qemu/cache.h b/include/qemu/cache.h
> new file mode 100644
> index 0000000..9ca2d0a
> --- /dev/null
> +++ b/include/qemu/cache.h
> @@ -0,0 +1,79 @@
> +/*
> + * Page cache for qemu
> + * The cache is base on a hash on the page address
> + *
> + * Copyright 2012 Red Hat, Inc. and/or its affiliates
> + *
> + * Authors:
> + *  Orit Wasserman  <owasserm@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef CACHE_H
> +#define CACHE_H
> +
> +/* Page cache for storing guest pages */
> +typedef struct Cache Cache;
> +
> +/**
> + * cache_init: Initialize the page cache
> + *
> + *
> + * Returns new allocated cache or NULL on error
> + *
> + * @cache pointer to the Cache struct
> + * @num_pages: cache maximal number of cached pages
> + * @page_size: cache page size
> + */
> +Cache *cache_init(int64_t num_pages, unsigned int page_size);
> +
> +/**
> + * cache_fini: free all cache resources
> + * @cache pointer to the Cache struct
> + */
> +void cache_fini(Cache *cache);
> +
> +/**
> + * cache_is_cached: Checks to see if the page is cached
> + *
> + * Returns %true if page is cached
> + *
> + * @cache pointer to the Cache struct
> + * @addr: page addr
> + */
> +bool cache_is_cached(const Cache *cache, uint64_t addr);
> +
> +/**
> + * get_cached_data: Get the data cached for an addr
> + *
> + * Returns pointer to the data cached or NULL if not cached
> + *
> + * @cache pointer to the Cache struct
> + * @addr: page addr
> + */
> +uint8_t *get_cached_data(const Cache *cache, uint64_t addr);
> +
> +/**
> + * cache_insert: insert the page into the cache. the previous value will be overwritten
> + *
> + * @cache pointer to the Cache struct
> + * @addr: page address
> + * @pdata: pointer to the page
> + */
> +void cache_insert(Cache *cache, uint64_t addr, uint8_t *pdata);
> +
> +/**
> + * cache_resize: resize the page cache. In case of size reduction the extra pages
> + * will be freed
> + *
> + * Returns -1 on error
> + *
> + * @cache pointer to the Cache struct
> + * @num_pages: new page cache size (in pages)
> + */
> +int cache_resize(Cache *cache, int64_t num_pages);
> +
> +#endif
> diff --git a/qemu-common.h b/qemu-common.h
> index c8c6b2a..45b1d97 100644
> --- a/qemu-common.h
> +++ b/qemu-common.h
> @@ -1,3 +1,4 @@
> +
>  /* Common header file that is included by all of qemu.  */
>  #ifndef QEMU_COMMON_H
>  #define QEMU_COMMON_H
> @@ -417,6 +418,15 @@ static inline uint64_t muldiv64(uint64_t a, uint32_t b, uint32_t c)
>  /* Round number up to multiple */
>  #define QEMU_ALIGN_UP(n, m) QEMU_ALIGN_DOWN((n) + (m) - 1, (m))
>
> +static inline bool is_power_of_2(int64_t value)
> +{
> +    if (!value) {
> +        return 0;
> +    }
> +
> +    return !(value & (value - 1));
> +}
> +
>  #include "module.h"
>
>  #endif
> --
> 1.7.7.6
>
Orit Wasserman - June 28, 2012, 8:41 a.m.
On 06/27/2012 07:55 PM, Eric Blake wrote:
> On 06/27/2012 04:34 AM, Orit Wasserman wrote:
>> Add LRU page cache mechanism.
>> The page are accessed by their address.
>>
>> Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
>> Signed-off-by: Petter Svard <petters@cs.umu.se>
>> Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> 
>> +++ b/cache.c
> 
> cache.c is a rather generic name; should it be page-cache.c instead to
> reflect that it only caches memory pages?
> 
>> @@ -0,0 +1,217 @@
>> +/*
>> + * Page cache for qemu
>> + * The cache is base on a hash on the page address
> 
> s/base on a hash on/based on a hash of/
> 
>> +
>> +Cache *cache_init(int64_t num_pages, unsigned int page_size)
>> +{
> 
>> +
>> +    /* round down to the nearest power of 2 */
>> +    if (!is_power_of_2(num_pages)) {
>> +        num_pages = 1 << ffs(num_pages);
> 
> That's not how you round down.  For example, if I passed in 0x5, you end
> up giving me 1 << ffs(5) == 1 << 1 == 2, but the correct answer should be 4.
> 
> http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
> and http://aggregate.org/MAGIC/#Leading%20Zero%20Count give some hints
> about what you really want to be doing; offhand, I came up with this (it
> works because you already rejected negative num_pages):
> 
> if (!is_power_of_2(num_pages)) {
>     num_pages |= num_pages >> 1;
>     num_pages |= num_pages >> 2;
>     num_pages |= num_pages >> 4;
>     num_pages |= num_pages >> 8;
>     num_pages |= num_pages >> 16;
>     num_pages |= num_pages >> 32;
>     num_pages -= num_pages / 2;
> }
> 
>> +    cache->page_cache = g_malloc((cache->max_num_items) *
>> +                                 sizeof(CacheItem));
>> +    if (!cache->page_cache) {
>> +        DPRINTF("could not allocate cache\n");
>> +        g_free(cache);
>> +        return NULL;
>> +    }
>> +
>> +    for (i = 0; i < cache->max_num_items; i++) {
>> +        cache->page_cache[i].it_data = NULL;
>> +        cache->page_cache[i].it_age = 0;
> 
> Does g_malloc leave memory uninitialized, or is it like calloc where it
> zeros out the memory making these two assignments redundant?

g_malloc doesn't initialize memory, g_malloc0 does.

> 
>> +
>> +int cache_resize(Cache *cache, int64_t new_num_pages)
>> +{
>> +    Cache *new_cache;
>> +    int i;
>> +
>> +    CacheItem *old_it, *new_it;
>> +
>> +    g_assert(cache);
>> +
>> +    /* same size */
>> +    if (new_num_pages == cache->max_num_items) {
>> +        return 0;
>> +    }
>> +
>> +    /* cache was not inited */
>> +    if (cache->page_cache == NULL) {
>> +        return -1;
>> +    }
> 
> Shouldn't these two conditions be swapped?  Non-init failure should take
> precedence over no size change.
> 
> If new_num_pages is not a power of 2, but rounds down to the same as the
> existing size, the size won't compare equal and you end up wasting a lot
> of effort moving pages between the resulting identically sized caches.
> I'd factor out your rounding-down code, and call it from multiple places
> prior to checking for size equality.
> 
>> +    /* move all data from old cache */
>> +    for (i = 0; i < cache->max_num_items; i++) {
>> +        old_it = &cache->page_cache[i];
>> +        if (old_it->it_addr != -1) {
>> +            /* check for collision , if there  is keep the first value */
> 
> s/collision , if there  is/collision, if there is,/
> 
>> +++ b/include/qemu/cache.h
>> @@ -0,0 +1,79 @@
>> +/*
>> + * Page cache for qemu
>> + * The cache is base on a hash on the page address
> 
> Same comments as for cache.c.
>
Max Filippov - June 28, 2012, 12:06 p.m.
On Wed, Jun 27, 2012 at 8:55 PM, Eric Blake <eblake@redhat.com> wrote:
> On 06/27/2012 04:34 AM, Orit Wasserman wrote:

[...]

>> +
>> +    /* round down to the nearest power of 2 */
>> +    if (!is_power_of_2(num_pages)) {
>> +        num_pages = 1 << ffs(num_pages);
>
> That's not how you round down.  For example, if I passed in 0x5, you end
> up giving me 1 << ffs(5) == 1 << 1 == 2, but the correct answer should be 4.
>
> http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
> and http://aggregate.org/MAGIC/#Leading%20Zero%20Count give some hints
> about what you really want to be doing; offhand, I came up with this (it
> works because you already rejected negative num_pages):
>
> if (!is_power_of_2(num_pages)) {
>    num_pages |= num_pages >> 1;
>    num_pages |= num_pages >> 2;
>    num_pages |= num_pages >> 4;
>    num_pages |= num_pages >> 8;
>    num_pages |= num_pages >> 16;
>    num_pages |= num_pages >> 32;
>    num_pages -= num_pages / 2;
> }

Or

if (!is_power_of_2(num_pages)) {
    num_pages = 0x8000000000000000ULL >> clz64(num_pages);
}

[...]
Eric Blake - June 28, 2012, 12:15 p.m.
On 06/28/2012 06:06 AM, Max Filippov wrote:
> On Wed, Jun 27, 2012 at 8:55 PM, Eric Blake <eblake@redhat.com> wrote:
>> On 06/27/2012 04:34 AM, Orit Wasserman wrote:
> 

>> if (!is_power_of_2(num_pages)) {
>>    num_pages |= num_pages >> 1;
>>    num_pages |= num_pages >> 2;
>>    num_pages |= num_pages >> 4;
>>    num_pages |= num_pages >> 8;
>>    num_pages |= num_pages >> 16;
>>    num_pages |= num_pages >> 32;
>>    num_pages -= num_pages / 2;
>> }
> 
> Or
> 
> if (!is_power_of_2(num_pages)) {
>     num_pages = 0x8000000000000000ULL >> clz64(num_pages);
> }

Indeed.  I knew about the gcc builtin for clz, but didn't suggest it
because it is not standardized, and didn't realize that qemu had already
wrapped it in a nice function (and one which is portable even when not
compiling with gcc).  It's a shame that libc provides ffs() but not
clz() (since the two operations are rather symmetric (just differing in
which direction they find the first bit), and both useful in
bit-twiddling operations.

Patch

diff --git a/Makefile.objs b/Makefile.objs
index 625c4d5..d9c6859 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -77,6 +77,7 @@  common-obj-y += qemu-char.o #aio.o
 common-obj-y += block-migration.o iohandler.o
 common-obj-y += pflib.o
 common-obj-y += bitmap.o bitops.o
+common-obj-y += cache.o
 
 common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
 common-obj-$(CONFIG_WIN32) += version.o
diff --git a/cache.c b/cache.c
new file mode 100644
index 0000000..729fc64
--- /dev/null
+++ b/cache.c
@@ -0,0 +1,217 @@ 
+/*
+ * Page cache for qemu
+ * The cache is base on a hash on the page address
+ *
+ * Copyright 2012 Red Hat, Inc. and/or its affiliates
+ *
+ * Authors:
+ *  Orit Wasserman  <owasserm@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <strings.h>
+#include <string.h>
+#include <sys/time.h>
+#include <sys/types.h>
+#include <stdbool.h>
+#include <glib.h>
+#include <strings.h>
+
+#include "qemu-common.h"
+#include "qemu/cache.h"
+
+#ifdef DEBUG_CACHE
+#define DPRINTF(fmt, ...) \
+    do { fprintf(stdout, "cache: " fmt, ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+    do { } while (0)
+#endif
+
+typedef struct CacheItem CacheItem;
+
+struct CacheItem {
+    uint64_t it_addr;
+    unsigned long it_age;
+    uint8_t *it_data;
+};
+
+struct Cache {
+    CacheItem *page_cache;
+    unsigned int page_size;
+    int64_t max_num_items;
+    uint64_t max_item_age;
+    int64_t num_items;
+};
+
+Cache *cache_init(int64_t num_pages, unsigned int page_size)
+{
+    int i;
+
+    Cache *cache = g_malloc(sizeof(Cache));
+    if (!cache) {
+        DPRINTF("Error allocation Cache\n");
+        return NULL;
+    }
+
+    if (num_pages <= 0) {
+        DPRINTF("invalid number pages\n");
+        return NULL;
+    }
+
+    /* round down to the nearest power of 2 */
+    if (!is_power_of_2(num_pages)) {
+        num_pages = 1 << ffs(num_pages);
+        DPRINTF("rounding down to %ld\n", num_pages);
+    }
+    cache->page_size = page_size;
+    cache->num_items = 0;
+    cache->max_item_age = 0;
+    cache->max_num_items = num_pages;
+
+    DPRINTF("Setting cache buckets to %lu\n", cache->max_num_items);
+
+    cache->page_cache = g_malloc((cache->max_num_items) *
+                                 sizeof(CacheItem));
+    if (!cache->page_cache) {
+        DPRINTF("could not allocate cache\n");
+        g_free(cache);
+        return NULL;
+    }
+
+    for (i = 0; i < cache->max_num_items; i++) {
+        cache->page_cache[i].it_data = NULL;
+        cache->page_cache[i].it_age = 0;
+        cache->page_cache[i].it_addr = -1;
+    }
+
+    return cache;
+}
+
+void cache_fini(Cache *cache)
+{
+    int i;
+
+    g_assert(cache);
+    g_assert(cache->page_cache);
+
+    for (i = 0; i < cache->max_num_items; i++) {
+        g_free(cache->page_cache[i].it_data);
+        cache->page_cache[i].it_data = 0;
+    }
+
+    g_free(cache->page_cache);
+    cache->page_cache = NULL;
+}
+
+static unsigned long cache_get_cache_pos(const Cache *cache, uint64_t address)
+{
+    unsigned long pos;
+
+    g_assert(cache->max_num_items);
+    pos = (address/cache->page_size) & (cache->max_num_items - 1);
+    return pos;
+}
+
+bool cache_is_cached(const Cache *cache, uint64_t addr)
+{
+    unsigned long pos;
+
+    g_assert(cache);
+    g_assert(cache->page_cache);
+
+    pos = cache_get_cache_pos(cache, addr);
+
+    return (cache->page_cache[pos].it_addr == addr);
+}
+
+static CacheItem *cache_get_by_addr(const Cache *cache, uint64_t addr)
+{
+    unsigned long pos;
+
+    g_assert(cache);
+    g_assert(cache->page_cache);
+
+    pos = cache_get_cache_pos(cache, addr);
+
+    return &cache->page_cache[pos];
+}
+
+uint8_t *get_cached_data(const Cache *cache, uint64_t addr)
+{
+    return cache_get_by_addr(cache, addr)->it_data;
+}
+
+void cache_insert(Cache *cache, unsigned long addr, uint8_t *pdata)
+{
+
+    CacheItem *it = NULL;
+
+    g_assert(cache);
+    g_assert(cache->page_cache);
+
+    /* actual update of entry */
+    it = cache_get_by_addr(cache, addr);
+
+    if (!it->it_data) {
+        cache->num_items++;
+    }
+
+    it->it_data = pdata;
+    it->it_age = ++cache->max_item_age;
+    it->it_addr = addr;
+}
+
+int cache_resize(Cache *cache, int64_t new_num_pages)
+{
+    Cache *new_cache;
+    int i;
+
+    CacheItem *old_it, *new_it;
+
+    g_assert(cache);
+
+    /* same size */
+    if (new_num_pages == cache->max_num_items) {
+        return 0;
+    }
+
+    /* cache was not inited */
+    if (cache->page_cache == NULL) {
+        return -1;
+    }
+
+    new_cache = cache_init(new_num_pages, cache->page_size);
+    if (!(new_cache)) {
+        DPRINTF("Error creating new cache\n");
+        return -1;
+    }
+
+    /* move all data from old cache */
+    for (i = 0; i < cache->max_num_items; i++) {
+        old_it = &cache->page_cache[i];
+        if (old_it->it_addr != -1) {
+            /* check for collision , if there  is keep the first value */
+            new_it = cache_get_by_addr(new_cache, old_it->it_addr);
+            if (new_it->it_data) {
+                g_free(old_it->it_data);
+            } else {
+                cache_insert(new_cache, old_it->it_addr, old_it->it_data);
+            }
+        }
+    }
+
+    cache->page_cache = new_cache->page_cache;
+    cache->max_num_items = new_cache->max_num_items;
+    cache->num_items = new_cache->num_items;
+
+    g_free(new_cache);
+
+    return 0;
+}
diff --git a/include/qemu/cache.h b/include/qemu/cache.h
new file mode 100644
index 0000000..9ca2d0a
--- /dev/null
+++ b/include/qemu/cache.h
@@ -0,0 +1,79 @@ 
+/*
+ * Page cache for qemu
+ * The cache is base on a hash on the page address
+ *
+ * Copyright 2012 Red Hat, Inc. and/or its affiliates
+ *
+ * Authors:
+ *  Orit Wasserman  <owasserm@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef CACHE_H
+#define CACHE_H
+
+/* Page cache for storing guest pages */
+typedef struct Cache Cache;
+
+/**
+ * cache_init: Initialize the page cache
+ *
+ *
+ * Returns new allocated cache or NULL on error
+ *
+ * @cache pointer to the Cache struct
+ * @num_pages: cache maximal number of cached pages
+ * @page_size: cache page size
+ */
+Cache *cache_init(int64_t num_pages, unsigned int page_size);
+
+/**
+ * cache_fini: free all cache resources
+ * @cache pointer to the Cache struct
+ */
+void cache_fini(Cache *cache);
+
+/**
+ * cache_is_cached: Checks to see if the page is cached
+ *
+ * Returns %true if page is cached
+ *
+ * @cache pointer to the Cache struct
+ * @addr: page addr
+ */
+bool cache_is_cached(const Cache *cache, uint64_t addr);
+
+/**
+ * get_cached_data: Get the data cached for an addr
+ *
+ * Returns pointer to the data cached or NULL if not cached
+ *
+ * @cache pointer to the Cache struct
+ * @addr: page addr
+ */
+uint8_t *get_cached_data(const Cache *cache, uint64_t addr);
+
+/**
+ * cache_insert: insert the page into the cache. the previous value will be overwritten
+ *
+ * @cache pointer to the Cache struct
+ * @addr: page address
+ * @pdata: pointer to the page
+ */
+void cache_insert(Cache *cache, uint64_t addr, uint8_t *pdata);
+
+/**
+ * cache_resize: resize the page cache. In case of size reduction the extra pages
+ * will be freed
+ *
+ * Returns -1 on error
+ *
+ * @cache pointer to the Cache struct
+ * @num_pages: new page cache size (in pages)
+ */
+int cache_resize(Cache *cache, int64_t num_pages);
+
+#endif
diff --git a/qemu-common.h b/qemu-common.h
index c8c6b2a..45b1d97 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -1,3 +1,4 @@ 
+
 /* Common header file that is included by all of qemu.  */
 #ifndef QEMU_COMMON_H
 #define QEMU_COMMON_H
@@ -417,6 +418,15 @@  static inline uint64_t muldiv64(uint64_t a, uint32_t b, uint32_t c)
 /* Round number up to multiple */
 #define QEMU_ALIGN_UP(n, m) QEMU_ALIGN_DOWN((n) + (m) - 1, (m))
 
+static inline bool is_power_of_2(int64_t value)
+{
+    if (!value) {
+        return 0;
+    }
+
+    return !(value & (value - 1));
+}
+
 #include "module.h"
 
 #endif