diff mbox

migration: cache memory region ram ptr

Message ID 536E5495.6080401@kamp.de
State New
Headers show

Commit Message

Peter Lieven May 10, 2014, 4:32 p.m. UTC
Am 10.05.2014 17:33, schrieb Paolo Bonzini:
> Il 10/05/2014 12:51, Peter Lieven ha scritto:
>> we currently look up the ram ptr for each single page. Cache
>> the pointer while we operate on the same block.
>
> Why don't you instead cache the result in the MemoryRegion, so that memory_region_get_ram_ptr becomes a simple, inline field access?

This seems to work. Wondering if it has other side implications. Basic tests like booting vServers and migration work. What about XEN?



Peter

>
> Paolo
>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>>  arch_init.c |   23 ++++++++++++++++-------
>>  1 file changed, 16 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch_init.c b/arch_init.c
>> index 582b716..ce338aa 100644
>> --- a/arch_init.c
>> +++ b/arch_init.c
>> @@ -594,13 +594,19 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
>>                  ram_bulk_stage = false;
>>              }
>>          } else {
>> +            static uint8_t *ram_ptr;
>>              int ret;
>>              uint8_t *p;
>>              bool send_async = true;
>> -            int cont = (block == last_sent_block) ?
>> -                RAM_SAVE_FLAG_CONTINUE : 0;
>> +            int cont = 0;
>>
>> -            p = memory_region_get_ram_ptr(mr) + offset;
>> +            if (block != last_sent_block) {
>> +                ram_ptr = memory_region_get_ram_ptr(mr);
>> +            } else {
>> +                cont = RAM_SAVE_FLAG_CONTINUE;
>> +            }
>> +
>> +            p = ram_ptr + offset;
>>
>>              /* In doubt sent page as normal */
>>              bytes_sent = -1;
>> @@ -990,16 +996,17 @@ static inline void *host_from_stream_offset(QEMUFile *f,
>>                                              int flags)
>>  {
>>      static RAMBlock *block = NULL;
>> +    static uint8_t *ram_ptr;
>>      char id[256];
>>      uint8_t len;
>>
>>      if (flags & RAM_SAVE_FLAG_CONTINUE) {
>> -        if (!block) {
>> +        if (!block || !ram_ptr) {
>>              fprintf(stderr, "Ack, bad migration stream!\n");
>>              return NULL;
>>          }
>>
>> -        return memory_region_get_ram_ptr(block->mr) + offset;
>> +        return ram_ptr + offset;
>>      }
>>
>>      len = qemu_get_byte(f);
>> @@ -1007,8 +1014,10 @@ static inline void *host_from_stream_offset(QEMUFile *f,
>>      id[len] = 0;
>>
>>      QTAILQ_FOREACH(block, &ram_list.blocks, next) {
>> -        if (!strncmp(id, block->idstr, sizeof(id)))
>> -            return memory_region_get_ram_ptr(block->mr) + offset;
>> +        if (!strncmp(id, block->idstr, sizeof(id))) {
>> +            ram_ptr = memory_region_get_ram_ptr(block->mr);
>> +            return ram_ptr + offset;
>> +        }
>>      }
>>
>>      fprintf(stderr, "Can't find block %s!\n", id);
>>
>

Comments

Paolo Bonzini May 12, 2014, 6:09 a.m. UTC | #1
Il 10/05/2014 18:32, Peter Lieven ha scritto:
> What about XEN?
>

You're right, Xen wouldn't work.  Your original patch would not break it 
just because Xen doesn't use migration (but the code would be broken).

You would have to cache qemu_get_ram_block rather than qemu_get_ram_ptr, 
move RAMBlock to memory-internal.h, and split the RAMBlock + ram_addr_t 
=> void * conversion out of qemu_get_ram_ptr and into a separate 
function (to be used by memory_region_get_ram_ptr).

I'm not sure of the benefit of your patch though.  qemu_get_ram_block 
already has a 1-item cache, are you seeing a low hit rate there?  Or any 
other profiling that shows qemu_get_ram_ptr as hot?

Paolo
Peter Lieven May 12, 2014, 8:16 a.m. UTC | #2
Am 12.05.2014 08:09, schrieb Paolo Bonzini:
> Il 10/05/2014 18:32, Peter Lieven ha scritto:
>> What about XEN?
>>
>
> You're right, Xen wouldn't work.  Your original patch would not break it just because Xen doesn't use migration (but the code would be broken).
>
> You would have to cache qemu_get_ram_block rather than qemu_get_ram_ptr, move RAMBlock to memory-internal.h, and split the RAMBlock + ram_addr_t => void * conversion out of qemu_get_ram_ptr and into a separate function (to be used by memory_region_get_ram_ptr).
>
> I'm not sure of the benefit of your patch though.  qemu_get_ram_block already has a 1-item cache, are you seeing a low hit rate there?  Or any other profiling that shows qemu_get_ram_ptr as hot?

qemu_get_ram_ptr is hot only during migration. But the hit-rate of the LRU cache seems to be good. I am wondering if this is different if the migration
has difficulties to converge, but you might be right it should be neglectible.

I ran some basic migration tests with and without the patch. It might be that the results with the cache are slightly better, but the variance of the
results is high. I had to run a significant number of tests to get more evidence.

Peter

>
> Paolo
diff mbox

Patch

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 1d55ad9..3003875 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -161,6 +161,7 @@  struct MemoryRegion {
     unsigned ioeventfd_nb;
     MemoryRegionIoeventfd *ioeventfds;
     NotifierList iommu_notify;
+    void *ram_ptr;
 };
 
 /**
diff --git a/memory.c b/memory.c
index 3f1df23..78d4032 100644
--- a/memory.c
+++ b/memory.c
@@ -862,6 +862,7 @@  void memory_region_init(MemoryRegion *mr,
     mr->ioeventfd_nb = 0;
     mr->ioeventfds = NULL;
     mr->flush_coalesced_mmio = false;
+    mr->ram_ptr = NULL;
 }
 
 static uint64_t unassigned_mem_read(void *opaque, hwaddr addr,
@@ -1249,7 +1250,12 @@  void *memory_region_get_ram_ptr(MemoryRegion *mr)
 
     assert(mr->terminates);
 
-    return qemu_get_ram_ptr(mr->ram_addr & TARGET_PAGE_MASK);
+    if (mr->ram_ptr) {
+        return mr->ram_ptr;
+    }
+
+    mr->ram_ptr = qemu_get_ram_ptr(mr->ram_addr & TARGET_PAGE_MASK);
+    return mr->ram_ptr;
 }
 
 static void memory_region_update_coalesced_range_as(MemoryRegion *mr, AddressSpace *as)