[RFC] optimize is_dup_page for zero pages

Message ID 35D32F2E-08C9-4CDC-9636-E8DE47A2A50C@dlhnet.de
State New
Headers show

Commit Message

Peter Lieven March 12, 2013, 12:15 p.m.
Am 12.03.2013 um 13:02 schrieb Paolo Bonzini <pbonzini@redhat.com>:

> Il 12/03/2013 12:51, Peter Lieven ha scritto:
>>>> buffer_is_zero is used in somewhat special cases (block
>>>> streaming/copy-on-read) where throughput doesn't really matter, unlike
>>>> is_dup_page/find_zero_bit which are used in migration.  But you can use
>>>> similar code for is_dup_page and buffer_is_zero.
>> ok, i will prepare a patch series for review. at the moment without touching
>> is_dup_page(). you can decide later if you use buffer_Is_zero to check
>> for zero pages later (maybe if the first x-bit are zero).
>> Two comments on changing is_dup_page() to is_zero_page():
>> a) Would it make sense to only check for zero pages in the first (bulk) round?
> Interesting idea.  Benchmark it. :)

What approach would you use to test it? It again depends on the load.
If there is no software running on the VM that is zeroing out large areas of memory
I would bet there is no need looking for dup pages.

>> b) Would it make sense to not transfer zero pages at all in the first round?
> Perhaps yes, but I'm not sure how to efficiently implement it.  There
> really isn't a well-specified first round in the RAM migration code.  Of
> course you could have another bitmap for known-zero pages.

what about this I used to limit XBZRLE to non-bulk stage:


> But zero pages should be rare in real-world testcases, except for
> ballooned pages.  The OS should try to use free memory for caches.
>> The memory at the target should read as zero (not allocated) anyway.
> Paolo


diff --git a/arch_init.c b/arch_init.c
index 1b71912..d48b914 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -326,6 +326,7 @@  static ram_addr_t last_offset;
 static unsigned long *migration_bitmap;
 static uint64_t migration_dirty_pages;
 static uint32_t last_version;
+static bool ram_bulk_stage;
 static inline
 ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
@@ -433,6 +434,7 @@  static int ram_save_block(QEMUFile *f, bool last_stage)
             if (!block) {
                 block = QTAILQ_FIRST(&ram_list.blocks);
                 complete_round = true;
+                ram_bulk_stage = false;
         } else {
             uint8_t *p;
@@ -536,6 +538,7 @@  static void reset_ram_globals(void)
     last_sent_block = NULL;
     last_offset = 0;
     last_version = ram_list.version;
+    ram_bulk_stage = true;
 #define MAX_WAIT 50 /* ms, half buffered_file limit */