Message ID | 4B6BF06D.1090909@lab.ntt.co.jp |
---|---|
State | New |
Headers | show |
> Sounds logical - do you have numbers on the improvement? Sure. The patch showed approximately 3-7 times speed up when measured with rdtsc. The test environment and detailed results are described below. --- tmp = rdtsc(); /* function of original code*/ t1 += rdtsc() - tmp; tmp = rdtsc(); /* function of this patch */ t2 += rdtsc() - tmp; --- Test Envirionment: CPU: 4x Intel Xeon Quad Core 2.66GHz Mem size: 6GB kvm version: 2.6.31-17-server qemu version: commit ed880109f74f0a4dd5b7ec09e6a2d9ba4903d9a5 Host OS: Ubuntu 9.10 (kernel 2.6.31) Guest OS: Debian/GNU Linux lenny (kernel 2.6.26) Guest Mem size: 512MB We executed live migration three times. This data shows, how many times the function is called (#called), runtime of original (orig.), runtime of this patch (patch), speedup ratio (ratio), when live migration run. Experimental results: Test1: Guest OS read 3GB file, which is bigger than memory. #called orig.(msec) patch(msec) ratio 114 1.00 0.15 6.76 132 1.57 0.25 6.26 96 1.00 0.16 6.27 Test2: Guest OS read/write 3GB file, which is bigger than memory. #called orig.(msec) patch(msec) ratio 2196 38.1 10.6 3.59 2256 39.6 10.8 3.68 2112 36.3 10.3 3.53 > Would be great if you could provide a version for upstream as well > because it will likely replace this qemu-kvm code on day. O.K. We'll prepare it. We'll also post a patch set to quicken dirty pages checking in ram_save_block and ram_save_live soon.
On 02/05/2010 12:18 PM, OHMURA Kei wrote: > dirty-bitmap-traveling is carried out by byte size in qemu-kvm.c. > But We think that dirty-bitmap-traveling by long size is faster than by byte > size especially when most of memory is not dirty. > > > > + > +static int kvm_get_dirty_pages_log_range_by_long(unsigned long start_addr, > + unsigned char *bitmap, > + unsigned long offset, > + unsigned long mem_size) > +{ > + unsigned int i; > + unsigned int len; > + unsigned long *bitmap_ul = (unsigned long *)bitmap; > + > + /* bitmap-traveling by long size is faster than by byte size > + * especially when most of memory is not dirty. > + * bitmap should be long-size aligned for traveling by long. > + */ > + if (((unsigned long)bitmap & (TARGET_LONG_SIZE - 1)) == 0) { > Since we allocate the bitmap, we can be sure that it is aligned on a long boundary (qemu_malloc() should guarantee that). So you can eliminate the fallback. > + len = ((mem_size / TARGET_PAGE_SIZE) + TARGET_LONG_BITS - 1) / > + TARGET_LONG_BITS; > + for (i = 0; i < len; i++) > + if (bitmap_ul[i] != 0) > + kvm_get_dirty_pages_log_range_by_byte(i * TARGET_LONG_SIZE, > + (i + 1) * TARGET_LONG_SIZE, bitmap, offset); > Better to just use the original loop here (since we don't need the function as a fallback). > + /* > + * We will check the remaining dirty-bitmap, > + * when the mem_size is not a multiple of TARGET_LONG_SIZE. > + */ > + if ((mem_size & (TARGET_LONG_SIZE - 1)) != 0) { > + len = ((mem_size / TARGET_PAGE_SIZE) + 7) / 8; > + kvm_get_dirty_pages_log_range_by_byte(i * TARGET_LONG_SIZE, > + len, bitmap, offset); > + } > Seems like the bitmap size is also aligned as well (allocated using BITMAP_SIZE which aligns using HOST_LONG_BITS), so this is unnecessary as well.
diff --git a/qemu-kvm.c b/qemu-kvm.c index a305907..5459cdd 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -2433,22 +2433,21 @@ int kvm_physical_memory_set_dirty_tracking(int enable) } /* get kvm's dirty pages bitmap and update qemu's */ -static int kvm_get_dirty_pages_log_range(unsigned long start_addr, - unsigned char *bitmap, - unsigned long offset, - unsigned long mem_size) +static void kvm_get_dirty_pages_log_range_by_byte(unsigned int start, + unsigned int end, + unsigned char *bitmap, + unsigned long offset) { unsigned int i, j, n = 0; unsigned char c; unsigned long page_number, addr, addr1; ram_addr_t ram_addr; - unsigned int len = ((mem_size / TARGET_PAGE_SIZE) + 7) / 8; /* * bitmap-traveling is faster than memory-traveling (for addr...) * especially when most of the memory is not dirty. */ - for (i = 0; i < len; i++) { + for (i = start; i < end; i++) { c = bitmap[i]; while (c > 0) { j = ffsl(c) - 1; @@ -2461,13 +2460,49 @@ static int kvm_get_dirty_pages_log_range(unsigned long start_addr, n++; } } +} + +static int kvm_get_dirty_pages_log_range_by_long(unsigned long start_addr, + unsigned char *bitmap, + unsigned long offset, + unsigned long mem_size) +{ + unsigned int i; + unsigned int len; + unsigned long *bitmap_ul = (unsigned long *)bitmap; + + /* bitmap-traveling by long size is faster than by byte size + * especially when most of memory is not dirty. + * bitmap should be long-size aligned for traveling by long. + */ + if (((unsigned long)bitmap & (TARGET_LONG_SIZE - 1)) == 0) { + len = ((mem_size / TARGET_PAGE_SIZE) + TARGET_LONG_BITS - 1) / + TARGET_LONG_BITS; + for (i = 0; i < len; i++) + if (bitmap_ul[i] != 0) + kvm_get_dirty_pages_log_range_by_byte(i * TARGET_LONG_SIZE, + (i + 1) * TARGET_LONG_SIZE, bitmap, offset); + /* + * We will check the remaining dirty-bitmap, + * when the mem_size is not a multiple of TARGET_LONG_SIZE. + */ + if ((mem_size & (TARGET_LONG_SIZE - 1)) != 0) { + len = ((mem_size / TARGET_PAGE_SIZE) + 7) / 8; + kvm_get_dirty_pages_log_range_by_byte(i * TARGET_LONG_SIZE, + len, bitmap, offset); + } + } else { /* slow path: traveling by byte. */ + len = ((mem_size / TARGET_PAGE_SIZE) + 7) / 8; + kvm_get_dirty_pages_log_range_by_byte(0, len, bitmap, offset); + } + return 0; } static int kvm_get_dirty_bitmap_cb(unsigned long start, unsigned long len, void *bitmap, void *opaque) { - return kvm_get_dirty_pages_log_range(start, bitmap, start, len); + return kvm_get_dirty_pages_log_range_by_long(start, bitmap, start, len); } /*
dirty-bitmap-traveling is carried out by byte size in qemu-kvm.c. But We think that dirty-bitmap-traveling by long size is faster than by byte size especially when most of memory is not dirty. Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp> --- qemu-kvm.c | 49 ++++++++++++++++++++++++++++++++++++++++++------- 1 files changed, 42 insertions(+), 7 deletions(-) -- 1.6.3.3