Message ID | 4B7130E9.7060809@lab.ntt.co.jp |
---|---|
State | New |
Headers | show |
On 02/09/2010 11:54 AM, OHMURA Kei wrote: > Thank you for your comments. We have implemented the code which applied your > comments. This is patch for qemu-kvm.c. > Please reuse the changelog when reposing a patch, this makes it easier for me to apply it. > @@ -2438,27 +2438,34 @@ static int kvm_get_dirty_pages_log_range(unsigned long start_addr, > unsigned long offset, > unsigned long mem_size) > { > - unsigned int i, j, n = 0; > + unsigned int i, j, k, start, end; > unsigned char c; > unsigned long page_number, addr, addr1; > ram_addr_t ram_addr; > - unsigned int len = ((mem_size / TARGET_PAGE_SIZE) + 7) / 8; > + unsigned int len = ((mem_size / TARGET_PAGE_SIZE) + TARGET_LONG_BITS - 1) / > + TARGET_LONG_BITS; > + unsigned long *bitmap_ul = (unsigned long *)bitmap; > > /* > * bitmap-traveling is faster than memory-traveling (for addr...) > * especially when most of the memory is not dirty. > */ > for (i = 0; i < len; i++) { > - c = bitmap[i]; > - while (c > 0) { > - j = ffsl(c) - 1; > - c &= ~(1u << j); > - page_number = i * 8 + j; > - addr1 = page_number * TARGET_PAGE_SIZE; > - addr = offset + addr1; > - ram_addr = cpu_get_physical_page_desc(addr); > - cpu_physical_memory_set_dirty(ram_addr); > - n++; > + if (bitmap_ul[i] != 0) { > + start = i * TARGET_LONG_SIZE; > + end = (i + 1) * TARGET_LONG_SIZE; > Should be a host long size, not guest. This will fail when running a 32-bit qemu-system-x86_64 binary. > + for (j = start; j < end; j++) { > + c = bitmap[j]; > + while (c > 0) { > + k = ffsl(c) - 1; > + c &= ~(1u << k); > + page_number = j * 8 + k; > + addr1 = page_number * TARGET_PAGE_SIZE; > + addr = offset + addr1; > + ram_addr = cpu_get_physical_page_desc(addr); > + cpu_physical_memory_set_dirty(ram_addr); > + } > + } > Instead of using a nested loop if bitmap_ul[i] != 0, it is possible to use just a single loop (while (c > 0)), and process a long's worth of data. The only trickery is with big endian hosts, where the conversion from bit number to page number is a bit complicated. If we do this, we can convert the bitmap's type to unsigned long throughout, and avoid the casts.
> Please reuse the changelog when reposing a patch, this makes it easier > for me to apply it. Thanks. Will follow it from next time. > Should be a host long size, not guest. This will fail when running a > 32-bit qemu-system-x86_64 binary. Sorry. That was our mistake. > Instead of using a nested loop if bitmap_ul[i] != 0, it is possible to > use just a single loop (while (c> 0)), and process a long's worth of data. > > The only trickery is with big endian hosts, where the conversion from > bit number to page number is a bit complicated. To convert the bitmap from big endian to little endian, le_bswap macro in bswap.h seems useful, which is now undefined. What do you think about this approach? This is an example bitmap-traveling code using le_bswap: /* * bitmap-traveling is faster than memory-traveling (for addr...) * especially when most of the memory is not dirty. */ for (i = 0; i < len; i++) { if (bitmap_ul[i] != 0) { c = le_bswap(bitmap_ul[i], HOST_LONG_BITS); while (c > 0) { j = ffsl(c) - 1; c &= ~(1ul << j); page_number = i * HOST_LONG_BITS + j; addr1 = page_number * TARGET_PAGE_SIZE; addr = offset + addr1; ram_addr = cpu_get_physical_page_desc(addr); cpu_physical_memory_set_dirty(ram_addr); } } }
On 02/10/2010 11:55 AM, OHMURA Kei wrote: > >> Instead of using a nested loop if bitmap_ul[i] != 0, it is possible to >> use just a single loop (while (c> 0)), and process a long's worth of data. >> >> The only trickery is with big endian hosts, where the conversion from >> bit number to page number is a bit complicated. >> > To convert the bitmap from big endian to little endian, le_bswap macro in > bswap.h seems useful, which is now undefined. What do you think about this > approach? > > This is an example bitmap-traveling code using le_bswap: > /* > * bitmap-traveling is faster than memory-traveling (for addr...) > * especially when most of the memory is not dirty. > */ > for (i = 0; i < len; i++) { > if (bitmap_ul[i] != 0) { > c = le_bswap(bitmap_ul[i], HOST_LONG_BITS); > while (c > 0) { > j = ffsl(c) - 1; > c &= ~(1ul << j); > page_number = i * HOST_LONG_BITS + j; > addr1 = page_number * TARGET_PAGE_SIZE; > addr = offset + addr1; > ram_addr = cpu_get_physical_page_desc(addr); > cpu_physical_memory_set_dirty(ram_addr); > } > } > } > Yes, that solves the problem very neatly.
diff --git a/qemu-kvm.c b/qemu-kvm.c index a305907..d7474ea 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -2438,27 +2438,34 @@ static int kvm_get_dirty_pages_log_range(unsigned long start_addr, unsigned long offset, unsigned long mem_size) { - unsigned int i, j, n = 0; + unsigned int i, j, k, start, end; unsigned char c; unsigned long page_number, addr, addr1; ram_addr_t ram_addr; - unsigned int len = ((mem_size / TARGET_PAGE_SIZE) + 7) / 8; + unsigned int len = ((mem_size / TARGET_PAGE_SIZE) + TARGET_LONG_BITS - 1) / + TARGET_LONG_BITS; + unsigned long *bitmap_ul = (unsigned long *)bitmap; /* * bitmap-traveling is faster than memory-traveling (for addr...) * especially when most of the memory is not dirty. */ for (i = 0; i < len; i++) { - c = bitmap[i]; - while (c > 0) { - j = ffsl(c) - 1; - c &= ~(1u << j); - page_number = i * 8 + j; - addr1 = page_number * TARGET_PAGE_SIZE; - addr = offset + addr1; - ram_addr = cpu_get_physical_page_desc(addr); - cpu_physical_memory_set_dirty(ram_addr); - n++; + if (bitmap_ul[i] != 0) { + start = i * TARGET_LONG_SIZE; + end = (i + 1) * TARGET_LONG_SIZE; + for (j = start; j < end; j++) { + c = bitmap[j]; + while (c > 0) { + k = ffsl(c) - 1; + c &= ~(1u << k); + page_number = j * 8 + k; + addr1 = page_number * TARGET_PAGE_SIZE; + addr = offset + addr1; + ram_addr = cpu_get_physical_page_desc(addr); + cpu_physical_memory_set_dirty(ram_addr); + } + } } } return 0;
Thank you for your comments. We have implemented the code which applied your comments. This is patch for qemu-kvm.c. Signed-off-by: OHMURA Kei <ohmura.kei@lab.ntt.co.jp> --- qemu-kvm.c | 31 +++++++++++++++++++------------ 1 files changed, 19 insertions(+), 12 deletions(-)