diff mbox

migration: use XBZRLE only after bulk stage

Message ID 5130ADC0.9070402@dlhnet.de
State New
Headers show

Commit Message

Peter Lieven March 1, 2013, 1:31 p.m. UTC
at the beginning of migration all pages are marked dirty and
in the first round a bulk migration of all pages is performed.

currently all these pages are copied to the page cache regardless
if there are frequently updated or not. this doesn't make sense
since most of these pages are never transferred again.

this patch changes the XBZRLE transfer to only be used after
the bulk stage has been completed. that means a page is added
to the page cache the second time it is transferred and XBZRLE
can benefit from the third time of transfer.

since the page cache is likely smaller than the number of pages
its also likely that in the second round the page is missing in the
cache due to collisions in the bulk phase.

on the other hand a lot of unneccssary mallocs, memdups and frees
are saved.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
  arch_init.c |    5 ++++-
  1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Eric Blake March 1, 2013, 1:52 p.m. UTC | #1
On 03/01/2013 06:31 AM, Peter Lieven wrote:
> at the beginning of migration all pages are marked dirty and
> in the first round a bulk migration of all pages is performed.
> 
> currently all these pages are copied to the page cache regardless
> if there are frequently updated or not. this doesn't make sense
> since most of these pages are never transferred again.
> 
> this patch changes the XBZRLE transfer to only be used after
> the bulk stage has been completed. that means a page is added
> to the page cache the second time it is transferred and XBZRLE
> can benefit from the third time of transfer.
> 
> since the page cache is likely smaller than the number of pages
> its also likely that in the second round the page is missing in the
> cache due to collisions in the bulk phase.
> 
> on the other hand a lot of unneccssary mallocs, memdups and frees

s/unneccssary/unnecessary/

> are saved.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>

Do you have any benchmark numbers?  At any rate, the explanation seems
sound, so a benchmark should show this.

> ---
>  arch_init.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)

Reviewed-by: Eric Blake <eblake@redhat.com>
Paolo Bonzini March 1, 2013, 1:53 p.m. UTC | #2
Il 01/03/2013 14:52, Eric Blake ha scritto:
> On 03/01/2013 06:31 AM, Peter Lieven wrote:
>> at the beginning of migration all pages are marked dirty and
>> in the first round a bulk migration of all pages is performed.
>>
>> currently all these pages are copied to the page cache regardless
>> if there are frequently updated or not. this doesn't make sense
>> since most of these pages are never transferred again.
>>
>> this patch changes the XBZRLE transfer to only be used after
>> the bulk stage has been completed. that means a page is added
>> to the page cache the second time it is transferred and XBZRLE
>> can benefit from the third time of transfer.
>>
>> since the page cache is likely smaller than the number of pages
>> its also likely that in the second round the page is missing in the
>> cache due to collisions in the bulk phase.
>>
>> on the other hand a lot of unneccssary mallocs, memdups and frees
> 
> s/unneccssary/unnecessary/
> 
>> are saved.
>>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
> 
> Do you have any benchmark numbers?  At any rate, the explanation seems
> sound, so a benchmark should show this.

It probably would be much less of a problem with the pending patches to
move RAM migration out of the big QEMU lock.  However, the explanation
makes sense.

Paolo

>> ---
>>  arch_init.c |    5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> Reviewed-by: Eric Blake <eblake@redhat.com>
>
Peter Lieven March 1, 2013, 2:06 p.m. UTC | #3
On 01.03.2013 14:52, Eric Blake wrote:
> On 03/01/2013 06:31 AM, Peter Lieven wrote:
>> at the beginning of migration all pages are marked dirty and
>> in the first round a bulk migration of all pages is performed.
>>
>> currently all these pages are copied to the page cache regardless
>> if there are frequently updated or not. this doesn't make sense
>> since most of these pages are never transferred again.
>>
>> this patch changes the XBZRLE transfer to only be used after
>> the bulk stage has been completed. that means a page is added
>> to the page cache the second time it is transferred and XBZRLE
>> can benefit from the third time of transfer.
>>
>> since the page cache is likely smaller than the number of pages
>> its also likely that in the second round the page is missing in the
>> cache due to collisions in the bulk phase.
>>
>> on the other hand a lot of unneccssary mallocs, memdups and frees
>
> s/unneccssary/unnecessary/
>
>> are saved.
>>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>
> Do you have any benchmark numbers?  At any rate, the explanation seems
> sound, so a benchmark should show this.

Do you have a particular test pattern in mind? If there is nothing going on
in the VM XBZRLE will not be better than normal copy at all.

Otherwise you will have N xbzrle misses and 0 xbzrle pages without the patch
and 0 xbzrle misses and 0 xbzrle pages with the patch.

Peter

>
>> ---
>>   arch_init.c |    5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> Reviewed-by: Eric Blake <eblake@redhat.com>
>
Eric Blake March 1, 2013, 2:08 p.m. UTC | #4
On 03/01/2013 07:06 AM, Peter Lieven wrote:
>> Do you have any benchmark numbers?  At any rate, the explanation seems
>> sound, so a benchmark should show this.
> 
> Do you have a particular test pattern in mind? If there is nothing going on
> in the VM XBZRLE will not be better than normal copy at all.
> 
> Otherwise you will have N xbzrle misses and 0 xbzrle pages without the
> patch
> and 0 xbzrle misses and 0 xbzrle pages with the patch.

How about a migration of a guest running the synthetic r/w load
generator in docs/xbzrle.txt?
Peter Lieven March 1, 2013, 2:13 p.m. UTC | #5
On 01.03.2013 15:08, Eric Blake wrote:
> On 03/01/2013 07:06 AM, Peter Lieven wrote:
>>> Do you have any benchmark numbers?  At any rate, the explanation seems
>>> sound, so a benchmark should show this.
>>
>> Do you have a particular test pattern in mind? If there is nothing going on
>> in the VM XBZRLE will not be better than normal copy at all.
>>
>> Otherwise you will have N xbzrle misses and 0 xbzrle pages without the
>> patch
>> and 0 xbzrle misses and 0 xbzrle pages with the patch.
>
> How about a migration of a guest running the synthetic r/w load
> generator in docs/xbzrle.txt?
>
Good idea. I will leave max downtime and bandwidth at default values.

Would you be happy with 1GB vRAM and 256MB page cache?

Peter
Eric Blake March 1, 2013, 2:23 p.m. UTC | #6
On 03/01/2013 07:13 AM, Peter Lieven wrote:
> On 01.03.2013 15:08, Eric Blake wrote:
>> On 03/01/2013 07:06 AM, Peter Lieven wrote:
>>>> Do you have any benchmark numbers?  At any rate, the explanation seems
>>>> sound, so a benchmark should show this.
>>>
>>> Do you have a particular test pattern in mind? If there is nothing
>>> going on
>>> in the VM XBZRLE will not be better than normal copy at all.
>>>
>>> Otherwise you will have N xbzrle misses and 0 xbzrle pages without the
>>> patch
>>> and 0 xbzrle misses and 0 xbzrle pages with the patch.
>>
>> How about a migration of a guest running the synthetic r/w load
>> generator in docs/xbzrle.txt?
>>
> Good idea. I will leave max downtime and bandwidth at default values.
> 
> Would you be happy with 1GB vRAM and 256MB page cache?

Sure - just any run that you can do that shows before and after numbers,
and that is described well enough to be a reproducible test.  Final
statistics on the migration (pages transferred, cache hits and misses,
etc) and time spent on the migration will hopefully show an improvement,
but most important is that they do not show a regression.
Peter Lieven March 1, 2013, 2:50 p.m. UTC | #7
On 01.03.2013 15:23, Eric Blake wrote:
> On 03/01/2013 07:13 AM, Peter Lieven wrote:
>> On 01.03.2013 15:08, Eric Blake wrote:
>>> On 03/01/2013 07:06 AM, Peter Lieven wrote:
>>>>> Do you have any benchmark numbers?  At any rate, the explanation seems
>>>>> sound, so a benchmark should show this.
>>>>
>>>> Do you have a particular test pattern in mind? If there is nothing
>>>> going on
>>>> in the VM XBZRLE will not be better than normal copy at all.
>>>>
>>>> Otherwise you will have N xbzrle misses and 0 xbzrle pages without the
>>>> patch
>>>> and 0 xbzrle misses and 0 xbzrle pages with the patch.
>>>
>>> How about a migration of a guest running the synthetic r/w load
>>> generator in docs/xbzrle.txt?
>>>
>> Good idea. I will leave max downtime and bandwidth at default values.
>>
>> Would you be happy with 1GB vRAM and 256MB page cache?
>
> Sure - just any run that you can do that shows before and after numbers,
> and that is described well enough to be a reproducible test.  Final
> statistics on the migration (pages transferred, cache hits and misses,
> etc) and time spent on the migration will hopefully show an improvement,
> but most important is that they do not show a regression.
>

just a quick test on my desktop:

~/git/qemu$ x86_64-softmmu/qemu-system-x86_64 -k de -enable-kvm -m 1024 -drive if=virtio,file=iscsi://172.21.200.31/iqn.2001-05.com.equallogic:0-8a0906-16470e107-713001aa6de511e0-001-test/0 -vnc :1 -boot dc -monitor stdio

using ubuntu 12.04.1 desktop and the example from docs/xbzrle.txt

a) with the patch

(qemu) info migrate
capabilities: xbzrle: on
Migration status: completed
total time: 22185 milliseconds
downtime: 29 milliseconds
transferred ram: 706034 kbytes
remaining ram: 0 kbytes
total ram: 1057216 kbytes
duplicate: 108556 pages
normal: 175146 pages
normal bytes: 700584 kbytes
cache size: 67108864 bytes
xbzrle transferred: 3127 kbytes
xbzrle pages: 117811 pages
xbzrle cache miss: 18750
xbzrle overflow : 0

b) without the patch

(qemu) info migrate
capabilities: xbzrle: on
Migration status: completed
total time: 22410 milliseconds
downtime: 21 milliseconds
transferred ram: 721318 kbytes
remaining ram: 0 kbytes
total ram: 1057216 kbytes
duplicate: 105553 pages
normal: 179589 pages
normal bytes: 718356 kbytes
cache size: 67108864 bytes
xbzrle transferred: 630 kbytes
xbzrle pages: 21527 pages
xbzrle cache miss: 179589
xbzrle overflow : 0
Eric Blake March 1, 2013, 4:04 p.m. UTC | #8
On 03/01/2013 07:50 AM, Peter Lieven wrote:

> just a quick test on my desktop:
> 
> ~/git/qemu$ x86_64-softmmu/qemu-system-x86_64 -k de -enable-kvm -m 1024
> -drive
> if=virtio,file=iscsi://172.21.200.31/iqn.2001-05.com.equallogic:0-8a0906-16470e107-713001aa6de511e0-001-test/0
> -vnc :1 -boot dc -monitor stdio
> 
> using ubuntu 12.04.1 desktop and the example from docs/xbzrle.txt

Thanks.  Reformatting a bit:

> 
> a) with the patch

designated with '+'

> b) without the patch

designated with '-'

+ total time: 22185 milliseconds
- total time: 22410 milliseconds

Shaved 0.3 seconds, better than 1%!

+ downtime: 29 milliseconds
- downtime: 21 milliseconds

Not sure why downtime seemed worse, but probably not the end of the world.

+ transferred ram: 706034 kbytes
- transferred ram: 721318 kbytes

Fewer bytes sent - good.

+ remaining ram: 0 kbytes
- remaining ram: 0 kbytes
+ total ram: 1057216 kbytes
- total ram: 1057216 kbytes
+ duplicate: 108556 pages
- duplicate: 105553 pages
+ normal: 175146 pages
- normal: 179589 pages
+ normal bytes: 700584 kbytes
- normal bytes: 718356 kbytes

Fewer normal bytes...

+ cache size: 67108864 bytes
- cache size: 67108864 bytes
+ xbzrle transferred: 3127 kbytes
- xbzrle transferred: 630 kbytes

...and more compressed pages sent - good.

+ xbzrle pages: 117811 pages
- xbzrle pages: 21527 pages
+ xbzrle cache miss: 18750
- xbzrle cache miss: 179589

And very good improvement on the cache miss rate.

+ xbzrle overflow : 0
- xbzrle overflow : 0

Thanks, this proves it's a good patch.
Peter Lieven March 4, 2013, 5:10 p.m. UTC | #9
Am 01.03.2013 um 17:04 schrieb Eric Blake <eblake@redhat.com>:

> On 03/01/2013 07:50 AM, Peter Lieven wrote:
> 
>> just a quick test on my desktop:
>> 
>> ~/git/qemu$ x86_64-softmmu/qemu-system-x86_64 -k de -enable-kvm -m 1024
>> -drive
>> if=virtio,file=iscsi://172.21.200.31/iqn.2001-05.com.equallogic:0-8a0906-16470e107-713001aa6de511e0-001-test/0
>> -vnc :1 -boot dc -monitor stdio
>> 
>> using ubuntu 12.04.1 desktop and the example from docs/xbzrle.txt
> 
> Thanks.  Reformatting a bit:
> 
>> 
>> a) with the patch
> 
> designated with '+'
> 
>> b) without the patch
> 
> designated with '-'
> 
> + total time: 22185 milliseconds
> - total time: 22410 milliseconds
> 
> Shaved 0.3 seconds, better than 1%!
> 
> + downtime: 29 milliseconds
> - downtime: 21 milliseconds
> 
> Not sure why downtime seemed worse, but probably not the end of the world.
> 
> + transferred ram: 706034 kbytes
> - transferred ram: 721318 kbytes
> 
> Fewer bytes sent - good.
> 
> + remaining ram: 0 kbytes
> - remaining ram: 0 kbytes
> + total ram: 1057216 kbytes
> - total ram: 1057216 kbytes
> + duplicate: 108556 pages
> - duplicate: 105553 pages
> + normal: 175146 pages
> - normal: 179589 pages
> + normal bytes: 700584 kbytes
> - normal bytes: 718356 kbytes
> 
> Fewer normal bytes...
> 
> + cache size: 67108864 bytes
> - cache size: 67108864 bytes
> + xbzrle transferred: 3127 kbytes
> - xbzrle transferred: 630 kbytes
> 
> ...and more compressed pages sent - good.
> 
> + xbzrle pages: 117811 pages
> - xbzrle pages: 21527 pages
> + xbzrle cache miss: 18750
> - xbzrle cache miss: 179589
> 
> And very good improvement on the cache miss rate.
> 
> + xbzrle overflow : 0
> - xbzrle overflow : 0
> 
> Thanks, this proves it's a good patch.

At least for the artificially generated load.

@Paolo: Have you seen my other question? Can the same page be transferred in the same round
more than once? If yes, I have to improve the patch for that case.

Peter
Paolo Bonzini March 4, 2013, 5:33 p.m. UTC | #10
Il 04/03/2013 18:10, Peter Lieven ha scritto:
>> > + xbzrle overflow : 0
>> > - xbzrle overflow : 0
>> > 
>> > Thanks, this proves it's a good patch.
> At least for the artificially generated load.
> 
> @Paolo: Have you seen my other question? Can the same page be transferred in the same round
> more than once? If yes, I have to improve the patch for that case.

I have no idea, I was leaving that to Orit.

How would it affect the patch?

Paolo
Peter Lieven March 4, 2013, 5:39 p.m. UTC | #11
Am 04.03.2013 um 18:33 schrieb Paolo Bonzini <pbonzini@redhat.com>:

> Il 04/03/2013 18:10, Peter Lieven ha scritto:
>>>> + xbzrle overflow : 0
>>>> - xbzrle overflow : 0
>>>> 
>>>> Thanks, this proves it's a good patch.
>> At least for the artificially generated load.
>> 
>> @Paolo: Have you seen my other question? Can the same page be transferred in the same round
>> more than once? If yes, I have to improve the patch for that case.
> 
> I have no idea, I was leaving that to Orit.
> 
> How would it affect the patch?

It currently starts using XBZRLE after the first complete round has been completed. If it was possible to transfer
the same page more than once in the first round, the patch would significantly slow that down.

I will dig into the code and try to find out as well.

Peter


> 
> Paolo
Orit Wasserman March 4, 2013, 6:51 p.m. UTC | #12
On 03/04/2013 07:10 PM, Peter Lieven wrote:
> 
> Am 01.03.2013 um 17:04 schrieb Eric Blake <eblake@redhat.com>:
> 
>> On 03/01/2013 07:50 AM, Peter Lieven wrote:
>>
>>> just a quick test on my desktop:
>>>
>>> ~/git/qemu$ x86_64-softmmu/qemu-system-x86_64 -k de -enable-kvm -m 1024
>>> -drive
>>> if=virtio,file=iscsi://172.21.200.31/iqn.2001-05.com.equallogic:0-8a0906-16470e107-713001aa6de511e0-001-test/0
>>> -vnc :1 -boot dc -monitor stdio
>>>
>>> using ubuntu 12.04.1 desktop and the example from docs/xbzrle.txt
>>
>> Thanks.  Reformatting a bit:
>>
>>>
>>> a) with the patch
>>
>> designated with '+'
>>
>>> b) without the patch
>>
>> designated with '-'
>>
>> + total time: 22185 milliseconds
>> - total time: 22410 milliseconds
>>
>> Shaved 0.3 seconds, better than 1%!
>>
>> + downtime: 29 milliseconds
>> - downtime: 21 milliseconds
>>
>> Not sure why downtime seemed worse, but probably not the end of the world.
>>
>> + transferred ram: 706034 kbytes
>> - transferred ram: 721318 kbytes
>>
>> Fewer bytes sent - good.
>>
>> + remaining ram: 0 kbytes
>> - remaining ram: 0 kbytes
>> + total ram: 1057216 kbytes
>> - total ram: 1057216 kbytes
>> + duplicate: 108556 pages
>> - duplicate: 105553 pages
>> + normal: 175146 pages
>> - normal: 179589 pages
>> + normal bytes: 700584 kbytes
>> - normal bytes: 718356 kbytes
>>
>> Fewer normal bytes...
>>
>> + cache size: 67108864 bytes
>> - cache size: 67108864 bytes
>> + xbzrle transferred: 3127 kbytes
>> - xbzrle transferred: 630 kbytes
>>
>> ...and more compressed pages sent - good.
>>
>> + xbzrle pages: 117811 pages
>> - xbzrle pages: 21527 pages
>> + xbzrle cache miss: 18750
>> - xbzrle cache miss: 179589
>>
>> And very good improvement on the cache miss rate.
>>
>> + xbzrle overflow : 0
>> - xbzrle overflow : 0
>>
>> Thanks, this proves it's a good patch.
> 
> At least for the artificially generated load.
> 
> @Paolo: Have you seen my other question? Can the same page be transferred in the same round
> more than once? If yes, I have to improve the patch for that case.
The same page can't be transferred more than once in the same round.

Orit
> 
> Peter
>
diff mbox

Patch

diff --git a/arch_init.c b/arch_init.c
index 8da868b..24241e0 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -347,6 +347,7 @@  static ram_addr_t last_offset;
  static unsigned long *migration_bitmap;
  static uint64_t migration_dirty_pages;
  static uint32_t last_version;
+static bool ram_bulk_stage;

  static inline
  ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
@@ -451,6 +452,7 @@  static int ram_save_block(QEMUFile *f, bool last_stage)
              if (!block) {
                  block = QTAILQ_FIRST(&ram_list.blocks);
                  complete_round = true;
+                ram_bulk_stage = false;
              }
          } else {
              uint8_t *p;
@@ -467,7 +469,7 @@  static int ram_save_block(QEMUFile *f, bool last_stage)
                                              RAM_SAVE_FLAG_COMPRESS);
                  qemu_put_byte(f, *p);
                  bytes_sent += 1;
-            } else if (migrate_use_xbzrle()) {
+            } else if (!ram_bulk_stage && migrate_use_xbzrle()) {
                  current_addr = block->offset + offset;
                  bytes_sent = save_xbzrle_page(f, p, current_addr, block,
                                                offset, cont, last_stage);
@@ -554,6 +556,7 @@  static void reset_ram_globals(void)
      last_sent_block = NULL;
      last_offset = 0;
      last_version = ram_list.version;
+    ram_bulk_stage = true;
  }

  #define MAX_WAIT 50 /* ms, half buffered_file limit */