Patchwork [v15,2/9] Add XBZRLE documentation

login
register
mail settings
Submitter Orit Wasserman
Date July 5, 2012, 12:51 p.m.
Message ID <1341492709-13897-3-git-send-email-owasserm@redhat.com>
Download mbox | patch
Permalink /patch/169158/
State New
Headers show

Comments

Orit Wasserman - July 5, 2012, 12:51 p.m.
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 docs/xbzrle.txt |  136 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 136 insertions(+), 0 deletions(-)
 create mode 100644 docs/xbzrle.txt
Eric Blake - July 5, 2012, 1:24 p.m.
On 07/05/2012 06:51 AM, Orit Wasserman wrote:
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> ---
>  docs/xbzrle.txt |  136 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 136 insertions(+), 0 deletions(-)
>  create mode 100644 docs/xbzrle.txt
> 

> +
> +Example
> +old buffer:
> +1001 zeros
> +05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 00 00 6b 00 6d
> +3074 zeros

This _still_ doesn't add up to 4096:

1001 + 20 + 3074 = 4095

> +
> +new buffer:
> +1001 zeros
> +01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 00 00 67 00 69
> +3704 zeros

Still a transposition error.

Also, this still has the flaw that it is too weak of an example - the
only unchanged bytes happen to also be zero bytes to begin with; it
would be much nicer if the example included at least one non-zero byte
that did not change between old and new.

> +
> +encoded buffer:
> +
> +encoded length 24
> +e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69
                                                         ^^
That says you have a zrun of 3 bytes, but the example only shows a zrun
of 2 bytes.

It feels like I'm pulling teeth to get a good example.  If you will just
squash in the following (hand-written) diff below, you will then have
4096 bytes in both old and new buffers, and your encoded buffer listing
a zrun of 3 will be correct, plus you will be demonstrating a non-zero
byte that remained unchanged.

@@ ???,??? @@
 Example
 old buffer:
 1001 zeros
-05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 00 00 6b 00 6d
+05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 68 00 00 6b 00 6d
 3074 zeros

 new buffer:
 1001 zeros
-01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 00 00 67 00 69
-3704 zeros
+01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 68 00 00 67 00 69
+3074 zeros

 encoded buffer:
Orit Wasserman - July 5, 2012, 5:22 p.m.
On 07/05/2012 04:24 PM, Eric Blake wrote:
> On 07/05/2012 06:51 AM, Orit Wasserman wrote:
>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
>> ---
>>  docs/xbzrle.txt |  136 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 136 insertions(+), 0 deletions(-)
>>  create mode 100644 docs/xbzrle.txt
>>
> 
>> +
>> +Example
>> +old buffer:
>> +1001 zeros
>> +05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 00 00 6b 00 6d
>> +3074 zeros
> 
> This _still_ doesn't add up to 4096:
> 
> 1001 + 20 + 3074 = 4095
> 
>> +
>> +new buffer:
>> +1001 zeros
>> +01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 00 00 67 00 69
>> +3704 zeros
> 
> Still a transposition error.
> 
> Also, this still has the flaw that it is too weak of an example - the
> only unchanged bytes happen to also be zero bytes to begin with; it
> would be much nicer if the example included at least one non-zero byte
> that did not change between old and new.
> 
>> +
>> +encoded buffer:
>> +
>> +encoded length 24
>> +e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69
>                                                          ^^
> That says you have a zrun of 3 bytes, but the example only shows a zrun
> of 2 bytes.
> 
> It feels like I'm pulling teeth to get a good example.  If you will just
> squash in the following (hand-written) diff below, you will then have
> 4096 bytes in both old and new buffers, and your encoded buffer listing
> a zrun of 3 will be correct, plus you will be demonstrating a non-zero
> byte that remained unchanged.
> 
> @@ ???,??? @@
>  Example
>  old buffer:
>  1001 zeros
> -05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 00 00 6b 00 6d
> +05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 68 00 00 6b 00 6d
>  3074 zeros
> 
>  new buffer:
>  1001 zeros
> -01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 00 00 67 00 69
> -3704 zeros
> +01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 68 00 00 67 00 69
> +3074 zeros
> 
>  encoded buffer:
> 
I will use this example ,
thanks,
Orit

Patch

diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt
new file mode 100644
index 0000000..cb567e6
--- /dev/null
+++ b/docs/xbzrle.txt
@@ -0,0 +1,136 @@ 
+XBZRLE (Xor Based Zero Run Length Encoding)
+===========================================
+
+Using XBZRLE (Xor Based Zero Run Length Encoding) allows for the reduction
+of VM downtime and the total live-migration time of Virtual machines.
+It is particularly useful for virtual machines running memory write intensive
+workloads that are typical of large enterprise applications such as SAP ERP
+Systems, and generally speaking for any application that uses a sparse memory
+update pattern.
+
+Instead of sending the changed guest memory page this solution will send a
+compressed version of the updates, thus reducing the amount of data sent during
+live migration.
+In order to be able to calculate the update, the previous memory pages need to
+be stored on the source. Those pages are stored in a dedicated cache
+(hash table) and are
+accessed by their address.
+The larger the cache size the better the chances are that the page has already
+been stored in the cache.
+A small cache size will result in high cache miss rate.
+Cache size can be changed before and during migration.
+
+Format
+=======
+
+The compression format performs a XOR between the previous and current content
+of the page, where zero represents an unchanged value.
+The page data delta is represented by zero and non zero runs.
+A zero run is represented by its length (in bytes).
+A non zero run is represented by its length (in bytes) and the new data.
+The run length is encoded using ULEB128 (http://en.wikipedia.org/wiki/LEB128)
+
+There can be more than one valid encoding, the sender may send a longer encoding
+for the benefit of reducing computation cost.
+
+page = zrun nzrun
+       | zrun nzrun page
+
+zrun = length
+
+nzrun = length byte...
+
+length = uleb128 encoded integer
+
+On the sender side XBZRLE is used as a compact delta encoding of page updates,
+retrieving the old page content from the cache (default size of 512 MB). The
+receiving side uses the existing page's content and XBZRLE to decode the new
+page's content.
+
+This work was originally based on research results published
+VEE 2011: Evaluation of Delta Compression Techniques for Efficient Live
+Migration of Large Virtual Machines by Benoit, Svard, Tordsson and Elmroth.
+Additionally the delta encoder XBRLE was improved further using the XBZRLE
+instead.
+
+XBZRLE has a sustained bandwidth of 2-2.5 GB/s for typical workloads making it
+ideal for in-line, real-time encoding such as is needed for live-migration.
+
+Example
+old buffer:
+1001 zeros
+05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 00 00 6b 00 6d
+3074 zeros
+
+new buffer:
+1001 zeros
+01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 00 00 67 00 69
+3704 zeros
+
+encoded buffer:
+
+encoded length 24
+e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69
+
+Migration Capabilities
+======================
+In order to use XBZRLE the destination QEMU version should be able to
+decode the new format.
+Adding a new migration capabilities command that will allow external management
+to query for it support.
+A typical use for the destination
+    {qemu} info migrate_capabilities
+    {qemu} xbzrle, ...
+
+In order to enable capabilities for future live migration,
+a new command migrate_set_parameter is introduced:
+    {qemu} migrate_set_parameter xbzrle
+
+Usage
+======
+
+1. Activate xbzrle
+2. Set the XBZRLE cache size - the cache size is in MBytes and should be a
+power of 2. The cache default value is 64MBytes.
+3. start outgoing migration
+
+A typical usage scenario:
+On the incoming QEMU:
+    {qemu} migrate_set_parameter xbzrle on
+On the outgoing QEMU:
+    {qemu} migrate_set_parameter xbzrle on
+    {qemu} migrate_set_cachesize 256m
+    {qemu} migrate -d tcp:destination.host:4444
+    {qemu} info migrate
+    ...
+    cache size: 67108864 bytes
+    transferred ram-duplicate: A kbytes
+    transferred ram-normal: B kbytes
+    transferred ram-xbrle: C kbytes
+    overflow ram-xbrle: D pages
+    cache-miss ram-xbrle: E pages
+
+cache-miss: the number of cache misses to date - high cache-miss rate
+indicates that the cache size is set too low.
+overflow: the number of overflows in the decoding which where the delta could
+not be compressed. This can happen if the changes in the pages are too large
+or there are many short changes; for example, changing every second byte (half a
+page).
+
+Testing: Testing indicated that live migration with XBZRLE was completed in 110
+seconds, whereas without it would not be able to complete.
+
+A simple synthetic memory r/w load generator:
+..    include <stdlib.h>
+..    include <stdio.h>
+..    int main()
+..    {
+..        char *buf = (char *) calloc(4096, 4096);
+..        while (1) {
+..            int i;
+..            for (i = 0; i < 4096 * 4; i++) {
+..                buf[i * 4096 / 4]++;
+..            }
+..            printf(".");
+..        }
+..    }