xbzrle: don't check the value in the vm ram repeatedly

Message ID	1396079541-7112-1-git-send-email-arei.gonglei@huawei.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: <arei.gonglei@huawei.com> To: <qemu-devel@nongnu.org> Date: Sat, 29 Mar 2014 15:52:21 +0800 Message-ID: <1396079541-7112-1-git-send-email-arei.gonglei@huawei.com> MIME-Version: 1.0 Content-Type: text/plain Cc: ChenLiang <chenliang88@huawei.com>, weidong.huang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, owasserm@redhat.com, Gonglei <arei.gonglei@huawei.com>, mrhines@us.ibm.com, pbonzini@redhat.com Subject: [Qemu-devel] [PATCH] xbzrle: don't check the value in the vm ram repeatedly Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Gonglei (Arei) March 29, 2014, 7:52 a.m. UTC

From: ChenLiang <chenliang88@huawei.com>

xbzrle_encode_buffer checks the value in the ram repeatedly.
It is risk if runs xbzrle_encode_buffer on changing data.
And it is not necessary.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 xbzrle.c | 4 ++++
 1 file changed, 4 insertions(+)

Eric Blake March 29, 2014, 1:53 p.m. UTC | #1

On 03/29/2014 01:52 AM, arei.gonglei@huawei.com wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> xbzrle_encode_buffer checks the value in the ram repeatedly.
> It is risk if runs xbzrle_encode_buffer on changing data.
> And it is not necessary.
> 
> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  xbzrle.c | 4 ++++
>  1 file changed, 4 insertions(+)

Insufficient.  Observe what we did at line 52 when looking for the zero-run:

>>         /* word at a time for speed */
>>         if (!res) {
>>             while (i < slen &&
>>                    (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {

This dereferences 8 bytes at new_buf[i] (on a 64-bit machine)...

>>                 i += sizeof(long);
>>                 zrun_len += sizeof(long);
>>             }
>> 
>>             /* go over the rest */
>>             while (i < slen && old_buf[i] == new_buf[i]) {

...and this also dereferences those same bytes.  But your argument is
that new_buf[i] is volatile.

You MUST read new_buf[i] into a temporary variable, and if the first
read found a difference, then the "go over the rest" analysis must be on
that temporary, rather than going back to reading directly from new_buf.

陈梁 March 29, 2014, 2:15 p.m. UTC | #2

> On 03/29/2014 01:52 AM, arei.gonglei@huawei.com wrote:
>> From: ChenLiang <chenliang88@huawei.com>
>> 
>> xbzrle_encode_buffer checks the value in the ram repeatedly.
>> It is risk if runs xbzrle_encode_buffer on changing data.
>> And it is not necessary.
>> 
>> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>> xbzrle.c | 4 ++++
>> 1 file changed, 4 insertions(+)
> 
> Insufficient.  Observe what we did at line 52 when looking for the zero-run:
> 
>>>        /* word at a time for speed */
>>>        if (!res) {
>>>            while (i < slen &&
>>>                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> 
> This dereferences 8 bytes at new_buf[i] (on a 64-bit machine)...
> 
>>>                i += sizeof(long);
>>>                zrun_len += sizeof(long);
>>>            }
>>> 
>>>            /* go over the rest */
>>>            while (i < slen && old_buf[i] == new_buf[i]) {
> 
> ...and this also dereferences those same bytes.  But your argument is
> that new_buf[i] is volatile.
> 
> You MUST read new_buf[i] into a temporary variable, and if the first
> read found a difference, then the "go over the rest" analysis must be on
> that temporary, rather than going back to reading directly from new_buf.
> 
It is ok, we just need to guarantee that the pages in cache are same to the page in dest side. 
Don‘t care about whether they are same to src side. Because the modified pages during this
time will be sent at next time.
> -- 
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>

Eric Blake March 29, 2014, 2:26 p.m. UTC | #3

On 03/29/2014 08:15 AM, 陈梁 wrote:

>>
>> Insufficient.  Observe what we did at line 52 when looking for the zero-run:
>>
>>>>        /* word at a time for speed */
>>>>        if (!res) {
>>>>            while (i < slen &&
>>>>                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
>>
>> This dereferences 8 bytes at new_buf[i] (on a 64-bit machine)...
>>
>>>>                i += sizeof(long);
>>>>                zrun_len += sizeof(long);
>>>>            }
>>>>
>>>>            /* go over the rest */
>>>>            while (i < slen && old_buf[i] == new_buf[i]) {
>>
>> ...and this also dereferences those same bytes.  But your argument is
>> that new_buf[i] is volatile.
>>
>> You MUST read new_buf[i] into a temporary variable, and if the first
>> read found a difference, then the "go over the rest" analysis must be on
>> that temporary, rather than going back to reading directly from new_buf.
>>
> It is ok, we just need to guarantee that the pages in cache are same to the page in dest side. 
> Don‘t care about whether they are same to src side. Because the modified pages during this
> time will be sent at next time.

No, it is not okay - if you are reading the same bytes from new_buf, and
new_buf is volatile, then you MUST read each byte of new_buf exactly
once.  Consider this sequence:

old_buf contains

00 00 00 00 00 00 00 00

new_buf initially contains

00 00 00 00 00 00 00 01

so the zrun detection spots that the two buffers are different on the
8-byte read.  But then another thread modifies new_buf to be all zeros.
 Now the "go over the rest" loop does 8 reads, expecting to find a
difference, but it can't find one.

You really need to do the "go over the rest" loop on an 8-byte temporary
variable.  Ever since your patch made new_buf be a volatile buffer,
rather than a static copy, you MUST visit each byte of new_buf exactly once.

Eric Blake March 29, 2014, 3:33 p.m. UTC | #4

On 03/29/2014 09:00 AM, 陈梁 wrote:

>>> You really need to do the "go over the rest" loop on an 8-byte temporary
>>> variable.  Ever since your patch made new_buf be a volatile buffer,
>>> rather than a static copy, you MUST visit each byte of new_buf exactly once.
>>>
>> hmm, thanks. get it. Maybe we can do it like this

> sorry, it should like this
> 
>         /* word at a time for speed */
>         if (!res) {
>             while (i < slen &&
>                    (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
>                 i += sizeof(long);
>                 zrun_len += sizeof(long);
>             }
> 
>             /* go over the rest */
>             //while (i < slen && old_buf[i] == new_buf[i]) {
>             //    zrun_len++;
>             //    i++;
>             //}
>         }

No, that's not right either.  Once you have made a decision based on
something you have read, you must proceed with that decision.  In the
case, you broke out of the while loop because you found a difference.
Now you must report the location of that difference, as of the time
where you read it and without re-reading from new_buf.  The ONLY viable
solution is to read the contents of new_buf into a temporary, then do
all subsequent actions on that temporary.

Eric Blake March 29, 2014, 3:38 p.m. UTC | #5

On 03/29/2014 09:12 AM, 陈梁 wrote:

>        /* word at a time for speed */
>        if (!res) {
>            while (i < slen &&
>                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
>                i += sizeof(long);
>                zrun_len += sizeof(long);
>            }
> 
>            /* go over the rest */
>            //while (i < slen && old_buf[i] == new_buf[i]) {
>            //    zrun_len++;
>            //    i++;
>            //}
>        }
>        i += sizeof(long);
>        nzrun_len += sizeof(long);

That does not produce a minimal compression (it treats all 8 bytes as
different, even if 7 of them were the same).  It might be a viable
solution if the extra overhead of the additional bytes sent over the
wire is less than the overhead saved by not re-checking the temporary
variable to determine a better compression.  But I'm not convinced - it
seems that reading memory into a register, then doing multiple
operations on that cached value, will be a win (that is, minimal
compression matters, because time spent transmitting bytes over the
network is slower than time spent calculating how to avoid bytes to
transmit).

Dr. David Alan Gilbert March 31, 2014, 9:40 a.m. UTC | #6

* ???? (chenliang0016@icloud.com) wrote:

> It is ok, we just need to guarantee that the pages in cache are same to the page in dest side. 
> Don??t care about whether they are same to src side. Because the modified pages during this
> time will be sent at next time.

It's an interesting, if unusual, observation; it means we can send
completely bogus data at this point because we know it will get
overwritten later; I think the requirements are:

  1) That we meet the protocol (which seems to require that the run lengths are
     not allowed to be 0)
  2) That we don't get stuck in any loops or go over the end of the page
     (I think this means we have to be careful of those byte loops within
     the word-at-a-time cases)
  3) The page that ends up in our xbzrle cache must match the destination
     page, since the next cycle of xbzrle will use it as reference.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

xbzrle: don't check the value in the vm ram repeatedly

Commit Message

Comments

Patch