Patchwork [v13,10/13] Add xbzrle_encode_buffer and xbzrle_decode_buffer functions

login
register
mail settings
Submitter Orit Wasserman
Date June 27, 2012, 10:34 a.m.
Message ID <1340793261-11400-11-git-send-email-owasserm@redhat.com>
Download mbox | patch
Permalink /patch/167606/
State New
Headers show

Comments

Orit Wasserman - June 27, 2012, 10:34 a.m.
Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
Signed-off-by: Petter Svard <petters@cs.umu.se>
Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 migration.h |    4 ++
 savevm.c    |  145 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 149 insertions(+), 0 deletions(-)
Eric Blake - June 27, 2012, 7:31 p.m.
On 06/27/2012 04:34 AM, Orit Wasserman wrote:
> Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
> Signed-off-by: Petter Svard <petters@cs.umu.se>
> Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>

> +int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
> +                         uint8_t *dst, int dlen)
> +{
> +    uint32_t zrun_len = 0, nzrun_len = 0;
> +    int d = 0 , i = 0;

s/0 ,/0,/

> +    int res, xor;

Bug.  You are declaring xor as an int, but assigning it by operations on
a long, and making conditional jumps based on the assignment.  If
sizeof(long) > sizeof(int), you will have truncation cause false positives.

> +    uint8_t *nzrun_start = NULL;

The algorithm will misbehave (run quite slow or even cause SIGBUS,
depending on the host architecture) if old_buf and new_buf have
different mis-alignments or if slen is not an even multiple, so
guaranteeing alignment up front saves us the effort of dealing with
corner cases.  You need to add something like this:

g_assert(!((uintptr_t)old_buf & (sizeof(long) - 1)) &&
         !((uintptr_t)new_buf & (sizeof(long) - 1)) &&
         !(slen & (sizeof(long) - 1)));

After all, we are only ever using this function to compress page data,
and pages should be aligned on entry as well as being a nice multiple in
length.

> +
> +    while (i < slen) {
> +        /* overflow */
> +        if (d + 2 > dlen) {
> +            return -1;
> +        }
> +
> +        /* not aligned to sizeof(long) */
> +        res = (slen - i) % sizeof(long);
> +        if (res) {
> +            while (!(old_buf[i] ^ new_buf[i]) && ++i <= res) {

Using '^' implies that you care about the difference, but in reality,
you only compare about whether there is a difference, not what the
difference is.  I would use '==' instead of '^' since some architectures
can compute (in)equality more efficiently than xor.

while (old_buf[i] == new_buf[i] && ++i <= res) {

> +                zrun_len++;
> +            }
> +        }
> +
> +        xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
> +        while (i <= slen - sizeof(long) && !xor) {
> +            i += sizeof(long);
> +            zrun_len += sizeof(long);
> +            xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
> +        }

Again, you aren't using xor for its value, so you can simplify this
entire loop:

while (i < slen && *(long *)(old_buf + i) == *(long*)(new_buf + i)) {
    i += sizeof(long);
    zrun_len += sizeof(long);
}

> +
> +        /* not aligned to sizeof(long) */
> +        res = (slen - i) % sizeof(long);
> +        if (res) {
> +            while (!(old_buf[i] ^ new_buf[i]) && ++i <= res) {
> +                zrun_len++;
> +            }
> +        }

Can have same simplification as above.

> +
> +        /* buffer unchanged */
> +        if (zrun_len == slen) {
> +            return 0;
> +        }
> +
> +        /* skip last zero run */
> +        if (i == slen + 1) {
> +            return d;
> +        }
> +
> +        d += uleb128_encode_small(dst + d, zrun_len);
> +
> +        zrun_len = 0;
> +        nzrun_start = new_buf + i;
> +
> +        /* not aligned to sizeof(long) */
> +        res = (slen - i) % sizeof(long);
> +        if (res) {
> +            while ((old_buf[i] ^ new_buf[i]) != 0 && ++i <= res) {
> +                nzrun_len++;
> +            }
> +        }

Can have same simplification as above, except using != instead of == for
checking for bytes that differ.

> +
> +        xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
> +        while (i <= slen - sizeof(long) && xor != 0) {
> +            i += sizeof(long);
> +            nzrun_len += sizeof(long);
> +            xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
> +        }

Unlike the zrun (where checking that two longs are equal means you can
increment by sizeof(long)), checking for the end of an nzrun requires
finding a 0 byte embedded within the xor of the two longs.  And that is
no longer something trivially easy to write.  Source code of strcmp() to
the rescue:

long mask = 0x0101010101010101ULL; /* truncation to 32-bit long okay */
xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
if ((xor - mask) & ~xor & (mask << 7)) {
    /* found the end of an nzrun within the current long */
} else {
    i += sizeof(long);
    nzrun_len += sizeof(long);
}

and wrap that in the appropriate while loop.

> +
> +        /* not aligned to sizeof(long) */
> +        res = (slen - i) % sizeof(long);
> +        if (res) {
> +            while ((old_buf[i] ^ new_buf[i]) != 0 && ++i <= res) {
> +                nzrun_len++;
> +            }
> +        }
> +
> +        /* overflow */
> +        if (d + nzrun_len + 2 > dlen) {
> +            return -1;
> +        }
> +
> +        d += uleb128_encode_small(dst + d, nzrun_len);
> +        memcpy(dst + d, nzrun_start, nzrun_len);
> +        d += nzrun_len;
> +        nzrun_len = 0;
> +    }
> +
> +    return d;
> +}

Definitely some work before the encode is correct.  I know you tested
migration speed, but did you test migration accuracy?  I'm afraid that
you ended up benchmarking with memory corruption rather than actual
migration.

> +
> +int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)

No comment before the function?

> +{
> +    int i = 0, d = 0;
> +    int ret;
> +    uint32_t count = 0;
> +
> +    while (i < slen) {
> +
> +        /* zrun */
> +        ret = uleb128_decode_small(src + i, &count);

If the user sends you malicious data, then they can arrange for the last
byte in the buffer to have bit 0x80 set, and uleb128_decode_small will
happily read not only the last byte of the buffer, but the next byte
beyond; this could even SIGSEGV if the buffer ended on a page boundary.
 Thankfully, it's trivial to prevent this from being a problem in
practice: our encoding requires us to end on an nzrun with non-zero
length, and therefore you are guaranteed that when decoding a zrun,
there will always be at least two more bytes in a valid stream, so you
should add this prior to the decode:

/* invalid input, since there must be room for an nzrun */
if (i == slen - 1) {
    return -1;
}

> +        if (ret < 0) {
> +            return -1;
> +        }
> +        i += ret;
> +        d += count;
> +
> +        /* overflow */
> +        if (d > dlen) {
> +            return -1;
> +        }
> +
> +        /* completed decoding */
> +        if (i == slen - 1) {
> +            return d;
> +        }

It looks like you thought about the idea of bad input, but you didn't
get the check quite right - you don't want to return success here.  This
is another place where a valid stream has at least two bytes (the
smallest possible nzrun is exactly two bytes, 1 for the length, and 1
byte of data).  I would replace this with:

/* invalid input, since an nzrun must have data */
if (i >= slen - 1) {
    return -1;
}

> +
> +        /* nzrun */
> +        ret = uleb128_decode_small(src + i, &count);
> +        if (ret < 0) {
> +            return -1;
> +        }
> +        i += ret;
> +
> +        /* overflow */
> +        if (d + count > dlen) {
> +            return -1;
> +        }

Missing one more overflow check - a malicious input could cause us to
try to read beyond slen.  You need:

if (i + count > slen) {
    return -1;
}

> +
> +        memcpy(dst + d , src + i, count);
> +        d += count;
> +        i += count;
> +    }
> +
> +    return d;
> +}
>
Orit Wasserman - June 28, 2012, 7:16 a.m.
On 06/27/2012 10:31 PM, Eric Blake wrote:
> On 06/27/2012 04:34 AM, Orit Wasserman wrote:
>> Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
>> Signed-off-by: Petter Svard <petters@cs.umu.se>
>> Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> 
>> +int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>> +                         uint8_t *dst, int dlen)
>> +{
>> +    uint32_t zrun_len = 0, nzrun_len = 0;
>> +    int d = 0 , i = 0;
> 
> s/0 ,/0,/
> 
>> +    int res, xor;
> 
> Bug.  You are declaring xor as an int, but assigning it by operations on
> a long, and making conditional jumps based on the assignment.  If
> sizeof(long) > sizeof(int), you will have truncation cause false positives.
> 
correct .
>> +    uint8_t *nzrun_start = NULL;
> 
> The algorithm will misbehave (run quite slow or even cause SIGBUS,
> depending on the host architecture) if old_buf and new_buf have
> different mis-alignments or if slen is not an even multiple, so
> guaranteeing alignment up front saves us the effort of dealing with
> corner cases.  You need to add something like this:
> 
> g_assert(!((uintptr_t)old_buf & (sizeof(long) - 1)) &&
>          !((uintptr_t)new_buf & (sizeof(long) - 1)) &&
>          !(slen & (sizeof(long) - 1)));
> 
> After all, we are only ever using this function to compress page data,
> and pages should be aligned on entry as well as being a nice multiple in
> length.
I will add the check. 
> 
>> +
>> +    while (i < slen) {
>> +        /* overflow */
>> +        if (d + 2 > dlen) {
>> +            return -1;
>> +        }
>> +
>> +        /* not aligned to sizeof(long) */
>> +        res = (slen - i) % sizeof(long);
>> +        if (res) {
>> +            while (!(old_buf[i] ^ new_buf[i]) && ++i <= res) {
> 
> Using '^' implies that you care about the difference, but in reality,
> you only compare about whether there is a difference, not what the
> difference is.  I would use '==' instead of '^' since some architectures
> can compute (in)equality more efficiently than xor.
> 
OK
> while (old_buf[i] == new_buf[i] && ++i <= res) {
> 
>> +                zrun_len++;
>> +            }
>> +        }
>> +
>> +        xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
>> +        while (i <= slen - sizeof(long) && !xor) {
>> +            i += sizeof(long);
>> +            zrun_len += sizeof(long);
>> +            xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
>> +        }
> 
> Again, you aren't using xor for its value, so you can simplify this
> entire loop:
> 
> while (i < slen && *(long *)(old_buf + i) == *(long*)(new_buf + i)) {
>     i += sizeof(long);
>     zrun_len += sizeof(long);
> }
> 
>> +
>> +        /* not aligned to sizeof(long) */
>> +        res = (slen - i) % sizeof(long);
>> +        if (res) {
>> +            while (!(old_buf[i] ^ new_buf[i]) && ++i <= res) {
>> +                zrun_len++;
>> +            }
>> +        }
> 
> Can have same simplification as above.
> 
>> +
>> +        /* buffer unchanged */
>> +        if (zrun_len == slen) {
>> +            return 0;
>> +        }
>> +
>> +        /* skip last zero run */
>> +        if (i == slen + 1) {
>> +            return d;
>> +        }
>> +
>> +        d += uleb128_encode_small(dst + d, zrun_len);
>> +
>> +        zrun_len = 0;
>> +        nzrun_start = new_buf + i;
>> +
>> +        /* not aligned to sizeof(long) */
>> +        res = (slen - i) % sizeof(long);
>> +        if (res) {
>> +            while ((old_buf[i] ^ new_buf[i]) != 0 && ++i <= res) {
>> +                nzrun_len++;
>> +            }
>> +        }
> 
> Can have same simplification as above, except using != instead of == for
> checking for bytes that differ.
> 
>> +
>> +        xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
>> +        while (i <= slen - sizeof(long) && xor != 0) {
>> +            i += sizeof(long);
>> +            nzrun_len += sizeof(long);
>> +            xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
>> +        }
> 
> Unlike the zrun (where checking that two longs are equal means you can
> increment by sizeof(long)), checking for the end of an nzrun requires
> finding a 0 byte embedded within the xor of the two longs.  And that is
> no longer something trivially easy to write.  Source code of strcmp() to
> the rescue:
> 
> long mask = 0x0101010101010101ULL; /* truncation to 32-bit long okay */
> xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
> if ((xor - mask) & ~xor & (mask << 7)) {
>     /* found the end of an nzrun within the current long */
> } else {
>     i += sizeof(long);
>     nzrun_len += sizeof(long);
> }
> 
> and wrap that in the appropriate while loop.
> 
I will look into it.
>> +
>> +        /* not aligned to sizeof(long) */
>> +        res = (slen - i) % sizeof(long);
>> +        if (res) {
>> +            while ((old_buf[i] ^ new_buf[i]) != 0 && ++i <= res) {
>> +                nzrun_len++;
>> +            }
>> +        }
>> +
>> +        /* overflow */
>> +        if (d + nzrun_len + 2 > dlen) {
>> +            return -1;
>> +        }
>> +
>> +        d += uleb128_encode_small(dst + d, nzrun_len);
>> +        memcpy(dst + d, nzrun_start, nzrun_len);
>> +        d += nzrun_len;
>> +        nzrun_len = 0;
>> +    }
>> +
>> +    return d;
>> +}
> 
> Definitely some work before the encode is correct.  I know you tested
> migration speed, but did you test migration accuracy?  I'm afraid that
> you ended up benchmarking with memory corruption rather than actual
> migration.
I have a unit test for encoding/decoding (finding memory corruption can be very hard).
It is not complete yet , adding test scenarios as I go.

> 
>> +
>> +int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
> 
> No comment before the function?
> 
>> +{
>> +    int i = 0, d = 0;
>> +    int ret;
>> +    uint32_t count = 0;
>> +
>> +    while (i < slen) {
>> +
>> +        /* zrun */
>> +        ret = uleb128_decode_small(src + i, &count);
> 
> If the user sends you malicious data, then they can arrange for the last
> byte in the buffer to have bit 0x80 set, and uleb128_decode_small will
> happily read not only the last byte of the buffer, but the next byte
> beyond; this could even SIGSEGV if the buffer ended on a page boundary.
>  Thankfully, it's trivial to prevent this from being a problem in
> practice: our encoding requires us to end on an nzrun with non-zero
> length, and therefore you are guaranteed that when decoding a zrun,
> there will always be at least two more bytes in a valid stream, so you
> should add this prior to the decode:
I will add a check that if there is only one byte left it's 0x80 bit is not set.
> 
> /* invalid input, since there must be room for an nzrun */
> if (i == slen - 1) {
>     return -1;
> }
> 
>> +        if (ret < 0) {
>> +            return -1;
>> +        }
>> +        i += ret;
>> +        d += count;
>> +
>> +        /* overflow */
>> +        if (d > dlen) {
>> +            return -1;
>> +        }
>> +
>> +        /* completed decoding */
>> +        if (i == slen - 1) {
>> +            return d;
>> +        }
> 
> It looks like you thought about the idea of bad input, but you didn't
> get the check quite right - you don't want to return success here.  This
> is another place where a valid stream has at least two bytes (the
> smallest possible nzrun is exactly two bytes, 1 for the length, and 1
> byte of data).  I would replace this with:
> 
> /* invalid input, since an nzrun must have data */
> if (i >= slen - 1) {
>     return -1;
> }
As an optimization the last zero run is not sent, so this case is valid.
> 
>> +
>> +        /* nzrun */
>> +        ret = uleb128_decode_small(src + i, &count);
>> +        if (ret < 0) {
>> +            return -1;
>> +        }
>> +        i += ret;
>> +
>> +        /* overflow */
>> +        if (d + count > dlen) {
>> +            return -1;
>> +        }
> 
> Missing one more overflow check - a malicious input could cause us to
> try to read beyond slen.  You need:
> 
> if (i + count > slen) {
>     return -1;
> }
> 
I will add it

Orit
>> +
>> +        memcpy(dst + d , src + i, count);
>> +        d += count;
>> +        i += count;
>> +    }
>> +
>> +    return d;
>> +}
>>
>

Patch

diff --git a/migration.h b/migration.h
index 1ae99f1..7582ecb 100644
--- a/migration.h
+++ b/migration.h
@@ -99,4 +99,8 @@  void migrate_add_blocker(Error *reason);
  */
 void migrate_del_blocker(Error *reason);
 
+int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
+                         uint8_t *dst, int dlen);
+int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen);
+
 #endif
diff --git a/savevm.c b/savevm.c
index d1d9020..26e7901 100644
--- a/savevm.c
+++ b/savevm.c
@@ -2374,3 +2374,148 @@  void vmstate_register_ram_global(MemoryRegion *mr)
 {
     vmstate_register_ram(mr, NULL);
 }
+
+/*
+  page = zrun nzrun
+       | zrun nzrun page
+
+  zrun = length
+
+  nzrun = length byte...
+
+  length = uleb128 encoded integer
+ */
+int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
+                         uint8_t *dst, int dlen)
+{
+    uint32_t zrun_len = 0, nzrun_len = 0;
+    int d = 0 , i = 0;
+    int res, xor;
+    uint8_t *nzrun_start = NULL;
+
+    while (i < slen) {
+        /* overflow */
+        if (d + 2 > dlen) {
+            return -1;
+        }
+
+        /* not aligned to sizeof(long) */
+        res = (slen - i) % sizeof(long);
+        if (res) {
+            while (!(old_buf[i] ^ new_buf[i]) && ++i <= res) {
+                zrun_len++;
+            }
+        }
+
+        xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
+        while (i <= slen - sizeof(long) && !xor) {
+            i += sizeof(long);
+            zrun_len += sizeof(long);
+            xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
+        }
+
+        /* not aligned to sizeof(long) */
+        res = (slen - i) % sizeof(long);
+        if (res) {
+            while (!(old_buf[i] ^ new_buf[i]) && ++i <= res) {
+                zrun_len++;
+            }
+        }
+
+        /* buffer unchanged */
+        if (zrun_len == slen) {
+            return 0;
+        }
+
+        /* skip last zero run */
+        if (i == slen + 1) {
+            return d;
+        }
+
+        d += uleb128_encode_small(dst + d, zrun_len);
+
+        zrun_len = 0;
+        nzrun_start = new_buf + i;
+
+        /* not aligned to sizeof(long) */
+        res = (slen - i) % sizeof(long);
+        if (res) {
+            while ((old_buf[i] ^ new_buf[i]) != 0 && ++i <= res) {
+                nzrun_len++;
+            }
+        }
+
+        xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
+        while (i <= slen - sizeof(long) && xor != 0) {
+            i += sizeof(long);
+            nzrun_len += sizeof(long);
+            xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i));
+        }
+
+        /* not aligned to sizeof(long) */
+        res = (slen - i) % sizeof(long);
+        if (res) {
+            while ((old_buf[i] ^ new_buf[i]) != 0 && ++i <= res) {
+                nzrun_len++;
+            }
+        }
+
+        /* overflow */
+        if (d + nzrun_len + 2 > dlen) {
+            return -1;
+        }
+
+        d += uleb128_encode_small(dst + d, nzrun_len);
+        memcpy(dst + d, nzrun_start, nzrun_len);
+        d += nzrun_len;
+        nzrun_len = 0;
+    }
+
+    return d;
+}
+
+int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
+{
+    int i = 0, d = 0;
+    int ret;
+    uint32_t count = 0;
+
+    while (i < slen) {
+
+        /* zrun */
+        ret = uleb128_decode_small(src + i, &count);
+        if (ret < 0) {
+            return -1;
+        }
+        i += ret;
+        d += count;
+
+        /* overflow */
+        if (d > dlen) {
+            return -1;
+        }
+
+        /* completed decoding */
+        if (i == slen - 1) {
+            return d;
+        }
+
+        /* nzrun */
+        ret = uleb128_decode_small(src + i, &count);
+        if (ret < 0) {
+            return -1;
+        }
+        i += ret;
+
+        /* overflow */
+        if (d + count > dlen) {
+            return -1;
+        }
+
+        memcpy(dst + d , src + i, count);
+        d += count;
+        i += count;
+    }
+
+    return d;
+}