Patchwork [v14,10/13] Add xbzrle_encode_buffer and xbzrle_decode_buffer functions

login
register
mail settings
Submitter Orit Wasserman
Date July 3, 2012, 1:52 p.m.
Message ID <1341323574-23206-11-git-send-email-owasserm@redhat.com>
Download mbox | patch
Permalink /patch/168818/
State New
Headers show

Comments

Orit Wasserman - July 3, 2012, 1:52 p.m.
Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
Signed-off-by: Petter Svard <petters@cs.umu.se>
Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 migration.h |    4 ++
 savevm.c    |  172 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 176 insertions(+), 0 deletions(-)
Eric Blake - July 3, 2012, 9:32 p.m.
On 07/03/2012 07:52 AM, Orit Wasserman wrote:
> Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
> Signed-off-by: Petter Svard <petters@cs.umu.se>
> Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>

> +int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
> +                         uint8_t *dst, int dlen)
> +{
> +    uint32_t zrun_len = 0, nzrun_len = 0;
> +    int d = 0, i = 0, start;
> +    long res, xor;
> +    uint8_t *nzrun_start = NULL;
> +
> +    g_assert(!((uintptr_t)old_buf & (sizeof(long) - 1)) &&
> +             !((uintptr_t)new_buf & (sizeof(long) - 1)) &&
> +             !(slen & (sizeof(long) - 1)));

I know I suggested this in v13, so now I'm bike-shedding my own review,
but I think it might be a bit more legible as:

g_assert(!(((uintptr_t)old_buf) | ((uintptr_t)new_buf) | slen) &
           (sizeof(long) - 1)));

> +
> +    while (i < slen) {
> +        /* overflow */
> +        if (d + 2 > dlen) {
> +            return -1;
> +        }
> +
> +        /* not aligned to sizeof(long) */
> +        res = (slen - i) % sizeof(long);
> +        if (res) {
> +            start = i;
> +            while (!(old_buf[i] ^ new_buf[i]) && (i - start) <= res) {
> +                zrun_len++;
> +                i++;
> +            }
> +        }

I'm not sure why you need 'start'; this same block will do the same
amount of computation with less typing and register pressure:

res = (slen - i) % sizeof(long);
while (res && old_buf[i] == new_buf[i]) {
    zrun_len++;
    i++;
    res--;
}

> +
> +        if (zrun_len == res) {

If you use my above rewrite, then this condition changes to:

if (!res) {

> +            while (i <= (slen - sizeof(long)) &&

I'd shorten this to:

while (i < slen &&

> +                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> +                i += sizeof(long);
> +                zrun_len += sizeof(long);
> +            }
> +
> +            /* go over the rest */
> +            while (old_buf[i] == new_buf[i] && ++i <= slen) {

Buffer overrun: if the page ends on a zrun, then at this point, i ==
slen, and old_buf[i] is now pointing one past the end of the array.  You
need to swap the conditions, and only increment 'i' inside the {}, as in:

while (i < slen && old_buf[i] == new_buf[i]) {
    i++;

> +                zrun_len++;
> +            }
> +        }
> +
> +        /* buffer unchanged */
> +        if (zrun_len == slen) {
> +            return 0;
> +        }
> +
> +        /* skip last zero run */
> +        if (i == slen + 1) {

Buffer overrun.  The loop should terminate at 'i == slen', not at 'i ==
slen + 1', since old_buf[slen - 1] is the last addressable byte.

> +            return d;
> +        }
> +
> +        d += uleb128_encode_small(dst + d, zrun_len);
> +
> +        /* no nzrun */
> +        if (i == slen) {
> +            return d;
> +        }

This is the correct code but repeats the intent of what you were trying
just before the uleb128_encode_small().

> +
> +        zrun_len = 0;
> +        nzrun_start = new_buf + i;

Missing an overflow check here.  If there are fewer than 2 bytes before
we hit dlen, then there is no way that we can fit in an nzrun, so we
might as well quit here.

> +
> +        /* not aligned to sizeof(long) */
> +        res = (slen - i) % sizeof(long);
> +        if (res) {
> +            start = i;
> +            while (old_buf[i] != new_buf[i] && i - start < res) {
> +                i++;
> +                nzrun_len++;
> +            }
> +        }

Again, you can exploit the fact that slen is aligned, and shorten to:

res = (slen - i) % sizeof(long);
while (res && old_bf[i] != new_buf[i]) {
    i++;
    nzrun_len++;
    res--;
}

> +
> +        if (nzrun_len == res) {

and this would change to:

if (!res) {

> +            /* truncation to 32-bit long okay */
> +            long mask = 0x0101010101010101ULL;
> +            xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
> +            printf("cont\n");

Stray debugging comment.

> +            while (i <= (slen - sizeof(long))) {

shorter to write as:

while (i < slen) {

> +                if ((xor - mask) & ~xor & (mask << 7)) {
> +                    /* found the end of an nzrun within the current long */
> +                    while (old_buf[i] != new_buf[i] && ++i <= slen) {

No need to compare against slen here, since the outer while guarantees
that.  You can simplify to:

while (old_buf[i] != new_buf[i]) {
    i++;

> +                        nzrun_len++;
> +                    }
> +                    break;
> +                } else {
> +                    i += sizeof(long);
> +                    nzrun_len += sizeof(long);
> +                    xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
> +                }
> +            }
> +

At this point, you are guaranteed to have found the end of the nzrun
(either you found two equal bytes, or you hit slen), which means...

> +            while (old_buf[i] != new_buf[i] && ++i <= slen) {
> +                nzrun_len++;
> +            }

this while loop is dead code.  Furthermore, this while loop is a
potential out-of-bounds-read (remember, old_buf[slen] is not valid
memory), so nuking it is the way to go.

> +        }
> +
> +        /* overflow */
> +        if (d + nzrun_len + 2 > dlen) {
> +            return -1;

There is a corner case where this reports overflow even when there was
none - that is when the old_buf ends in an nzrun of 0x7f bytes or less,
so the encoding occupies only 1 byte, and the encoded version would
still fit in dlen.  I think you want to defer your overflow checking
until after...

> +        }
> +
> +        d += uleb128_encode_small(dst + d, nzrun_len);

...you know the encoded size.  But that only works if you made sure
earlier that there was room to encode nzrun_len (up to 2 bytes) in the
first place.

> +        memcpy(dst + d, nzrun_start, nzrun_len);
> +        d += nzrun_len;
> +        nzrun_len = 0;
> +    }
> +
> +    return d;
> +}
> +

Do you have any unit tests that pass in various inputs and compare
against expected outputs?

> +int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
> +{
> +    int i = 0, d = 0;
> +    int ret;
> +    uint32_t count = 0;
> +
> +    while (i < slen) {
> +
> +        /* zrun */
> +        if ((slen - i) < 2 && *(src + i) & 0x80) {
> +            return -1;
> +        }

You can simplify this - the protocol demands an nzrun after every zrun,
which means at this point in the decode, there must be at least 3 bytes
to be valid.  It's simpler to write:

if (slen - i < 2) {
    return -1;
}

> +
> +        ret = uleb128_decode_small(src + i, &count);
> +        if (ret < 0) {
> +            return -1;
> +        }

I'd also add a check that we only ever get a zero-length zrun at the
start of a buffer (anywhere else, a zero-length zrun is invalid); as in:

if (!count && i) {
    return -1;
}

> +        i += ret;
> +        d += count;
> +
> +        /* overflow */
> +        if (d > dlen) {
> +            return -1;
> +        }
> +
> +        /* completed decoding */

No, if we got here, then we were handed an invalid stream.  Remember,
the protocol says that every zrun must be followed by an nzrun.

> +        if (i == slen - 1) {
> +            return d;
> +        }
> +
> +        /* nzrun */
> +        if ((slen - i) < 2 && *(src + i) & 0x80) {
> +            return -1;
> +        }

Again, a valid nzrun is at least 2 bytes, so no need to worry about the
contents of *src, it is sufficient to check:

if (slen - i < 2) {
    return -1;
}

> +        ret = uleb128_decode_small(src + i, &count);
> +        if (ret < 0) {

An nzrun should be a non-zero value; I'd write this as (ret <= 0) to
rule out an attempt to pass a zero-length nzrun.

> +            return -1;
> +        }
> +        i += ret;
> +
> +        /* overflow */
> +        if (d + count > dlen) {
> +            return -1;
> +        }
> +
> +        memcpy(dst + d , src + i, count);
> +        d += count;
> +        i += count;
> +    }
> +
> +    return d;
> +}
>
Eric Blake - July 3, 2012, 9:39 p.m.
On 07/03/2012 03:32 PM, Eric Blake wrote:

>> +        ret = uleb128_decode_small(src + i, &count);
>> +        if (ret < 0) {
> 
> An nzrun should be a non-zero value; I'd write this as (ret <= 0) to
> rule out an attempt to pass a zero-length nzrun.

Correcting myself,

if (ret < 0 || !count) {

At this point, I think I will just bite the bullet and post a version of
this code that incorporates my review.
Eric Blake - July 4, 2012, 12:20 a.m.
On 07/03/2012 03:39 PM, Eric Blake wrote:
> On 07/03/2012 03:32 PM, Eric Blake wrote:
> 
>>> +        ret = uleb128_decode_small(src + i, &count);
>>> +        if (ret < 0) {
>>
>> An nzrun should be a non-zero value; I'd write this as (ret <= 0) to
>> rule out an attempt to pass a zero-length nzrun.
> 
> Correcting myself,
> 
> if (ret < 0 || !count) {
> 
> At this point, I think I will just bite the bullet and post a version of
> this code that incorporates my review.

Something like this (lightly tested):

/*
  page = zrun nzrun
       | zrun nzrun page

  zrun = length

  nzrun = length byte...

  length = uleb128 encoded integer
 */
int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
                         uint8_t *dst, int dlen)
{
    uint32_t zrun_len = 0, nzrun_len = 0;
    int d = 0, i = 0;
    long res, xor;
    uint8_t *nzrun_start = NULL;

    g_assert(!(((uintptr_t)old_buf | (uintptr_t)new_buf | slen) %
	       sizeof(long)));

    while (i < slen) {
        /* overflow */
        if (d + 2 > dlen) {
            return -1;
        }

        /* not aligned to sizeof(long) */
        res = (slen - i) % sizeof(long);
        while (res && old_buf[i] == new_buf[i]) {
            zrun_len++;
            i++;
            res--;
        }

        if (!res) {
            while (i < slen &&
                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
                i += sizeof(long);
                zrun_len += sizeof(long);
            }

            /* go over the rest */
            while (i < slen && old_buf[i] == new_buf[i]) {
                zrun_len++;
		i++;
            }
        }

        /* buffer unchanged */
        if (zrun_len == slen) {
            return 0;
        }

        /* skip last zero run */
        if (i == slen) {
            return d;
        }

        d += uleb128_encode_small(dst + d, zrun_len);

        zrun_len = 0;
        nzrun_start = new_buf + i;

        /* overflow */
        if (d + 2 > dlen) {
            return -1;
        }

        /* not aligned to sizeof(long) */
        res = (slen - i) % sizeof(long);
        while (res && old_buf[i] != new_buf[i]) {
            nzrun_len++;
            i++;
            res--;
        }

        if (!res) {
            /* truncation to 32-bit long okay */
            long mask = 0x0101010101010101ULL;
            while (i < slen) {
                xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
	        if ((xor - mask) & ~xor & (mask << 7)) {
		    /* found the end of an nzrun within the current long */
		    while (old_buf[i] != new_buf[i]) {
                        nzrun_len++;
                        i++;
                    }
                    break;
                } else {
                    i += sizeof(long);
                    nzrun_len += sizeof(long);
                }
            }
        }

        d += uleb128_encode_small(dst + d, nzrun_len);

        /* overflow */
        if (d + nzrun_len > dlen) {
            return -1;
        }

        memcpy(dst + d, nzrun_start, nzrun_len);
        d += nzrun_len;
        nzrun_len = 0;
    }

    return d;
}

int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
{
    int i = 0, d = 0;
    int ret;
    uint32_t count = 0;

    while (i < slen) {

        /* zrun */
        if (slen - i < 2) {
            return -1;
        }

        ret = uleb128_decode_small(src + i, &count);
        if (ret < 0 || (i && !count)) {
            return -1;
        }
        i += ret;
        d += count;

        /* overflow */
        if (d > dlen) {
            return -1;
        }

        /* nzrun */
        if (slen - i < 2) {
            return -1;
        }
        ret = uleb128_decode_small(src + i, &count);
        if (ret < 0 || !count) {
            return -1;
        }
        i += ret;

        /* overflow */
        if (d + count > dlen || i + count > slen) {
            return -1;
        }

        memcpy(dst + d , src + i, count);
        d += count;
        i += count;
    }

    return d;
}
Orit Wasserman - July 4, 2012, 7:24 a.m.
On 07/04/2012 12:32 AM, Eric Blake wrote:
> On 07/03/2012 07:52 AM, Orit Wasserman wrote:
>> Signed-off-by: Benoit Hudzia <benoit.hudzia@sap.com>
>> Signed-off-by: Petter Svard <petters@cs.umu.se>
>> Signed-off-by: Aidan Shribman <aidan.shribman@sap.com>
>> Signed-off-by: Orit Wasserman <owasserm@redhat.com>
> 
>> +int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>> +                         uint8_t *dst, int dlen)
>> +{
>> +    uint32_t zrun_len = 0, nzrun_len = 0;
>> +    int d = 0, i = 0, start;
>> +    long res, xor;
>> +    uint8_t *nzrun_start = NULL;
>> +
>> +    g_assert(!((uintptr_t)old_buf & (sizeof(long) - 1)) &&
>> +             !((uintptr_t)new_buf & (sizeof(long) - 1)) &&
>> +             !(slen & (sizeof(long) - 1)));
> 
> I know I suggested this in v13, so now I'm bike-shedding my own review,
> but I think it might be a bit more legible as:
> 
> g_assert(!(((uintptr_t)old_buf) | ((uintptr_t)new_buf) | slen) &
>            (sizeof(long) - 1)));
> 
>> +
>> +    while (i < slen) {
>> +        /* overflow */
>> +        if (d + 2 > dlen) {
>> +            return -1;
>> +        }
>> +
>> +        /* not aligned to sizeof(long) */
>> +        res = (slen - i) % sizeof(long);
>> +        if (res) {
>> +            start = i;
>> +            while (!(old_buf[i] ^ new_buf[i]) && (i - start) <= res) {
>> +                zrun_len++;
>> +                i++;
>> +            }
>> +        }
> 
> I'm not sure why you need 'start'; this same block will do the same
> amount of computation with less typing and register pressure:
> 
> res = (slen - i) % sizeof(long);
> while (res && old_buf[i] == new_buf[i]) {
>     zrun_len++;
>     i++;
>     res--;
> }
> 
>> +
>> +        if (zrun_len == res) {
> 
> If you use my above rewrite, then this condition changes to:
> 
> if (!res) {
> 
>> +            while (i <= (slen - sizeof(long)) &&
> 
> I'd shorten this to:
> 
> while (i < slen &&
> 
>> +                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
>> +                i += sizeof(long);
>> +                zrun_len += sizeof(long);
>> +            }
>> +
>> +            /* go over the rest */
>> +            while (old_buf[i] == new_buf[i] && ++i <= slen) {
> 
> Buffer overrun: if the page ends on a zrun, then at this point, i ==
> slen, and old_buf[i] is now pointing one past the end of the array.  You
> need to swap the conditions, and only increment 'i' inside the {}, as in:
> 
> while (i < slen && old_buf[i] == new_buf[i]) {
>     i++;
> 
>> +                zrun_len++;
>> +            }
>> +        }
>> +
>> +        /* buffer unchanged */
>> +        if (zrun_len == slen) {
>> +            return 0;
>> +        }
>> +
>> +        /* skip last zero run */
>> +        if (i == slen + 1) {
> 
> Buffer overrun.  The loop should terminate at 'i == slen', not at 'i ==
> slen + 1', since old_buf[slen - 1] is the last addressable byte.
> 
>> +            return d;
>> +        }
>> +
>> +        d += uleb128_encode_small(dst + d, zrun_len);
>> +
>> +        /* no nzrun */
>> +        if (i == slen) {
>> +            return d;
>> +        }
> 
> This is the correct code but repeats the intent of what you were trying
> just before the uleb128_encode_small().
> 
>> +
>> +        zrun_len = 0;
>> +        nzrun_start = new_buf + i;
> 
> Missing an overflow check here.  If there are fewer than 2 bytes before
> we hit dlen, then there is no way that we can fit in an nzrun, so we
> might as well quit here.
> 
>> +
>> +        /* not aligned to sizeof(long) */
>> +        res = (slen - i) % sizeof(long);
>> +        if (res) {
>> +            start = i;
>> +            while (old_buf[i] != new_buf[i] && i - start < res) {
>> +                i++;
>> +                nzrun_len++;
>> +            }
>> +        }
> 
> Again, you can exploit the fact that slen is aligned, and shorten to:
> 
> res = (slen - i) % sizeof(long);
> while (res && old_bf[i] != new_buf[i]) {
>     i++;
>     nzrun_len++;
>     res--;
> }
> 
>> +
>> +        if (nzrun_len == res) {
> 
> and this would change to:
> 
> if (!res) {
> 
>> +            /* truncation to 32-bit long okay */
>> +            long mask = 0x0101010101010101ULL;
>> +            xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
>> +            printf("cont\n");
> 
> Stray debugging comment.
> 
>> +            while (i <= (slen - sizeof(long))) {
> 
> shorter to write as:
> 
> while (i < slen) {
> 
>> +                if ((xor - mask) & ~xor & (mask << 7)) {
>> +                    /* found the end of an nzrun within the current long */
>> +                    while (old_buf[i] != new_buf[i] && ++i <= slen) {
> 
> No need to compare against slen here, since the outer while guarantees
> that.  You can simplify to:
> 
> while (old_buf[i] != new_buf[i]) {
>     i++;
> 
>> +                        nzrun_len++;
>> +                    }
>> +                    break;
>> +                } else {
>> +                    i += sizeof(long);
>> +                    nzrun_len += sizeof(long);
>> +                    xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
>> +                }
>> +            }
>> +
> 
> At this point, you are guaranteed to have found the end of the nzrun
> (either you found two equal bytes, or you hit slen), which means...
> 
>> +            while (old_buf[i] != new_buf[i] && ++i <= slen) {
>> +                nzrun_len++;
>> +            }
> 
> this while loop is dead code.  Furthermore, this while loop is a
> potential out-of-bounds-read (remember, old_buf[slen] is not valid
> memory), so nuking it is the way to go.
> 
>> +        }
>> +
>> +        /* overflow */
>> +        if (d + nzrun_len + 2 > dlen) {
>> +            return -1;
> 
> There is a corner case where this reports overflow even when there was
> none - that is when the old_buf ends in an nzrun of 0x7f bytes or less,
> so the encoding occupies only 1 byte, and the encoded version would
> still fit in dlen.  I think you want to defer your overflow checking
> until after...
> 
>> +        }
>> +
>> +        d += uleb128_encode_small(dst + d, nzrun_len);
> 
> ...you know the encoded size.  But that only works if you made sure
> earlier that there was room to encode nzrun_len (up to 2 bytes) in the
> first place.
> 
>> +        memcpy(dst + d, nzrun_start, nzrun_len);
>> +        d += nzrun_len;
>> +        nzrun_len = 0;
>> +    }
>> +
>> +    return d;
>> +}
>> +
> 
> Do you have any unit tests that pass in various inputs and compare
> against expected outputs?
> 
>> +int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
>> +{
>> +    int i = 0, d = 0;
>> +    int ret;
>> +    uint32_t count = 0;
>> +
>> +    while (i < slen) {
>> +
>> +        /* zrun */
>> +        if ((slen - i) < 2 && *(src + i) & 0x80) {
>> +            return -1;
>> +        }
> 
> You can simplify this - the protocol demands an nzrun after every zrun,
> which means at this point in the decode, there must be at least 3 bytes
> to be valid.  It's simpler to write:
> 
> if (slen - i < 2) {
>     return -1;
> }
> 
>> +
>> +        ret = uleb128_decode_small(src + i, &count);
>> +        if (ret < 0) {
>> +            return -1;
>> +        }
> 
> I'd also add a check that we only ever get a zero-length zrun at the
> start of a buffer (anywhere else, a zero-length zrun is invalid); as in:
> 
> if (!count && i) {
>     return -1;
> }
> 
>> +        i += ret;
>> +        d += count;
>> +
>> +        /* overflow */
>> +        if (d > dlen) {
>> +            return -1;
>> +        }
>> +
>> +        /* completed decoding */
> 
> No, if we got here, then we were handed an invalid stream.  Remember,
> the protocol says that every zrun must be followed by an nzrun.
> 
>> +        if (i == slen - 1) {
>> +            return d;
>> +        }
>> +
>> +        /* nzrun */
>> +        if ((slen - i) < 2 && *(src + i) & 0x80) {
>> +            return -1;
>> +        }
> 
> Again, a valid nzrun is at least 2 bytes, so no need to worry about the
> contents of *src, it is sufficient to check:
> 
> if (slen - i < 2) {
>     return -1;
> }
> 
>> +        ret = uleb128_decode_small(src + i, &count);
>> +        if (ret < 0) {
> 
> An nzrun should be a non-zero value; I'd write this as (ret <= 0) to
> rule out an attempt to pass a zero-length nzrun.
decode can only return -1 (invalid) or the decoded len 1 or 2
so maybe you meant I should check that count is bigger than zero?

Orit
> 
>> +            return -1;
>> +        }
>> +        i += ret;
>> +
>> +        /* overflow */
>> +        if (d + count > dlen) {
>> +            return -1;
>> +        }
>> +
>> +        memcpy(dst + d , src + i, count);
>> +        d += count;
>> +        i += count;
>> +    }
>> +
>> +    return d;
>> +}
>>
>
Eric Blake - July 4, 2012, 11:36 a.m.
On 07/04/2012 01:24 AM, Orit Wasserman wrote:

>>
>>> +        ret = uleb128_decode_small(src + i, &count);
>>> +        if (ret < 0) {
>>
>> An nzrun should be a non-zero value; I'd write this as (ret <= 0) to
>> rule out an attempt to pass a zero-length nzrun.
> decode can only return -1 (invalid) or the decoded len 1 or 2
> so maybe you meant I should check that count is bigger than zero?

Yes, and I even corrected myself:
https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00368.html

as well as incorporated my comments (which probably picked up a couple
other fixes in the process):
https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00369.html
Orit Wasserman - July 4, 2012, 12:51 p.m.
On 07/04/2012 03:20 AM, Eric Blake wrote:
> On 07/03/2012 03:39 PM, Eric Blake wrote:
>> On 07/03/2012 03:32 PM, Eric Blake wrote:
>>
>>>> +        ret = uleb128_decode_small(src + i, &count);
>>>> +        if (ret < 0) {
>>>
>>> An nzrun should be a non-zero value; I'd write this as (ret <= 0) to
>>> rule out an attempt to pass a zero-length nzrun.
>>
>> Correcting myself,
>>
>> if (ret < 0 || !count) {
>>
>> At this point, I think I will just bite the bullet and post a version of
>> this code that incorporates my review.
> 
> Something like this (lightly tested):
> 
> /*
>   page = zrun nzrun
>        | zrun nzrun page
> 
>   zrun = length
> 
>   nzrun = length byte...
> 
>   length = uleb128 encoded integer
>  */
> int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>                          uint8_t *dst, int dlen)
> {
>     uint32_t zrun_len = 0, nzrun_len = 0;
>     int d = 0, i = 0;
>     long res, xor;
>     uint8_t *nzrun_start = NULL;
> 
>     g_assert(!(((uintptr_t)old_buf | (uintptr_t)new_buf | slen) %
> 	       sizeof(long)));
> 
>     while (i < slen) {
>         /* overflow */
>         if (d + 2 > dlen) {
>             return -1;
>         }
> 
>         /* not aligned to sizeof(long) */
>         res = (slen - i) % sizeof(long);
>         while (res && old_buf[i] == new_buf[i]) {
>             zrun_len++;
>             i++;
>             res--;
>         }
> 
>         if (!res) {
>             while (i < slen &&
>                    (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
>                 i += sizeof(long);
>                 zrun_len += sizeof(long);
>             }
> 
>             /* go over the rest */
>             while (i < slen && old_buf[i] == new_buf[i]) {
>                 zrun_len++;
> 		i++;
>             }
>         }
> 
>         /* buffer unchanged */
>         if (zrun_len == slen) {
>             return 0;
>         }
> 
>         /* skip last zero run */
>         if (i == slen) {
>             return d;
>         }
> 
>         d += uleb128_encode_small(dst + d, zrun_len);
> 
>         zrun_len = 0;
>         nzrun_start = new_buf + i;
> 
>         /* overflow */
>         if (d + 2 > dlen) {
>             return -1;
>         }
> 
>         /* not aligned to sizeof(long) */
>         res = (slen - i) % sizeof(long);
>         while (res && old_buf[i] != new_buf[i]) {
>             nzrun_len++;
>             i++;
>             res--;
>         }
> 
>         if (!res) {
>             /* truncation to 32-bit long okay */
>             long mask = 0x0101010101010101ULL;
>             while (i < slen) {
>                 xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
> 	        if ((xor - mask) & ~xor & (mask << 7)) {
> 		    /* found the end of an nzrun within the current long */
> 		    while (old_buf[i] != new_buf[i]) {
>                         nzrun_len++;
>                         i++;
>                     }
>                     break;
>                 } else {
>                     i += sizeof(long);
>                     nzrun_len += sizeof(long);
>                 }
>             }
>         }
> 
>         d += uleb128_encode_small(dst + d, nzrun_len);
> 
>         /* overflow */
>         if (d + nzrun_len > dlen) {
>             return -1;
>         }
> 
>         memcpy(dst + d, nzrun_start, nzrun_len);
>         d += nzrun_len;
>         nzrun_len = 0;
>     }
> 
>     return d;
> }
> 
> int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
> {
>     int i = 0, d = 0;
>     int ret;
>     uint32_t count = 0;
> 
>     while (i < slen) {
> 
>         /* zrun */
>         if (slen - i < 2) {
>             return -1;
>         }
> 
>         ret = uleb128_decode_small(src + i, &count);
>         if (ret < 0 || (i && !count)) {
>             return -1;
>         }
>         i += ret;
>         d += count;
> 
>         /* overflow */
>         if (d > dlen) {
>             return -1;
>         }
> 
>         /* nzrun */
>         if (slen - i < 2) {
>             return -1;
>         }
>         ret = uleb128_decode_small(src + i, &count);
>         if (ret < 0 || !count) {
>             return -1;
>         }
>         i += ret;
> 
>         /* overflow */
>         if (d + count > dlen || i + count > slen) {
>             return -1;
>         }
> 
>         memcpy(dst + d , src + i, count);
>         d += count;
>         i += count;
>     }
> 
>     return d;
> }
> 
thanks

Patch

diff --git a/migration.h b/migration.h
index 1ae99f1..7582ecb 100644
--- a/migration.h
+++ b/migration.h
@@ -99,4 +99,8 @@  void migrate_add_blocker(Error *reason);
  */
 void migrate_del_blocker(Error *reason);
 
+int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
+                         uint8_t *dst, int dlen);
+int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen);
+
 #endif
diff --git a/savevm.c b/savevm.c
index d1d9020..2ab0986 100644
--- a/savevm.c
+++ b/savevm.c
@@ -2374,3 +2374,175 @@  void vmstate_register_ram_global(MemoryRegion *mr)
 {
     vmstate_register_ram(mr, NULL);
 }
+
+/*
+  page = zrun nzrun
+       | zrun nzrun page
+
+  zrun = length
+
+  nzrun = length byte...
+
+  length = uleb128 encoded integer
+ */
+int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
+                         uint8_t *dst, int dlen)
+{
+    uint32_t zrun_len = 0, nzrun_len = 0;
+    int d = 0, i = 0, start;
+    long res, xor;
+    uint8_t *nzrun_start = NULL;
+
+    g_assert(!((uintptr_t)old_buf & (sizeof(long) - 1)) &&
+             !((uintptr_t)new_buf & (sizeof(long) - 1)) &&
+             !(slen & (sizeof(long) - 1)));
+
+    while (i < slen) {
+        /* overflow */
+        if (d + 2 > dlen) {
+            return -1;
+        }
+
+        /* not aligned to sizeof(long) */
+        res = (slen - i) % sizeof(long);
+        if (res) {
+            start = i;
+            while (!(old_buf[i] ^ new_buf[i]) && (i - start) <= res) {
+                zrun_len++;
+                i++;
+            }
+        }
+
+        if (zrun_len == res) {
+            while (i <= (slen - sizeof(long)) &&
+                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
+                i += sizeof(long);
+                zrun_len += sizeof(long);
+            }
+
+            /* go over the rest */
+            while (old_buf[i] == new_buf[i] && ++i <= slen) {
+                zrun_len++;
+            }
+        }
+
+        /* buffer unchanged */
+        if (zrun_len == slen) {
+            return 0;
+        }
+
+        /* skip last zero run */
+        if (i == slen + 1) {
+            return d;
+        }
+
+        d += uleb128_encode_small(dst + d, zrun_len);
+
+        /* no nzrun */
+        if (i == slen) {
+            return d;
+        }
+
+        zrun_len = 0;
+        nzrun_start = new_buf + i;
+
+        /* not aligned to sizeof(long) */
+        res = (slen - i) % sizeof(long);
+        if (res) {
+            start = i;
+            while (old_buf[i] != new_buf[i] && i - start < res) {
+                i++;
+                nzrun_len++;
+            }
+        }
+
+        if (nzrun_len == res) {
+            /* truncation to 32-bit long okay */
+            long mask = 0x0101010101010101ULL;
+            xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
+            printf("cont\n");
+            while (i <= (slen - sizeof(long))) {
+                if ((xor - mask) & ~xor & (mask << 7)) {
+                    /* found the end of an nzrun within the current long */
+                    while (old_buf[i] != new_buf[i] && ++i <= slen) {
+                        nzrun_len++;
+                    }
+                    break;
+                } else {
+                    i += sizeof(long);
+                    nzrun_len += sizeof(long);
+                    xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
+                }
+            }
+
+            while (old_buf[i] != new_buf[i] && ++i <= slen) {
+                nzrun_len++;
+            }
+        }
+
+        /* overflow */
+        if (d + nzrun_len + 2 > dlen) {
+            return -1;
+        }
+
+        d += uleb128_encode_small(dst + d, nzrun_len);
+        memcpy(dst + d, nzrun_start, nzrun_len);
+        d += nzrun_len;
+        nzrun_len = 0;
+    }
+
+    return d;
+}
+
+int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
+{
+    int i = 0, d = 0;
+    int ret;
+    uint32_t count = 0;
+
+    while (i < slen) {
+
+        /* zrun */
+        if ((slen - i) < 2 && *(src + i) & 0x80) {
+            return -1;
+        }
+
+        ret = uleb128_decode_small(src + i, &count);
+        if (ret < 0) {
+            return -1;
+        }
+        i += ret;
+        d += count;
+
+        /* overflow */
+        if (d > dlen) {
+            return -1;
+        }
+
+        /* completed decoding */
+        if (i == slen - 1) {
+            return d;
+        }
+
+        /* nzrun */
+        if ((slen - i) < 2 && *(src + i) & 0x80) {
+            return -1;
+        }
+        ret = uleb128_decode_small(src + i, &count);
+        if (ret < 0) {
+            return -1;
+        }
+        i += ret;
+
+        /* overflow */
+        if (d + count > dlen) {
+            return -1;
+        }
+
+        memcpy(dst + d , src + i, count);
+        d += count;
+        i += count;
+    }
+
+    return d;
+}