diff mbox

[4/9] hbitmap: store / restore

Message ID 1418307457-25996-5-git-send-email-vsementsov@parallels.com
State New
Headers show

Commit Message

Vladimir Sementsov-Ogievskiy Dec. 11, 2014, 2:17 p.m. UTC
Functions to store / restore HBitmap. HBitmap should be saved to linear
bitmap format independently of endianess.

Because of restoring in several steps, every step writes only the last
level of the bitmap. All other levels are restored by
hbitmap_restore_finish as a last step of restoring. So, HBitmap is
inconsistent while restoring.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
---
 include/qemu/hbitmap.h | 49 +++++++++++++++++++++++++++++
 util/hbitmap.c         | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 133 insertions(+)

Comments

John Snow Jan. 8, 2015, 9:21 p.m. UTC | #1
On 12/11/2014 09:17 AM, Vladimir Sementsov-Ogievskiy wrote:
> Functions to store / restore HBitmap. HBitmap should be saved to linear
> bitmap format independently of endianess.
>
> Because of restoring in several steps, every step writes only the last
> level of the bitmap. All other levels are restored by
> hbitmap_restore_finish as a last step of restoring. So, HBitmap is
> inconsistent while restoring.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
> ---
>   include/qemu/hbitmap.h | 49 +++++++++++++++++++++++++++++
>   util/hbitmap.c         | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 133 insertions(+)
>
> diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
> index b645cfc..e57b610 100644
> --- a/include/qemu/hbitmap.h
> +++ b/include/qemu/hbitmap.h
> @@ -126,6 +126,55 @@ void hbitmap_reset(HBitmap *hb, uint64_t start, uint64_t count);
>   bool hbitmap_get(const HBitmap *hb, uint64_t item);
>
>   /**
> + * hbitmap_data_size:
> + * @hb: HBitmap to operate on.
> + * @count: Number of bits
> + *
> + * Return amount of bytes hbitmap_store_data needs
> + */
> +uint64_t hbitmap_data_size(const HBitmap *hb, uint64_t count);
> +
> +/**
> + * hbitmap_store_data
> + * @hb: HBitmap to oprate on.
> + * @buf: Buffer to store bitmap data.
> + * @start: First bit to store.
> + * @count: Number of bits to store.
> + *
> + * Stores HBitmap data corresponding to given region. The format of saved data
> + * is linear sequence of bits, so it can be used by hbitmap_restore_data
> + * independently of endianess
> + */
> +void hbitmap_store_data(const HBitmap *hb, uint8_t *buf,
> +                        uint64_t start, uint64_t count);
> +
> +/**
> + * hbitmap_restore_data
> + * @hb: HBitmap to oprate on.
> + * @buf: Buffer to restore bitmap data from.
> + * @start: First bit to restore.
> + * @count: Number of bits to restore.
> + *
> + * Retores HBitmap data corresponding to given region. The format is the same
> + * as for hbitmap_store_data.
> + *
> + * ! The bitmap becomes inconsistent after this operation.
> + * hbitmap_restore_finish should be called before using the bitmap after
> + * data restoring.
> + */
> +void hbitmap_restore_data(HBitmap *hb, uint8_t *buf,
> +                          uint64_t start, uint64_t count);
> +

I guess by nature of how bitmap migration has been implemented, we 
cannot help but restore parts of the bitmap piece by piece, which 
requires us to force the bitmap into an inconsistent state.

Yuck.

It might be "nice" to turn on a disable bit inside the hbitmap that 
disallows its use until it is made consistent again, but I don't know 
what the performance hit of the extra conditionals everywhere would be like.

> +/**
> + * hbitmap_restore_finish
> + * @hb: HBitmap to operate on.
> + *
> + * Repair HBitmap after calling hbitmap_restore_data. Actuall all HBitmap
> + * layers are restore here.
> + */
> +void hbitmap_restore_finish(HBitmap *hb);
> +
> +/**
>    * hbitmap_free:
>    * @hb: HBitmap to operate on.
>    *

These are just biased opinions:

- It might be nice to name the store/restore functions "serialize" and 
"deserialize," to describe exactly what they are doing.

- I might refer to "restore_finish" as "post_load" or "post_restore" or 
something similar to mimic how device migration functions are named. I 
think hbitmap_restore_data_finalize would also be fine; the key part 
here is clearly naming it relative to "restore_data."

> diff --git a/util/hbitmap.c b/util/hbitmap.c
> index 8aa7406..ac0323f 100644
> --- a/util/hbitmap.c
> +++ b/util/hbitmap.c
> @@ -362,6 +362,90 @@ bool hbitmap_get(const HBitmap *hb, uint64_t item)
>       return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] & bit) != 0;
>   }
>
> +uint64_t hbitmap_data_size(const HBitmap *hb, uint64_t count)
> +{
> +    uint64_t size;
> +
> +    if (count == 0) {
> +        return 0;
> +    }
> +
> +    size = (((count - 1) >> hb->granularity) >> BITS_PER_LEVEL) + 1;
> +
> +    return size * sizeof(unsigned long);
> +}
> +

This seems flawed to me: number of bits without an offset can't be 
mapped to a number of real bytes, because "two bits" may or may not 
cross a granularity boundary, depending on *WHERE* you start counting.

e.g.

granularity = 1 (i.e. 2^1 = 2 virtual bits per 1 real bit)
virtual: 001100
real:     0 1 0

The amount of space required to hold "two bits" here could be as little 
as one bit, if the offset is k={0,2,4} but it could be as much as two 
bits if the offset is k={1,3}.

You may never use the function in this way, but mapping virtual bits to 
an implementation byte-size seems like it is inviting an inconsistency.

> +void hbitmap_store_data(const HBitmap *hb, uint8_t *buf,
> +                        uint64_t start, uint64_t count)
> +{
> +    uint64_t last = start + count - 1;
> +    unsigned long *out = (unsigned long *)buf;
> +
> +    if (count == 0) {
> +        return;
> +    }
> +
> +    start = (start >> hb->granularity) >> BITS_PER_LEVEL;
> +    last = (last >> hb->granularity) >> BITS_PER_LEVEL;
> +    count = last - start + 1;
> +
> +#ifdef __BIG_ENDIAN_BITFIELD
> +    for (i = start; i <= last; ++i) {
> +        unsigned long el = hb->levels[HBITMAP_LEVELS - 1][i];
> +        out[i] = (BITS_PER_LONG == 32 ? cpu_to_le32(el) : cpu_to_le64(el));
> +    }
> +#else
> +    memcpy(out, &hb->levels[HBITMAP_LEVELS - 1][start],
> +           count * sizeof(unsigned long));
> +#endif
> +}
> +

I suppose the ifdefs are trying to take an optimization by using memcpy 
if at all possible, without a conversion.

Why are the conversions to little endian, though? Shouldn't we be 
serializing to a Big Endian format?

> +void hbitmap_restore_data(HBitmap *hb, uint8_t *buf,
> +                          uint64_t start, uint64_t count)
> +{
> +    uint64_t last = start + count - 1;
> +    unsigned long *in = (unsigned long *)buf;
> +
> +    if (count == 0) {
> +        return;
> +    }
> +
> +    start = (start >> hb->granularity) >> BITS_PER_LEVEL;
> +    last = (last >> hb->granularity) >> BITS_PER_LEVEL;
> +    count = last - start + 1;
> +
> +#ifdef __BIG_ENDIAN_BITFIELD
> +    for (i = start; i <= last; ++i) {
> +        hb->levels[HBITMAP_LEVELS - 1][i] =
> +            (BITS_PER_LONG == 32 ? be32_to_cpu(in[i]) : be64_to_cpu(in[i]));
> +    }
> +#else
> +    memcpy(&hb->levels[HBITMAP_LEVELS - 1][start], in,
> +           count * sizeof(unsigned long));
> +#endif
> +}
> +

...? We're storing as LE but restoring from BE? I'm confused.

I'm also not clear on the __BIG_ENDIAN_BITFIELD macro. Why do we want to 
pack differently based on how C-bitfields are packed by the compiler? I 
don't think that has any impact on how longs are stored (always in the 
host native format.)

I would rather see the serialize/deserialize always convert to and from 
BE explicitly with cpu_to_be[32|64] and be[32|64]_to_cpu. I think trying 
to optimize the pack/unpack in the no-op condition with a # define and a 
memcpy is inviting portability problems.

> +void hbitmap_restore_finish(HBitmap *bitmap)
> +{
> +    int64_t i, size;
> +    int lev, j;
> +
> +    /* restore levels starting from penultimate to zero level, assuming
> +     * that the last one is ok */
> +    size = MAX((bitmap->size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1);
> +    for (lev = HBITMAP_LEVELS - 1; lev-- > 0; ) {
> +        size = MAX((size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1);
> +        for (i = 0; i < size; ++i) {
> +            bitmap->levels[lev][i] = 0;
> +            for (j = 0; j < BITS_PER_LONG; ++j) {
> +                if (bitmap->levels[lev+1][(i << BITS_PER_LEVEL) + j]) {
> +                    bitmap->levels[lev][i] |=  (1 << j);
> +                }
> +            }

Just a musing: Should we cache the size of each level? I know we can 
keep recalculating it, but... why bother? We recalculate it in so many 
places now.

> +        }
> +    }
> +}
> +
>   void hbitmap_free(HBitmap *hb)
>   {
>       unsigned i;
>
Paolo Bonzini Jan. 8, 2015, 9:37 p.m. UTC | #2
On 08/01/2015 22:21, John Snow wrote:
> Why are the conversions to little endian, though? Shouldn't we be
> serializing to a Big Endian format?

Because reading two 32-bit little-endian longs or a 64-bit little-endian
long gives the same value.  This is not true for big-endian.

Take the following two 32-bit values:

   0x04030201    0x08070605

representing offsets 0 and 32 in the bitmap.  Parse them back as one
64-bit little endian value:

   0x0807060504030201 = 0x04030201 + 0x08070605 * 2^32

Now parse them as as one 64-bit big endian value:

   0x0403020108070605 => 0x08070605 + 0x04030201 * 2^32

so the values are swapped.

>> +void hbitmap_restore_data(HBitmap *hb, uint8_t *buf,
>> +                          uint64_t start, uint64_t count)
>> +{
>> +    uint64_t last = start + count - 1;
>> +    unsigned long *in = (unsigned long *)buf;
>> +
>> +    if (count == 0) {
>> +        return;
>> +    }
>> +
>> +    start = (start >> hb->granularity) >> BITS_PER_LEVEL;
>> +    last = (last >> hb->granularity) >> BITS_PER_LEVEL;
>> +    count = last - start + 1;
>> +
>> +#ifdef __BIG_ENDIAN_BITFIELD
>> +    for (i = start; i <= last; ++i) {
>> +        hb->levels[HBITMAP_LEVELS - 1][i] =
>> +            (BITS_PER_LONG == 32 ? be32_to_cpu(in[i]) :
>> be64_to_cpu(in[i]));
>> +    }
>> +#else
>> +    memcpy(&hb->levels[HBITMAP_LEVELS - 1][start], in,
>> +           count * sizeof(unsigned long));
>> +#endif
>> +}
>> +
> 
> ...? We're storing as LE but restoring from BE? I'm confused.
> 
> I'm also not clear on the __BIG_ENDIAN_BITFIELD macro. Why do we want to
> pack differently based on how C-bitfields are packed by the compiler? I
> don't think that has any impact on how longs are stored (always in the
> host native format.)

I agree.

Paolo
Vladimir Sementsov-Ogievskiy Jan. 13, 2015, 12:59 p.m. UTC | #3
Best regards,
Vladimir

On 09.01.2015 00:21, John Snow wrote:
>
>
> On 12/11/2014 09:17 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Functions to store / restore HBitmap. HBitmap should be saved to linear
>> bitmap format independently of endianess.
>>
>> Because of restoring in several steps, every step writes only the last
>> level of the bitmap. All other levels are restored by
>> hbitmap_restore_finish as a last step of restoring. So, HBitmap is
>> inconsistent while restoring.
.....
>
> I guess by nature of how bitmap migration has been implemented, we 
> cannot help but restore parts of the bitmap piece by piece, which 
> requires us to force the bitmap into an inconsistent state.
>
> Yuck.
>
> It might be "nice" to turn on a disable bit inside the hbitmap that 
> disallows its use until it is made consistent again, but I don't know 
> what the performance hit of the extra conditionals everywhere would be 
> like.
>

Another approach is to restore the bitmap after each piece set. It is a 
bit slower of course but the bitmap will be consistent after 
hbitmap_restore_data()

>> +/**
>> + * hbitmap_restore_finish
>> + * @hb: HBitmap to operate on.
>> + *
>> + * Repair HBitmap after calling hbitmap_restore_data. Actuall all 
>> HBitmap
>> + * layers are restore here.
>> + */
>> +void hbitmap_restore_finish(HBitmap *hb);
>> +
>> +/**
>>    * hbitmap_free:
>>    * @hb: HBitmap to operate on.
>>    *
>
> These are just biased opinions:
>
> - It might be nice to name the store/restore functions "serialize" and 
> "deserialize," to describe exactly what they are doing.
>
> - I might refer to "restore_finish" as "post_load" or "post_restore" 
> or something similar to mimic how device migration functions are 
> named. I think hbitmap_restore_data_finalize would also be fine; the 
> key part here is clearly naming it relative to "restore_data."
>

Hmm. Ok, what about the following set:
     hbitmap_serialize()
     hbitmap_deserialize_part()
     hbitmap_deserialize_finish()

>> diff --git a/util/hbitmap.c b/util/hbitmap.c
>> index 8aa7406..ac0323f 100644
>> --- a/util/hbitmap.c
>> +++ b/util/hbitmap.c
>> @@ -362,6 +362,90 @@ bool hbitmap_get(const HBitmap *hb, uint64_t item)
>>       return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] & 
>> bit) != 0;
>>   }
>>
>> +uint64_t hbitmap_data_size(const HBitmap *hb, uint64_t count)
>> +{
>> +    uint64_t size;
>> +
>> +    if (count == 0) {
>> +        return 0;
>> +    }
>> +
>> +    size = (((count - 1) >> hb->granularity) >> BITS_PER_LEVEL) + 1;
>> +
>> +    return size * sizeof(unsigned long);
>> +}
>> +
>
> This seems flawed to me: number of bits without an offset can't be 
> mapped to a number of real bytes, because "two bits" may or may not 
> cross a granularity boundary, depending on *WHERE* you start counting.
>
> e.g.
>
> granularity = 1 (i.e. 2^1 = 2 virtual bits per 1 real bit)
> virtual: 001100
> real:     0 1 0
>
> The amount of space required to hold "two bits" here could be as 
> little as one bit, if the offset is k={0,2,4} but it could be as much 
> as two bits if the offset is k={1,3}.
>
> You may never use the function in this way, but mapping virtual bits 
> to an implementation byte-size seems like it is inviting an 
> inconsistency.
I dislike this function too.. But unfortunately we need the size in 
bytes used for serialization.

Hmm. Ok, without loss of generality, let offset be less than 
granularity. The precise formula should look like:

size = (((offset+count-1) >> hb->granularity) >> BITS_PER_LEVEL);

So,
size = ((((1 << hb->granularity) + count - 2) >> hb->granularity) >> 
BITS_PER_LEVEL) + 1;
would be enough in any case. Ok?



>
>> +void hbitmap_store_data(const HBitmap *hb, uint8_t *buf,
>> +                        uint64_t start, uint64_t count)
>> +{
>> +    uint64_t last = start + count - 1;
>> +    unsigned long *out = (unsigned long *)buf;
>> +
>> +    if (count == 0) {
>> +        return;
>> +    }
>> +
>> +    start = (start >> hb->granularity) >> BITS_PER_LEVEL;
>> +    last = (last >> hb->granularity) >> BITS_PER_LEVEL;
>> +    count = last - start + 1;
>> +
>> +#ifdef __BIG_ENDIAN_BITFIELD
>> +    for (i = start; i <= last; ++i) {
>> +        unsigned long el = hb->levels[HBITMAP_LEVELS - 1][i];
>> +        out[i] = (BITS_PER_LONG == 32 ? cpu_to_le32(el) : 
>> cpu_to_le64(el));
>> +    }
>> +#else
>> +    memcpy(out, &hb->levels[HBITMAP_LEVELS - 1][start],
>> +           count * sizeof(unsigned long));
>> +#endif
>> +}
>> +
>
> I suppose the ifdefs are trying to take an optimization by using 
> memcpy if at all possible, without a conversion.
>
> Why are the conversions to little endian, though? Shouldn't we be 
> serializing to a Big Endian format?
As Paolo already explained it is for portability. LE bitmap 
representation is independent of size of bitmap array elements, which 
may be 32bit/64bit, or whatever else with LE format.
>
>> +void hbitmap_restore_data(HBitmap *hb, uint8_t *buf,
>> +                          uint64_t start, uint64_t count)
>> +{
>> +    uint64_t last = start + count - 1;
>> +    unsigned long *in = (unsigned long *)buf;
>> +
>> +    if (count == 0) {
>> +        return;
>> +    }
>> +
>> +    start = (start >> hb->granularity) >> BITS_PER_LEVEL;
>> +    last = (last >> hb->granularity) >> BITS_PER_LEVEL;
>> +    count = last - start + 1;
>> +
>> +#ifdef __BIG_ENDIAN_BITFIELD
>> +    for (i = start; i <= last; ++i) {
>> +        hb->levels[HBITMAP_LEVELS - 1][i] =
>> +            (BITS_PER_LONG == 32 ? be32_to_cpu(in[i]) : 
>> be64_to_cpu(in[i]));
>> +    }
>> +#else
>> +    memcpy(&hb->levels[HBITMAP_LEVELS - 1][start], in,
>> +           count * sizeof(unsigned long));
>> +#endif
>> +}
>> +
>
> ...? We're storing as LE but restoring from BE? I'm confused.
Oh, it's a mistake. Thanks. I've missed it when testing because of 
{#ifdef __BIG_ENDIAN_BITFIELD } branch was never compiled.
>
> I'm also not clear on the __BIG_ENDIAN_BITFIELD macro. Why do we want 
> to pack differently based on how C-bitfields are packed by the 
> compiler? I don't think that has any impact on how longs are stored 
> (always in the host native format.)
>
> I would rather see the serialize/deserialize always convert to and 
> from BE explicitly with cpu_to_be[32|64] and be[32|64]_to_cpu. I think 
> trying to optimize the pack/unpack in the no-op condition with a # 
> define and a memcpy is inviting portability problems.
Ok. I'll leave only branch with for(...) in v2.
>
>> +void hbitmap_restore_finish(HBitmap *bitmap)
>> +{
>> +    int64_t i, size;
>> +    int lev, j;
>> +
>> +    /* restore levels starting from penultimate to zero level, assuming
>> +     * that the last one is ok */
>> +    size = MAX((bitmap->size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 
>> 1);
>> +    for (lev = HBITMAP_LEVELS - 1; lev-- > 0; ) {
>> +        size = MAX((size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1);
>> +        for (i = 0; i < size; ++i) {
>> +            bitmap->levels[lev][i] = 0;
>> +            for (j = 0; j < BITS_PER_LONG; ++j) {
>> +                if (bitmap->levels[lev+1][(i << BITS_PER_LEVEL) + j]) {
>> +                    bitmap->levels[lev][i] |=  (1 << j);
>> +                }
>> +            }
>
> Just a musing: Should we cache the size of each level? I know we can 
> keep recalculating it, but... why bother? We recalculate it in so many 
> places now.
Interesting point.

However, there is another real bug: there are may be no longs on the 
next level for last bits of the current one. The new version of this 
function will be in v2.
>
>> +        }
>> +    }
>> +}
>> +
>>   void hbitmap_free(HBitmap *hb)
>>   {
>>       unsigned i;
>>
John Snow Jan. 13, 2015, 5:08 p.m. UTC | #4
On 01/13/2015 07:59 AM, Vladimir Sementsov-Ogievskiy wrote:
>
> Best regards,
> Vladimir
>
> On 09.01.2015 00:21, John Snow wrote:
>>
>>
>> On 12/11/2014 09:17 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> Functions to store / restore HBitmap. HBitmap should be saved to linear
>>> bitmap format independently of endianess.
>>>
>>> Because of restoring in several steps, every step writes only the last
>>> level of the bitmap. All other levels are restored by
>>> hbitmap_restore_finish as a last step of restoring. So, HBitmap is
>>> inconsistent while restoring.
> .....
>>
>> I guess by nature of how bitmap migration has been implemented, we
>> cannot help but restore parts of the bitmap piece by piece, which
>> requires us to force the bitmap into an inconsistent state.
>>
>> Yuck.
>>
>> It might be "nice" to turn on a disable bit inside the hbitmap that
>> disallows its use until it is made consistent again, but I don't know
>> what the performance hit of the extra conditionals everywhere would be
>> like.
>>
>
> Another approach is to restore the bitmap after each piece set. It is a
> bit slower of course but the bitmap will be consistent after
> hbitmap_restore_data()
>

Yeah, that's not great either. The approach you have already taken is 
probably the best.

>>> +/**
>>> + * hbitmap_restore_finish
>>> + * @hb: HBitmap to operate on.
>>> + *
>>> + * Repair HBitmap after calling hbitmap_restore_data. Actuall all
>>> HBitmap
>>> + * layers are restore here.
>>> + */
>>> +void hbitmap_restore_finish(HBitmap *hb);
>>> +
>>> +/**
>>>    * hbitmap_free:
>>>    * @hb: HBitmap to operate on.
>>>    *
>>
>> These are just biased opinions:
>>
>> - It might be nice to name the store/restore functions "serialize" and
>> "deserialize," to describe exactly what they are doing.
>>
>> - I might refer to "restore_finish" as "post_load" or "post_restore"
>> or something similar to mimic how device migration functions are
>> named. I think hbitmap_restore_data_finalize would also be fine; the
>> key part here is clearly naming it relative to "restore_data."
>>
>
> Hmm. Ok, what about the following set:
>      hbitmap_serialize()
>      hbitmap_deserialize_part()
>      hbitmap_deserialize_finish()
>

Looks good to me!

>>> diff --git a/util/hbitmap.c b/util/hbitmap.c
>>> index 8aa7406..ac0323f 100644
>>> --- a/util/hbitmap.c
>>> +++ b/util/hbitmap.c
>>> @@ -362,6 +362,90 @@ bool hbitmap_get(const HBitmap *hb, uint64_t item)
>>>       return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] &
>>> bit) != 0;
>>>   }
>>>
>>> +uint64_t hbitmap_data_size(const HBitmap *hb, uint64_t count)
>>> +{
>>> +    uint64_t size;
>>> +
>>> +    if (count == 0) {
>>> +        return 0;
>>> +    }
>>> +
>>> +    size = (((count - 1) >> hb->granularity) >> BITS_PER_LEVEL) + 1;
>>> +
>>> +    return size * sizeof(unsigned long);
>>> +}
>>> +
>>
>> This seems flawed to me: number of bits without an offset can't be
>> mapped to a number of real bytes, because "two bits" may or may not
>> cross a granularity boundary, depending on *WHERE* you start counting.
>>
>> e.g.
>>
>> granularity = 1 (i.e. 2^1 = 2 virtual bits per 1 real bit)
>> virtual: 001100
>> real:     0 1 0
>>
>> The amount of space required to hold "two bits" here could be as
>> little as one bit, if the offset is k={0,2,4} but it could be as much
>> as two bits if the offset is k={1,3}.
>>
>> You may never use the function in this way, but mapping virtual bits
>> to an implementation byte-size seems like it is inviting an
>> inconsistency.
> I dislike this function too.. But unfortunately we need the size in
> bytes used for serialization.
>
> Hmm. Ok, without loss of generality, let offset be less than
> granularity. The precise formula should look like:
>
> size = (((offset+count-1) >> hb->granularity) >> BITS_PER_LEVEL);
>
> So,
> size = ((((1 << hb->granularity) + count - 2) >> hb->granularity) >>
> BITS_PER_LEVEL) + 1;
> would be enough in any case. Ok?
>

I think so, as long as when you deserialize the object it does so 
correctly, regardless of whether or not you start on an even multiple of 
the granularity.

>
>
>>
>>> +void hbitmap_store_data(const HBitmap *hb, uint8_t *buf,
>>> +                        uint64_t start, uint64_t count)
>>> +{
>>> +    uint64_t last = start + count - 1;
>>> +    unsigned long *out = (unsigned long *)buf;
>>> +
>>> +    if (count == 0) {
>>> +        return;
>>> +    }
>>> +
>>> +    start = (start >> hb->granularity) >> BITS_PER_LEVEL;
>>> +    last = (last >> hb->granularity) >> BITS_PER_LEVEL;
>>> +    count = last - start + 1;
>>> +
>>> +#ifdef __BIG_ENDIAN_BITFIELD
>>> +    for (i = start; i <= last; ++i) {
>>> +        unsigned long el = hb->levels[HBITMAP_LEVELS - 1][i];
>>> +        out[i] = (BITS_PER_LONG == 32 ? cpu_to_le32(el) :
>>> cpu_to_le64(el));
>>> +    }
>>> +#else
>>> +    memcpy(out, &hb->levels[HBITMAP_LEVELS - 1][start],
>>> +           count * sizeof(unsigned long));
>>> +#endif
>>> +}
>>> +
>>
>> I suppose the ifdefs are trying to take an optimization by using
>> memcpy if at all possible, without a conversion.
>>
>> Why are the conversions to little endian, though? Shouldn't we be
>> serializing to a Big Endian format?
> As Paolo already explained it is for portability. LE bitmap
> representation is independent of size of bitmap array elements, which
> may be 32bit/64bit, or whatever else with LE format.

Yes, that makes sense. :)

>>
>>> +void hbitmap_restore_data(HBitmap *hb, uint8_t *buf,
>>> +                          uint64_t start, uint64_t count)
>>> +{
>>> +    uint64_t last = start + count - 1;
>>> +    unsigned long *in = (unsigned long *)buf;
>>> +
>>> +    if (count == 0) {
>>> +        return;
>>> +    }
>>> +
>>> +    start = (start >> hb->granularity) >> BITS_PER_LEVEL;
>>> +    last = (last >> hb->granularity) >> BITS_PER_LEVEL;
>>> +    count = last - start + 1;
>>> +
>>> +#ifdef __BIG_ENDIAN_BITFIELD
>>> +    for (i = start; i <= last; ++i) {
>>> +        hb->levels[HBITMAP_LEVELS - 1][i] =
>>> +            (BITS_PER_LONG == 32 ? be32_to_cpu(in[i]) :
>>> be64_to_cpu(in[i]));
>>> +    }
>>> +#else
>>> +    memcpy(&hb->levels[HBITMAP_LEVELS - 1][start], in,
>>> +           count * sizeof(unsigned long));
>>> +#endif
>>> +}
>>> +
>>
>> ...? We're storing as LE but restoring from BE? I'm confused.
> Oh, it's a mistake. Thanks. I've missed it when testing because of
> {#ifdef __BIG_ENDIAN_BITFIELD } branch was never compiled.

Glad I am not crazy :)

>>
>> I'm also not clear on the __BIG_ENDIAN_BITFIELD macro. Why do we want
>> to pack differently based on how C-bitfields are packed by the
>> compiler? I don't think that has any impact on how longs are stored
>> (always in the host native format.)
>>
>> I would rather see the serialize/deserialize always convert to and
>> from BE explicitly with cpu_to_be[32|64] and be[32|64]_to_cpu. I think
>> trying to optimize the pack/unpack in the no-op condition with a #
>> define and a memcpy is inviting portability problems.
> Ok. I'll leave only branch with for(...) in v2.

You can correct me if I am mistaken, but I seem to recall determining 
endianness in a platform, compiler and libc independent way is very 
difficult during the pre-processor stage.

If there _IS_ some way to do this, the optimization is fine, but I do 
not think it will work as written.

The un-optimized version is the safest.

>>
>>> +void hbitmap_restore_finish(HBitmap *bitmap)
>>> +{
>>> +    int64_t i, size;
>>> +    int lev, j;
>>> +
>>> +    /* restore levels starting from penultimate to zero level, assuming
>>> +     * that the last one is ok */
>>> +    size = MAX((bitmap->size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL,
>>> 1);
>>> +    for (lev = HBITMAP_LEVELS - 1; lev-- > 0; ) {
>>> +        size = MAX((size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1);
>>> +        for (i = 0; i < size; ++i) {
>>> +            bitmap->levels[lev][i] = 0;
>>> +            for (j = 0; j < BITS_PER_LONG; ++j) {
>>> +                if (bitmap->levels[lev+1][(i << BITS_PER_LEVEL) + j]) {
>>> +                    bitmap->levels[lev][i] |=  (1 << j);
>>> +                }
>>> +            }
>>
>> Just a musing: Should we cache the size of each level? I know we can
>> keep recalculating it, but... why bother? We recalculate it in so many
>> places now.
> Interesting point.
>
> However, there is another real bug: there are may be no longs on the
> next level for last bits of the current one. The new version of this
> function will be in v2.

Great :)

>>
>>> +        }
>>> +    }
>>> +}
>>> +
>>>   void hbitmap_free(HBitmap *hb)
>>>   {
>>>       unsigned i;
>>>
>

Thank you!
--js
Vladimir Sementsov-Ogievskiy Jan. 14, 2015, 10:29 a.m. UTC | #5
Best regards,
Vladimir

On 13.01.2015 20:08, John Snow wrote:
> On 01/13/2015 07:59 AM, Vladimir Sementsov-Ogievskiy wrote:
>> On 09.01.2015 00:21, John Snow wrote:
>>> On 12/11/2014 09:17 AM, Vladimir Sementsov-Ogievskiy wrote:
> ....
>>>> +/**
>>>> + * hbitmap_restore_finish
>>>> + * @hb: HBitmap to operate on.
>>>> + *
>>>> + * Repair HBitmap after calling hbitmap_restore_data. Actuall all
>>>> HBitmap
>>>> + * layers are restore here.
>>>> + */
>>>> +void hbitmap_restore_finish(HBitmap *hb);
>>>> +
>>>> +/**
>>>>    * hbitmap_free:
>>>>    * @hb: HBitmap to operate on.
>>>>    *
>>>
>>> These are just biased opinions:
>>>
>>> - It might be nice to name the store/restore functions "serialize" and
>>> "deserialize," to describe exactly what they are doing.
>>>
>>> - I might refer to "restore_finish" as "post_load" or "post_restore"
>>> or something similar to mimic how device migration functions are
>>> named. I think hbitmap_restore_data_finalize would also be fine; the
>>> key part here is clearly naming it relative to "restore_data."
>>>
>>
>> Hmm. Ok, what about the following set:
>>      hbitmap_serialize()
>>      hbitmap_deserialize_part()
>>      hbitmap_deserialize_finish()
>>
>
> Looks good to me!
* hbitmap_serialize_part() is better to be similar with its pair function
>
>>>> diff --git a/util/hbitmap.c b/util/hbitmap.c
>>>> index 8aa7406..ac0323f 100644
>>>> --- a/util/hbitmap.c
>>>> +++ b/util/hbitmap.c
>>>> @@ -362,6 +362,90 @@ bool hbitmap_get(const HBitmap *hb, uint64_t 
>>>> item)
>>>>       return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] &
>>>> bit) != 0;
>>>>   }
>>>>
>>>> +uint64_t hbitmap_data_size(const HBitmap *hb, uint64_t count)
>>>> +{
>>>> +    uint64_t size;
>>>> +
>>>> +    if (count == 0) {
>>>> +        return 0;
>>>> +    }
>>>> +
>>>> +    size = (((count - 1) >> hb->granularity) >> BITS_PER_LEVEL) + 1;
>>>> +
>>>> +    return size * sizeof(unsigned long);
>>>> +}
>>>> +
>>>
>>> This seems flawed to me: number of bits without an offset can't be
>>> mapped to a number of real bytes, because "two bits" may or may not
>>> cross a granularity boundary, depending on *WHERE* you start counting.
>>>
>>> e.g.
>>>
>>> granularity = 1 (i.e. 2^1 = 2 virtual bits per 1 real bit)
>>> virtual: 001100
>>> real:     0 1 0
>>>
>>> The amount of space required to hold "two bits" here could be as
>>> little as one bit, if the offset is k={0,2,4} but it could be as much
>>> as two bits if the offset is k={1,3}.
>>>
>>> You may never use the function in this way, but mapping virtual bits
>>> to an implementation byte-size seems like it is inviting an
>>> inconsistency.
>> I dislike this function too.. But unfortunately we need the size in
>> bytes used for serialization.
>>
>> Hmm. Ok, without loss of generality, let offset be less than
>> granularity. The precise formula should look like:
>>
>> size = (((offset+count-1) >> hb->granularity) >> BITS_PER_LEVEL);
>>
>> So,
>> size = ((((1 << hb->granularity) + count - 2) >> hb->granularity) >>
>> BITS_PER_LEVEL) + 1;
>> would be enough in any case. Ok?
>>
>
> I think so, as long as when you deserialize the object it does so 
> correctly, regardless of whether or not you start on an even multiple 
> of the granularity.
>
May be, also rename hbitmap_data_size to hbitmap_serialize_size or even 
drop it and make function hbitmap_store_data() return the necessary size 
when *buf is NULL.
diff mbox

Patch

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index b645cfc..e57b610 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -126,6 +126,55 @@  void hbitmap_reset(HBitmap *hb, uint64_t start, uint64_t count);
 bool hbitmap_get(const HBitmap *hb, uint64_t item);
 
 /**
+ * hbitmap_data_size:
+ * @hb: HBitmap to operate on.
+ * @count: Number of bits
+ *
+ * Return amount of bytes hbitmap_store_data needs
+ */
+uint64_t hbitmap_data_size(const HBitmap *hb, uint64_t count);
+
+/**
+ * hbitmap_store_data
+ * @hb: HBitmap to oprate on.
+ * @buf: Buffer to store bitmap data.
+ * @start: First bit to store.
+ * @count: Number of bits to store.
+ *
+ * Stores HBitmap data corresponding to given region. The format of saved data
+ * is linear sequence of bits, so it can be used by hbitmap_restore_data
+ * independently of endianess
+ */
+void hbitmap_store_data(const HBitmap *hb, uint8_t *buf,
+                        uint64_t start, uint64_t count);
+
+/**
+ * hbitmap_restore_data
+ * @hb: HBitmap to oprate on.
+ * @buf: Buffer to restore bitmap data from.
+ * @start: First bit to restore.
+ * @count: Number of bits to restore.
+ *
+ * Retores HBitmap data corresponding to given region. The format is the same
+ * as for hbitmap_store_data.
+ *
+ * ! The bitmap becomes inconsistent after this operation.
+ * hbitmap_restore_finish should be called before using the bitmap after
+ * data restoring.
+ */
+void hbitmap_restore_data(HBitmap *hb, uint8_t *buf,
+                          uint64_t start, uint64_t count);
+
+/**
+ * hbitmap_restore_finish
+ * @hb: HBitmap to operate on.
+ *
+ * Repair HBitmap after calling hbitmap_restore_data. Actuall all HBitmap
+ * layers are restore here.
+ */
+void hbitmap_restore_finish(HBitmap *hb);
+
+/**
  * hbitmap_free:
  * @hb: HBitmap to operate on.
  *
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 8aa7406..ac0323f 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -362,6 +362,90 @@  bool hbitmap_get(const HBitmap *hb, uint64_t item)
     return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] & bit) != 0;
 }
 
+uint64_t hbitmap_data_size(const HBitmap *hb, uint64_t count)
+{
+    uint64_t size;
+
+    if (count == 0) {
+        return 0;
+    }
+
+    size = (((count - 1) >> hb->granularity) >> BITS_PER_LEVEL) + 1;
+
+    return size * sizeof(unsigned long);
+}
+
+void hbitmap_store_data(const HBitmap *hb, uint8_t *buf,
+                        uint64_t start, uint64_t count)
+{
+    uint64_t last = start + count - 1;
+    unsigned long *out = (unsigned long *)buf;
+
+    if (count == 0) {
+        return;
+    }
+
+    start = (start >> hb->granularity) >> BITS_PER_LEVEL;
+    last = (last >> hb->granularity) >> BITS_PER_LEVEL;
+    count = last - start + 1;
+
+#ifdef __BIG_ENDIAN_BITFIELD
+    for (i = start; i <= last; ++i) {
+        unsigned long el = hb->levels[HBITMAP_LEVELS - 1][i];
+        out[i] = (BITS_PER_LONG == 32 ? cpu_to_le32(el) : cpu_to_le64(el));
+    }
+#else
+    memcpy(out, &hb->levels[HBITMAP_LEVELS - 1][start],
+           count * sizeof(unsigned long));
+#endif
+}
+
+void hbitmap_restore_data(HBitmap *hb, uint8_t *buf,
+                          uint64_t start, uint64_t count)
+{
+    uint64_t last = start + count - 1;
+    unsigned long *in = (unsigned long *)buf;
+
+    if (count == 0) {
+        return;
+    }
+
+    start = (start >> hb->granularity) >> BITS_PER_LEVEL;
+    last = (last >> hb->granularity) >> BITS_PER_LEVEL;
+    count = last - start + 1;
+
+#ifdef __BIG_ENDIAN_BITFIELD
+    for (i = start; i <= last; ++i) {
+        hb->levels[HBITMAP_LEVELS - 1][i] =
+            (BITS_PER_LONG == 32 ? be32_to_cpu(in[i]) : be64_to_cpu(in[i]));
+    }
+#else
+    memcpy(&hb->levels[HBITMAP_LEVELS - 1][start], in,
+           count * sizeof(unsigned long));
+#endif
+}
+
+void hbitmap_restore_finish(HBitmap *bitmap)
+{
+    int64_t i, size;
+    int lev, j;
+
+    /* restore levels starting from penultimate to zero level, assuming
+     * that the last one is ok */
+    size = MAX((bitmap->size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1);
+    for (lev = HBITMAP_LEVELS - 1; lev-- > 0; ) {
+        size = MAX((size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1);
+        for (i = 0; i < size; ++i) {
+            bitmap->levels[lev][i] = 0;
+            for (j = 0; j < BITS_PER_LONG; ++j) {
+                if (bitmap->levels[lev+1][(i << BITS_PER_LEVEL) + j]) {
+                    bitmap->levels[lev][i] |=  (1 << j);
+                }
+            }
+        }
+    }
+}
+
 void hbitmap_free(HBitmap *hb)
 {
     unsigned i;