diff mbox

[v7] spec: add qcow2 bitmaps extension specification

Message ID 1452517517-3953-1-git-send-email-vsementsov@virtuozzo.com
State New
Headers show

Commit Message

Vladimir Sementsov-Ogievskiy Jan. 11, 2016, 1:05 p.m. UTC
The new feature for qcow2: storing bitmaps.

This patch adds new header extension to qcow2 - Bitmaps Extension. It
provides an ability to store virtual disk related bitmaps in a qcow2
image. For now there is only one type of such bitmaps: Dirty Tracking
Bitmap, which just tracks virtual disk changes from some moment.

Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
should be stored in this qcow2 file. The size of each bitmap
(considering its granularity) is equal to virtual disk size.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---

v7:

- Rewordings, grammar.
  Max, Eric, John, thank you very much.

- add last paragraph: remaining bits in bitmap data clusters must be
  zero.

- s/Bitmap Directory/bitmap directory/ and other names like this at
  the request of Max.

v6:

- reword bitmap_directory_size description
- bitmap type: make 0 reserved
- extra_data_size: resize to 4bytes
  Also, I've marked this field as "must be zero". We can always change
  it, if we decide allowing managing app to specify any extra data, by
  defining some magic value as a top of user extra data.. So, for now
  non zeor extra_data_size should be considered as an error.
- swap name and extra_data to give good alignment to extra_data.


v5:

- 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
  bitmaps.
- rewordings
- move upper bounds to "Notes about Qemu limits"
- s/should/must somewhere. (but not everywhere)
- move name_size field closer to name itself in bitmap header
- add extra data area to bitmap header
- move bitmap data description to separate section

 docs/specs/qcow2.txt | 172 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 171 insertions(+), 1 deletion(-)

Comments

John Snow Jan. 12, 2016, 12:30 a.m. UTC | #1
On 01/11/2016 08:05 AM, Vladimir Sementsov-Ogievskiy wrote:
> The new feature for qcow2: storing bitmaps.
> 
> This patch adds new header extension to qcow2 - Bitmaps Extension. It
> provides an ability to store virtual disk related bitmaps in a qcow2
> image. For now there is only one type of such bitmaps: Dirty Tracking
> Bitmap, which just tracks virtual disk changes from some moment.
> 
> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
> should be stored in this qcow2 file. The size of each bitmap
> (considering its granularity) is equal to virtual disk size.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
> 
> v7:
> 
> - Rewordings, grammar.
>   Max, Eric, John, thank you very much.
> 
> - add last paragraph: remaining bits in bitmap data clusters must be
>   zero.
> 
> - s/Bitmap Directory/bitmap directory/ and other names like this at
>   the request of Max.
> 
> v6:
> 
> - reword bitmap_directory_size description
> - bitmap type: make 0 reserved
> - extra_data_size: resize to 4bytes
>   Also, I've marked this field as "must be zero". We can always change
>   it, if we decide allowing managing app to specify any extra data, by
>   defining some magic value as a top of user extra data.. So, for now
>   non zeor extra_data_size should be considered as an error.
> - swap name and extra_data to give good alignment to extra_data.
> 
> 
> v5:
> 
> - 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
>   bitmaps.
> - rewordings
> - move upper bounds to "Notes about Qemu limits"
> - s/should/must somewhere. (but not everywhere)
> - move name_size field closer to name itself in bitmap header
> - add extra data area to bitmap header
> - move bitmap data description to separate section
> 
>  docs/specs/qcow2.txt | 172 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 171 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index 121dfc8..997239d 100644
> --- a/docs/specs/qcow2.txt
> +++ b/docs/specs/qcow2.txt
> @@ -103,7 +103,18 @@ in the description of a field.
>                      write to an image with unknown auto-clear features if it
>                      clears the respective bits from this field first.
>  
> -                    Bits 0-63:  Reserved (set to 0)
> +                    Bit 0:      Bitmaps extension bit
> +                                This bit indicates consistency for the bitmaps
> +                                extension data.
> +
> +                                It is an error if this bit is set without the
> +                                bitmaps extension present.
> +
> +                                If the bitmaps extension is present but this
> +                                bit is unset, the bitmaps extension data is
> +                                inconsistent.
> +
> +                    Bits 1-63:  Reserved (set to 0)
>  
>           96 -  99:  refcount_order
>                      Describes the width of a reference count block entry (width
> @@ -123,6 +134,7 @@ be stored. Each extension has a structure like the following:
>                          0x00000000 - End of the header extension area
>                          0xE2792ACA - Backing file format name
>                          0x6803f857 - Feature name table
> +                        0x23852875 - Bitmaps extension
>                          other      - Unknown header extension, can be safely
>                                       ignored
>  
> @@ -166,6 +178,34 @@ the header extension data. Each entry look like this:
>                      terminated if it has full length)
>  
>  
> +== Bitmaps extension ==
> +
> +The bitmaps extension is an optional header extension. It provides the ability
> +to store bitmaps related to a virtual disk. For now, there is only one bitmap
> +type: the dirty tracking bitmap, which tracks virtual disk changes from some
> +point in time.
> +
> +The data of the extension should be considered consistent only if the
> +corresponding auto-clear feature bit is set, see autoclear_features above.
> +
> +The fields of the bitmaps extension are:
> +
> +          0 -  3:  nb_bitmaps
> +                   The number of bitmaps contained in the image. Must be
> +                   greater than or equal to 1.
> +
> +                   Note: Qemu currently only supports up to 65535 bitmaps per
> +                   image.
> +
> +          4 -  7:  bitmap_directory_size
> +                   Size of the bitmap directory in bytes. It is the cumulative
> +                   size of all (nb_bitmaps) bitmap headers.
> +
> +          8 - 15:  bitmap_directory_offset
> +                   Offset into the image file at which the bitmap directory
> +                   starts. Must be aligned to a cluster boundary.
> +
> +
>  == Host cluster management ==
>  
>  qcow2 manages the allocation of host clusters by maintaining a reference count
> @@ -360,3 +400,133 @@ Snapshot table entry:
>  
>          variable:   Padding to round up the snapshot table entry size to the
>                      next multiple of 8.
> +
> +
> +== Bitmaps ==
> +
> +As mentioned above, the bitmaps extension provides the ability to store bitmaps
> +related a virtual disk. This section describes how these bitmaps are stored.
> +
> +Note: all bitmaps are related to the virtual disk stored in this image.
> +
> +=== Bitmap directory ===
> +
> +Each bitmap saved in the image is described in a bitmap directory entry. The
> +bitmap directory is a contiguous area in the image file, whose starting offset
> +and length are given by the header extension fields bitmap_directory_offset and
> +bitmap_directory_size. The entries of the bitmap directory have variable
> +length, depending on the length of the bitmap name and extra data. These
> +entries are also called bitmap headers.
> +
> +Structure of a bitmap directory entry:
> +
> +    Byte 0 -  7:    bitmap_table_offset
> +                    Offset into the image file at which the bitmap table
> +                    (described below) for the bitmap starts. Must be aligned to
> +                    a cluster boundary.
> +
> +         8 - 11:    bitmap_table_size
> +                    Number of entries in the bitmap table of the bitmap.
> +
> +        12 - 15:    flags
> +                    Bit
> +                      0: in_use
> +                         The bitmap was not saved correctly and may be
> +                         inconsistent.
> +
> +                      1: auto
> +                         The bitmap must reflect all changes of the virtual
> +                         disk by any application that would write to this qcow2
> +                         file (including writes, snapshot switching, etc.). The
> +                         type of this bitmap must be 'dirty tracking bitmap'.
> +
> +                    Bits 2 - 31 are reserved and must be 0.
> +
> +             16:    type
> +                    This field describes the sort of the bitmap.
> +                    Values:
> +                      1: Dirty tracking bitmap
> +
> +                    Values 0, 2 - 255 are reserved.
> +
> +             17:    granularity_bits
> +                    Granularity bits. Valid values: 0 - 63.
> +
> +                    Note: Qemu currently doesn't support granularity_bits
> +                    greater than 31.
> +
> +                    Granularity is calculated as
> +                        granularity = 1 << granularity_bits
> +
> +                    A bitmap's granularity is how many bytes of the image
> +                    accounts for one bit of the bitmap.
> +
> +        18 - 19:    name_size
> +                    Size of the bitmap name. Must be non-zero.
> +
> +                    Note: Qemu currently doesn't support values greater than
> +                    1023.
> +
> +        20 - 23:    extra_data_size
> +                    Size of type-specific extra data.
> +
> +                    For now, as no extra data is defined, extra_data_size is
> +                    reserved and must be zero.
> +
> +        variable:   Type-specific extra data for the bitmap.
> +
> +        variable:   The name of the bitmap (not null terminated). Must be
> +                    unique among all bitmap names within the bitmaps extension.
> +
> +        variable:   Padding to round up the bitmap directory entry size to the
> +                    next multiple of 8.
> +
> +=== Bitmap table ===
> +
> +Bitmaps are stored using a one-level structure (as opposed to two-level
> +structure like for refcounts and guest clusters mapping) for the mapping of
> +bitmap data to host clusters. This structure is called the bitmap table.
> +
> +Each bitmap table has a variable size (stored in the bitmap directory Entry)
> +and may use multiple clusters, however, it must be contiguous in the image
> +file.
> +
> +Structure of a bitmap table entry:
> +
> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are non-zero.
> +                    If bits 9 - 55 are zero:
> +                      0: Cluster should be read as all zeros.
> +                      1: Cluster should be read as all ones.
> +
> +         1 -  8:    Reserved and must be zero.
> +
> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be aligned to
> +                    a cluster boundary. If the offset is 0, the cluster is
> +                    unallocated; in that case, bit 0 determines how this
> +                    cluster should be treated when read from.
> +
> +        56 - 63:    Reserved and must be zero.
> +
> +=== Bitmap data ===
> +
> +As noted above, bitmap data is stored in separate clusters, described by the
> +bitmap table. Given an offset (in bytes) into the bitmap data, the offset into
> +the image file can be obtained as follows:
> +
> +    image_offset =
> +        bitmap_table[bitmap_data_offset / cluster_size] +
> +            (bitmap_data_offset % cluster_size)
> +
> +This offset is not defined if bits 9 - 55 of bitmap table entry are zero (see
> +above).
> +
> +Given an offset byte_nr into the virtual disk and the bitmap's granularity, the
> +bit offset into the bitmap can be calculated like this:
> +
> +    bit_offset =
> +        image_offset(byte_nr / granularity / 8) * 8 +
> +            (byte_nr / granularity) % 8
> +
> +If the size of the bitmap data is not a multiply of cluster size then the last
> +cluster of the bitmap data contains some unused tail bits. These bits must be
> +zero.
> 

s/multiply/multiple/.

Looks good otherwise, pending the results of our discussion about what
to do about the extra_data_size / type-specific data fields.

Thanks!
Denis V. Lunev Jan. 14, 2016, 11:35 a.m. UTC | #2
On 01/12/2016 03:30 AM, John Snow wrote:
>
> On 01/11/2016 08:05 AM, Vladimir Sementsov-Ogievskiy wrote:
>> The new feature for qcow2: storing bitmaps.
>>
>> This patch adds new header extension to qcow2 - Bitmaps Extension. It
>> provides an ability to store virtual disk related bitmaps in a qcow2
>> image. For now there is only one type of such bitmaps: Dirty Tracking
>> Bitmap, which just tracks virtual disk changes from some moment.
>>
>> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
>> should be stored in this qcow2 file. The size of each bitmap
>> (considering its granularity) is equal to virtual disk size.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>
>> v7:
>>
>> - Rewordings, grammar.
>>    Max, Eric, John, thank you very much.
>>
>> - add last paragraph: remaining bits in bitmap data clusters must be
>>    zero.
>>
>> - s/Bitmap Directory/bitmap directory/ and other names like this at
>>    the request of Max.
>>
>> v6:
>>
>> - reword bitmap_directory_size description
>> - bitmap type: make 0 reserved
>> - extra_data_size: resize to 4bytes
>>    Also, I've marked this field as "must be zero". We can always change
>>    it, if we decide allowing managing app to specify any extra data, by
>>    defining some magic value as a top of user extra data.. So, for now
>>    non zeor extra_data_size should be considered as an error.
>> - swap name and extra_data to give good alignment to extra_data.
>>
>>
>> v5:
>>
>> - 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
>>    bitmaps.
>> - rewordings
>> - move upper bounds to "Notes about Qemu limits"
>> - s/should/must somewhere. (but not everywhere)
>> - move name_size field closer to name itself in bitmap header
>> - add extra data area to bitmap header
>> - move bitmap data description to separate section
>>
>>   docs/specs/qcow2.txt | 172 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 171 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>> index 121dfc8..997239d 100644
>> --- a/docs/specs/qcow2.txt
>> +++ b/docs/specs/qcow2.txt
>> @@ -103,7 +103,18 @@ in the description of a field.
>>                       write to an image with unknown auto-clear features if it
>>                       clears the respective bits from this field first.
>>   
>> -                    Bits 0-63:  Reserved (set to 0)
>> +                    Bit 0:      Bitmaps extension bit
>> +                                This bit indicates consistency for the bitmaps
>> +                                extension data.
>> +
>> +                                It is an error if this bit is set without the
>> +                                bitmaps extension present.
>> +
>> +                                If the bitmaps extension is present but this
>> +                                bit is unset, the bitmaps extension data is
>> +                                inconsistent.
>> +
>> +                    Bits 1-63:  Reserved (set to 0)
>>   
>>            96 -  99:  refcount_order
>>                       Describes the width of a reference count block entry (width
>> @@ -123,6 +134,7 @@ be stored. Each extension has a structure like the following:
>>                           0x00000000 - End of the header extension area
>>                           0xE2792ACA - Backing file format name
>>                           0x6803f857 - Feature name table
>> +                        0x23852875 - Bitmaps extension
>>                           other      - Unknown header extension, can be safely
>>                                        ignored
>>   
>> @@ -166,6 +178,34 @@ the header extension data. Each entry look like this:
>>                       terminated if it has full length)
>>   
>>   
>> +== Bitmaps extension ==
>> +
>> +The bitmaps extension is an optional header extension. It provides the ability
>> +to store bitmaps related to a virtual disk. For now, there is only one bitmap
>> +type: the dirty tracking bitmap, which tracks virtual disk changes from some
>> +point in time.
>> +
>> +The data of the extension should be considered consistent only if the
>> +corresponding auto-clear feature bit is set, see autoclear_features above.
>> +
>> +The fields of the bitmaps extension are:
>> +
>> +          0 -  3:  nb_bitmaps
>> +                   The number of bitmaps contained in the image. Must be
>> +                   greater than or equal to 1.
>> +
>> +                   Note: Qemu currently only supports up to 65535 bitmaps per
>> +                   image.
>> +
>> +          4 -  7:  bitmap_directory_size
>> +                   Size of the bitmap directory in bytes. It is the cumulative
>> +                   size of all (nb_bitmaps) bitmap headers.
>> +
>> +          8 - 15:  bitmap_directory_offset
>> +                   Offset into the image file at which the bitmap directory
>> +                   starts. Must be aligned to a cluster boundary.
>> +
>> +
>>   == Host cluster management ==
>>   
>>   qcow2 manages the allocation of host clusters by maintaining a reference count
>> @@ -360,3 +400,133 @@ Snapshot table entry:
>>   
>>           variable:   Padding to round up the snapshot table entry size to the
>>                       next multiple of 8.
>> +
>> +
>> +== Bitmaps ==
>> +
>> +As mentioned above, the bitmaps extension provides the ability to store bitmaps
>> +related a virtual disk. This section describes how these bitmaps are stored.
>> +
>> +Note: all bitmaps are related to the virtual disk stored in this image.
>> +
>> +=== Bitmap directory ===
>> +
>> +Each bitmap saved in the image is described in a bitmap directory entry. The
>> +bitmap directory is a contiguous area in the image file, whose starting offset
>> +and length are given by the header extension fields bitmap_directory_offset and
>> +bitmap_directory_size. The entries of the bitmap directory have variable
>> +length, depending on the length of the bitmap name and extra data. These
>> +entries are also called bitmap headers.
>> +
>> +Structure of a bitmap directory entry:
>> +
>> +    Byte 0 -  7:    bitmap_table_offset
>> +                    Offset into the image file at which the bitmap table
>> +                    (described below) for the bitmap starts. Must be aligned to
>> +                    a cluster boundary.
>> +
>> +         8 - 11:    bitmap_table_size
>> +                    Number of entries in the bitmap table of the bitmap.
>> +
>> +        12 - 15:    flags
>> +                    Bit
>> +                      0: in_use
>> +                         The bitmap was not saved correctly and may be
>> +                         inconsistent.
>> +
>> +                      1: auto
>> +                         The bitmap must reflect all changes of the virtual
>> +                         disk by any application that would write to this qcow2
>> +                         file (including writes, snapshot switching, etc.). The
>> +                         type of this bitmap must be 'dirty tracking bitmap'.
>> +
>> +                    Bits 2 - 31 are reserved and must be 0.
>> +
>> +             16:    type
>> +                    This field describes the sort of the bitmap.
>> +                    Values:
>> +                      1: Dirty tracking bitmap
>> +
>> +                    Values 0, 2 - 255 are reserved.
>> +
>> +             17:    granularity_bits
>> +                    Granularity bits. Valid values: 0 - 63.
>> +
>> +                    Note: Qemu currently doesn't support granularity_bits
>> +                    greater than 31.
>> +
>> +                    Granularity is calculated as
>> +                        granularity = 1 << granularity_bits
>> +
>> +                    A bitmap's granularity is how many bytes of the image
>> +                    accounts for one bit of the bitmap.
>> +
>> +        18 - 19:    name_size
>> +                    Size of the bitmap name. Must be non-zero.
>> +
>> +                    Note: Qemu currently doesn't support values greater than
>> +                    1023.
>> +
>> +        20 - 23:    extra_data_size
>> +                    Size of type-specific extra data.
>> +
>> +                    For now, as no extra data is defined, extra_data_size is
>> +                    reserved and must be zero.
>> +
>> +        variable:   Type-specific extra data for the bitmap.
>> +
>> +        variable:   The name of the bitmap (not null terminated). Must be
>> +                    unique among all bitmap names within the bitmaps extension.
>> +
>> +        variable:   Padding to round up the bitmap directory entry size to the
>> +                    next multiple of 8.
>> +
>> +=== Bitmap table ===
>> +
>> +Bitmaps are stored using a one-level structure (as opposed to two-level
>> +structure like for refcounts and guest clusters mapping) for the mapping of
>> +bitmap data to host clusters. This structure is called the bitmap table.
>> +
>> +Each bitmap table has a variable size (stored in the bitmap directory Entry)
>> +and may use multiple clusters, however, it must be contiguous in the image
>> +file.
>> +
>> +Structure of a bitmap table entry:
>> +
>> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are non-zero.
>> +                    If bits 9 - 55 are zero:
>> +                      0: Cluster should be read as all zeros.
>> +                      1: Cluster should be read as all ones.
>> +
>> +         1 -  8:    Reserved and must be zero.
>> +
>> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be aligned to
>> +                    a cluster boundary. If the offset is 0, the cluster is
>> +                    unallocated; in that case, bit 0 determines how this
>> +                    cluster should be treated when read from.
>> +
>> +        56 - 63:    Reserved and must be zero.
>> +
>> +=== Bitmap data ===
>> +
>> +As noted above, bitmap data is stored in separate clusters, described by the
>> +bitmap table. Given an offset (in bytes) into the bitmap data, the offset into
>> +the image file can be obtained as follows:
>> +
>> +    image_offset =
>> +        bitmap_table[bitmap_data_offset / cluster_size] +
>> +            (bitmap_data_offset % cluster_size)
>> +
>> +This offset is not defined if bits 9 - 55 of bitmap table entry are zero (see
>> +above).
>> +
>> +Given an offset byte_nr into the virtual disk and the bitmap's granularity, the
>> +bit offset into the bitmap can be calculated like this:
>> +
>> +    bit_offset =
>> +        image_offset(byte_nr / granularity / 8) * 8 +
>> +            (byte_nr / granularity) % 8
>> +
>> +If the size of the bitmap data is not a multiply of cluster size then the last
>> +cluster of the bitmap data contains some unused tail bits. These bits must be
>> +zero.
>>
> s/multiply/multiple/.
>
> Looks good otherwise, pending the results of our discussion about what
> to do about the extra_data_size / type-specific data fields.
>
> Thanks!
>
should we re-submit? Is this version good enough to be accepted? :)

Den
John Snow Jan. 14, 2016, 4:42 p.m. UTC | #3
On 01/14/2016 06:35 AM, Denis V. Lunev wrote:
> On 01/12/2016 03:30 AM, John Snow wrote:
>>
>> On 01/11/2016 08:05 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> The new feature for qcow2: storing bitmaps.
>>>
>>> This patch adds new header extension to qcow2 - Bitmaps Extension. It
>>> provides an ability to store virtual disk related bitmaps in a qcow2
>>> image. For now there is only one type of such bitmaps: Dirty Tracking
>>> Bitmap, which just tracks virtual disk changes from some moment.
>>>
>>> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
>>> should be stored in this qcow2 file. The size of each bitmap
>>> (considering its granularity) is equal to virtual disk size.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>
>>> v7:
>>>
>>> - Rewordings, grammar.
>>>    Max, Eric, John, thank you very much.
>>>
>>> - add last paragraph: remaining bits in bitmap data clusters must be
>>>    zero.
>>>
>>> - s/Bitmap Directory/bitmap directory/ and other names like this at
>>>    the request of Max.
>>>
>>> v6:
>>>
>>> - reword bitmap_directory_size description
>>> - bitmap type: make 0 reserved
>>> - extra_data_size: resize to 4bytes
>>>    Also, I've marked this field as "must be zero". We can always change
>>>    it, if we decide allowing managing app to specify any extra data, by
>>>    defining some magic value as a top of user extra data.. So, for now
>>>    non zeor extra_data_size should be considered as an error.
>>> - swap name and extra_data to give good alignment to extra_data.
>>>
>>>
>>> v5:
>>>
>>> - 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
>>>    bitmaps.
>>> - rewordings
>>> - move upper bounds to "Notes about Qemu limits"
>>> - s/should/must somewhere. (but not everywhere)
>>> - move name_size field closer to name itself in bitmap header
>>> - add extra data area to bitmap header
>>> - move bitmap data description to separate section
>>>
>>>   docs/specs/qcow2.txt | 172
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>   1 file changed, 171 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>> index 121dfc8..997239d 100644
>>> --- a/docs/specs/qcow2.txt
>>> +++ b/docs/specs/qcow2.txt
>>> @@ -103,7 +103,18 @@ in the description of a field.
>>>                       write to an image with unknown auto-clear
>>> features if it
>>>                       clears the respective bits from this field first.
>>>   -                    Bits 0-63:  Reserved (set to 0)
>>> +                    Bit 0:      Bitmaps extension bit
>>> +                                This bit indicates consistency for
>>> the bitmaps
>>> +                                extension data.
>>> +
>>> +                                It is an error if this bit is set
>>> without the
>>> +                                bitmaps extension present.
>>> +
>>> +                                If the bitmaps extension is present
>>> but this
>>> +                                bit is unset, the bitmaps extension
>>> data is
>>> +                                inconsistent.
>>> +
>>> +                    Bits 1-63:  Reserved (set to 0)
>>>              96 -  99:  refcount_order
>>>                       Describes the width of a reference count block
>>> entry (width
>>> @@ -123,6 +134,7 @@ be stored. Each extension has a structure like
>>> the following:
>>>                           0x00000000 - End of the header extension area
>>>                           0xE2792ACA - Backing file format name
>>>                           0x6803f857 - Feature name table
>>> +                        0x23852875 - Bitmaps extension
>>>                           other      - Unknown header extension, can
>>> be safely
>>>                                        ignored
>>>   @@ -166,6 +178,34 @@ the header extension data. Each entry look
>>> like this:
>>>                       terminated if it has full length)
>>>     +== Bitmaps extension ==
>>> +
>>> +The bitmaps extension is an optional header extension. It provides
>>> the ability
>>> +to store bitmaps related to a virtual disk. For now, there is only
>>> one bitmap
>>> +type: the dirty tracking bitmap, which tracks virtual disk changes
>>> from some
>>> +point in time.
>>> +
>>> +The data of the extension should be considered consistent only if the
>>> +corresponding auto-clear feature bit is set, see autoclear_features
>>> above.
>>> +
>>> +The fields of the bitmaps extension are:
>>> +
>>> +          0 -  3:  nb_bitmaps
>>> +                   The number of bitmaps contained in the image.
>>> Must be
>>> +                   greater than or equal to 1.
>>> +
>>> +                   Note: Qemu currently only supports up to 65535
>>> bitmaps per
>>> +                   image.
>>> +
>>> +          4 -  7:  bitmap_directory_size
>>> +                   Size of the bitmap directory in bytes. It is the
>>> cumulative
>>> +                   size of all (nb_bitmaps) bitmap headers.
>>> +
>>> +          8 - 15:  bitmap_directory_offset
>>> +                   Offset into the image file at which the bitmap
>>> directory
>>> +                   starts. Must be aligned to a cluster boundary.
>>> +
>>> +
>>>   == Host cluster management ==
>>>     qcow2 manages the allocation of host clusters by maintaining a
>>> reference count
>>> @@ -360,3 +400,133 @@ Snapshot table entry:
>>>             variable:   Padding to round up the snapshot table entry
>>> size to the
>>>                       next multiple of 8.
>>> +
>>> +
>>> +== Bitmaps ==
>>> +
>>> +As mentioned above, the bitmaps extension provides the ability to
>>> store bitmaps
>>> +related a virtual disk. This section describes how these bitmaps are
>>> stored.
>>> +
>>> +Note: all bitmaps are related to the virtual disk stored in this image.
>>> +
>>> +=== Bitmap directory ===
>>> +
>>> +Each bitmap saved in the image is described in a bitmap directory
>>> entry. The
>>> +bitmap directory is a contiguous area in the image file, whose
>>> starting offset
>>> +and length are given by the header extension fields
>>> bitmap_directory_offset and
>>> +bitmap_directory_size. The entries of the bitmap directory have
>>> variable
>>> +length, depending on the length of the bitmap name and extra data.
>>> These
>>> +entries are also called bitmap headers.
>>> +
>>> +Structure of a bitmap directory entry:
>>> +
>>> +    Byte 0 -  7:    bitmap_table_offset
>>> +                    Offset into the image file at which the bitmap
>>> table
>>> +                    (described below) for the bitmap starts. Must be
>>> aligned to
>>> +                    a cluster boundary.
>>> +
>>> +         8 - 11:    bitmap_table_size
>>> +                    Number of entries in the bitmap table of the
>>> bitmap.
>>> +
>>> +        12 - 15:    flags
>>> +                    Bit
>>> +                      0: in_use
>>> +                         The bitmap was not saved correctly and may be
>>> +                         inconsistent.
>>> +
>>> +                      1: auto
>>> +                         The bitmap must reflect all changes of the
>>> virtual
>>> +                         disk by any application that would write to
>>> this qcow2
>>> +                         file (including writes, snapshot switching,
>>> etc.). The
>>> +                         type of this bitmap must be 'dirty tracking
>>> bitmap'.
>>> +
>>> +                    Bits 2 - 31 are reserved and must be 0.
>>> +
>>> +             16:    type
>>> +                    This field describes the sort of the bitmap.
>>> +                    Values:
>>> +                      1: Dirty tracking bitmap
>>> +
>>> +                    Values 0, 2 - 255 are reserved.
>>> +
>>> +             17:    granularity_bits
>>> +                    Granularity bits. Valid values: 0 - 63.
>>> +
>>> +                    Note: Qemu currently doesn't support
>>> granularity_bits
>>> +                    greater than 31.
>>> +
>>> +                    Granularity is calculated as
>>> +                        granularity = 1 << granularity_bits
>>> +
>>> +                    A bitmap's granularity is how many bytes of the
>>> image
>>> +                    accounts for one bit of the bitmap.
>>> +
>>> +        18 - 19:    name_size
>>> +                    Size of the bitmap name. Must be non-zero.
>>> +
>>> +                    Note: Qemu currently doesn't support values
>>> greater than
>>> +                    1023.
>>> +
>>> +        20 - 23:    extra_data_size
>>> +                    Size of type-specific extra data.
>>> +
>>> +                    For now, as no extra data is defined,
>>> extra_data_size is
>>> +                    reserved and must be zero.
>>> +
>>> +        variable:   Type-specific extra data for the bitmap.
>>> +
>>> +        variable:   The name of the bitmap (not null terminated).
>>> Must be
>>> +                    unique among all bitmap names within the bitmaps
>>> extension.
>>> +
>>> +        variable:   Padding to round up the bitmap directory entry
>>> size to the
>>> +                    next multiple of 8.
>>> +
>>> +=== Bitmap table ===
>>> +
>>> +Bitmaps are stored using a one-level structure (as opposed to two-level
>>> +structure like for refcounts and guest clusters mapping) for the
>>> mapping of
>>> +bitmap data to host clusters. This structure is called the bitmap
>>> table.
>>> +
>>> +Each bitmap table has a variable size (stored in the bitmap
>>> directory Entry)
>>> +and may use multiple clusters, however, it must be contiguous in the
>>> image
>>> +file.
>>> +
>>> +Structure of a bitmap table entry:
>>> +
>>> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are
>>> non-zero.
>>> +                    If bits 9 - 55 are zero:
>>> +                      0: Cluster should be read as all zeros.
>>> +                      1: Cluster should be read as all ones.
>>> +
>>> +         1 -  8:    Reserved and must be zero.
>>> +
>>> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be
>>> aligned to
>>> +                    a cluster boundary. If the offset is 0, the
>>> cluster is
>>> +                    unallocated; in that case, bit 0 determines how
>>> this
>>> +                    cluster should be treated when read from.
>>> +
>>> +        56 - 63:    Reserved and must be zero.
>>> +
>>> +=== Bitmap data ===
>>> +
>>> +As noted above, bitmap data is stored in separate clusters,
>>> described by the
>>> +bitmap table. Given an offset (in bytes) into the bitmap data, the
>>> offset into
>>> +the image file can be obtained as follows:
>>> +
>>> +    image_offset =
>>> +        bitmap_table[bitmap_data_offset / cluster_size] +
>>> +            (bitmap_data_offset % cluster_size)
>>> +
>>> +This offset is not defined if bits 9 - 55 of bitmap table entry are
>>> zero (see
>>> +above).
>>> +
>>> +Given an offset byte_nr into the virtual disk and the bitmap's
>>> granularity, the
>>> +bit offset into the bitmap can be calculated like this:
>>> +
>>> +    bit_offset =
>>> +        image_offset(byte_nr / granularity / 8) * 8 +
>>> +            (byte_nr / granularity) % 8
>>> +
>>> +If the size of the bitmap data is not a multiply of cluster size
>>> then the last
>>> +cluster of the bitmap data contains some unused tail bits. These
>>> bits must be
>>> +zero.
>>>
>> s/multiply/multiple/.
>>
>> Looks good otherwise, pending the results of our discussion about what
>> to do about the extra_data_size / type-specific data fields.
>>
>> Thanks!
>>
> should we re-submit? Is this version good enough to be accepted? :)
> 
> Den

For my money, I'm happy -- though depending on how Vladimir wishes to
use his type-specific data field, it may require some clarification. (I
had a question over whether or not we needed versioning for this field
if the plan was to add type-specific data to a type later, or if the
extra data was always defined directly alongside a new type.)

The 2.6 window isn't close to over yet, though, so I would personally be
OK with follow-up patches.

Max, Eric, any further input?

--js
Eric Blake Jan. 14, 2016, 10:08 p.m. UTC | #4
On 01/11/2016 06:05 AM, Vladimir Sementsov-Ogievskiy wrote:
> The new feature for qcow2: storing bitmaps.
> 
> This patch adds new header extension to qcow2 - Bitmaps Extension. It
> provides an ability to store virtual disk related bitmaps in a qcow2
> image. For now there is only one type of such bitmaps: Dirty Tracking
> Bitmap, which just tracks virtual disk changes from some moment.
> 
> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
> should be stored in this qcow2 file. The size of each bitmap
> (considering its granularity) is equal to virtual disk size.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
> 

> @@ -166,6 +178,34 @@ the header extension data. Each entry look like this:
>                      terminated if it has full length)
>  
>  
> +== Bitmaps extension ==

> +          0 -  3:  nb_bitmaps
> +                   The number of bitmaps contained in the image. Must be
> +                   greater than or equal to 1.
> +
> +                   Note: Qemu currently only supports up to 65535 bitmaps per
> +                   image.
> +
> +          4 -  7:  bitmap_directory_size
> +                   Size of the bitmap directory in bytes. It is the cumulative
> +                   size of all (nb_bitmaps) bitmap headers.

Only 4 bytes - if we ever raise our 64k entry restriction (nb_bitmaps),
could we run into an image that has so many directory entries as to make
the directory itself spill past 4G?  But I don't think it is likely, so
I can live with your choice.

> +
> +== Bitmaps ==
> +
> +As mentioned above, the bitmaps extension provides the ability to store bitmaps
> +related a virtual disk. This section describes how these bitmaps are stored.
> +
> +Note: all bitmaps are related to the virtual disk stored in this image.
> +
> +=== Bitmap directory ===
> +
> +Each bitmap saved in the image is described in a bitmap directory entry. The
> +bitmap directory is a contiguous area in the image file, whose starting offset
> +and length are given by the header extension fields bitmap_directory_offset and
> +bitmap_directory_size. The entries of the bitmap directory have variable
> +length, depending on the length of the bitmap name and extra data. These
> +entries are also called bitmap headers.
> +
> +Structure of a bitmap directory entry:
> +
> +    Byte 0 -  7:    bitmap_table_offset
> +                    Offset into the image file at which the bitmap table
> +                    (described below) for the bitmap starts. Must be aligned to
> +                    a cluster boundary.
> +
> +         8 - 11:    bitmap_table_size
> +                    Number of entries in the bitmap table of the bitmap.

Should this be the size in bytes, instead of the number of entries? But
at least the entries are fixed width of 8 bytes each, so this lets you
get a bitmap table up to 32G bytes rather than just 4G in size.  (Let's
see here - if we have 32G bytes in the bitmap table, that means 4G
clusters occupied by the bitmap itself; in the worst case of 512-byte
clusters and granularity 0, that is a maximum bitmap size of 2T
describing 16T of virtual guest image; with larger cluster size and/or
larger granularity, we cover a lot more virtual guest space with less
bitmap size; so I guess we aren't too worried about running out of space?).

> +        20 - 23:    extra_data_size
> +                    Size of type-specific extra data.
> +
> +                    For now, as no extra data is defined, extra_data_size is
> +                    reserved and must be zero.
> +
> +        variable:   Type-specific extra data for the bitmap.

I'd write this as:
           variable:   extra_data
                       Type-specific extra data for the bitmap,
                       occupying extra_data_size bytes.

> +
> +        variable:   The name of the bitmap (not null terminated). Must be
> +                    unique among all bitmap names within the bitmaps extension.
> +
> +        variable:   Padding to round up the bitmap directory entry size to the
> +                    next multiple of 8.

Should we require the padding to be all NUL bytes?  (We aren't
consistent on whether we require that for other locations of padding in
the spec, so that could be a followup patch).

> +
> +=== Bitmap table ===
> +
> +Bitmaps are stored using a one-level structure (as opposed to two-level
> +structure like for refcounts and guest clusters mapping) for the mapping of
> +bitmap data to host clusters. This structure is called the bitmap table.

Possible wording tweak:
Bitmaps are stored using a one-level structure (as opposed to the
two-level structures for refcounts and guest cluster mapping), and are
used for the mapping of bitmap data to host clusters

> +
> +Each bitmap table has a variable size (stored in the bitmap directory Entry)

Does 'Entry' still need to be capitalized?

> +and may use multiple clusters, however, it must be contiguous in the image
> +file.
> +
> +Structure of a bitmap table entry:
> +
> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are non-zero.
> +                    If bits 9 - 55 are zero:
> +                      0: Cluster should be read as all zeros.
> +                      1: Cluster should be read as all ones.
> +
> +         1 -  8:    Reserved and must be zero.
> +
> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be aligned to
> +                    a cluster boundary. If the offset is 0, the cluster is
> +                    unallocated; in that case, bit 0 determines how this
> +                    cluster should be treated when read from.

Possible wording tweak:
s/when read from/during reads/.

> +
> +        56 - 63:    Reserved and must be zero.
> +
> +=== Bitmap data ===
> +
> +As noted above, bitmap data is stored in separate clusters, described by the
> +bitmap table. Given an offset (in bytes) into the bitmap data, the offset into
> +the image file can be obtained as follows:
> +
> +    image_offset =
> +        bitmap_table[bitmap_data_offset / cluster_size] +
> +            (bitmap_data_offset % cluster_size)
> +
> +This offset is not defined if bits 9 - 55 of bitmap table entry are zero (see
> +above).
> +
> +Given an offset byte_nr into the virtual disk and the bitmap's granularity, the
> +bit offset into the bitmap can be calculated like this:
> +
> +    bit_offset =
> +        image_offset(byte_nr / granularity / 8) * 8 +
> +            (byte_nr / granularity) % 8
> +
> +If the size of the bitmap data is not a multiply of cluster size then the last

s/multiply of cluster size/multiple of the cluster size,/

> +cluster of the bitmap data contains some unused tail bits. These bits must be
> +zero.
>
John Snow Jan. 14, 2016, 11:26 p.m. UTC | #5
On 01/14/2016 05:08 PM, Eric Blake wrote:
> On 01/11/2016 06:05 AM, Vladimir Sementsov-Ogievskiy wrote:
>> The new feature for qcow2: storing bitmaps.
>>
>> This patch adds new header extension to qcow2 - Bitmaps Extension. It
>> provides an ability to store virtual disk related bitmaps in a qcow2
>> image. For now there is only one type of such bitmaps: Dirty Tracking
>> Bitmap, which just tracks virtual disk changes from some moment.
>>
>> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
>> should be stored in this qcow2 file. The size of each bitmap
>> (considering its granularity) is equal to virtual disk size.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>
> 
>> @@ -166,6 +178,34 @@ the header extension data. Each entry look like this:
>>                      terminated if it has full length)
>>  
>>  
>> +== Bitmaps extension ==
> 
>> +          0 -  3:  nb_bitmaps
>> +                   The number of bitmaps contained in the image. Must be
>> +                   greater than or equal to 1.
>> +
>> +                   Note: Qemu currently only supports up to 65535 bitmaps per
>> +                   image.
>> +
>> +          4 -  7:  bitmap_directory_size
>> +                   Size of the bitmap directory in bytes. It is the cumulative
>> +                   size of all (nb_bitmaps) bitmap headers.
> 
> Only 4 bytes - if we ever raise our 64k entry restriction (nb_bitmaps),
> could we run into an image that has so many directory entries as to make
> the directory itself spill past 4G?  But I don't think it is likely, so
> I can live with your choice.
> 

"We'll never need this!"

I hope someone in 2082 is reading this right now and is quite angry.

(But really, I can't foresee needing this many per each drive -- and if
we do, we have external storage mechanisms in development to handle such
wild cases.)

>> +
>> +== Bitmaps ==
>> +
>> +As mentioned above, the bitmaps extension provides the ability to store bitmaps
>> +related a virtual disk. This section describes how these bitmaps are stored.
>> +
>> +Note: all bitmaps are related to the virtual disk stored in this image.
>> +
>> +=== Bitmap directory ===
>> +
>> +Each bitmap saved in the image is described in a bitmap directory entry. The
>> +bitmap directory is a contiguous area in the image file, whose starting offset
>> +and length are given by the header extension fields bitmap_directory_offset and
>> +bitmap_directory_size. The entries of the bitmap directory have variable
>> +length, depending on the length of the bitmap name and extra data. These
>> +entries are also called bitmap headers.
>> +
>> +Structure of a bitmap directory entry:
>> +
>> +    Byte 0 -  7:    bitmap_table_offset
>> +                    Offset into the image file at which the bitmap table
>> +                    (described below) for the bitmap starts. Must be aligned to
>> +                    a cluster boundary.
>> +
>> +         8 - 11:    bitmap_table_size
>> +                    Number of entries in the bitmap table of the bitmap.
> 
> Should this be the size in bytes, instead of the number of entries? But

For what benefit? We can calculate either from the other, and this gives
us a better resolution.

> at least the entries are fixed width of 8 bytes each, so this lets you
> get a bitmap table up to 32G bytes rather than just 4G in size.  (Let's
> see here - if we have 32G bytes in the bitmap table, that means 4G
> clusters occupied by the bitmap itself; in the worst case of 512-byte
> clusters and granularity 0, that is a maximum bitmap size of 2T
> describing 16T of virtual guest image; with larger cluster size and/or
> larger granularity, we cover a lot more virtual guest space with less
> bitmap size; so I guess we aren't too worried about running out of space?).
> 

Yes, worst case of g=0 and cluster size of 512 bytes, we can get 2T
bitmaps describing 16T of virtual data.

"default case" of 64K clusters and 64K granularity: 256TiB bitmaps
describing ... let's see ... if my math is right, 128EiB?

We're probably fine :)

(Cue future space-person from 2159 wondering how I could have ever been
so naive. Sorry, future space-person!)

>> +        20 - 23:    extra_data_size
>> +                    Size of type-specific extra data.
>> +
>> +                    For now, as no extra data is defined, extra_data_size is
>> +                    reserved and must be zero.
>> +
>> +        variable:   Type-specific extra data for the bitmap.
> 
> I'd write this as:
>            variable:   extra_data
>                        Type-specific extra data for the bitmap,
>                        occupying extra_data_size bytes.
> 
>> +
>> +        variable:   The name of the bitmap (not null terminated). Must be
>> +                    unique among all bitmap names within the bitmaps extension.
>> +
>> +        variable:   Padding to round up the bitmap directory entry size to the
>> +                    next multiple of 8.
> 
> Should we require the padding to be all NUL bytes?  (We aren't
> consistent on whether we require that for other locations of padding in
> the spec, so that could be a followup patch).
> 
>> +
>> +=== Bitmap table ===
>> +
>> +Bitmaps are stored using a one-level structure (as opposed to two-level
>> +structure like for refcounts and guest clusters mapping) for the mapping of
>> +bitmap data to host clusters. This structure is called the bitmap table.
> 
> Possible wording tweak:
> Bitmaps are stored using a one-level structure (as opposed to the
> two-level structures for refcounts and guest cluster mapping), and are
> used for the mapping of bitmap data to host clusters
> 
>> +
>> +Each bitmap table has a variable size (stored in the bitmap directory Entry)
> 
> Does 'Entry' still need to be capitalized?
> 
>> +and may use multiple clusters, however, it must be contiguous in the image
>> +file.
>> +
>> +Structure of a bitmap table entry:
>> +
>> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are non-zero.
>> +                    If bits 9 - 55 are zero:
>> +                      0: Cluster should be read as all zeros.
>> +                      1: Cluster should be read as all ones.
>> +
>> +         1 -  8:    Reserved and must be zero.
>> +
>> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be aligned to
>> +                    a cluster boundary. If the offset is 0, the cluster is
>> +                    unallocated; in that case, bit 0 determines how this
>> +                    cluster should be treated when read from.
> 
> Possible wording tweak:
> s/when read from/during reads/.
> 
>> +
>> +        56 - 63:    Reserved and must be zero.
>> +
>> +=== Bitmap data ===
>> +
>> +As noted above, bitmap data is stored in separate clusters, described by the
>> +bitmap table. Given an offset (in bytes) into the bitmap data, the offset into
>> +the image file can be obtained as follows:
>> +
>> +    image_offset =
>> +        bitmap_table[bitmap_data_offset / cluster_size] +
>> +            (bitmap_data_offset % cluster_size)
>> +
>> +This offset is not defined if bits 9 - 55 of bitmap table entry are zero (see
>> +above).
>> +
>> +Given an offset byte_nr into the virtual disk and the bitmap's granularity, the
>> +bit offset into the bitmap can be calculated like this:
>> +
>> +    bit_offset =
>> +        image_offset(byte_nr / granularity / 8) * 8 +
>> +            (byte_nr / granularity) % 8
>> +
>> +If the size of the bitmap data is not a multiply of cluster size then the last
> 
> s/multiply of cluster size/multiple of the cluster size,/
> 
>> +cluster of the bitmap data contains some unused tail bits. These bits must be
>> +zero.
>>
> 

Thanks!

--js
Vladimir Sementsov-Ogievskiy Jan. 16, 2016, 2:06 p.m. UTC | #6
On 15.01.2016 02:26, John Snow wrote:
>
> On 01/14/2016 05:08 PM, Eric Blake wrote:
>> On 01/11/2016 06:05 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> The new feature for qcow2: storing bitmaps.
>>>
>>> This patch adds new header extension to qcow2 - Bitmaps Extension. It
>>> provides an ability to store virtual disk related bitmaps in a qcow2
>>> image. For now there is only one type of such bitmaps: Dirty Tracking
>>> Bitmap, which just tracks virtual disk changes from some moment.
>>>
>>> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
>>> should be stored in this qcow2 file. The size of each bitmap
>>> (considering its granularity) is equal to virtual disk size.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>
>>> @@ -166,6 +178,34 @@ the header extension data. Each entry look like this:
>>>                       terminated if it has full length)
>>>   
>>>   
>>> +== Bitmaps extension ==
>>> +          0 -  3:  nb_bitmaps
>>> +                   The number of bitmaps contained in the image. Must be
>>> +                   greater than or equal to 1.
>>> +
>>> +                   Note: Qemu currently only supports up to 65535 bitmaps per
>>> +                   image.
>>> +
>>> +          4 -  7:  bitmap_directory_size
>>> +                   Size of the bitmap directory in bytes. It is the cumulative
>>> +                   size of all (nb_bitmaps) bitmap headers.
>> Only 4 bytes - if we ever raise our 64k entry restriction (nb_bitmaps),
>> could we run into an image that has so many directory entries as to make
>> the directory itself spill past 4G?  But I don't think it is likely, so
>> I can live with your choice.
>>
> "We'll never need this!"
>
> I hope someone in 2082 is reading this right now and is quite angry.
>
> (But really, I can't foresee needing this many per each drive -- and if
> we do, we have external storage mechanisms in development to handle such
> wild cases.)

Let's change it to 8 bytes, and add 4bytes reserved field to keep 8b 
alignment.

>
>>> +
>>> +== Bitmaps ==
>>> +
>>> +As mentioned above, the bitmaps extension provides the ability to store bitmaps
>>> +related a virtual disk. This section describes how these bitmaps are stored.
>>> +
>>> +Note: all bitmaps are related to the virtual disk stored in this image.
>>> +
>>> +=== Bitmap directory ===
>>> +
>>> +Each bitmap saved in the image is described in a bitmap directory entry. The
>>> +bitmap directory is a contiguous area in the image file, whose starting offset
>>> +and length are given by the header extension fields bitmap_directory_offset and
>>> +bitmap_directory_size. The entries of the bitmap directory have variable
>>> +length, depending on the length of the bitmap name and extra data. These
>>> +entries are also called bitmap headers.
>>> +
>>> +Structure of a bitmap directory entry:
>>> +
>>> +    Byte 0 -  7:    bitmap_table_offset
>>> +                    Offset into the image file at which the bitmap table
>>> +                    (described below) for the bitmap starts. Must be aligned to
>>> +                    a cluster boundary.
>>> +
>>> +         8 - 11:    bitmap_table_size
>>> +                    Number of entries in the bitmap table of the bitmap.
>> Should this be the size in bytes, instead of the number of entries? But
> For what benefit? We can calculate either from the other, and this gives
> us a better resolution.
>
>> at least the entries are fixed width of 8 bytes each, so this lets you
>> get a bitmap table up to 32G bytes rather than just 4G in size.  (Let's
>> see here - if we have 32G bytes in the bitmap table, that means 4G
>> clusters occupied by the bitmap itself; in the worst case of 512-byte
>> clusters and granularity 0, that is a maximum bitmap size of 2T
>> describing 16T of virtual guest image; with larger cluster size and/or
>> larger granularity, we cover a lot more virtual guest space with less
>> bitmap size; so I guess we aren't too worried about running out of space?).
>>
> Yes, worst case of g=0 and cluster size of 512 bytes, we can get 2T
> bitmaps describing 16T of virtual data.
>
> "default case" of 64K clusters and 64K granularity: 256TiB bitmaps
> describing ... let's see ... if my math is right, 128EiB?
>
> We're probably fine :)
>
> (Cue future space-person from 2159 wondering how I could have ever been
> so naive. Sorry, future space-person!)

it is just like l1_size from qcow2 header.

          36 - 39:   l1_size
                     Number of entries in the active L1 table


>
>>> +        20 - 23:    extra_data_size
>>> +                    Size of type-specific extra data.
>>> +
>>> +                    For now, as no extra data is defined, extra_data_size is
>>> +                    reserved and must be zero.
>>> +
>>> +        variable:   Type-specific extra data for the bitmap.
>> I'd write this as:
>>             variable:   extra_data
>>                         Type-specific extra data for the bitmap,
>>                         occupying extra_data_size bytes.

Ok.. then, similar should be done for the name.

About extra data and our discussion of versioning it. For now, we don't 
have concrete cases and requirements for this data, we are only foresee, 
that it may be needed in future, and waste time in assumptions about 
what kind of data will it be..

Will it be really type specific, i.e. bitmap type strictly define its 
structure? Or it will contain self-defined fields, some of them will be 
optional, some of them will depend on bitmap type or something like 
this? We have no concrete cases and strategy of using this data... We 
need not versioning for now, as we have no versions).

The only question, that should be answered, what should do the software, 
if it find extra_data of unknown format. (currently any extra data is 
unknown).

variants:

1) abort (or, more precisely: "image file must not be written") (like 
incompatible feature)
2) delete extra data (like autoclear feature)
3) ignore (like compatible feature)

We can use two bits of flags field to specify the behavior.

===========================================

Or, the other way, we should fix extra data format like this:

Extra data contains several fields of the following format:

4b: field magic
1b: flags: specifies abort/delete/ignore behavior if the field is unknown
1b: field name size - for error message if the field is unknown
2b: field size
var: field name
var: field data
var: padding

- where I've seen it...

=================

And the last option:

We can select abort behavior for now for any extra data, i.e. just 
remove "variable extra_data" and make field extra_data_size 'reserved 
must be zero'.

================


So, lets finally choose something and go on.


Note: for ways 1 and 3, we should in future define extra data in the way 
that gives a software clear method to understand, is this extra data 
known or unknown for it. I.e. it should have some magic number, defining 
format, or version, or structure like in way 2.

>>
>>> +
>>> +        variable:   The name of the bitmap (not null terminated). Must be
>>> +                    unique among all bitmap names within the bitmaps extension.
>>> +
>>> +        variable:   Padding to round up the bitmap directory entry size to the
>>> +                    next multiple of 8.
>> Should we require the padding to be all NUL bytes?  (We aren't
>> consistent on whether we require that for other locations of padding in
>> the spec, so that could be a followup patch).

Let's require this, I do not see any shortcomings of this, only benefits.

>>
>>> +
>>> +=== Bitmap table ===
>>> +
>>> +Bitmaps are stored using a one-level structure (as opposed to two-level
>>> +structure like for refcounts and guest clusters mapping) for the mapping of
>>> +bitmap data to host clusters. This structure is called the bitmap table.
>> Possible wording tweak:
>> Bitmaps are stored using a one-level structure (as opposed to the
>> two-level structures for refcounts and guest cluster mapping), and are
>> used for the mapping of bitmap data to host clusters

No, this is wrong.. Who are used? Bitmaps are stored and are used? No, 
not bitmaps, structures are used.

>>
>>> +
>>> +Each bitmap table has a variable size (stored in the bitmap directory Entry)
>> Does 'Entry' still need to be capitalized?

It's a mistake, thanks.

>>
>>> +and may use multiple clusters, however, it must be contiguous in the image
>>> +file.
>>> +
>>> +Structure of a bitmap table entry:
>>> +
>>> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are non-zero.
>>> +                    If bits 9 - 55 are zero:
>>> +                      0: Cluster should be read as all zeros.
>>> +                      1: Cluster should be read as all ones.
>>> +
>>> +         1 -  8:    Reserved and must be zero.
>>> +
>>> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be aligned to
>>> +                    a cluster boundary. If the offset is 0, the cluster is
>>> +                    unallocated; in that case, bit 0 determines how this
>>> +                    cluster should be treated when read from.
>> Possible wording tweak:
>> s/when read from/during reads/.

ok

>>
>>> +
>>> +        56 - 63:    Reserved and must be zero.
>>> +
>>> +=== Bitmap data ===
>>> +
>>> +As noted above, bitmap data is stored in separate clusters, described by the
>>> +bitmap table. Given an offset (in bytes) into the bitmap data, the offset into
>>> +the image file can be obtained as follows:
>>> +
>>> +    image_offset =
>>> +        bitmap_table[bitmap_data_offset / cluster_size] +
>>> +            (bitmap_data_offset % cluster_size)
>>> +
>>> +This offset is not defined if bits 9 - 55 of bitmap table entry are zero (see
>>> +above).
>>> +
>>> +Given an offset byte_nr into the virtual disk and the bitmap's granularity, the
>>> +bit offset into the bitmap can be calculated like this:
>>> +
>>> +    bit_offset =
>>> +        image_offset(byte_nr / granularity / 8) * 8 +
>>> +            (byte_nr / granularity) % 8
>>> +
>>> +If the size of the bitmap data is not a multiply of cluster size then the last
>> s/multiply of cluster size/multiple of the cluster size,/

ok, thanks

>>
>>> +cluster of the bitmap data contains some unused tail bits. These bits must be
>>> +zero.
>>>
> Thanks!
>
> --js

Please, let's decide finally about extra data, than I'll reroll it and, 
I hope, it will be committed, to make it possible to continue work on 
persistence series. About extra data, I'm ready to accept any variant, 
strictly defining, what software should do with unknown extra data.
John Snow Jan. 18, 2016, 4:54 p.m. UTC | #7
On 01/16/2016 09:06 AM, Vladimir Sementsov-Ogievskiy wrote:
> On 15.01.2016 02:26, John Snow wrote:
>>
>> On 01/14/2016 05:08 PM, Eric Blake wrote:
>>> On 01/11/2016 06:05 AM, Vladimir Sementsov-Ogievskiy wrote:
>>>> The new feature for qcow2: storing bitmaps.
>>>>
>>>> This patch adds new header extension to qcow2 - Bitmaps Extension. It
>>>> provides an ability to store virtual disk related bitmaps in a qcow2
>>>> image. For now there is only one type of such bitmaps: Dirty Tracking
>>>> Bitmap, which just tracks virtual disk changes from some moment.
>>>>
>>>> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
>>>> should be stored in this qcow2 file. The size of each bitmap
>>>> (considering its granularity) is equal to virtual disk size.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>
>>>> @@ -166,6 +178,34 @@ the header extension data. Each entry look like
>>>> this:
>>>>                       terminated if it has full length)
>>>>     +== Bitmaps extension ==
>>>> +          0 -  3:  nb_bitmaps
>>>> +                   The number of bitmaps contained in the image.
>>>> Must be
>>>> +                   greater than or equal to 1.
>>>> +
>>>> +                   Note: Qemu currently only supports up to 65535
>>>> bitmaps per
>>>> +                   image.
>>>> +
>>>> +          4 -  7:  bitmap_directory_size
>>>> +                   Size of the bitmap directory in bytes. It is the
>>>> cumulative
>>>> +                   size of all (nb_bitmaps) bitmap headers.
>>> Only 4 bytes - if we ever raise our 64k entry restriction (nb_bitmaps),
>>> could we run into an image that has so many directory entries as to make
>>> the directory itself spill past 4G?  But I don't think it is likely, so
>>> I can live with your choice.
>>>
>> "We'll never need this!"
>>
>> I hope someone in 2082 is reading this right now and is quite angry.
>>
>> (But really, I can't foresee needing this many per each drive -- and if
>> we do, we have external storage mechanisms in development to handle such
>> wild cases.)
> 
> Let's change it to 8 bytes, and add 4bytes reserved field to keep 8b
> alignment.
> 
>>
>>>> +
>>>> +== Bitmaps ==
>>>> +
>>>> +As mentioned above, the bitmaps extension provides the ability to
>>>> store bitmaps
>>>> +related a virtual disk. This section describes how these bitmaps
>>>> are stored.
>>>> +
>>>> +Note: all bitmaps are related to the virtual disk stored in this
>>>> image.
>>>> +
>>>> +=== Bitmap directory ===
>>>> +
>>>> +Each bitmap saved in the image is described in a bitmap directory
>>>> entry. The
>>>> +bitmap directory is a contiguous area in the image file, whose
>>>> starting offset
>>>> +and length are given by the header extension fields
>>>> bitmap_directory_offset and
>>>> +bitmap_directory_size. The entries of the bitmap directory have
>>>> variable
>>>> +length, depending on the length of the bitmap name and extra data.
>>>> These
>>>> +entries are also called bitmap headers.
>>>> +
>>>> +Structure of a bitmap directory entry:
>>>> +
>>>> +    Byte 0 -  7:    bitmap_table_offset
>>>> +                    Offset into the image file at which the bitmap
>>>> table
>>>> +                    (described below) for the bitmap starts. Must
>>>> be aligned to
>>>> +                    a cluster boundary.
>>>> +
>>>> +         8 - 11:    bitmap_table_size
>>>> +                    Number of entries in the bitmap table of the
>>>> bitmap.
>>> Should this be the size in bytes, instead of the number of entries? But
>> For what benefit? We can calculate either from the other, and this gives
>> us a better resolution.
>>
>>> at least the entries are fixed width of 8 bytes each, so this lets you
>>> get a bitmap table up to 32G bytes rather than just 4G in size.  (Let's
>>> see here - if we have 32G bytes in the bitmap table, that means 4G
>>> clusters occupied by the bitmap itself; in the worst case of 512-byte
>>> clusters and granularity 0, that is a maximum bitmap size of 2T
>>> describing 16T of virtual guest image; with larger cluster size and/or
>>> larger granularity, we cover a lot more virtual guest space with less
>>> bitmap size; so I guess we aren't too worried about running out of
>>> space?).
>>>
>> Yes, worst case of g=0 and cluster size of 512 bytes, we can get 2T
>> bitmaps describing 16T of virtual data.
>>
>> "default case" of 64K clusters and 64K granularity: 256TiB bitmaps
>> describing ... let's see ... if my math is right, 128EiB?
>>
>> We're probably fine :)
>>
>> (Cue future space-person from 2159 wondering how I could have ever been
>> so naive. Sorry, future space-person!)
> 
> it is just like l1_size from qcow2 header.
> 
>          36 - 39:   l1_size
>                     Number of entries in the active L1 table
> 
> 
>>
>>>> +        20 - 23:    extra_data_size
>>>> +                    Size of type-specific extra data.
>>>> +
>>>> +                    For now, as no extra data is defined,
>>>> extra_data_size is
>>>> +                    reserved and must be zero.
>>>> +
>>>> +        variable:   Type-specific extra data for the bitmap.
>>> I'd write this as:
>>>             variable:   extra_data
>>>                         Type-specific extra data for the bitmap,
>>>                         occupying extra_data_size bytes.
> 
> Ok.. then, similar should be done for the name.
> 
> About extra data and our discussion of versioning it. For now, we don't
> have concrete cases and requirements for this data, we are only foresee,
> that it may be needed in future, and waste time in assumptions about
> what kind of data will it be..
> 
> Will it be really type specific, i.e. bitmap type strictly define its
> structure? Or it will contain self-defined fields, some of them will be
> optional, some of them will depend on bitmap type or something like
> this? We have no concrete cases and strategy of using this data... We
> need not versioning for now, as we have no versions).
> 
> The only question, that should be answered, what should do the software,
> if it find extra_data of unknown format. (currently any extra data is
> unknown).
> 
> variants:
> 
> 1) abort (or, more precisely: "image file must not be written") (like
> incompatible feature)
> 2) delete extra data (like autoclear feature)
> 3) ignore (like compatible feature)
> 
> We can use two bits of flags field to specify the behavior.
> 
> ===========================================
> 
> Or, the other way, we should fix extra data format like this:
> 
> Extra data contains several fields of the following format:
> 
> 4b: field magic
> 1b: flags: specifies abort/delete/ignore behavior if the field is unknown
> 1b: field name size - for error message if the field is unknown
> 2b: field size
> var: field name
> var: field data
> var: padding
> 
> - where I've seen it...
> 
> =================
> 
> And the last option:
> 
> We can select abort behavior for now for any extra data, i.e. just
> remove "variable extra_data" and make field extra_data_size 'reserved
> must be zero'.
> 
> ================
> 
> 
> So, lets finally choose something and go on.
> 
> 
> Note: for ways 1 and 3, we should in future define extra data in the way
> that gives a software clear method to understand, is this extra data
> known or unknown for it. I.e. it should have some magic number, defining
> format, or version, or structure like in way 2.
> 
>>>
>>>> +
>>>> +        variable:   The name of the bitmap (not null terminated).
>>>> Must be
>>>> +                    unique among all bitmap names within the
>>>> bitmaps extension.
>>>> +
>>>> +        variable:   Padding to round up the bitmap directory entry
>>>> size to the
>>>> +                    next multiple of 8.
>>> Should we require the padding to be all NUL bytes?  (We aren't
>>> consistent on whether we require that for other locations of padding in
>>> the spec, so that could be a followup patch).
> 
> Let's require this, I do not see any shortcomings of this, only benefits.
> 
>>>
>>>> +
>>>> +=== Bitmap table ===
>>>> +
>>>> +Bitmaps are stored using a one-level structure (as opposed to
>>>> two-level
>>>> +structure like for refcounts and guest clusters mapping) for the
>>>> mapping of
>>>> +bitmap data to host clusters. This structure is called the bitmap
>>>> table.
>>> Possible wording tweak:
>>> Bitmaps are stored using a one-level structure (as opposed to the
>>> two-level structures for refcounts and guest cluster mapping), and are
>>> used for the mapping of bitmap data to host clusters
> 
> No, this is wrong.. Who are used? Bitmaps are stored and are used? No,
> not bitmaps, structures are used.
> 
>>>
>>>> +
>>>> +Each bitmap table has a variable size (stored in the bitmap
>>>> directory Entry)
>>> Does 'Entry' still need to be capitalized?
> 
> It's a mistake, thanks.
> 
>>>
>>>> +and may use multiple clusters, however, it must be contiguous in
>>>> the image
>>>> +file.
>>>> +
>>>> +Structure of a bitmap table entry:
>>>> +
>>>> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are
>>>> non-zero.
>>>> +                    If bits 9 - 55 are zero:
>>>> +                      0: Cluster should be read as all zeros.
>>>> +                      1: Cluster should be read as all ones.
>>>> +
>>>> +         1 -  8:    Reserved and must be zero.
>>>> +
>>>> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be
>>>> aligned to
>>>> +                    a cluster boundary. If the offset is 0, the
>>>> cluster is
>>>> +                    unallocated; in that case, bit 0 determines how
>>>> this
>>>> +                    cluster should be treated when read from.
>>> Possible wording tweak:
>>> s/when read from/during reads/.
> 
> ok
> 
>>>
>>>> +
>>>> +        56 - 63:    Reserved and must be zero.
>>>> +
>>>> +=== Bitmap data ===
>>>> +
>>>> +As noted above, bitmap data is stored in separate clusters,
>>>> described by the
>>>> +bitmap table. Given an offset (in bytes) into the bitmap data, the
>>>> offset into
>>>> +the image file can be obtained as follows:
>>>> +
>>>> +    image_offset =
>>>> +        bitmap_table[bitmap_data_offset / cluster_size] +
>>>> +            (bitmap_data_offset % cluster_size)
>>>> +
>>>> +This offset is not defined if bits 9 - 55 of bitmap table entry are
>>>> zero (see
>>>> +above).
>>>> +
>>>> +Given an offset byte_nr into the virtual disk and the bitmap's
>>>> granularity, the
>>>> +bit offset into the bitmap can be calculated like this:
>>>> +
>>>> +    bit_offset =
>>>> +        image_offset(byte_nr / granularity / 8) * 8 +
>>>> +            (byte_nr / granularity) % 8
>>>> +
>>>> +If the size of the bitmap data is not a multiply of cluster size
>>>> then the last
>>> s/multiply of cluster size/multiple of the cluster size,/
> 
> ok, thanks
> 
>>>
>>>> +cluster of the bitmap data contains some unused tail bits. These
>>>> bits must be
>>>> +zero.
>>>>
>> Thanks!
>>
>> --js
> 
> Please, let's decide finally about extra data, than I'll reroll it and,
> I hope, it will be committed, to make it possible to continue work on
> persistence series. About extra data, I'm ready to accept any variant,
> strictly defining, what software should do with unknown extra data.
> 
> 

I discussed this with Eric Blake on IRC briefly, and I mentioned I was
concerned that we didn't specify a format at all for the extra data.
Eric felt that it was not unusual to leave a space for future expansion
and that as we haven't used it yet, we don't need to solidify it.

He also felt it would be unusual to stipulate the format of data that we
don't even intend to use yet.

In short, I'm being too proactive.

A commit message mention that, should anyone wish to expand the
type-specific data in the future that adding a 2-byte version as the
first field in extra data would probably be sufficient, and we can worry
about the spec wording later. It is fine to assume for now that if
extra_data_size is 0 that the version/format of the data is "v0" and
that does not limit our future expansion.

Sound good? Sorry for the red tape.
--John Snow
Eric Blake Jan. 18, 2016, 9:16 p.m. UTC | #8
On 01/18/2016 09:54 AM, John Snow wrote:

>> Please, let's decide finally about extra data, than I'll reroll it and,
>> I hope, it will be committed, to make it possible to continue work on
>> persistence series. About extra data, I'm ready to accept any variant,
>> strictly defining, what software should do with unknown extra data.
>>
>>
> 
> I discussed this with Eric Blake on IRC briefly, and I mentioned I was
> concerned that we didn't specify a format at all for the extra data.
> Eric felt that it was not unusual to leave a space for future expansion
> and that as we haven't used it yet, we don't need to solidify it.
> 
> He also felt it would be unusual to stipulate the format of data that we
> don't even intend to use yet.
> 
> In short, I'm being too proactive.
> 
> A commit message mention that, should anyone wish to expand the
> type-specific data in the future that adding a 2-byte version as the
> first field in extra data would probably be sufficient, and we can worry
> about the spec wording later. It is fine to assume for now that if
> extra_data_size is 0 that the version/format of the data is "v0" and
> that does not limit our future expansion.

Or put another way:

I'm just fine if our initial implementation provides sufficient
information for us to completely parse the file even when the file is
generated by a newer qemu (we have a length, so we know how far to skip
to find the next entry), while at the same time throwing up our hands if
the length is non-zero (we won't read the bitmap at all, because we
don't know if the non-zero extra_data contains instructions that would
change how to interpret the data) or even prevent writes (if the bitmap
entry is marked automatic, we must refuse any write that would requiring
an update to the bitmap because we don't know how to write to a bitmap
while correctly preserving semantics of those extra_data bytes).  We
have enough room for future extension, and that's good enough for now;
the commit message can document that we thought about the future,
without having to actually nail down in the spec what the future will
actually do.
Vladimir Sementsov-Ogievskiy Jan. 19, 2016, 8:57 a.m. UTC | #9
On 19.01.2016 00:16, Eric Blake wrote:
> preserving semantics of those extra_data bytes).  We
> have enough room for future extension, and that's good e

Ok, so, what should go to the spec? Current wording is ok? Just delete 
"Type-specific":

+
+        20 - 23:    extra_data_size
+                    Size of type-specific extra data.
+
+                    For now, as no extra data is defined, extra_data_size is
+                    reserved and must be zero.
+
+        variable:   Extra data for the bitmap.
+
Kevin Wolf Jan. 19, 2016, 5:27 p.m. UTC | #10
Am 18.01.2016 um 22:16 hat Eric Blake geschrieben:
> On 01/18/2016 09:54 AM, John Snow wrote:
> 
> >> Please, let's decide finally about extra data, than I'll reroll it and,
> >> I hope, it will be committed, to make it possible to continue work on
> >> persistence series. About extra data, I'm ready to accept any variant,
> >> strictly defining, what software should do with unknown extra data.
> >>
> >>
> > 
> > I discussed this with Eric Blake on IRC briefly, and I mentioned I was
> > concerned that we didn't specify a format at all for the extra data.
> > Eric felt that it was not unusual to leave a space for future expansion
> > and that as we haven't used it yet, we don't need to solidify it.
> > 
> > He also felt it would be unusual to stipulate the format of data that we
> > don't even intend to use yet.
> > 
> > In short, I'm being too proactive.
> > 
> > A commit message mention that, should anyone wish to expand the
> > type-specific data in the future that adding a 2-byte version as the
> > first field in extra data would probably be sufficient, and we can worry
> > about the spec wording later. It is fine to assume for now that if
> > extra_data_size is 0 that the version/format of the data is "v0" and
> > that does not limit our future expansion.
> 
> Or put another way:
> 
> I'm just fine if our initial implementation provides sufficient
> information for us to completely parse the file even when the file is
> generated by a newer qemu (we have a length, so we know how far to skip
> to find the next entry), while at the same time throwing up our hands if
> the length is non-zero (we won't read the bitmap at all, because we
> don't know if the non-zero extra_data contains instructions that would
> change how to interpret the data) or even prevent writes (if the bitmap
> entry is marked automatic, we must refuse any write that would requiring
> an update to the bitmap because we don't know how to write to a bitmap
> while correctly preserving semantics of those extra_data bytes).

Can we assume that the extra_data doesn't contain references to
clusters? Otherwise we need to forbid 'qemu-img check -r leaks' when
there is unknown extra_data.

FWIW, this assumption is already made for snapshots, so it seems okay to
make it here as well. But we could be explicit about it.

Kevin
Kevin Wolf Jan. 19, 2016, 5:29 p.m. UTC | #11
Am 19.01.2016 um 09:57 hat Vladimir Sementsov-Ogievskiy geschrieben:
> On 19.01.2016 00:16, Eric Blake wrote:
> >preserving semantics of those extra_data bytes).  We
> >have enough room for future extension, and that's good e
> 
> Ok, so, what should go to the spec? Current wording is ok? Just
> delete "Type-specific":
> 
> +
> +        20 - 23:    extra_data_size
> +                    Size of type-specific extra data.
> +
> +                    For now, as no extra data is defined, extra_data_size is
> +                    reserved and must be zero.
> +
> +        variable:   Extra data for the bitmap.

Please be explicit that if extra_data_size is non-zero, the bitmap must
not be used (i.e. specify the incompatible-feature-bit-like behaviour).

Kevin
Kevin Wolf Jan. 19, 2016, 5:48 p.m. UTC | #12
Am 11.01.2016 um 14:05 hat Vladimir Sementsov-Ogievskiy geschrieben:
> The new feature for qcow2: storing bitmaps.
> 
> This patch adds new header extension to qcow2 - Bitmaps Extension. It
> provides an ability to store virtual disk related bitmaps in a qcow2
> image. For now there is only one type of such bitmaps: Dirty Tracking
> Bitmap, which just tracks virtual disk changes from some moment.
> 
> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
> should be stored in this qcow2 file. The size of each bitmap
> (considering its granularity) is equal to virtual disk size.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
> 
> v7:
> 
> - Rewordings, grammar.
>   Max, Eric, John, thank you very much.
> 
> - add last paragraph: remaining bits in bitmap data clusters must be
>   zero.
> 
> - s/Bitmap Directory/bitmap directory/ and other names like this at
>   the request of Max.
> 
> v6:
> 
> - reword bitmap_directory_size description
> - bitmap type: make 0 reserved
> - extra_data_size: resize to 4bytes
>   Also, I've marked this field as "must be zero". We can always change
>   it, if we decide allowing managing app to specify any extra data, by
>   defining some magic value as a top of user extra data.. So, for now
>   non zeor extra_data_size should be considered as an error.
> - swap name and extra_data to give good alignment to extra_data.
> 
> 
> v5:
> 
> - 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
>   bitmaps.
> - rewordings
> - move upper bounds to "Notes about Qemu limits"
> - s/should/must somewhere. (but not everywhere)
> - move name_size field closer to name itself in bitmap header
> - add extra data area to bitmap header
> - move bitmap data description to separate section
> 
>  docs/specs/qcow2.txt | 172 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 171 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index 121dfc8..997239d 100644
> --- a/docs/specs/qcow2.txt
> +++ b/docs/specs/qcow2.txt
> @@ -103,7 +103,18 @@ in the description of a field.
>                      write to an image with unknown auto-clear features if it
>                      clears the respective bits from this field first.
>  
> -                    Bits 0-63:  Reserved (set to 0)
> +                    Bit 0:      Bitmaps extension bit
> +                                This bit indicates consistency for the bitmaps
> +                                extension data.
> +
> +                                It is an error if this bit is set without the
> +                                bitmaps extension present.
> +
> +                                If the bitmaps extension is present but this
> +                                bit is unset, the bitmaps extension data is
> +                                inconsistent.

It may as well be consistent, but we don't know.

Perhaps something like "must be considered inconsistent" or "is
potentially inconsistent".

> +
> +                    Bits 1-63:  Reserved (set to 0)
>  
>           96 -  99:  refcount_order
>                      Describes the width of a reference count block entry (width
> @@ -123,6 +134,7 @@ be stored. Each extension has a structure like the following:
>                          0x00000000 - End of the header extension area
>                          0xE2792ACA - Backing file format name
>                          0x6803f857 - Feature name table
> +                        0x23852875 - Bitmaps extension
>                          other      - Unknown header extension, can be safely
>                                       ignored
>  
> @@ -166,6 +178,34 @@ the header extension data. Each entry look like this:
>                      terminated if it has full length)
>  
>  
> +== Bitmaps extension ==
> +
> +The bitmaps extension is an optional header extension. It provides the ability
> +to store bitmaps related to a virtual disk. For now, there is only one bitmap
> +type: the dirty tracking bitmap, which tracks virtual disk changes from some
> +point in time.

I have one major problem with this patch, and it starts here.

The spec talks about dirty tracking bitmaps all the way, but it never
really defines what a dirty tracking bitmap even contains. It has a few
hints here and there, but they aren't consistent.

Here's the first hint: They track "virtual disk changes", which implies
they track guest clusters rather than host clusters.

> +The data of the extension should be considered consistent only if the
> +corresponding auto-clear feature bit is set, see autoclear_features above.
> +
> +The fields of the bitmaps extension are:
> +
> +          0 -  3:  nb_bitmaps
> +                   The number of bitmaps contained in the image. Must be
> +                   greater than or equal to 1.
> +
> +                   Note: Qemu currently only supports up to 65535 bitmaps per
> +                   image.
> +
> +          4 -  7:  bitmap_directory_size
> +                   Size of the bitmap directory in bytes. It is the cumulative
> +                   size of all (nb_bitmaps) bitmap headers.
> +
> +          8 - 15:  bitmap_directory_offset
> +                   Offset into the image file at which the bitmap directory
> +                   starts. Must be aligned to a cluster boundary.
> +
> +
>  == Host cluster management ==
>  
>  qcow2 manages the allocation of host clusters by maintaining a reference count
> @@ -360,3 +400,133 @@ Snapshot table entry:
>  
>          variable:   Padding to round up the snapshot table entry size to the
>                      next multiple of 8.
> +
> +
> +== Bitmaps ==
> +
> +As mentioned above, the bitmaps extension provides the ability to store bitmaps
> +related a virtual disk. This section describes how these bitmaps are stored.

s/related/related to/

> +Note: all bitmaps are related to the virtual disk stored in this image.
> +
> +=== Bitmap directory ===
> +
> +Each bitmap saved in the image is described in a bitmap directory entry. The
> +bitmap directory is a contiguous area in the image file, whose starting offset
> +and length are given by the header extension fields bitmap_directory_offset and
> +bitmap_directory_size. The entries of the bitmap directory have variable
> +length, depending on the length of the bitmap name and extra data. These
> +entries are also called bitmap headers.
> +
> +Structure of a bitmap directory entry:
> +
> +    Byte 0 -  7:    bitmap_table_offset
> +                    Offset into the image file at which the bitmap table
> +                    (described below) for the bitmap starts. Must be aligned to
> +                    a cluster boundary.
> +
> +         8 - 11:    bitmap_table_size
> +                    Number of entries in the bitmap table of the bitmap.
> +
> +        12 - 15:    flags
> +                    Bit
> +                      0: in_use
> +                         The bitmap was not saved correctly and may be
> +                         inconsistent.
> +
> +                      1: auto
> +                         The bitmap must reflect all changes of the virtual
> +                         disk by any application that would write to this qcow2
> +                         file (including writes, snapshot switching, etc.). The
> +                         type of this bitmap must be 'dirty tracking bitmap'.

This suggests that we can represent snapshot switching in a dirty
tracking bitmap. Which is almost impossible if we track guest clusters,
except if loading a snapshot should mean that all bits in the bitmap are
set. However, that feels a bit useless and dirty tracking across
snapshots doesn't seem to make a whole lot of sense anyway.

Maybe what would make more sense is that a bitmap is tied to a specific
snapshot or bitmaps are included in a snapshot, so that you can revert
to the dirty status as it was when the snapshot was taken.

Or, if you really meant, that the tracking words on a host cluster
level, what clusters would be included in the bitmap? Would only the
virtual disk be included? VM state? Any metadata?

It seems we definitely need a new section on dirty tracking bitmaps that
describes what the bitmap actually means and how it's supposed to work
with snapshots. I guess we could also talk about how it works with other
changes to the image like resizing.

> +                    Bits 2 - 31 are reserved and must be 0.
> +
> +             16:    type
> +                    This field describes the sort of the bitmap.
> +                    Values:
> +                      1: Dirty tracking bitmap
> +
> +                    Values 0, 2 - 255 are reserved.
> +
> +             17:    granularity_bits
> +                    Granularity bits. Valid values: 0 - 63.
> +
> +                    Note: Qemu currently doesn't support granularity_bits
> +                    greater than 31.
> +
> +                    Granularity is calculated as
> +                        granularity = 1 << granularity_bits
> +
> +                    A bitmap's granularity is how many bytes of the image
> +                    accounts for one bit of the bitmap.
> +
> +        18 - 19:    name_size
> +                    Size of the bitmap name. Must be non-zero.
> +
> +                    Note: Qemu currently doesn't support values greater than
> +                    1023.
> +
> +        20 - 23:    extra_data_size
> +                    Size of type-specific extra data.
> +
> +                    For now, as no extra data is defined, extra_data_size is
> +                    reserved and must be zero.
> +
> +        variable:   Type-specific extra data for the bitmap.

We talked about this in the other subthread.

> +        variable:   The name of the bitmap (not null terminated). Must be
> +                    unique among all bitmap names within the bitmaps extension.
> +
> +        variable:   Padding to round up the bitmap directory entry size to the
> +                    next multiple of 8.
> +
> +=== Bitmap table ===
> +
> +Bitmaps are stored using a one-level structure (as opposed to two-level
> +structure like for refcounts and guest clusters mapping) for the mapping of
> +bitmap data to host clusters. This structure is called the bitmap table.
> +
> +Each bitmap table has a variable size (stored in the bitmap directory Entry)
> +and may use multiple clusters, however, it must be contiguous in the image
> +file.
> +
> +Structure of a bitmap table entry:
> +
> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are non-zero.
> +                    If bits 9 - 55 are zero:
> +                      0: Cluster should be read as all zeros.
> +                      1: Cluster should be read as all ones.
> +
> +         1 -  8:    Reserved and must be zero.
> +
> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be aligned to
> +                    a cluster boundary. If the offset is 0, the cluster is
> +                    unallocated; in that case, bit 0 determines how this
> +                    cluster should be treated when read from.
> +
> +        56 - 63:    Reserved and must be zero.
> +
> +=== Bitmap data ===
> +
> +As noted above, bitmap data is stored in separate clusters, described by the
> +bitmap table. Given an offset (in bytes) into the bitmap data, the offset into
> +the image file can be obtained as follows:
> +
> +    image_offset =
> +        bitmap_table[bitmap_data_offset / cluster_size] +
> +            (bitmap_data_offset % cluster_size)
> +
> +This offset is not defined if bits 9 - 55 of bitmap table entry are zero (see
> +above).
> +
> +Given an offset byte_nr into the virtual disk and the bitmap's granularity, the
> +bit offset into the bitmap can be calculated like this:
> +
> +    bit_offset =
> +        image_offset(byte_nr / granularity / 8) * 8 +
> +            (byte_nr / granularity) % 8
> +
> +If the size of the bitmap data is not a multiply of cluster size then the last
> +cluster of the bitmap data contains some unused tail bits. These bits must be
> +zero.

In which order are the bits stored in the bitmap?

Kevin
Vladimir Sementsov-Ogievskiy Jan. 20, 2016, 12:34 p.m. UTC | #13
On 19.01.2016 20:48, Kevin Wolf wrote:
> Am 11.01.2016 um 14:05 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> The new feature for qcow2: storing bitmaps.
>>
>> This patch adds new header extension to qcow2 - Bitmaps Extension. It
>> provides an ability to store virtual disk related bitmaps in a qcow2
>> image. For now there is only one type of such bitmaps: Dirty Tracking
>> Bitmap, which just tracks virtual disk changes from some moment.
>>
>> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
>> should be stored in this qcow2 file. The size of each bitmap
>> (considering its granularity) is equal to virtual disk size.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>
>> v7:
>>
>> - Rewordings, grammar.
>>    Max, Eric, John, thank you very much.
>>
>> - add last paragraph: remaining bits in bitmap data clusters must be
>>    zero.
>>
>> - s/Bitmap Directory/bitmap directory/ and other names like this at
>>    the request of Max.
>>
>> v6:
>>
>> - reword bitmap_directory_size description
>> - bitmap type: make 0 reserved
>> - extra_data_size: resize to 4bytes
>>    Also, I've marked this field as "must be zero". We can always change
>>    it, if we decide allowing managing app to specify any extra data, by
>>    defining some magic value as a top of user extra data.. So, for now
>>    non zeor extra_data_size should be considered as an error.
>> - swap name and extra_data to give good alignment to extra_data.
>>
>>
>> v5:
>>
>> - 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
>>    bitmaps.
>> - rewordings
>> - move upper bounds to "Notes about Qemu limits"
>> - s/should/must somewhere. (but not everywhere)
>> - move name_size field closer to name itself in bitmap header
>> - add extra data area to bitmap header
>> - move bitmap data description to separate section
>>
>>   docs/specs/qcow2.txt | 172 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 171 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>> index 121dfc8..997239d 100644
>> --- a/docs/specs/qcow2.txt
>> +++ b/docs/specs/qcow2.txt
>> @@ -103,7 +103,18 @@ in the description of a field.
>>                       write to an image with unknown auto-clear features if it
>>                       clears the respective bits from this field first.
>>   
>> -                    Bits 0-63:  Reserved (set to 0)
>> +                    Bit 0:      Bitmaps extension bit
>> +                                This bit indicates consistency for the bitmaps
>> +                                extension data.
>> +
>> +                                It is an error if this bit is set without the
>> +                                bitmaps extension present.
>> +
>> +                                If the bitmaps extension is present but this
>> +                                bit is unset, the bitmaps extension data is
>> +                                inconsistent.
> It may as well be consistent, but we don't know.
>
> Perhaps something like "must be considered inconsistent" or "is
> potentially inconsistent".
>
>> +
>> +                    Bits 1-63:  Reserved (set to 0)
>>   
>>            96 -  99:  refcount_order
>>                       Describes the width of a reference count block entry (width
>> @@ -123,6 +134,7 @@ be stored. Each extension has a structure like the following:
>>                           0x00000000 - End of the header extension area
>>                           0xE2792ACA - Backing file format name
>>                           0x6803f857 - Feature name table
>> +                        0x23852875 - Bitmaps extension
>>                           other      - Unknown header extension, can be safely
>>                                        ignored
>>   
>> @@ -166,6 +178,34 @@ the header extension data. Each entry look like this:
>>                       terminated if it has full length)
>>   
>>   
>> +== Bitmaps extension ==
>> +
>> +The bitmaps extension is an optional header extension. It provides the ability
>> +to store bitmaps related to a virtual disk. For now, there is only one bitmap
>> +type: the dirty tracking bitmap, which tracks virtual disk changes from some
>> +point in time.
> I have one major problem with this patch, and it starts here.
>
> The spec talks about dirty tracking bitmaps all the way, but it never
> really defines what a dirty tracking bitmap even contains. It has a few
> hints here and there, but they aren't consistent.
>
> Here's the first hint: They track "virtual disk changes", which implies
> they track guest clusters rather than host clusters.
>
>> +The data of the extension should be considered consistent only if the
>> +corresponding auto-clear feature bit is set, see autoclear_features above.
>> +
>> +The fields of the bitmaps extension are:
>> +
>> +          0 -  3:  nb_bitmaps
>> +                   The number of bitmaps contained in the image. Must be
>> +                   greater than or equal to 1.
>> +
>> +                   Note: Qemu currently only supports up to 65535 bitmaps per
>> +                   image.
>> +
>> +          4 -  7:  bitmap_directory_size
>> +                   Size of the bitmap directory in bytes. It is the cumulative
>> +                   size of all (nb_bitmaps) bitmap headers.
>> +
>> +          8 - 15:  bitmap_directory_offset
>> +                   Offset into the image file at which the bitmap directory
>> +                   starts. Must be aligned to a cluster boundary.
>> +
>> +
>>   == Host cluster management ==
>>   
>>   qcow2 manages the allocation of host clusters by maintaining a reference count
>> @@ -360,3 +400,133 @@ Snapshot table entry:
>>   
>>           variable:   Padding to round up the snapshot table entry size to the
>>                       next multiple of 8.
>> +
>> +
>> +== Bitmaps ==
>> +
>> +As mentioned above, the bitmaps extension provides the ability to store bitmaps
>> +related a virtual disk. This section describes how these bitmaps are stored.
> s/related/related to/
>
>> +Note: all bitmaps are related to the virtual disk stored in this image.
>> +
>> +=== Bitmap directory ===
>> +
>> +Each bitmap saved in the image is described in a bitmap directory entry. The
>> +bitmap directory is a contiguous area in the image file, whose starting offset
>> +and length are given by the header extension fields bitmap_directory_offset and
>> +bitmap_directory_size. The entries of the bitmap directory have variable
>> +length, depending on the length of the bitmap name and extra data. These
>> +entries are also called bitmap headers.
>> +
>> +Structure of a bitmap directory entry:
>> +
>> +    Byte 0 -  7:    bitmap_table_offset
>> +                    Offset into the image file at which the bitmap table
>> +                    (described below) for the bitmap starts. Must be aligned to
>> +                    a cluster boundary.
>> +
>> +         8 - 11:    bitmap_table_size
>> +                    Number of entries in the bitmap table of the bitmap.
>> +
>> +        12 - 15:    flags
>> +                    Bit
>> +                      0: in_use
>> +                         The bitmap was not saved correctly and may be
>> +                         inconsistent.
>> +
>> +                      1: auto
>> +                         The bitmap must reflect all changes of the virtual
>> +                         disk by any application that would write to this qcow2
>> +                         file (including writes, snapshot switching, etc.). The
>> +                         type of this bitmap must be 'dirty tracking bitmap'.
> This suggests that we can represent snapshot switching in a dirty
> tracking bitmap. Which is almost impossible if we track guest clusters,
> except if loading a snapshot should mean that all bits in the bitmap are
> set. However, that feels a bit useless and dirty tracking across
> snapshots doesn't seem to make a whole lot of sense anyway.
>
> Maybe what would make more sense is that a bitmap is tied to a specific
> snapshot or bitmaps are included in a snapshot, so that you can revert
> to the dirty status as it was when the snapshot was taken.
>
> Or, if you really meant, that the tracking words on a host cluster
> level, what clusters would be included in the bitmap? Would only the
> virtual disk be included? VM state? Any metadata?
>
> It seems we definitely need a new section on dirty tracking bitmaps that
> describes what the bitmap actually means and how it's supposed to work
> with snapshots. I guess we could also talk about how it works with other
> changes to the image like resizing.

Bitmaps track guest clusters.
Backup is not related to snapshots, so, I think set all bits in the 
dirty bitmap on snapshot switch is ok. Of course we can't just ignore 
snapshot switch, as it changes how user (and backup) sees the virtual 
disk. Deleting the bitmap on snapshot switch is not good option too I think.

Bitmap size is defined to be equal to virtual disk size. So on resize, 
the one who resize must resize the bitmap too, to maintain accordance of 
the image file and the spec.

I'll think about such section, ok.

>
>> +                    Bits 2 - 31 are reserved and must be 0.
>> +
>> +             16:    type
>> +                    This field describes the sort of the bitmap.
>> +                    Values:
>> +                      1: Dirty tracking bitmap
>> +
>> +                    Values 0, 2 - 255 are reserved.
>> +
>> +             17:    granularity_bits
>> +                    Granularity bits. Valid values: 0 - 63.
>> +
>> +                    Note: Qemu currently doesn't support granularity_bits
>> +                    greater than 31.
>> +
>> +                    Granularity is calculated as
>> +                        granularity = 1 << granularity_bits
>> +
>> +                    A bitmap's granularity is how many bytes of the image
>> +                    accounts for one bit of the bitmap.
>> +
>> +        18 - 19:    name_size
>> +                    Size of the bitmap name. Must be non-zero.
>> +
>> +                    Note: Qemu currently doesn't support values greater than
>> +                    1023.
>> +
>> +        20 - 23:    extra_data_size
>> +                    Size of type-specific extra data.
>> +
>> +                    For now, as no extra data is defined, extra_data_size is
>> +                    reserved and must be zero.
>> +
>> +        variable:   Type-specific extra data for the bitmap.
> We talked about this in the other subthread.
>
>> +        variable:   The name of the bitmap (not null terminated). Must be
>> +                    unique among all bitmap names within the bitmaps extension.
>> +
>> +        variable:   Padding to round up the bitmap directory entry size to the
>> +                    next multiple of 8.
>> +
>> +=== Bitmap table ===
>> +
>> +Bitmaps are stored using a one-level structure (as opposed to two-level
>> +structure like for refcounts and guest clusters mapping) for the mapping of
>> +bitmap data to host clusters. This structure is called the bitmap table.
>> +
>> +Each bitmap table has a variable size (stored in the bitmap directory Entry)
>> +and may use multiple clusters, however, it must be contiguous in the image
>> +file.
>> +
>> +Structure of a bitmap table entry:
>> +
>> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are non-zero.
>> +                    If bits 9 - 55 are zero:
>> +                      0: Cluster should be read as all zeros.
>> +                      1: Cluster should be read as all ones.
>> +
>> +         1 -  8:    Reserved and must be zero.
>> +
>> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be aligned to
>> +                    a cluster boundary. If the offset is 0, the cluster is
>> +                    unallocated; in that case, bit 0 determines how this
>> +                    cluster should be treated when read from.
>> +
>> +        56 - 63:    Reserved and must be zero.
>> +
>> +=== Bitmap data ===
>> +
>> +As noted above, bitmap data is stored in separate clusters, described by the
>> +bitmap table. Given an offset (in bytes) into the bitmap data, the offset into
>> +the image file can be obtained as follows:
>> +
>> +    image_offset =
>> +        bitmap_table[bitmap_data_offset / cluster_size] +
>> +            (bitmap_data_offset % cluster_size)
>> +
>> +This offset is not defined if bits 9 - 55 of bitmap table entry are zero (see
>> +above).
>> +
>> +Given an offset byte_nr into the virtual disk and the bitmap's granularity, the
>> +bit offset into the bitmap can be calculated like this:
>> +
>> +    bit_offset =
>> +        image_offset(byte_nr / granularity / 8) * 8 +
>> +            (byte_nr / granularity) % 8
>> +
>> +If the size of the bitmap data is not a multiply of cluster size then the last
>> +cluster of the bitmap data contains some unused tail bits. These bits must be
>> +zero.
> In which order are the bits stored in the bitmap?

What do you mean?

>
> Kevin
John Snow Jan. 20, 2016, 9:22 p.m. UTC | #14
On 01/20/2016 07:34 AM, Vladimir Sementsov-Ogievskiy wrote:
> On 19.01.2016 20:48, Kevin Wolf wrote:
>> Am 11.01.2016 um 14:05 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>> The new feature for qcow2: storing bitmaps.
>>>
>>> This patch adds new header extension to qcow2 - Bitmaps Extension. It
>>> provides an ability to store virtual disk related bitmaps in a qcow2
>>> image. For now there is only one type of such bitmaps: Dirty Tracking
>>> Bitmap, which just tracks virtual disk changes from some moment.
>>>
>>> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
>>> should be stored in this qcow2 file. The size of each bitmap
>>> (considering its granularity) is equal to virtual disk size.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>
>>> v7:
>>>
>>> - Rewordings, grammar.
>>>    Max, Eric, John, thank you very much.
>>>
>>> - add last paragraph: remaining bits in bitmap data clusters must be
>>>    zero.
>>>
>>> - s/Bitmap Directory/bitmap directory/ and other names like this at
>>>    the request of Max.
>>>
>>> v6:
>>>
>>> - reword bitmap_directory_size description
>>> - bitmap type: make 0 reserved
>>> - extra_data_size: resize to 4bytes
>>>    Also, I've marked this field as "must be zero". We can always change
>>>    it, if we decide allowing managing app to specify any extra data, by
>>>    defining some magic value as a top of user extra data.. So, for now
>>>    non zeor extra_data_size should be considered as an error.
>>> - swap name and extra_data to give good alignment to extra_data.
>>>
>>>
>>> v5:
>>>
>>> - 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
>>>    bitmaps.
>>> - rewordings
>>> - move upper bounds to "Notes about Qemu limits"
>>> - s/should/must somewhere. (but not everywhere)
>>> - move name_size field closer to name itself in bitmap header
>>> - add extra data area to bitmap header
>>> - move bitmap data description to separate section
>>>
>>>   docs/specs/qcow2.txt | 172
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>   1 file changed, 171 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>> index 121dfc8..997239d 100644
>>> --- a/docs/specs/qcow2.txt
>>> +++ b/docs/specs/qcow2.txt
>>> @@ -103,7 +103,18 @@ in the description of a field.
>>>                       write to an image with unknown auto-clear
>>> features if it
>>>                       clears the respective bits from this field first.
>>>   -                    Bits 0-63:  Reserved (set to 0)
>>> +                    Bit 0:      Bitmaps extension bit
>>> +                                This bit indicates consistency for
>>> the bitmaps
>>> +                                extension data.
>>> +
>>> +                                It is an error if this bit is set
>>> without the
>>> +                                bitmaps extension present.
>>> +
>>> +                                If the bitmaps extension is present
>>> but this
>>> +                                bit is unset, the bitmaps extension
>>> data is
>>> +                                inconsistent.
>> It may as well be consistent, but we don't know.
>>
>> Perhaps something like "must be considered inconsistent" or "is
>> potentially inconsistent".
>>
>>> +
>>> +                    Bits 1-63:  Reserved (set to 0)
>>>              96 -  99:  refcount_order
>>>                       Describes the width of a reference count block
>>> entry (width
>>> @@ -123,6 +134,7 @@ be stored. Each extension has a structure like
>>> the following:
>>>                           0x00000000 - End of the header extension area
>>>                           0xE2792ACA - Backing file format name
>>>                           0x6803f857 - Feature name table
>>> +                        0x23852875 - Bitmaps extension
>>>                           other      - Unknown header extension, can
>>> be safely
>>>                                        ignored
>>>   @@ -166,6 +178,34 @@ the header extension data. Each entry look
>>> like this:
>>>                       terminated if it has full length)
>>>     +== Bitmaps extension ==
>>> +
>>> +The bitmaps extension is an optional header extension. It provides
>>> the ability
>>> +to store bitmaps related to a virtual disk. For now, there is only
>>> one bitmap
>>> +type: the dirty tracking bitmap, which tracks virtual disk changes
>>> from some
>>> +point in time.
>> I have one major problem with this patch, and it starts here.
>>
>> The spec talks about dirty tracking bitmaps all the way, but it never
>> really defines what a dirty tracking bitmap even contains. It has a few
>> hints here and there, but they aren't consistent.
>>
>> Here's the first hint: They track "virtual disk changes", which implies
>> they track guest clusters rather than host clusters.
>>
>>> +The data of the extension should be considered consistent only if the
>>> +corresponding auto-clear feature bit is set, see autoclear_features
>>> above.
>>> +
>>> +The fields of the bitmaps extension are:
>>> +
>>> +          0 -  3:  nb_bitmaps
>>> +                   The number of bitmaps contained in the image.
>>> Must be
>>> +                   greater than or equal to 1.
>>> +
>>> +                   Note: Qemu currently only supports up to 65535
>>> bitmaps per
>>> +                   image.
>>> +
>>> +          4 -  7:  bitmap_directory_size
>>> +                   Size of the bitmap directory in bytes. It is the
>>> cumulative
>>> +                   size of all (nb_bitmaps) bitmap headers.
>>> +
>>> +          8 - 15:  bitmap_directory_offset
>>> +                   Offset into the image file at which the bitmap
>>> directory
>>> +                   starts. Must be aligned to a cluster boundary.
>>> +
>>> +
>>>   == Host cluster management ==
>>>     qcow2 manages the allocation of host clusters by maintaining a
>>> reference count
>>> @@ -360,3 +400,133 @@ Snapshot table entry:
>>>             variable:   Padding to round up the snapshot table entry
>>> size to the
>>>                       next multiple of 8.
>>> +
>>> +
>>> +== Bitmaps ==
>>> +
>>> +As mentioned above, the bitmaps extension provides the ability to
>>> store bitmaps
>>> +related a virtual disk. This section describes how these bitmaps are
>>> stored.
>> s/related/related to/
>>
>>> +Note: all bitmaps are related to the virtual disk stored in this image.
>>> +
>>> +=== Bitmap directory ===
>>> +
>>> +Each bitmap saved in the image is described in a bitmap directory
>>> entry. The
>>> +bitmap directory is a contiguous area in the image file, whose
>>> starting offset
>>> +and length are given by the header extension fields
>>> bitmap_directory_offset and
>>> +bitmap_directory_size. The entries of the bitmap directory have
>>> variable
>>> +length, depending on the length of the bitmap name and extra data.
>>> These
>>> +entries are also called bitmap headers.
>>> +
>>> +Structure of a bitmap directory entry:
>>> +
>>> +    Byte 0 -  7:    bitmap_table_offset
>>> +                    Offset into the image file at which the bitmap
>>> table
>>> +                    (described below) for the bitmap starts. Must be
>>> aligned to
>>> +                    a cluster boundary.
>>> +
>>> +         8 - 11:    bitmap_table_size
>>> +                    Number of entries in the bitmap table of the
>>> bitmap.
>>> +
>>> +        12 - 15:    flags
>>> +                    Bit
>>> +                      0: in_use
>>> +                         The bitmap was not saved correctly and may be
>>> +                         inconsistent.
>>> +
>>> +                      1: auto
>>> +                         The bitmap must reflect all changes of the
>>> virtual
>>> +                         disk by any application that would write to
>>> this qcow2
>>> +                         file (including writes, snapshot switching,
>>> etc.). The
>>> +                         type of this bitmap must be 'dirty tracking
>>> bitmap'.
>> This suggests that we can represent snapshot switching in a dirty
>> tracking bitmap. Which is almost impossible if we track guest clusters,
>> except if loading a snapshot should mean that all bits in the bitmap are
>> set. However, that feels a bit useless and dirty tracking across
>> snapshots doesn't seem to make a whole lot of sense anyway.
>>
>> Maybe what would make more sense is that a bitmap is tied to a specific
>> snapshot or bitmaps are included in a snapshot, so that you can revert
>> to the dirty status as it was when the snapshot was taken.
>>
>> Or, if you really meant, that the tracking words on a host cluster
>> level, what clusters would be included in the bitmap? Would only the
>> virtual disk be included? VM state? Any metadata?
>>
>> It seems we definitely need a new section on dirty tracking bitmaps that
>> describes what the bitmap actually means and how it's supposed to work
>> with snapshots. I guess we could also talk about how it works with other
>> changes to the image like resizing.
> 
> Bitmaps track guest clusters.
> Backup is not related to snapshots, so, I think set all bits in the
> dirty bitmap on snapshot switch is ok. Of course we can't just ignore
> snapshot switch, as it changes how user (and backup) sees the virtual
> disk. Deleting the bitmap on snapshot switch is not good option too I
> think.
> 
> Bitmap size is defined to be equal to virtual disk size. So on resize,
> the one who resize must resize the bitmap too, to maintain accordance of
> the image file and the spec.
> 
> I'll think about such section, ok.
> 
>>
>>> +                    Bits 2 - 31 are reserved and must be 0.
>>> +
>>> +             16:    type
>>> +                    This field describes the sort of the bitmap.
>>> +                    Values:
>>> +                      1: Dirty tracking bitmap
>>> +
>>> +                    Values 0, 2 - 255 are reserved.
>>> +
>>> +             17:    granularity_bits
>>> +                    Granularity bits. Valid values: 0 - 63.
>>> +
>>> +                    Note: Qemu currently doesn't support
>>> granularity_bits
>>> +                    greater than 31.
>>> +
>>> +                    Granularity is calculated as
>>> +                        granularity = 1 << granularity_bits
>>> +
>>> +                    A bitmap's granularity is how many bytes of the
>>> image
>>> +                    accounts for one bit of the bitmap.
>>> +
>>> +        18 - 19:    name_size
>>> +                    Size of the bitmap name. Must be non-zero.
>>> +
>>> +                    Note: Qemu currently doesn't support values
>>> greater than
>>> +                    1023.
>>> +
>>> +        20 - 23:    extra_data_size
>>> +                    Size of type-specific extra data.
>>> +
>>> +                    For now, as no extra data is defined,
>>> extra_data_size is
>>> +                    reserved and must be zero.
>>> +
>>> +        variable:   Type-specific extra data for the bitmap.
>> We talked about this in the other subthread.
>>
>>> +        variable:   The name of the bitmap (not null terminated).
>>> Must be
>>> +                    unique among all bitmap names within the bitmaps
>>> extension.
>>> +
>>> +        variable:   Padding to round up the bitmap directory entry
>>> size to the
>>> +                    next multiple of 8.
>>> +
>>> +=== Bitmap table ===
>>> +
>>> +Bitmaps are stored using a one-level structure (as opposed to two-level
>>> +structure like for refcounts and guest clusters mapping) for the
>>> mapping of
>>> +bitmap data to host clusters. This structure is called the bitmap
>>> table.
>>> +
>>> +Each bitmap table has a variable size (stored in the bitmap
>>> directory Entry)
>>> +and may use multiple clusters, however, it must be contiguous in the
>>> image
>>> +file.
>>> +
>>> +Structure of a bitmap table entry:
>>> +
>>> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are
>>> non-zero.
>>> +                    If bits 9 - 55 are zero:
>>> +                      0: Cluster should be read as all zeros.
>>> +                      1: Cluster should be read as all ones.
>>> +
>>> +         1 -  8:    Reserved and must be zero.
>>> +
>>> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be
>>> aligned to
>>> +                    a cluster boundary. If the offset is 0, the
>>> cluster is
>>> +                    unallocated; in that case, bit 0 determines how
>>> this
>>> +                    cluster should be treated when read from.
>>> +
>>> +        56 - 63:    Reserved and must be zero.
>>> +
>>> +=== Bitmap data ===
>>> +
>>> +As noted above, bitmap data is stored in separate clusters,
>>> described by the
>>> +bitmap table. Given an offset (in bytes) into the bitmap data, the
>>> offset into
>>> +the image file can be obtained as follows:
>>> +
>>> +    image_offset =
>>> +        bitmap_table[bitmap_data_offset / cluster_size] +
>>> +            (bitmap_data_offset % cluster_size)
>>> +
>>> +This offset is not defined if bits 9 - 55 of bitmap table entry are
>>> zero (see
>>> +above).
>>> +
>>> +Given an offset byte_nr into the virtual disk and the bitmap's
>>> granularity, the
>>> +bit offset into the bitmap can be calculated like this:
>>> +
>>> +    bit_offset =
>>> +        image_offset(byte_nr / granularity / 8) * 8 +
>>> +            (byte_nr / granularity) % 8
>>> +
>>> +If the size of the bitmap data is not a multiply of cluster size
>>> then the last
>>> +cluster of the bitmap data contains some unused tail bits. These
>>> bits must be
>>> +zero.
>> In which order are the bits stored in the bitmap?
> 
> What do you mean?

He means BE or LE. You can intuit it from the formula, but it's not
explicitly stated, I think.

> 
>>
>> Kevin
> 
>
Vladimir Sementsov-Ogievskiy Jan. 21, 2016, 8:22 a.m. UTC | #15
On 21.01.2016 00:22, John Snow wrote:
>
> On 01/20/2016 07:34 AM, Vladimir Sementsov-Ogievskiy wrote:
>> On 19.01.2016 20:48, Kevin Wolf wrote:
>>> Am 11.01.2016 um 14:05 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> The new feature for qcow2: storing bitmaps.
>>>>
>>>> This patch adds new header extension to qcow2 - Bitmaps Extension. It
>>>> provides an ability to store virtual disk related bitmaps in a qcow2
>>>> image. For now there is only one type of such bitmaps: Dirty Tracking
>>>> Bitmap, which just tracks virtual disk changes from some moment.
>>>>
>>>> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
>>>> should be stored in this qcow2 file. The size of each bitmap
>>>> (considering its granularity) is equal to virtual disk size.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>
>>>> v7:
>>>>
>>>> - Rewordings, grammar.
>>>>     Max, Eric, John, thank you very much.
>>>>
>>>> - add last paragraph: remaining bits in bitmap data clusters must be
>>>>     zero.
>>>>
>>>> - s/Bitmap Directory/bitmap directory/ and other names like this at
>>>>     the request of Max.
>>>>
>>>> v6:
>>>>
>>>> - reword bitmap_directory_size description
>>>> - bitmap type: make 0 reserved
>>>> - extra_data_size: resize to 4bytes
>>>>     Also, I've marked this field as "must be zero". We can always change
>>>>     it, if we decide allowing managing app to specify any extra data, by
>>>>     defining some magic value as a top of user extra data.. So, for now
>>>>     non zeor extra_data_size should be considered as an error.
>>>> - swap name and extra_data to give good alignment to extra_data.
>>>>
>>>>
>>>> v5:
>>>>
>>>> - 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
>>>>     bitmaps.
>>>> - rewordings
>>>> - move upper bounds to "Notes about Qemu limits"
>>>> - s/should/must somewhere. (but not everywhere)
>>>> - move name_size field closer to name itself in bitmap header
>>>> - add extra data area to bitmap header
>>>> - move bitmap data description to separate section
>>>>
>>>>    docs/specs/qcow2.txt | 172
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>    1 file changed, 171 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>>> index 121dfc8..997239d 100644
>>>> --- a/docs/specs/qcow2.txt
>>>> +++ b/docs/specs/qcow2.txt
>>>> @@ -103,7 +103,18 @@ in the description of a field.
>>>>                        write to an image with unknown auto-clear
>>>> features if it
>>>>                        clears the respective bits from this field first.
>>>>    -                    Bits 0-63:  Reserved (set to 0)
>>>> +                    Bit 0:      Bitmaps extension bit
>>>> +                                This bit indicates consistency for
>>>> the bitmaps
>>>> +                                extension data.
>>>> +
>>>> +                                It is an error if this bit is set
>>>> without the
>>>> +                                bitmaps extension present.
>>>> +
>>>> +                                If the bitmaps extension is present
>>>> but this
>>>> +                                bit is unset, the bitmaps extension
>>>> data is
>>>> +                                inconsistent.
>>> It may as well be consistent, but we don't know.
>>>
>>> Perhaps something like "must be considered inconsistent" or "is
>>> potentially inconsistent".
>>>
>>>> +
>>>> +                    Bits 1-63:  Reserved (set to 0)
>>>>               96 -  99:  refcount_order
>>>>                        Describes the width of a reference count block
>>>> entry (width
>>>> @@ -123,6 +134,7 @@ be stored. Each extension has a structure like
>>>> the following:
>>>>                            0x00000000 - End of the header extension area
>>>>                            0xE2792ACA - Backing file format name
>>>>                            0x6803f857 - Feature name table
>>>> +                        0x23852875 - Bitmaps extension
>>>>                            other      - Unknown header extension, can
>>>> be safely
>>>>                                         ignored
>>>>    @@ -166,6 +178,34 @@ the header extension data. Each entry look
>>>> like this:
>>>>                        terminated if it has full length)
>>>>      +== Bitmaps extension ==
>>>> +
>>>> +The bitmaps extension is an optional header extension. It provides
>>>> the ability
>>>> +to store bitmaps related to a virtual disk. For now, there is only
>>>> one bitmap
>>>> +type: the dirty tracking bitmap, which tracks virtual disk changes
>>>> from some
>>>> +point in time.
>>> I have one major problem with this patch, and it starts here.
>>>
>>> The spec talks about dirty tracking bitmaps all the way, but it never
>>> really defines what a dirty tracking bitmap even contains. It has a few
>>> hints here and there, but they aren't consistent.
>>>
>>> Here's the first hint: They track "virtual disk changes", which implies
>>> they track guest clusters rather than host clusters.
>>>
>>>> +The data of the extension should be considered consistent only if the
>>>> +corresponding auto-clear feature bit is set, see autoclear_features
>>>> above.
>>>> +
>>>> +The fields of the bitmaps extension are:
>>>> +
>>>> +          0 -  3:  nb_bitmaps
>>>> +                   The number of bitmaps contained in the image.
>>>> Must be
>>>> +                   greater than or equal to 1.
>>>> +
>>>> +                   Note: Qemu currently only supports up to 65535
>>>> bitmaps per
>>>> +                   image.
>>>> +
>>>> +          4 -  7:  bitmap_directory_size
>>>> +                   Size of the bitmap directory in bytes. It is the
>>>> cumulative
>>>> +                   size of all (nb_bitmaps) bitmap headers.
>>>> +
>>>> +          8 - 15:  bitmap_directory_offset
>>>> +                   Offset into the image file at which the bitmap
>>>> directory
>>>> +                   starts. Must be aligned to a cluster boundary.
>>>> +
>>>> +
>>>>    == Host cluster management ==
>>>>      qcow2 manages the allocation of host clusters by maintaining a
>>>> reference count
>>>> @@ -360,3 +400,133 @@ Snapshot table entry:
>>>>              variable:   Padding to round up the snapshot table entry
>>>> size to the
>>>>                        next multiple of 8.
>>>> +
>>>> +
>>>> +== Bitmaps ==
>>>> +
>>>> +As mentioned above, the bitmaps extension provides the ability to
>>>> store bitmaps
>>>> +related a virtual disk. This section describes how these bitmaps are
>>>> stored.
>>> s/related/related to/
>>>
>>>> +Note: all bitmaps are related to the virtual disk stored in this image.
>>>> +
>>>> +=== Bitmap directory ===
>>>> +
>>>> +Each bitmap saved in the image is described in a bitmap directory
>>>> entry. The
>>>> +bitmap directory is a contiguous area in the image file, whose
>>>> starting offset
>>>> +and length are given by the header extension fields
>>>> bitmap_directory_offset and
>>>> +bitmap_directory_size. The entries of the bitmap directory have
>>>> variable
>>>> +length, depending on the length of the bitmap name and extra data.
>>>> These
>>>> +entries are also called bitmap headers.
>>>> +
>>>> +Structure of a bitmap directory entry:
>>>> +
>>>> +    Byte 0 -  7:    bitmap_table_offset
>>>> +                    Offset into the image file at which the bitmap
>>>> table
>>>> +                    (described below) for the bitmap starts. Must be
>>>> aligned to
>>>> +                    a cluster boundary.
>>>> +
>>>> +         8 - 11:    bitmap_table_size
>>>> +                    Number of entries in the bitmap table of the
>>>> bitmap.
>>>> +
>>>> +        12 - 15:    flags
>>>> +                    Bit
>>>> +                      0: in_use
>>>> +                         The bitmap was not saved correctly and may be
>>>> +                         inconsistent.
>>>> +
>>>> +                      1: auto
>>>> +                         The bitmap must reflect all changes of the
>>>> virtual
>>>> +                         disk by any application that would write to
>>>> this qcow2
>>>> +                         file (including writes, snapshot switching,
>>>> etc.). The
>>>> +                         type of this bitmap must be 'dirty tracking
>>>> bitmap'.
>>> This suggests that we can represent snapshot switching in a dirty
>>> tracking bitmap. Which is almost impossible if we track guest clusters,
>>> except if loading a snapshot should mean that all bits in the bitmap are
>>> set. However, that feels a bit useless and dirty tracking across
>>> snapshots doesn't seem to make a whole lot of sense anyway.
>>>
>>> Maybe what would make more sense is that a bitmap is tied to a specific
>>> snapshot or bitmaps are included in a snapshot, so that you can revert
>>> to the dirty status as it was when the snapshot was taken.
>>>
>>> Or, if you really meant, that the tracking words on a host cluster
>>> level, what clusters would be included in the bitmap? Would only the
>>> virtual disk be included? VM state? Any metadata?
>>>
>>> It seems we definitely need a new section on dirty tracking bitmaps that
>>> describes what the bitmap actually means and how it's supposed to work
>>> with snapshots. I guess we could also talk about how it works with other
>>> changes to the image like resizing.
>> Bitmaps track guest clusters.
>> Backup is not related to snapshots, so, I think set all bits in the
>> dirty bitmap on snapshot switch is ok. Of course we can't just ignore
>> snapshot switch, as it changes how user (and backup) sees the virtual
>> disk. Deleting the bitmap on snapshot switch is not good option too I
>> think.
>>
>> Bitmap size is defined to be equal to virtual disk size. So on resize,
>> the one who resize must resize the bitmap too, to maintain accordance of
>> the image file and the spec.
>>
>> I'll think about such section, ok.
>>
>>>> +                    Bits 2 - 31 are reserved and must be 0.
>>>> +
>>>> +             16:    type
>>>> +                    This field describes the sort of the bitmap.
>>>> +                    Values:
>>>> +                      1: Dirty tracking bitmap
>>>> +
>>>> +                    Values 0, 2 - 255 are reserved.
>>>> +
>>>> +             17:    granularity_bits
>>>> +                    Granularity bits. Valid values: 0 - 63.
>>>> +
>>>> +                    Note: Qemu currently doesn't support
>>>> granularity_bits
>>>> +                    greater than 31.
>>>> +
>>>> +                    Granularity is calculated as
>>>> +                        granularity = 1 << granularity_bits
>>>> +
>>>> +                    A bitmap's granularity is how many bytes of the
>>>> image
>>>> +                    accounts for one bit of the bitmap.
>>>> +
>>>> +        18 - 19:    name_size
>>>> +                    Size of the bitmap name. Must be non-zero.
>>>> +
>>>> +                    Note: Qemu currently doesn't support values
>>>> greater than
>>>> +                    1023.
>>>> +
>>>> +        20 - 23:    extra_data_size
>>>> +                    Size of type-specific extra data.
>>>> +
>>>> +                    For now, as no extra data is defined,
>>>> extra_data_size is
>>>> +                    reserved and must be zero.
>>>> +
>>>> +        variable:   Type-specific extra data for the bitmap.
>>> We talked about this in the other subthread.
>>>
>>>> +        variable:   The name of the bitmap (not null terminated).
>>>> Must be
>>>> +                    unique among all bitmap names within the bitmaps
>>>> extension.
>>>> +
>>>> +        variable:   Padding to round up the bitmap directory entry
>>>> size to the
>>>> +                    next multiple of 8.
>>>> +
>>>> +=== Bitmap table ===
>>>> +
>>>> +Bitmaps are stored using a one-level structure (as opposed to two-level
>>>> +structure like for refcounts and guest clusters mapping) for the
>>>> mapping of
>>>> +bitmap data to host clusters. This structure is called the bitmap
>>>> table.
>>>> +
>>>> +Each bitmap table has a variable size (stored in the bitmap
>>>> directory Entry)
>>>> +and may use multiple clusters, however, it must be contiguous in the
>>>> image
>>>> +file.
>>>> +
>>>> +Structure of a bitmap table entry:
>>>> +
>>>> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are
>>>> non-zero.
>>>> +                    If bits 9 - 55 are zero:
>>>> +                      0: Cluster should be read as all zeros.
>>>> +                      1: Cluster should be read as all ones.
>>>> +
>>>> +         1 -  8:    Reserved and must be zero.
>>>> +
>>>> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be
>>>> aligned to
>>>> +                    a cluster boundary. If the offset is 0, the
>>>> cluster is
>>>> +                    unallocated; in that case, bit 0 determines how
>>>> this
>>>> +                    cluster should be treated when read from.
>>>> +
>>>> +        56 - 63:    Reserved and must be zero.
>>>> +
>>>> +=== Bitmap data ===
>>>> +
>>>> +As noted above, bitmap data is stored in separate clusters,
>>>> described by the
>>>> +bitmap table. Given an offset (in bytes) into the bitmap data, the
>>>> offset into
>>>> +the image file can be obtained as follows:
>>>> +
>>>> +    image_offset =
>>>> +        bitmap_table[bitmap_data_offset / cluster_size] +
>>>> +            (bitmap_data_offset % cluster_size)
>>>> +
>>>> +This offset is not defined if bits 9 - 55 of bitmap table entry are
>>>> zero (see
>>>> +above).
>>>> +
>>>> +Given an offset byte_nr into the virtual disk and the bitmap's
>>>> granularity, the
>>>> +bit offset into the bitmap can be calculated like this:
>>>> +
>>>> +    bit_offset =
>>>> +        image_offset(byte_nr / granularity / 8) * 8 +
>>>> +            (byte_nr / granularity) % 8
>>>> +
>>>> +If the size of the bitmap data is not a multiply of cluster size
>>>> then the last
>>>> +cluster of the bitmap data contains some unused tail bits. These
>>>> bits must be
>>>> +zero.
>>> In which order are the bits stored in the bitmap?
>> What do you mean?
> He means BE or LE. You can intuit it from the formula, but it's not
> explicitly stated, I think

byte ordering? But it make sense when we deal with integers, which 
occupies more then 1 byte. Here is plain byte sequence. Bitmap. Like a 
string.. What should I write here additionally?

>>> Kevin
>>
Kevin Wolf Jan. 21, 2016, 9:53 a.m. UTC | #16
Am 21.01.2016 um 09:22 hat Vladimir Sementsov-Ogievskiy geschrieben:
> On 21.01.2016 00:22, John Snow wrote:
> >
> >On 01/20/2016 07:34 AM, Vladimir Sementsov-Ogievskiy wrote:
> >>On 19.01.2016 20:48, Kevin Wolf wrote:
> >>>Am 11.01.2016 um 14:05 hat Vladimir Sementsov-Ogievskiy geschrieben:
> >>>>The new feature for qcow2: storing bitmaps.
> >>>>
> >>>>This patch adds new header extension to qcow2 - Bitmaps Extension. It
> >>>>provides an ability to store virtual disk related bitmaps in a qcow2
> >>>>image. For now there is only one type of such bitmaps: Dirty Tracking
> >>>>Bitmap, which just tracks virtual disk changes from some moment.
> >>>>
> >>>>Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
> >>>>should be stored in this qcow2 file. The size of each bitmap
> >>>>(considering its granularity) is equal to virtual disk size.
> >>>>
> >>>>Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> >>>>---
> >>>>
> >>>>v7:
> >>>>
> >>>>- Rewordings, grammar.
> >>>>    Max, Eric, John, thank you very much.
> >>>>
> >>>>- add last paragraph: remaining bits in bitmap data clusters must be
> >>>>    zero.
> >>>>
> >>>>- s/Bitmap Directory/bitmap directory/ and other names like this at
> >>>>    the request of Max.
> >>>>
> >>>>v6:
> >>>>
> >>>>- reword bitmap_directory_size description
> >>>>- bitmap type: make 0 reserved
> >>>>- extra_data_size: resize to 4bytes
> >>>>    Also, I've marked this field as "must be zero". We can always change
> >>>>    it, if we decide allowing managing app to specify any extra data, by
> >>>>    defining some magic value as a top of user extra data.. So, for now
> >>>>    non zeor extra_data_size should be considered as an error.
> >>>>- swap name and extra_data to give good alignment to extra_data.
> >>>>
> >>>>
> >>>>v5:
> >>>>
> >>>>- 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
> >>>>    bitmaps.
> >>>>- rewordings
> >>>>- move upper bounds to "Notes about Qemu limits"
> >>>>- s/should/must somewhere. (but not everywhere)
> >>>>- move name_size field closer to name itself in bitmap header
> >>>>- add extra data area to bitmap header
> >>>>- move bitmap data description to separate section
> >>>>
> >>>>   docs/specs/qcow2.txt | 172
> >>>>++++++++++++++++++++++++++++++++++++++++++++++++++-
> >>>>   1 file changed, 171 insertions(+), 1 deletion(-)
> >>>>
> >>>>diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> >>>>index 121dfc8..997239d 100644
> >>>>--- a/docs/specs/qcow2.txt
> >>>>+++ b/docs/specs/qcow2.txt
> >>>>@@ -103,7 +103,18 @@ in the description of a field.
> >>>>                       write to an image with unknown auto-clear
> >>>>features if it
> >>>>                       clears the respective bits from this field first.
> >>>>   -                    Bits 0-63:  Reserved (set to 0)
> >>>>+                    Bit 0:      Bitmaps extension bit
> >>>>+                                This bit indicates consistency for
> >>>>the bitmaps
> >>>>+                                extension data.
> >>>>+
> >>>>+                                It is an error if this bit is set
> >>>>without the
> >>>>+                                bitmaps extension present.
> >>>>+
> >>>>+                                If the bitmaps extension is present
> >>>>but this
> >>>>+                                bit is unset, the bitmaps extension
> >>>>data is
> >>>>+                                inconsistent.
> >>>It may as well be consistent, but we don't know.
> >>>
> >>>Perhaps something like "must be considered inconsistent" or "is
> >>>potentially inconsistent".
> >>>
> >>>>+
> >>>>+                    Bits 1-63:  Reserved (set to 0)
> >>>>              96 -  99:  refcount_order
> >>>>                       Describes the width of a reference count block
> >>>>entry (width
> >>>>@@ -123,6 +134,7 @@ be stored. Each extension has a structure like
> >>>>the following:
> >>>>                           0x00000000 - End of the header extension area
> >>>>                           0xE2792ACA - Backing file format name
> >>>>                           0x6803f857 - Feature name table
> >>>>+                        0x23852875 - Bitmaps extension
> >>>>                           other      - Unknown header extension, can
> >>>>be safely
> >>>>                                        ignored
> >>>>   @@ -166,6 +178,34 @@ the header extension data. Each entry look
> >>>>like this:
> >>>>                       terminated if it has full length)
> >>>>     +== Bitmaps extension ==
> >>>>+
> >>>>+The bitmaps extension is an optional header extension. It provides
> >>>>the ability
> >>>>+to store bitmaps related to a virtual disk. For now, there is only
> >>>>one bitmap
> >>>>+type: the dirty tracking bitmap, which tracks virtual disk changes
> >>>>from some
> >>>>+point in time.
> >>>I have one major problem with this patch, and it starts here.
> >>>
> >>>The spec talks about dirty tracking bitmaps all the way, but it never
> >>>really defines what a dirty tracking bitmap even contains. It has a few
> >>>hints here and there, but they aren't consistent.
> >>>
> >>>Here's the first hint: They track "virtual disk changes", which implies
> >>>they track guest clusters rather than host clusters.
> >>>
> >>>>+The data of the extension should be considered consistent only if the
> >>>>+corresponding auto-clear feature bit is set, see autoclear_features
> >>>>above.
> >>>>+
> >>>>+The fields of the bitmaps extension are:
> >>>>+
> >>>>+          0 -  3:  nb_bitmaps
> >>>>+                   The number of bitmaps contained in the image.
> >>>>Must be
> >>>>+                   greater than or equal to 1.
> >>>>+
> >>>>+                   Note: Qemu currently only supports up to 65535
> >>>>bitmaps per
> >>>>+                   image.
> >>>>+
> >>>>+          4 -  7:  bitmap_directory_size
> >>>>+                   Size of the bitmap directory in bytes. It is the
> >>>>cumulative
> >>>>+                   size of all (nb_bitmaps) bitmap headers.
> >>>>+
> >>>>+          8 - 15:  bitmap_directory_offset
> >>>>+                   Offset into the image file at which the bitmap
> >>>>directory
> >>>>+                   starts. Must be aligned to a cluster boundary.
> >>>>+
> >>>>+
> >>>>   == Host cluster management ==
> >>>>     qcow2 manages the allocation of host clusters by maintaining a
> >>>>reference count
> >>>>@@ -360,3 +400,133 @@ Snapshot table entry:
> >>>>             variable:   Padding to round up the snapshot table entry
> >>>>size to the
> >>>>                       next multiple of 8.
> >>>>+
> >>>>+
> >>>>+== Bitmaps ==
> >>>>+
> >>>>+As mentioned above, the bitmaps extension provides the ability to
> >>>>store bitmaps
> >>>>+related a virtual disk. This section describes how these bitmaps are
> >>>>stored.
> >>>s/related/related to/
> >>>
> >>>>+Note: all bitmaps are related to the virtual disk stored in this image.
> >>>>+
> >>>>+=== Bitmap directory ===
> >>>>+
> >>>>+Each bitmap saved in the image is described in a bitmap directory
> >>>>entry. The
> >>>>+bitmap directory is a contiguous area in the image file, whose
> >>>>starting offset
> >>>>+and length are given by the header extension fields
> >>>>bitmap_directory_offset and
> >>>>+bitmap_directory_size. The entries of the bitmap directory have
> >>>>variable
> >>>>+length, depending on the length of the bitmap name and extra data.
> >>>>These
> >>>>+entries are also called bitmap headers.
> >>>>+
> >>>>+Structure of a bitmap directory entry:
> >>>>+
> >>>>+    Byte 0 -  7:    bitmap_table_offset
> >>>>+                    Offset into the image file at which the bitmap
> >>>>table
> >>>>+                    (described below) for the bitmap starts. Must be
> >>>>aligned to
> >>>>+                    a cluster boundary.
> >>>>+
> >>>>+         8 - 11:    bitmap_table_size
> >>>>+                    Number of entries in the bitmap table of the
> >>>>bitmap.
> >>>>+
> >>>>+        12 - 15:    flags
> >>>>+                    Bit
> >>>>+                      0: in_use
> >>>>+                         The bitmap was not saved correctly and may be
> >>>>+                         inconsistent.
> >>>>+
> >>>>+                      1: auto
> >>>>+                         The bitmap must reflect all changes of the
> >>>>virtual
> >>>>+                         disk by any application that would write to
> >>>>this qcow2
> >>>>+                         file (including writes, snapshot switching,
> >>>>etc.). The
> >>>>+                         type of this bitmap must be 'dirty tracking
> >>>>bitmap'.
> >>>This suggests that we can represent snapshot switching in a dirty
> >>>tracking bitmap. Which is almost impossible if we track guest clusters,
> >>>except if loading a snapshot should mean that all bits in the bitmap are
> >>>set. However, that feels a bit useless and dirty tracking across
> >>>snapshots doesn't seem to make a whole lot of sense anyway.
> >>>
> >>>Maybe what would make more sense is that a bitmap is tied to a specific
> >>>snapshot or bitmaps are included in a snapshot, so that you can revert
> >>>to the dirty status as it was when the snapshot was taken.
> >>>
> >>>Or, if you really meant, that the tracking words on a host cluster
> >>>level, what clusters would be included in the bitmap? Would only the
> >>>virtual disk be included? VM state? Any metadata?
> >>>
> >>>It seems we definitely need a new section on dirty tracking bitmaps that
> >>>describes what the bitmap actually means and how it's supposed to work
> >>>with snapshots. I guess we could also talk about how it works with other
> >>>changes to the image like resizing.
> >>Bitmaps track guest clusters.
> >>Backup is not related to snapshots, so, I think set all bits in the
> >>dirty bitmap on snapshot switch is ok. Of course we can't just ignore
> >>snapshot switch, as it changes how user (and backup) sees the virtual
> >>disk. Deleting the bitmap on snapshot switch is not good option too I
> >>think.
> >>
> >>Bitmap size is defined to be equal to virtual disk size. So on resize,
> >>the one who resize must resize the bitmap too, to maintain accordance of
> >>the image file and the spec.
> >>
> >>I'll think about such section, ok.
> >>
> >>>>+                    Bits 2 - 31 are reserved and must be 0.
> >>>>+
> >>>>+             16:    type
> >>>>+                    This field describes the sort of the bitmap.
> >>>>+                    Values:
> >>>>+                      1: Dirty tracking bitmap
> >>>>+
> >>>>+                    Values 0, 2 - 255 are reserved.
> >>>>+
> >>>>+             17:    granularity_bits
> >>>>+                    Granularity bits. Valid values: 0 - 63.
> >>>>+
> >>>>+                    Note: Qemu currently doesn't support
> >>>>granularity_bits
> >>>>+                    greater than 31.
> >>>>+
> >>>>+                    Granularity is calculated as
> >>>>+                        granularity = 1 << granularity_bits
> >>>>+
> >>>>+                    A bitmap's granularity is how many bytes of the
> >>>>image
> >>>>+                    accounts for one bit of the bitmap.
> >>>>+
> >>>>+        18 - 19:    name_size
> >>>>+                    Size of the bitmap name. Must be non-zero.
> >>>>+
> >>>>+                    Note: Qemu currently doesn't support values
> >>>>greater than
> >>>>+                    1023.
> >>>>+
> >>>>+        20 - 23:    extra_data_size
> >>>>+                    Size of type-specific extra data.
> >>>>+
> >>>>+                    For now, as no extra data is defined,
> >>>>extra_data_size is
> >>>>+                    reserved and must be zero.
> >>>>+
> >>>>+        variable:   Type-specific extra data for the bitmap.
> >>>We talked about this in the other subthread.
> >>>
> >>>>+        variable:   The name of the bitmap (not null terminated).
> >>>>Must be
> >>>>+                    unique among all bitmap names within the bitmaps
> >>>>extension.
> >>>>+
> >>>>+        variable:   Padding to round up the bitmap directory entry
> >>>>size to the
> >>>>+                    next multiple of 8.
> >>>>+
> >>>>+=== Bitmap table ===
> >>>>+
> >>>>+Bitmaps are stored using a one-level structure (as opposed to two-level
> >>>>+structure like for refcounts and guest clusters mapping) for the
> >>>>mapping of
> >>>>+bitmap data to host clusters. This structure is called the bitmap
> >>>>table.
> >>>>+
> >>>>+Each bitmap table has a variable size (stored in the bitmap
> >>>>directory Entry)
> >>>>+and may use multiple clusters, however, it must be contiguous in the
> >>>>image
> >>>>+file.
> >>>>+
> >>>>+Structure of a bitmap table entry:
> >>>>+
> >>>>+    Bit       0:    Reserved and must be zero if bits 9 - 55 are
> >>>>non-zero.
> >>>>+                    If bits 9 - 55 are zero:
> >>>>+                      0: Cluster should be read as all zeros.
> >>>>+                      1: Cluster should be read as all ones.
> >>>>+
> >>>>+         1 -  8:    Reserved and must be zero.
> >>>>+
> >>>>+         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be
> >>>>aligned to
> >>>>+                    a cluster boundary. If the offset is 0, the
> >>>>cluster is
> >>>>+                    unallocated; in that case, bit 0 determines how
> >>>>this
> >>>>+                    cluster should be treated when read from.
> >>>>+
> >>>>+        56 - 63:    Reserved and must be zero.
> >>>>+
> >>>>+=== Bitmap data ===
> >>>>+
> >>>>+As noted above, bitmap data is stored in separate clusters,
> >>>>described by the
> >>>>+bitmap table. Given an offset (in bytes) into the bitmap data, the
> >>>>offset into
> >>>>+the image file can be obtained as follows:
> >>>>+
> >>>>+    image_offset =
> >>>>+        bitmap_table[bitmap_data_offset / cluster_size] +
> >>>>+            (bitmap_data_offset % cluster_size)
> >>>>+
> >>>>+This offset is not defined if bits 9 - 55 of bitmap table entry are
> >>>>zero (see
> >>>>+above).
> >>>>+
> >>>>+Given an offset byte_nr into the virtual disk and the bitmap's
> >>>>granularity, the
> >>>>+bit offset into the bitmap can be calculated like this:
> >>>>+
> >>>>+    bit_offset =
> >>>>+        image_offset(byte_nr / granularity / 8) * 8 +
> >>>>+            (byte_nr / granularity) % 8
> >>>>+
> >>>>+If the size of the bitmap data is not a multiply of cluster size
> >>>>then the last
> >>>>+cluster of the bitmap data contains some unused tail bits. These
> >>>>bits must be
> >>>>+zero.
> >>>In which order are the bits stored in the bitmap?
> >>What do you mean?
> >He means BE or LE. You can intuit it from the formula, but it's not
> >explicitly stated, I think
> 
> byte ordering? But it make sense when we deal with integers, which
> occupies more then 1 byte. Here is plain byte sequence. Bitmap. Like
> a string.. What should I write here additionally?

I meant sub-byte ordering. Whether bit 0 is the LSB or the MSB. I assume
you intend to use the LSB for bit 0, but it's not written anywhere.

Actually, that's an assumption that is made for bitmaps in the whole
spec (including e.g. feature flags), even though it is only explicitly
spelt out for the refcount blocks. I guess we should move it to the
'General' section.

Kevin
Vladimir Sementsov-Ogievskiy Jan. 21, 2016, 10:44 a.m. UTC | #17
On 21.01.2016 12:53, Kevin Wolf wrote:
> Am 21.01.2016 um 09:22 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> On 21.01.2016 00:22, John Snow wrote:
>>> On 01/20/2016 07:34 AM, Vladimir Sementsov-Ogievskiy wrote:
>>>> On 19.01.2016 20:48, Kevin Wolf wrote:
>>>>> Am 11.01.2016 um 14:05 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>>> The new feature for qcow2: storing bitmaps.
>>>>>>
>>>>>> This patch adds new header extension to qcow2 - Bitmaps Extension. It
>>>>>> provides an ability to store virtual disk related bitmaps in a qcow2
>>>>>> image. For now there is only one type of such bitmaps: Dirty Tracking
>>>>>> Bitmap, which just tracks virtual disk changes from some moment.
>>>>>>
>>>>>> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
>>>>>> should be stored in this qcow2 file. The size of each bitmap
>>>>>> (considering its granularity) is equal to virtual disk size.
>>>>>>
>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>>> ---
>>>>>>
>>>>>> v7:
>>>>>>
>>>>>> - Rewordings, grammar.
>>>>>>     Max, Eric, John, thank you very much.
>>>>>>
>>>>>> - add last paragraph: remaining bits in bitmap data clusters must be
>>>>>>     zero.
>>>>>>
>>>>>> - s/Bitmap Directory/bitmap directory/ and other names like this at
>>>>>>     the request of Max.
>>>>>>
>>>>>> v6:
>>>>>>
>>>>>> - reword bitmap_directory_size description
>>>>>> - bitmap type: make 0 reserved
>>>>>> - extra_data_size: resize to 4bytes
>>>>>>     Also, I've marked this field as "must be zero". We can always change
>>>>>>     it, if we decide allowing managing app to specify any extra data, by
>>>>>>     defining some magic value as a top of user extra data.. So, for now
>>>>>>     non zeor extra_data_size should be considered as an error.
>>>>>> - swap name and extra_data to give good alignment to extra_data.
>>>>>>
>>>>>>
>>>>>> v5:
>>>>>>
>>>>>> - 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
>>>>>>     bitmaps.
>>>>>> - rewordings
>>>>>> - move upper bounds to "Notes about Qemu limits"
>>>>>> - s/should/must somewhere. (but not everywhere)
>>>>>> - move name_size field closer to name itself in bitmap header
>>>>>> - add extra data area to bitmap header
>>>>>> - move bitmap data description to separate section
>>>>>>
>>>>>>    docs/specs/qcow2.txt | 172
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>>>    1 file changed, 171 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>>>>> index 121dfc8..997239d 100644
>>>>>> --- a/docs/specs/qcow2.txt
>>>>>> +++ b/docs/specs/qcow2.txt
>>>>>> @@ -103,7 +103,18 @@ in the description of a field.
>>>>>>                        write to an image with unknown auto-clear
>>>>>> features if it
>>>>>>                        clears the respective bits from this field first.
>>>>>>    -                    Bits 0-63:  Reserved (set to 0)
>>>>>> +                    Bit 0:      Bitmaps extension bit
>>>>>> +                                This bit indicates consistency for
>>>>>> the bitmaps
>>>>>> +                                extension data.
>>>>>> +
>>>>>> +                                It is an error if this bit is set
>>>>>> without the
>>>>>> +                                bitmaps extension present.
>>>>>> +
>>>>>> +                                If the bitmaps extension is present
>>>>>> but this
>>>>>> +                                bit is unset, the bitmaps extension
>>>>>> data is
>>>>>> +                                inconsistent.
>>>>> It may as well be consistent, but we don't know.
>>>>>
>>>>> Perhaps something like "must be considered inconsistent" or "is
>>>>> potentially inconsistent".
>>>>>
>>>>>> +
>>>>>> +                    Bits 1-63:  Reserved (set to 0)
>>>>>>               96 -  99:  refcount_order
>>>>>>                        Describes the width of a reference count block
>>>>>> entry (width
>>>>>> @@ -123,6 +134,7 @@ be stored. Each extension has a structure like
>>>>>> the following:
>>>>>>                            0x00000000 - End of the header extension area
>>>>>>                            0xE2792ACA - Backing file format name
>>>>>>                            0x6803f857 - Feature name table
>>>>>> +                        0x23852875 - Bitmaps extension
>>>>>>                            other      - Unknown header extension, can
>>>>>> be safely
>>>>>>                                         ignored
>>>>>>    @@ -166,6 +178,34 @@ the header extension data. Each entry look
>>>>>> like this:
>>>>>>                        terminated if it has full length)
>>>>>>      +== Bitmaps extension ==
>>>>>> +
>>>>>> +The bitmaps extension is an optional header extension. It provides
>>>>>> the ability
>>>>>> +to store bitmaps related to a virtual disk. For now, there is only
>>>>>> one bitmap
>>>>>> +type: the dirty tracking bitmap, which tracks virtual disk changes
>>>>> >from some
>>>>>> +point in time.
>>>>> I have one major problem with this patch, and it starts here.
>>>>>
>>>>> The spec talks about dirty tracking bitmaps all the way, but it never
>>>>> really defines what a dirty tracking bitmap even contains. It has a few
>>>>> hints here and there, but they aren't consistent.
>>>>>
>>>>> Here's the first hint: They track "virtual disk changes", which implies
>>>>> they track guest clusters rather than host clusters.
>>>>>
>>>>>> +The data of the extension should be considered consistent only if the
>>>>>> +corresponding auto-clear feature bit is set, see autoclear_features
>>>>>> above.
>>>>>> +
>>>>>> +The fields of the bitmaps extension are:
>>>>>> +
>>>>>> +          0 -  3:  nb_bitmaps
>>>>>> +                   The number of bitmaps contained in the image.
>>>>>> Must be
>>>>>> +                   greater than or equal to 1.
>>>>>> +
>>>>>> +                   Note: Qemu currently only supports up to 65535
>>>>>> bitmaps per
>>>>>> +                   image.
>>>>>> +
>>>>>> +          4 -  7:  bitmap_directory_size
>>>>>> +                   Size of the bitmap directory in bytes. It is the
>>>>>> cumulative
>>>>>> +                   size of all (nb_bitmaps) bitmap headers.
>>>>>> +
>>>>>> +          8 - 15:  bitmap_directory_offset
>>>>>> +                   Offset into the image file at which the bitmap
>>>>>> directory
>>>>>> +                   starts. Must be aligned to a cluster boundary.
>>>>>> +
>>>>>> +
>>>>>>    == Host cluster management ==
>>>>>>      qcow2 manages the allocation of host clusters by maintaining a
>>>>>> reference count
>>>>>> @@ -360,3 +400,133 @@ Snapshot table entry:
>>>>>>              variable:   Padding to round up the snapshot table entry
>>>>>> size to the
>>>>>>                        next multiple of 8.
>>>>>> +
>>>>>> +
>>>>>> +== Bitmaps ==
>>>>>> +
>>>>>> +As mentioned above, the bitmaps extension provides the ability to
>>>>>> store bitmaps
>>>>>> +related a virtual disk. This section describes how these bitmaps are
>>>>>> stored.
>>>>> s/related/related to/
>>>>>
>>>>>> +Note: all bitmaps are related to the virtual disk stored in this image.
>>>>>> +
>>>>>> +=== Bitmap directory ===
>>>>>> +
>>>>>> +Each bitmap saved in the image is described in a bitmap directory
>>>>>> entry. The
>>>>>> +bitmap directory is a contiguous area in the image file, whose
>>>>>> starting offset
>>>>>> +and length are given by the header extension fields
>>>>>> bitmap_directory_offset and
>>>>>> +bitmap_directory_size. The entries of the bitmap directory have
>>>>>> variable
>>>>>> +length, depending on the length of the bitmap name and extra data.
>>>>>> These
>>>>>> +entries are also called bitmap headers.
>>>>>> +
>>>>>> +Structure of a bitmap directory entry:
>>>>>> +
>>>>>> +    Byte 0 -  7:    bitmap_table_offset
>>>>>> +                    Offset into the image file at which the bitmap
>>>>>> table
>>>>>> +                    (described below) for the bitmap starts. Must be
>>>>>> aligned to
>>>>>> +                    a cluster boundary.
>>>>>> +
>>>>>> +         8 - 11:    bitmap_table_size
>>>>>> +                    Number of entries in the bitmap table of the
>>>>>> bitmap.
>>>>>> +
>>>>>> +        12 - 15:    flags
>>>>>> +                    Bit
>>>>>> +                      0: in_use
>>>>>> +                         The bitmap was not saved correctly and may be
>>>>>> +                         inconsistent.
>>>>>> +
>>>>>> +                      1: auto
>>>>>> +                         The bitmap must reflect all changes of the
>>>>>> virtual
>>>>>> +                         disk by any application that would write to
>>>>>> this qcow2
>>>>>> +                         file (including writes, snapshot switching,
>>>>>> etc.). The
>>>>>> +                         type of this bitmap must be 'dirty tracking
>>>>>> bitmap'.
>>>>> This suggests that we can represent snapshot switching in a dirty
>>>>> tracking bitmap. Which is almost impossible if we track guest clusters,
>>>>> except if loading a snapshot should mean that all bits in the bitmap are
>>>>> set. However, that feels a bit useless and dirty tracking across
>>>>> snapshots doesn't seem to make a whole lot of sense anyway.
>>>>>
>>>>> Maybe what would make more sense is that a bitmap is tied to a specific
>>>>> snapshot or bitmaps are included in a snapshot, so that you can revert
>>>>> to the dirty status as it was when the snapshot was taken.
>>>>>
>>>>> Or, if you really meant, that the tracking words on a host cluster
>>>>> level, what clusters would be included in the bitmap? Would only the
>>>>> virtual disk be included? VM state? Any metadata?
>>>>>
>>>>> It seems we definitely need a new section on dirty tracking bitmaps that
>>>>> describes what the bitmap actually means and how it's supposed to work
>>>>> with snapshots. I guess we could also talk about how it works with other
>>>>> changes to the image like resizing.
>>>> Bitmaps track guest clusters.
>>>> Backup is not related to snapshots, so, I think set all bits in the
>>>> dirty bitmap on snapshot switch is ok. Of course we can't just ignore
>>>> snapshot switch, as it changes how user (and backup) sees the virtual
>>>> disk. Deleting the bitmap on snapshot switch is not good option too I
>>>> think.
>>>>
>>>> Bitmap size is defined to be equal to virtual disk size. So on resize,
>>>> the one who resize must resize the bitmap too, to maintain accordance of
>>>> the image file and the spec.
>>>>
>>>> I'll think about such section, ok.
>>>>
>>>>>> +                    Bits 2 - 31 are reserved and must be 0.
>>>>>> +
>>>>>> +             16:    type
>>>>>> +                    This field describes the sort of the bitmap.
>>>>>> +                    Values:
>>>>>> +                      1: Dirty tracking bitmap
>>>>>> +
>>>>>> +                    Values 0, 2 - 255 are reserved.
>>>>>> +
>>>>>> +             17:    granularity_bits
>>>>>> +                    Granularity bits. Valid values: 0 - 63.
>>>>>> +
>>>>>> +                    Note: Qemu currently doesn't support
>>>>>> granularity_bits
>>>>>> +                    greater than 31.
>>>>>> +
>>>>>> +                    Granularity is calculated as
>>>>>> +                        granularity = 1 << granularity_bits
>>>>>> +
>>>>>> +                    A bitmap's granularity is how many bytes of the
>>>>>> image
>>>>>> +                    accounts for one bit of the bitmap.
>>>>>> +
>>>>>> +        18 - 19:    name_size
>>>>>> +                    Size of the bitmap name. Must be non-zero.
>>>>>> +
>>>>>> +                    Note: Qemu currently doesn't support values
>>>>>> greater than
>>>>>> +                    1023.
>>>>>> +
>>>>>> +        20 - 23:    extra_data_size
>>>>>> +                    Size of type-specific extra data.
>>>>>> +
>>>>>> +                    For now, as no extra data is defined,
>>>>>> extra_data_size is
>>>>>> +                    reserved and must be zero.
>>>>>> +
>>>>>> +        variable:   Type-specific extra data for the bitmap.
>>>>> We talked about this in the other subthread.
>>>>>
>>>>>> +        variable:   The name of the bitmap (not null terminated).
>>>>>> Must be
>>>>>> +                    unique among all bitmap names within the bitmaps
>>>>>> extension.
>>>>>> +
>>>>>> +        variable:   Padding to round up the bitmap directory entry
>>>>>> size to the
>>>>>> +                    next multiple of 8.
>>>>>> +
>>>>>> +=== Bitmap table ===
>>>>>> +
>>>>>> +Bitmaps are stored using a one-level structure (as opposed to two-level
>>>>>> +structure like for refcounts and guest clusters mapping) for the
>>>>>> mapping of
>>>>>> +bitmap data to host clusters. This structure is called the bitmap
>>>>>> table.
>>>>>> +
>>>>>> +Each bitmap table has a variable size (stored in the bitmap
>>>>>> directory Entry)
>>>>>> +and may use multiple clusters, however, it must be contiguous in the
>>>>>> image
>>>>>> +file.
>>>>>> +
>>>>>> +Structure of a bitmap table entry:
>>>>>> +
>>>>>> +    Bit       0:    Reserved and must be zero if bits 9 - 55 are
>>>>>> non-zero.
>>>>>> +                    If bits 9 - 55 are zero:
>>>>>> +                      0: Cluster should be read as all zeros.
>>>>>> +                      1: Cluster should be read as all ones.
>>>>>> +
>>>>>> +         1 -  8:    Reserved and must be zero.
>>>>>> +
>>>>>> +         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be
>>>>>> aligned to
>>>>>> +                    a cluster boundary. If the offset is 0, the
>>>>>> cluster is
>>>>>> +                    unallocated; in that case, bit 0 determines how
>>>>>> this
>>>>>> +                    cluster should be treated when read from.
>>>>>> +
>>>>>> +        56 - 63:    Reserved and must be zero.
>>>>>> +
>>>>>> +=== Bitmap data ===
>>>>>> +
>>>>>> +As noted above, bitmap data is stored in separate clusters,
>>>>>> described by the
>>>>>> +bitmap table. Given an offset (in bytes) into the bitmap data, the
>>>>>> offset into
>>>>>> +the image file can be obtained as follows:
>>>>>> +
>>>>>> +    image_offset =
>>>>>> +        bitmap_table[bitmap_data_offset / cluster_size] +
>>>>>> +            (bitmap_data_offset % cluster_size)
>>>>>> +
>>>>>> +This offset is not defined if bits 9 - 55 of bitmap table entry are
>>>>>> zero (see
>>>>>> +above).
>>>>>> +
>>>>>> +Given an offset byte_nr into the virtual disk and the bitmap's
>>>>>> granularity, the
>>>>>> +bit offset into the bitmap can be calculated like this:
>>>>>> +
>>>>>> +    bit_offset =
>>>>>> +        image_offset(byte_nr / granularity / 8) * 8 +
>>>>>> +            (byte_nr / granularity) % 8
>>>>>> +
>>>>>> +If the size of the bitmap data is not a multiply of cluster size
>>>>>> then the last
>>>>>> +cluster of the bitmap data contains some unused tail bits. These
>>>>>> bits must be
>>>>>> +zero.
>>>>> In which order are the bits stored in the bitmap?
>>>> What do you mean?
>>> He means BE or LE. You can intuit it from the formula, but it's not
>>> explicitly stated, I think
>> byte ordering? But it make sense when we deal with integers, which
>> occupies more then 1 byte. Here is plain byte sequence. Bitmap. Like
>> a string.. What should I write here additionally?
> I meant sub-byte ordering. Whether bit 0 is the LSB or the MSB. I assume
> you intend to use the LSB for bit 0, but it's not written anywhere.
>
> Actually, that's an assumption that is made for bitmaps in the whole
> spec (including e.g. feature flags), even though it is only explicitly
> spelt out for the refcount blocks. I guess we should move it to the
> 'General' section.

Understand now. Then it should be separate patch.

>
> Kevin
Vladimir Sementsov-Ogievskiy Jan. 25, 2016, 10:15 a.m. UTC | #18
On 19.01.2016 20:29, Kevin Wolf wrote:
> Am 19.01.2016 um 09:57 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> On 19.01.2016 00:16, Eric Blake wrote:
>>> preserving semantics of those extra_data bytes).  We
>>> have enough room for future extension, and that's good e
>> Ok, so, what should go to the spec? Current wording is ok? Just
>> delete "Type-specific":
>>
>> +
>> +        20 - 23:    extra_data_size
>> +                    Size of type-specific extra data.
>> +
>> +                    For now, as no extra data is defined, extra_data_size is
>> +                    reserved and must be zero.
>> +
>> +        variable:   Extra data for the bitmap.
> Please be explicit that if extra_data_size is non-zero, the bitmap must
> not be used (i.e. specify the incompatible-feature-bit-like behaviour).

It is not enough. If there are some unknown extra data, then just 
ignoring this bitmap may lead to its inconsistency. So, if it is 
non-zero, the whole image should not be written. (real 
incompatible-feature-bit behavior).

>
> Kevin
Vladimir Sementsov-Ogievskiy Jan. 25, 2016, 10:22 a.m. UTC | #19
On 19.01.2016 20:27, Kevin Wolf wrote:
> Am 18.01.2016 um 22:16 hat Eric Blake geschrieben:
>> On 01/18/2016 09:54 AM, John Snow wrote:
>>
>>>> Please, let's decide finally about extra data, than I'll reroll it and,
>>>> I hope, it will be committed, to make it possible to continue work on
>>>> persistence series. About extra data, I'm ready to accept any variant,
>>>> strictly defining, what software should do with unknown extra data.
>>>>
>>>>
>>> I discussed this with Eric Blake on IRC briefly, and I mentioned I was
>>> concerned that we didn't specify a format at all for the extra data.
>>> Eric felt that it was not unusual to leave a space for future expansion
>>> and that as we haven't used it yet, we don't need to solidify it.
>>>
>>> He also felt it would be unusual to stipulate the format of data that we
>>> don't even intend to use yet.
>>>
>>> In short, I'm being too proactive.
>>>
>>> A commit message mention that, should anyone wish to expand the
>>> type-specific data in the future that adding a 2-byte version as the
>>> first field in extra data would probably be sufficient, and we can worry
>>> about the spec wording later. It is fine to assume for now that if
>>> extra_data_size is 0 that the version/format of the data is "v0" and
>>> that does not limit our future expansion.
>> Or put another way:
>>
>> I'm just fine if our initial implementation provides sufficient
>> information for us to completely parse the file even when the file is
>> generated by a newer qemu (we have a length, so we know how far to skip
>> to find the next entry), while at the same time throwing up our hands if
>> the length is non-zero (we won't read the bitmap at all, because we
>> don't know if the non-zero extra_data contains instructions that would
>> change how to interpret the data) or even prevent writes (if the bitmap
>> entry is marked automatic, we must refuse any write that would requiring
>> an update to the bitmap because we don't know how to write to a bitmap
>> while correctly preserving semantics of those extra_data bytes).
> Can we assume that the extra_data doesn't contain references to
> clusters? Otherwise we need to forbid 'qemu-img check -r leaks' when
> there is unknown extra_data.
>
> FWIW, this assumption is already made for snapshots, so it seems okay to
> make it here as well. But we could be explicit about it.

ok, will do.

>
> Kevin
Kevin Wolf Jan. 25, 2016, 11:09 a.m. UTC | #20
Am 25.01.2016 um 11:15 hat Vladimir Sementsov-Ogievskiy geschrieben:
> On 19.01.2016 20:29, Kevin Wolf wrote:
> >Am 19.01.2016 um 09:57 hat Vladimir Sementsov-Ogievskiy geschrieben:
> >>On 19.01.2016 00:16, Eric Blake wrote:
> >>>preserving semantics of those extra_data bytes).  We
> >>>have enough room for future extension, and that's good e
> >>Ok, so, what should go to the spec? Current wording is ok? Just
> >>delete "Type-specific":
> >>
> >>+
> >>+        20 - 23:    extra_data_size
> >>+                    Size of type-specific extra data.
> >>+
> >>+                    For now, as no extra data is defined, extra_data_size is
> >>+                    reserved and must be zero.
> >>+
> >>+        variable:   Extra data for the bitmap.
> >Please be explicit that if extra_data_size is non-zero, the bitmap must
> >not be used (i.e. specify the incompatible-feature-bit-like behaviour).
> 
> It is not enough. If there are some unknown extra data, then just
> ignoring this bitmap may lead to its inconsistency. So, if it is
> non-zero, the whole image should not be written. (real
> incompatible-feature-bit behavior).

Don't we generally ignore all bitmaps until a user actively tries to
make use of it? Of course, with the 'auto' flag set, just ignoring the
bitmap isn't possible, but I think in all other cases it should be.

If we ever add another type of bitmaps that doesn't have the 'auto' flag
set, but is still automatically used, so that the image as a whole must
become read-only without the bitmap, we can still add a normal
incompatible feature flag. But I think it's more likely that we add
extra_data that doesn't prevent use of the image, so we should have a
way to express that.

Kevin
Vladimir Sementsov-Ogievskiy Jan. 25, 2016, 12:27 p.m. UTC | #21
On 25.01.2016 14:09, Kevin Wolf wrote:
> Am 25.01.2016 um 11:15 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> On 19.01.2016 20:29, Kevin Wolf wrote:
>>> Am 19.01.2016 um 09:57 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> On 19.01.2016 00:16, Eric Blake wrote:
>>>>> preserving semantics of those extra_data bytes).  We
>>>>> have enough room for future extension, and that's good e
>>>> Ok, so, what should go to the spec? Current wording is ok? Just
>>>> delete "Type-specific":
>>>>
>>>> +
>>>> +        20 - 23:    extra_data_size
>>>> +                    Size of type-specific extra data.
>>>> +
>>>> +                    For now, as no extra data is defined, extra_data_size is
>>>> +                    reserved and must be zero.
>>>> +
>>>> +        variable:   Extra data for the bitmap.
>>> Please be explicit that if extra_data_size is non-zero, the bitmap must
>>> not be used (i.e. specify the incompatible-feature-bit-like behaviour).
>> It is not enough. If there are some unknown extra data, then just
>> ignoring this bitmap may lead to its inconsistency. So, if it is
>> non-zero, the whole image should not be written. (real
>> incompatible-feature-bit behavior).
> Don't we generally ignore all bitmaps until a user actively tries to
> make use of it? Of course, with the 'auto' flag set, just ignoring the
> bitmap isn't possible, but I think in all other cases it should be.
>
> If we ever add another type of bitmaps that doesn't have the 'auto' flag
> set, but is still automatically used, so that the image as a whole must
> become read-only without the bitmap, we can still add a normal
> incompatible feature flag. But I think it's more likely that we add
> extra_data that doesn't prevent use of the image, so we should have a
> way to express that.
>
> Kevin

let's add one flag:

    2: extra_data_compatible
        This flag is meaningful when extra data is unknown for the software.
        If it is set, the bitmap may be used as usual, extra data must 
be left as is.
        If it is unset, the bitmap must not be used, but left as is with 
extra data.

Ok? And, if there will be incompatible features we will use normal 
incompatible feature flag.
diff mbox

Patch

diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 121dfc8..997239d 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -103,7 +103,18 @@  in the description of a field.
                     write to an image with unknown auto-clear features if it
                     clears the respective bits from this field first.
 
-                    Bits 0-63:  Reserved (set to 0)
+                    Bit 0:      Bitmaps extension bit
+                                This bit indicates consistency for the bitmaps
+                                extension data.
+
+                                It is an error if this bit is set without the
+                                bitmaps extension present.
+
+                                If the bitmaps extension is present but this
+                                bit is unset, the bitmaps extension data is
+                                inconsistent.
+
+                    Bits 1-63:  Reserved (set to 0)
 
          96 -  99:  refcount_order
                     Describes the width of a reference count block entry (width
@@ -123,6 +134,7 @@  be stored. Each extension has a structure like the following:
                         0x00000000 - End of the header extension area
                         0xE2792ACA - Backing file format name
                         0x6803f857 - Feature name table
+                        0x23852875 - Bitmaps extension
                         other      - Unknown header extension, can be safely
                                      ignored
 
@@ -166,6 +178,34 @@  the header extension data. Each entry look like this:
                     terminated if it has full length)
 
 
+== Bitmaps extension ==
+
+The bitmaps extension is an optional header extension. It provides the ability
+to store bitmaps related to a virtual disk. For now, there is only one bitmap
+type: the dirty tracking bitmap, which tracks virtual disk changes from some
+point in time.
+
+The data of the extension should be considered consistent only if the
+corresponding auto-clear feature bit is set, see autoclear_features above.
+
+The fields of the bitmaps extension are:
+
+          0 -  3:  nb_bitmaps
+                   The number of bitmaps contained in the image. Must be
+                   greater than or equal to 1.
+
+                   Note: Qemu currently only supports up to 65535 bitmaps per
+                   image.
+
+          4 -  7:  bitmap_directory_size
+                   Size of the bitmap directory in bytes. It is the cumulative
+                   size of all (nb_bitmaps) bitmap headers.
+
+          8 - 15:  bitmap_directory_offset
+                   Offset into the image file at which the bitmap directory
+                   starts. Must be aligned to a cluster boundary.
+
+
 == Host cluster management ==
 
 qcow2 manages the allocation of host clusters by maintaining a reference count
@@ -360,3 +400,133 @@  Snapshot table entry:
 
         variable:   Padding to round up the snapshot table entry size to the
                     next multiple of 8.
+
+
+== Bitmaps ==
+
+As mentioned above, the bitmaps extension provides the ability to store bitmaps
+related a virtual disk. This section describes how these bitmaps are stored.
+
+Note: all bitmaps are related to the virtual disk stored in this image.
+
+=== Bitmap directory ===
+
+Each bitmap saved in the image is described in a bitmap directory entry. The
+bitmap directory is a contiguous area in the image file, whose starting offset
+and length are given by the header extension fields bitmap_directory_offset and
+bitmap_directory_size. The entries of the bitmap directory have variable
+length, depending on the length of the bitmap name and extra data. These
+entries are also called bitmap headers.
+
+Structure of a bitmap directory entry:
+
+    Byte 0 -  7:    bitmap_table_offset
+                    Offset into the image file at which the bitmap table
+                    (described below) for the bitmap starts. Must be aligned to
+                    a cluster boundary.
+
+         8 - 11:    bitmap_table_size
+                    Number of entries in the bitmap table of the bitmap.
+
+        12 - 15:    flags
+                    Bit
+                      0: in_use
+                         The bitmap was not saved correctly and may be
+                         inconsistent.
+
+                      1: auto
+                         The bitmap must reflect all changes of the virtual
+                         disk by any application that would write to this qcow2
+                         file (including writes, snapshot switching, etc.). The
+                         type of this bitmap must be 'dirty tracking bitmap'.
+
+                    Bits 2 - 31 are reserved and must be 0.
+
+             16:    type
+                    This field describes the sort of the bitmap.
+                    Values:
+                      1: Dirty tracking bitmap
+
+                    Values 0, 2 - 255 are reserved.
+
+             17:    granularity_bits
+                    Granularity bits. Valid values: 0 - 63.
+
+                    Note: Qemu currently doesn't support granularity_bits
+                    greater than 31.
+
+                    Granularity is calculated as
+                        granularity = 1 << granularity_bits
+
+                    A bitmap's granularity is how many bytes of the image
+                    accounts for one bit of the bitmap.
+
+        18 - 19:    name_size
+                    Size of the bitmap name. Must be non-zero.
+
+                    Note: Qemu currently doesn't support values greater than
+                    1023.
+
+        20 - 23:    extra_data_size
+                    Size of type-specific extra data.
+
+                    For now, as no extra data is defined, extra_data_size is
+                    reserved and must be zero.
+
+        variable:   Type-specific extra data for the bitmap.
+
+        variable:   The name of the bitmap (not null terminated). Must be
+                    unique among all bitmap names within the bitmaps extension.
+
+        variable:   Padding to round up the bitmap directory entry size to the
+                    next multiple of 8.
+
+=== Bitmap table ===
+
+Bitmaps are stored using a one-level structure (as opposed to two-level
+structure like for refcounts and guest clusters mapping) for the mapping of
+bitmap data to host clusters. This structure is called the bitmap table.
+
+Each bitmap table has a variable size (stored in the bitmap directory Entry)
+and may use multiple clusters, however, it must be contiguous in the image
+file.
+
+Structure of a bitmap table entry:
+
+    Bit       0:    Reserved and must be zero if bits 9 - 55 are non-zero.
+                    If bits 9 - 55 are zero:
+                      0: Cluster should be read as all zeros.
+                      1: Cluster should be read as all ones.
+
+         1 -  8:    Reserved and must be zero.
+
+         9 - 55:    Bits 9 - 55 of the host cluster offset. Must be aligned to
+                    a cluster boundary. If the offset is 0, the cluster is
+                    unallocated; in that case, bit 0 determines how this
+                    cluster should be treated when read from.
+
+        56 - 63:    Reserved and must be zero.
+
+=== Bitmap data ===
+
+As noted above, bitmap data is stored in separate clusters, described by the
+bitmap table. Given an offset (in bytes) into the bitmap data, the offset into
+the image file can be obtained as follows:
+
+    image_offset =
+        bitmap_table[bitmap_data_offset / cluster_size] +
+            (bitmap_data_offset % cluster_size)
+
+This offset is not defined if bits 9 - 55 of bitmap table entry are zero (see
+above).
+
+Given an offset byte_nr into the virtual disk and the bitmap's granularity, the
+bit offset into the bitmap can be calculated like this:
+
+    bit_offset =
+        image_offset(byte_nr / granularity / 8) * 8 +
+            (byte_nr / granularity) % 8
+
+If the size of the bitmap data is not a multiply of cluster size then the last
+cluster of the bitmap data contains some unused tail bits. These bits must be
+zero.