Patchwork [1/6,v11] docs: spec for add-cow file format

login
register
mail settings
Submitter Robert Wang
Date July 31, 2012, 4:51 p.m.
Message ID <1343753510-24661-1-git-send-email-wdongxu@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/174285/
State New
Headers show

Comments

Robert Wang - July 31, 2012, 4:51 p.m.
Introduce a new file format:add-cow. The usage can be found at this patch.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
Now add-cow is still using QEMUOptionParameter, not QemuOpts,  I will send a
seperate patch series to convert.

 docs/specs/add-cow.txt |  128 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 128 insertions(+), 0 deletions(-)
 create mode 100644 docs/specs/add-cow.txt
Eric Blake - Aug. 1, 2012, 1:51 p.m.
On 07/31/2012 10:51 AM, Dong Xu Wang wrote:
> Introduce a new file format:add-cow. The usage can be found at this patch.
> 
> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> ---
> Now add-cow is still using QEMUOptionParameter, not QemuOpts,  I will send a
> seperate patch series to convert.

s/seperate/separate/

> 
>  docs/specs/add-cow.txt |  128 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 128 insertions(+), 0 deletions(-)
>  create mode 100644 docs/specs/add-cow.txt
> 
> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
> new file mode 100644
> index 0000000..4793a3e
> --- /dev/null
> +++ b/docs/specs/add-cow.txt
> @@ -0,0 +1,128 @@
> +== General ==
> +
> +Raw file format does not support backing file and copy on write feature.

grammar:
The raw file format does not support backing files or the copy-on-write
feature.

> +The add-cow image format makes it possible to use backing files with raw
> +image by keeping a separate .add-cow metadata file. Once all sectors
> +have been written into the raw image it is safe to discard the .add-cow
> +and backing files, then we can use the raw image directly.

I'm still not sure how this series fits in with the recent discussions
on adding drive-mirror with the capability of creating a raw file mirror
of a qcow2 snapshot.


> +
> +While using add-cow, procedures may like this:

grammar:
An example usage of add-cow would look like:

> +(ubuntu.img is a disk image which has been installed OS.)
> +    1)  Create a raw image with the same size of ubuntu.img
> +            qemu-img create -f raw test.raw 8G
> +    2)  Create an add-cow image which will store dirty bitmap
> +            qemu-img create -f add-cow test.add-cow \
> +                -o backing_file=ubuntu.img,image_file=test.raw
> +    3)  Run qemu with add-cow image
> +            qemu -drive if=virtio,file=test.add-cow
> +
> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
> +will be calculated by the size of ubuntu.img, test.raw will be used from the
> +1st byte, the rest part can be used for other purpose.

Grammar was off here, but I'm not sure what you meant to suggest a
replacement.  Maybe:

the size of test.add-cow will be calculated from the size of ubuntu.img,
and extra space at the tail of test.raw can be used for other purposes.

> +(#define HEADER_SIZE (4096 * header_pages_size))
> +    Byte    0 -  7:     magic
> +                        add-cow magic string ("ADD_COW\xff").
> +
> +            8 -  11:    version
> +                        Version number (only valid value is 1 now).
> +
> +            12 - 15:    backing file name offset
> +                        Offset in the add-cow file at which the backing file
> +                        name is stored (NB: The string is not null terminated).

Nit: this should be nul-terminated (NUL is the one-byte all-0 character
in single byte encodings that ends a string, and NULL is the four- or
eight-byte value, typically all-0, for a pointer to nowhere).


> +
> +            28 - 35:    features
> +                        Currently only 3 feature bit is used:
> +                        Feature bits:
> +                            The image uses a backing file:
> +                            * ADD_COW_F_BACKING_FILE    = 0x01.

Isn't this bit redundant with the earlier field at byte 12 stating
whether a backing file is present?

> +                            The image uses a image file:
> +                            * ADD_COW_F_IMAGE_FILE      = 0x02.

The field at byte 20 implies that an image file name is mandatory,
meaning this bit is always 1 and therefore pointless.

> +                            All bits in bitmap have been set to 1, add-cow wrapper
> +                            can be discarded.
> +                            * ADD_COW_F_All_ALLOCATED   = 0x04.
> +
> +            36 - 43:    optional features
> +                        Not used now. Researved for future use.

s/Researved/Reserved/, mention that it must be set to 0

> +
> +            44 - 47:    header pages size
> +                        The header field is variable-sized. This field indicates
> +                        how many pages(4k) will be used to store add-cow header.
> +                        In add-cow v1, it is fixed to 1, so the header size will
> +                        be 4k * 1 = 4096 bytes.
> +
> +Image file name and backing file name must NOT be the same, we prevent this
> +while creating add-cow files.
> +
> +Image file and backing file are interpreted relative to the qcow2 file, not
> +to the current working directory of the process that opened the qcow2 file.

Either indent this description to match the field it is describing, or
sink it down until after you have called out the header layout.

> +
> +== Reserved ==
> +
> +    Byte    48 - 63:    backing file format
> +                        format of backing file. It will be filled with 0 if
> +                        backing file name offset is 0. If backing file name
> +                        offset is none-zero, it must be non-zero.

s/none-zero/non-zero/

Why are you defining these byte offsets in the Reserved section?  This
text should occur earlier in the mandatory header format.  Is this field
free-form ASCII?  Must the field be NUL-terminated?  For that matter, I
think you can just delete the ==Reserved== header, as you are calling
out every possible offset.

> +
> +            64 - 79:    image file format
> +                        format of image file. It must be non-zero.

Same question about whether this field is ASCII, and must be NUL-terminated.

> +
> +            80 - [HEADER_SIZE - 1]:
> +                        It is used to make sure COW bitmap field starts at the
> +                        HEADER_SIZE byte, backing file name and image file name
> +                        will be stored here.

Is it required that bytes not pointed to by backing file and image names
must have any particular value?

> +
> +== COW bitmap ==
> +
> +The "COW bitmap" field starts at the 4096th byte, stores a bitmap related to
> +backing file and image file. The bitmap will track whether the sector in
> +backing file is dirty or not.
> +
> +Each bit in the bitmap indicates one cluster's status. One cluster includes 128
> +sectors, then each bit indicates 512 * 128 = 64k bytes. the size of bitmap is
> +calculated according to virtual size of backing file. In each byte, bit 0 to 7
> +will track the 1st to 7th cluster in sequence, bit orders in one byte look like:

1st to 7th is only 7 clusters.  You mean either '0th to 7th' or '1st to
8th'.  Or just simplify to:

Within each byte, the least significant bit covers the first cluster.

> + +----+----+----+----+----+----+----+----+
> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
> + +----+----+----+----+----+----+----+----+
> +
> +If the bit is 0, indicates the sector has not been allocated in image file, data
> +should be loaded from backing file while reading; if the bit is 1, indicates the
> +related sector has been dirty, should be loaded from image file while reading.
> +Writing to a sector causes the corresponding bit to be set to 1.
> +
> +If raw image is not an even multiple of cluster bytes, bits that correspond to
> +bytes beyond the raw file size in add-cow will be 0.

Will this affect the use of the ADD_COW_F_ALL_ALLOCATED feature bit in
the header?
Stefan Hajnoczi - Aug. 1, 2012, 1:55 p.m.
On Tue, Jul 31, 2012 at 5:51 PM, Dong Xu Wang
<wdongxu@linux.vnet.ibm.com> wrote:
> Introduce a new file format:add-cow. The usage can be found at this patch.
>
> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> ---
> Now add-cow is still using QEMUOptionParameter, not QemuOpts,  I will send a
> seperate patch series to convert.

I suggest including a cover letter in future patch series:

git format-patch --cover-letter --numbered -o my-series/ master..

Or to do it in a single command with git-send-email(1) use the --compose option.

The cover letter makes it easy for reviewers to add their Reviewed-by:
to the entire series by replying only once.  It's also a good place to
include a changelog that tells reviewers what you changed from the
last published version.

>  docs/specs/add-cow.txt |  128 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 128 insertions(+), 0 deletions(-)
>  create mode 100644 docs/specs/add-cow.txt
>
> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
> new file mode 100644
> index 0000000..4793a3e
> --- /dev/null
> +++ b/docs/specs/add-cow.txt
> @@ -0,0 +1,128 @@
> +== General ==
> +
> +Raw file format does not support backing file and copy on write feature.
> +The add-cow image format makes it possible to use backing files with raw
> +image by keeping a separate .add-cow metadata file. Once all sectors
> +have been written into the raw image it is safe to discard the .add-cow
> +and backing files, then we can use the raw image directly.
> +
> +While using add-cow, procedures may like this:
> +(ubuntu.img is a disk image which has been installed OS.)
> +    1)  Create a raw image with the same size of ubuntu.img
> +            qemu-img create -f raw test.raw 8G
> +    2)  Create an add-cow image which will store dirty bitmap
> +            qemu-img create -f add-cow test.add-cow \
> +                -o backing_file=ubuntu.img,image_file=test.raw
> +    3)  Run qemu with add-cow image
> +            qemu -drive if=virtio,file=test.add-cow
> +
> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
> +will be calculated by the size of ubuntu.img, test.raw will be used from the
> +1st byte, the rest part can be used for other purpose.

This is not how backing files normally work.  With qcow2 or qed a
smaller backing file just means that the guest reads zeroes from the
areas beyond the end of the backing file.  Is there a special reason
why you want to implement the behavior you described in the spec?
Otherwise I suggest implementing the same behavior as qcow2/qed.

> +
> +=Specification=
> +
> +The file format looks like this:
> +
> + +---------------+-------------+-----------------+
> + |     Header    |   Reserved  |    COW bitmap   |
> + +---------------+-------------+-----------------+
> +
> +All numbers in add-cow are stored in Little Endian byte order.
> +
> +== Header ==
> +
> +The Header is included in the first bytes:
> +(#define HEADER_SIZE (4096 * header_pages_size))
> +    Byte    0 -  7:     magic
> +                        add-cow magic string ("ADD_COW\xff").
> +
> +            8 -  11:    version
> +                        Version number (only valid value is 1 now).
> +
> +            12 - 15:    backing file name offset
> +                        Offset in the add-cow file at which the backing file
> +                        name is stored (NB: The string is not null terminated).
> +                        If backing file name does NOT exist, this field will be
> +                        0. Must be between 80 and [HEADER_SIZE - 2](a file name
> +                        must be at least 1 byte).
> +
> +            16 - 19:    backing file name size
> +                        Length of the backing file name in bytes. It will be 0
> +                        if the backing file name offset is 0. If backing file
> +                        name offset is non-zero, then it must be non-zero. Must
> +                        be less than [HEADER_SIZE - 80] to fit in the reserved
> +                        part of the header.
> +
> +            20 - 23:    image file name offset
> +                        Offset in the add-cow file at which the image file name
> +                        is stored (NB: The string is not null terminated). It
> +                        must be between 80 and [HEADER_SIZE - 2].
> +
> +            24 - 27:    image file name size
> +                        Length of the image file name in bytes.
> +                        Must be less than [HEADER_SIZE - 80] to fit in the reserved
> +                        part of the header.
> +
> +            28 - 35:    features
> +                        Currently only 3 feature bit is used:
> +                        Feature bits:
> +                            The image uses a backing file:
> +                            * ADD_COW_F_BACKING_FILE    = 0x01.
> +                            The image uses a image file:
> +                            * ADD_COW_F_IMAGE_FILE      = 0x02.
> +                            All bits in bitmap have been set to 1, add-cow wrapper
> +                            can be discarded.
> +                            * ADD_COW_F_All_ALLOCATED   = 0x04.
> +
> +            36 - 43:    optional features
> +                        Not used now. Researved for future use.

s/Researved/Reserved/

> +
> +            44 - 47:    header pages size
> +                        The header field is variable-sized. This field indicates
> +                        how many pages(4k) will be used to store add-cow header.
> +                        In add-cow v1, it is fixed to 1, so the header size will
> +                        be 4k * 1 = 4096 bytes.
> +
> +Image file name and backing file name must NOT be the same, we prevent this
> +while creating add-cow files.
> +
> +Image file and backing file are interpreted relative to the qcow2 file, not
> +to the current working directory of the process that opened the qcow2 file.
> +
> +== Reserved ==
> +
> +    Byte    48 - 63:    backing file format
> +                        format of backing file. It will be filled with 0 if
> +                        backing file name offset is 0. If backing file name
> +                        offset is none-zero, it must be non-zero.

s/none-zero/non-zero/

> +
> +            64 - 79:    image file format
> +                        format of image file. It must be non-zero.
> +
> +            80 - [HEADER_SIZE - 1]:
> +                        It is used to make sure COW bitmap field starts at the
> +                        HEADER_SIZE byte, backing file name and image file name
> +                        will be stored here.
> +
> +== COW bitmap ==
> +
> +The "COW bitmap" field starts at the 4096th byte, stores a bitmap related to

I would say it starts at offset HEADER_SIZE.  4096th byte == offset
0x0fff, 4097th byte == offset 0x1000.

Stefan
Robert Wang - Aug. 2, 2012, 7:03 a.m.
On Wed, Aug 1, 2012 at 9:51 PM, Eric Blake <eblake@redhat.com> wrote:
> On 07/31/2012 10:51 AM, Dong Xu Wang wrote:
>> Introduce a new file format:add-cow. The usage can be found at this patch.
>>
>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>> ---
>> Now add-cow is still using QEMUOptionParameter, not QemuOpts,  I will send a
>> seperate patch series to convert.
>
> s/seperate/separate/
>
Yep.
>>
>>  docs/specs/add-cow.txt |  128 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 128 insertions(+), 0 deletions(-)
>>  create mode 100644 docs/specs/add-cow.txt
>>
>> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
>> new file mode 100644
>> index 0000000..4793a3e
>> --- /dev/null
>> +++ b/docs/specs/add-cow.txt
>> @@ -0,0 +1,128 @@
>> +== General ==
>> +
>> +Raw file format does not support backing file and copy on write feature.
>
> grammar:
> The raw file format does not support backing files or the copy-on-write
> feature.

Yep.

>
>> +The add-cow image format makes it possible to use backing files with raw
>> +image by keeping a separate .add-cow metadata file. Once all sectors
>> +have been written into the raw image it is safe to discard the .add-cow
>> +and backing files, then we can use the raw image directly.
>
> I'm still not sure how this series fits in with the recent discussions
> on adding drive-mirror with the capability of creating a raw file mirror
> of a qcow2 snapshot.
>
>
>> +
>> +While using add-cow, procedures may like this:
>
> grammar:
> An example usage of add-cow would look like:
>

Yep.

>> +(ubuntu.img is a disk image which has been installed OS.)
>> +    1)  Create a raw image with the same size of ubuntu.img
>> +            qemu-img create -f raw test.raw 8G
>> +    2)  Create an add-cow image which will store dirty bitmap
>> +            qemu-img create -f add-cow test.add-cow \
>> +                -o backing_file=ubuntu.img,image_file=test.raw
>> +    3)  Run qemu with add-cow image
>> +            qemu -drive if=virtio,file=test.add-cow
>> +
>> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
>> +will be calculated by the size of ubuntu.img, test.raw will be used from the
>> +1st byte, the rest part can be used for other purpose.
>
> Grammar was off here, but I'm not sure what you meant to suggest a
> replacement.  Maybe:
>
> the size of test.add-cow will be calculated from the size of ubuntu.img,
> and extra space at the tail of test.raw can be used for other purposes.

Yes, this is what I mean.


>
>> +(#define HEADER_SIZE (4096 * header_pages_size))
>> +    Byte    0 -  7:     magic
>> +                        add-cow magic string ("ADD_COW\xff").
>> +
>> +            8 -  11:    version
>> +                        Version number (only valid value is 1 now).
>> +
>> +            12 - 15:    backing file name offset
>> +                        Offset in the add-cow file at which the backing file
>> +                        name is stored (NB: The string is not null terminated).
>
> Nit: this should be nul-terminated (NUL is the one-byte all-0 character
> in single byte encodings that ends a string, and NULL is the four- or
> eight-byte value, typically all-0, for a pointer to nowhere).
>
>
>> +
>> +            28 - 35:    features
>> +                        Currently only 3 feature bit is used:
>> +                        Feature bits:
>> +                            The image uses a backing file:
>> +                            * ADD_COW_F_BACKING_FILE    = 0x01.
>
> Isn't this bit redundant with the earlier field at byte 12 stating
> whether a backing file is present?
>

Yes, it is redundant. I just want to make if the add-cow image is
using  backing_file more clear.

>> +                            The image uses a image file:
>> +                            * ADD_COW_F_IMAGE_FILE      = 0x02.
>
> The field at byte 20 implies that an image file name is mandatory,
> meaning this bit is always 1 and therefore pointless.
>
>> +                            All bits in bitmap have been set to 1, add-cow wrapper
>> +                            can be discarded.
>> +                            * ADD_COW_F_All_ALLOCATED   = 0x04.
>> +
>> +            36 - 43:    optional features
>> +                        Not used now. Researved for future use.
>
> s/Researved/Reserved/, mention that it must be set to 0

Okay.

>
>> +
>> +            44 - 47:    header pages size
>> +                        The header field is variable-sized. This field indicates
>> +                        how many pages(4k) will be used to store add-cow header.
>> +                        In add-cow v1, it is fixed to 1, so the header size will
>> +                        be 4k * 1 = 4096 bytes.
>> +
>> +Image file name and backing file name must NOT be the same, we prevent this
>> +while creating add-cow files.
>> +
>> +Image file and backing file are interpreted relative to the qcow2 file, not
>> +to the current working directory of the process that opened the qcow2 file.
>
> Either indent this description to match the field it is describing, or
> sink it down until after you have called out the header layout.

Okay.

>
>> +
>> +== Reserved ==
>> +
>> +    Byte    48 - 63:    backing file format
>> +                        format of backing file. It will be filled with 0 if
>> +                        backing file name offset is 0. If backing file name
>> +                        offset is none-zero, it must be non-zero.
>
> s/none-zero/non-zero/

Okay.

>
> Why are you defining these byte offsets in the Reserved section?  This
> text should occur earlier in the mandatory header format.  Is this field
> free-form ASCII?  Must the field be NUL-terminated?  For that matter, I
> think you can just delete the ==Reserved== header, as you are calling
> out every possible offset.
>

Okay. I will describle more clearly in v12.

>> +
>> +            64 - 79:    image file format
>> +                        format of image file. It must be non-zero.
>
> Same question about whether this field is ASCII, and must be NUL-terminated.
>

Okay.

>> +
>> +            80 - [HEADER_SIZE - 1]:
>> +                        It is used to make sure COW bitmap field starts at the
>> +                        HEADER_SIZE byte, backing file name and image file name
>> +                        will be stored here.
>
> Is it required that bytes not pointed to by backing file and image names
> must have any particular value?

It must be 0, I will correct it later.

>
>> +
>> +== COW bitmap ==
>> +
>> +The "COW bitmap" field starts at the 4096th byte, stores a bitmap related to
>> +backing file and image file. The bitmap will track whether the sector in
>> +backing file is dirty or not.
>> +
>> +Each bit in the bitmap indicates one cluster's status. One cluster includes 128
>> +sectors, then each bit indicates 512 * 128 = 64k bytes. the size of bitmap is
>> +calculated according to virtual size of backing file. In each byte, bit 0 to 7
>> +will track the 1st to 7th cluster in sequence, bit orders in one byte look like:
>
> 1st to 7th is only 7 clusters.  You mean either '0th to 7th' or '1st to
> 8th'.  Or just simplify to:
>
> Within each byte, the least significant bit covers the first cluster.

Okay.

>
>> + +----+----+----+----+----+----+----+----+
>> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
>> + +----+----+----+----+----+----+----+----+
>> +
>> +If the bit is 0, indicates the sector has not been allocated in image file, data
>> +should be loaded from backing file while reading; if the bit is 1, indicates the
>> +related sector has been dirty, should be loaded from image file while reading.
>> +Writing to a sector causes the corresponding bit to be set to 1.
>> +
>> +If raw image is not an even multiple of cluster bytes, bits that correspond to
>> +bytes beyond the raw file size in add-cow will be 0.
>
> Will this affect the use of the ADD_COW_F_ALL_ALLOCATED feature bit in
> the header?

No, beyond the raw file size, the related bits in bitmap will make no sense.

I mean: ADD_COW_F_ALL_ALLOCATED  will be set when:

1)  If add-cow is created only use image_file option, but no
backing_file option.
In this case, bitmap will be useless.

2) when all sectors have been allocated in image_file, we will not use
image_file.
but in this case, I think it is hard to determine whether all sectors
have been allocated
or not, scanning all bitmap will take some time, maybe we can do this
when we call
"qemu-io check"? Or before opening add-cow files, we do a check?

Any comments?

Thank you, Eric,

>
> --
> Eric Blake   eblake@redhat.com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
Robert Wang - Aug. 2, 2012, 7:09 a.m.
On Wed, Aug 1, 2012 at 9:55 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Tue, Jul 31, 2012 at 5:51 PM, Dong Xu Wang
> <wdongxu@linux.vnet.ibm.com> wrote:
>> Introduce a new file format:add-cow. The usage can be found at this patch.
>>
>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>> ---
>> Now add-cow is still using QEMUOptionParameter, not QemuOpts,  I will send a
>> seperate patch series to convert.
>
> I suggest including a cover letter in future patch series:
>
> git format-patch --cover-letter --numbered -o my-series/ master..
>
> Or to do it in a single command with git-send-email(1) use the --compose option.
>
> The cover letter makes it easy for reviewers to add their Reviewed-by:
> to the entire series by replying only once.  It's also a good place to
> include a changelog that tells reviewers what you changed from the
> last published version.

Okay.

>
>>  docs/specs/add-cow.txt |  128 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 128 insertions(+), 0 deletions(-)
>>  create mode 100644 docs/specs/add-cow.txt
>>
>> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
>> new file mode 100644
>> index 0000000..4793a3e
>> --- /dev/null
>> +++ b/docs/specs/add-cow.txt
>> @@ -0,0 +1,128 @@
>> +== General ==
>> +
>> +Raw file format does not support backing file and copy on write feature.
>> +The add-cow image format makes it possible to use backing files with raw
>> +image by keeping a separate .add-cow metadata file. Once all sectors
>> +have been written into the raw image it is safe to discard the .add-cow
>> +and backing files, then we can use the raw image directly.
>> +
>> +While using add-cow, procedures may like this:
>> +(ubuntu.img is a disk image which has been installed OS.)
>> +    1)  Create a raw image with the same size of ubuntu.img
>> +            qemu-img create -f raw test.raw 8G
>> +    2)  Create an add-cow image which will store dirty bitmap
>> +            qemu-img create -f add-cow test.add-cow \
>> +                -o backing_file=ubuntu.img,image_file=test.raw
>> +    3)  Run qemu with add-cow image
>> +            qemu -drive if=virtio,file=test.add-cow
>> +
>> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
>> +will be calculated by the size of ubuntu.img, test.raw will be used from the
>> +1st byte, the rest part can be used for other purpose.
>
> This is not how backing files normally work.  With qcow2 or qed a
> smaller backing file just means that the guest reads zeroes from the
> areas beyond the end of the backing file.  Is there a special reason
> why you want to implement the behavior you described in the spec?
> Otherwise I suggest implementing the same behavior as qcow2/qed.

The size of add-cow will be caclulated by the size of backing_file, if
backing_file can be smaller, how can I get the size of add-cow while
being created?

Do you mean like following steps?
1) qemu-img create -f qcow2 source.qcow2 *8G*
2) qemu-img create -f t.add-cow -o backing_file=source.qcow2,image_file=t *10G*
And then reading un-allocated bytes from add-cow after 8G will be 0?

>
>> +
>> +=Specification=
>> +
>> +The file format looks like this:
>> +
>> + +---------------+-------------+-----------------+
>> + |     Header    |   Reserved  |    COW bitmap   |
>> + +---------------+-------------+-----------------+
>> +
>> +All numbers in add-cow are stored in Little Endian byte order.
>> +
>> +== Header ==
>> +
>> +The Header is included in the first bytes:
>> +(#define HEADER_SIZE (4096 * header_pages_size))
>> +    Byte    0 -  7:     magic
>> +                        add-cow magic string ("ADD_COW\xff").
>> +
>> +            8 -  11:    version
>> +                        Version number (only valid value is 1 now).
>> +
>> +            12 - 15:    backing file name offset
>> +                        Offset in the add-cow file at which the backing file
>> +                        name is stored (NB: The string is not null terminated).
>> +                        If backing file name does NOT exist, this field will be
>> +                        0. Must be between 80 and [HEADER_SIZE - 2](a file name
>> +                        must be at least 1 byte).
>> +
>> +            16 - 19:    backing file name size
>> +                        Length of the backing file name in bytes. It will be 0
>> +                        if the backing file name offset is 0. If backing file
>> +                        name offset is non-zero, then it must be non-zero. Must
>> +                        be less than [HEADER_SIZE - 80] to fit in the reserved
>> +                        part of the header.
>> +
>> +            20 - 23:    image file name offset
>> +                        Offset in the add-cow file at which the image file name
>> +                        is stored (NB: The string is not null terminated). It
>> +                        must be between 80 and [HEADER_SIZE - 2].
>> +
>> +            24 - 27:    image file name size
>> +                        Length of the image file name in bytes.
>> +                        Must be less than [HEADER_SIZE - 80] to fit in the reserved
>> +                        part of the header.
>> +
>> +            28 - 35:    features
>> +                        Currently only 3 feature bit is used:
>> +                        Feature bits:
>> +                            The image uses a backing file:
>> +                            * ADD_COW_F_BACKING_FILE    = 0x01.
>> +                            The image uses a image file:
>> +                            * ADD_COW_F_IMAGE_FILE      = 0x02.
>> +                            All bits in bitmap have been set to 1, add-cow wrapper
>> +                            can be discarded.
>> +                            * ADD_COW_F_All_ALLOCATED   = 0x04.
>> +
>> +            36 - 43:    optional features
>> +                        Not used now. Researved for future use.
>
> s/Researved/Reserved/

Okay.
>
>> +
>> +            44 - 47:    header pages size
>> +                        The header field is variable-sized. This field indicates
>> +                        how many pages(4k) will be used to store add-cow header.
>> +                        In add-cow v1, it is fixed to 1, so the header size will
>> +                        be 4k * 1 = 4096 bytes.
>> +
>> +Image file name and backing file name must NOT be the same, we prevent this
>> +while creating add-cow files.
>> +
>> +Image file and backing file are interpreted relative to the qcow2 file, not
>> +to the current working directory of the process that opened the qcow2 file.
>> +
>> +== Reserved ==
>> +
>> +    Byte    48 - 63:    backing file format
>> +                        format of backing file. It will be filled with 0 if
>> +                        backing file name offset is 0. If backing file name
>> +                        offset is none-zero, it must be non-zero.
>
> s/none-zero/non-zero/
Okay.
>
>> +
>> +            64 - 79:    image file format
>> +                        format of image file. It must be non-zero.
>> +
>> +            80 - [HEADER_SIZE - 1]:
>> +                        It is used to make sure COW bitmap field starts at the
>> +                        HEADER_SIZE byte, backing file name and image file name
>> +                        will be stored here.
>> +
>> +== COW bitmap ==
>> +
>> +The "COW bitmap" field starts at the 4096th byte, stores a bitmap related to
>
> I would say it starts at offset HEADER_SIZE.  4096th byte == offset
> 0x0fff, 4097th byte == offset 0x1000.

Okay.

>
> Stefan
>

Thank you Stefan.
Stefan Hajnoczi - Aug. 2, 2012, 10:44 a.m.
On Thu, Aug 2, 2012 at 8:09 AM, Dong Xu Wang <wdongxu@linux.vnet.ibm.com> wrote:
> On Wed, Aug 1, 2012 at 9:55 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>> On Tue, Jul 31, 2012 at 5:51 PM, Dong Xu Wang
>> <wdongxu@linux.vnet.ibm.com> wrote:
>>> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
>>> +will be calculated by the size of ubuntu.img, test.raw will be used from the
>>> +1st byte, the rest part can be used for other purpose.
>>
>> This is not how backing files normally work.  With qcow2 or qed a
>> smaller backing file just means that the guest reads zeroes from the
>> areas beyond the end of the backing file.  Is there a special reason
>> why you want to implement the behavior you described in the spec?
>> Otherwise I suggest implementing the same behavior as qcow2/qed.
>
> The size of add-cow will be caclulated by the size of backing_file, if
> backing_file can be smaller, how can I get the size of add-cow while
> being created?
>
> Do you mean like following steps?
> 1) qemu-img create -f qcow2 source.qcow2 *8G*
> 2) qemu-img create -f t.add-cow -o backing_file=source.qcow2,image_file=t *10G*
> And then reading un-allocated bytes from add-cow after 8G will be 0?

Yes.  You could also get the virtual disk size from the size of the
image_file during creation.

Stefan
Robert Wang - Aug. 3, 2012, 5:56 a.m.
On Thu, Aug 2, 2012 at 6:44 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Thu, Aug 2, 2012 at 8:09 AM, Dong Xu Wang <wdongxu@linux.vnet.ibm.com> wrote:
>> On Wed, Aug 1, 2012 at 9:55 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>> On Tue, Jul 31, 2012 at 5:51 PM, Dong Xu Wang
>>> <wdongxu@linux.vnet.ibm.com> wrote:
>>>> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
>>>> +will be calculated by the size of ubuntu.img, test.raw will be used from the
>>>> +1st byte, the rest part can be used for other purpose.
>>>
>>> This is not how backing files normally work.  With qcow2 or qed a
>>> smaller backing file just means that the guest reads zeroes from the
>>> areas beyond the end of the backing file.  Is there a special reason
>>> why you want to implement the behavior you described in the spec?
>>> Otherwise I suggest implementing the same behavior as qcow2/qed.
>>
>> The size of add-cow will be caclulated by the size of backing_file, if
>> backing_file can be smaller, how can I get the size of add-cow while
>> being created?
>>
>> Do you mean like following steps?
>> 1) qemu-img create -f qcow2 source.qcow2 *8G*
>> 2) qemu-img create -f t.add-cow -o backing_file=source.qcow2,image_file=t *10G*
>> And then reading un-allocated bytes from add-cow after 8G will be 0?
>
> Yes.  You could also get the virtual disk size from the size of the
> image_file during creation.

Eric said I shoud make sure  we also support a raw file larger than
the backing file. so
add-cow will have the same virtual size with image_file, and their
virtual size can be larger
than backing_file's.

Am I right?
Thanks.

> Stefan
>
Stefan Hajnoczi - Aug. 3, 2012, 8:26 a.m.
On Fri, Aug 3, 2012 at 6:56 AM, Dong Xu Wang <wdongxu@linux.vnet.ibm.com> wrote:
> On Thu, Aug 2, 2012 at 6:44 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>> On Thu, Aug 2, 2012 at 8:09 AM, Dong Xu Wang <wdongxu@linux.vnet.ibm.com> wrote:
>>> On Wed, Aug 1, 2012 at 9:55 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>>> On Tue, Jul 31, 2012 at 5:51 PM, Dong Xu Wang
>>>> <wdongxu@linux.vnet.ibm.com> wrote:
>>>>> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
>>>>> +will be calculated by the size of ubuntu.img, test.raw will be used from the
>>>>> +1st byte, the rest part can be used for other purpose.
>>>>
>>>> This is not how backing files normally work.  With qcow2 or qed a
>>>> smaller backing file just means that the guest reads zeroes from the
>>>> areas beyond the end of the backing file.  Is there a special reason
>>>> why you want to implement the behavior you described in the spec?
>>>> Otherwise I suggest implementing the same behavior as qcow2/qed.
>>>
>>> The size of add-cow will be caclulated by the size of backing_file, if
>>> backing_file can be smaller, how can I get the size of add-cow while
>>> being created?
>>>
>>> Do you mean like following steps?
>>> 1) qemu-img create -f qcow2 source.qcow2 *8G*
>>> 2) qemu-img create -f t.add-cow -o backing_file=source.qcow2,image_file=t *10G*
>>> And then reading un-allocated bytes from add-cow after 8G will be 0?
>>
>> Yes.  You could also get the virtual disk size from the size of the
>> image_file during creation.
>
> Eric said I shoud make sure  we also support a raw file larger than
> the backing file. so
> add-cow will have the same virtual size with image_file, and their
> virtual size can be larger
> than backing_file's.

Yes.  Please update the spec because it says the opposite:

"test.raw may be larger than ubuntu.img, in that case, the size of
test.add-cow will be calculated by the size of ubuntu.img"

It should use the size of test.raw.

Stefan
Robert Wang - Aug. 6, 2012, 2:05 a.m.
On Fri, Aug 3, 2012 at 4:26 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Fri, Aug 3, 2012 at 6:56 AM, Dong Xu Wang <wdongxu@linux.vnet.ibm.com> wrote:
>> On Thu, Aug 2, 2012 at 6:44 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>> On Thu, Aug 2, 2012 at 8:09 AM, Dong Xu Wang <wdongxu@linux.vnet.ibm.com> wrote:
>>>> On Wed, Aug 1, 2012 at 9:55 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>>>> On Tue, Jul 31, 2012 at 5:51 PM, Dong Xu Wang
>>>>> <wdongxu@linux.vnet.ibm.com> wrote:
>>>>>> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
>>>>>> +will be calculated by the size of ubuntu.img, test.raw will be used from the
>>>>>> +1st byte, the rest part can be used for other purpose.
>>>>>
>>>>> This is not how backing files normally work.  With qcow2 or qed a
>>>>> smaller backing file just means that the guest reads zeroes from the
>>>>> areas beyond the end of the backing file.  Is there a special reason
>>>>> why you want to implement the behavior you described in the spec?
>>>>> Otherwise I suggest implementing the same behavior as qcow2/qed.
>>>>
>>>> The size of add-cow will be caclulated by the size of backing_file, if
>>>> backing_file can be smaller, how can I get the size of add-cow while
>>>> being created?
>>>>
>>>> Do you mean like following steps?
>>>> 1) qemu-img create -f qcow2 source.qcow2 *8G*
>>>> 2) qemu-img create -f t.add-cow -o backing_file=source.qcow2,image_file=t *10G*
>>>> And then reading un-allocated bytes from add-cow after 8G will be 0?
>>>
>>> Yes.  You could also get the virtual disk size from the size of the
>>> image_file during creation.
>>
>> Eric said I shoud make sure  we also support a raw file larger than
>> the backing file. so
>> add-cow will have the same virtual size with image_file, and their
>> virtual size can be larger
>> than backing_file's.
>
> Yes.  Please update the spec because it says the opposite:
>
> "test.raw may be larger than ubuntu.img, in that case, the size of
> test.add-cow will be calculated by the size of ubuntu.img"
Okay. Thanks.
>
> It should use the size of test.raw.
>
> Stefan
>

Patch

diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
new file mode 100644
index 0000000..4793a3e
--- /dev/null
+++ b/docs/specs/add-cow.txt
@@ -0,0 +1,128 @@ 
+== General ==
+
+Raw file format does not support backing file and copy on write feature.
+The add-cow image format makes it possible to use backing files with raw
+image by keeping a separate .add-cow metadata file. Once all sectors
+have been written into the raw image it is safe to discard the .add-cow
+and backing files, then we can use the raw image directly.
+
+While using add-cow, procedures may like this:
+(ubuntu.img is a disk image which has been installed OS.)
+    1)  Create a raw image with the same size of ubuntu.img
+            qemu-img create -f raw test.raw 8G
+    2)  Create an add-cow image which will store dirty bitmap
+            qemu-img create -f add-cow test.add-cow \
+                -o backing_file=ubuntu.img,image_file=test.raw
+    3)  Run qemu with add-cow image
+            qemu -drive if=virtio,file=test.add-cow
+
+test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
+will be calculated by the size of ubuntu.img, test.raw will be used from the
+1st byte, the rest part can be used for other purpose.
+
+=Specification=
+
+The file format looks like this:
+
+ +---------------+-------------+-----------------+
+ |     Header    |   Reserved  |    COW bitmap   |
+ +---------------+-------------+-----------------+
+
+All numbers in add-cow are stored in Little Endian byte order.
+
+== Header ==
+
+The Header is included in the first bytes:
+(#define HEADER_SIZE (4096 * header_pages_size))
+    Byte    0 -  7:     magic
+                        add-cow magic string ("ADD_COW\xff").
+
+            8 -  11:    version
+                        Version number (only valid value is 1 now).
+
+            12 - 15:    backing file name offset
+                        Offset in the add-cow file at which the backing file
+                        name is stored (NB: The string is not null terminated).
+                        If backing file name does NOT exist, this field will be
+                        0. Must be between 80 and [HEADER_SIZE - 2](a file name
+                        must be at least 1 byte).
+
+            16 - 19:    backing file name size
+                        Length of the backing file name in bytes. It will be 0
+                        if the backing file name offset is 0. If backing file
+                        name offset is non-zero, then it must be non-zero. Must
+                        be less than [HEADER_SIZE - 80] to fit in the reserved
+                        part of the header.
+
+            20 - 23:    image file name offset
+                        Offset in the add-cow file at which the image file name
+                        is stored (NB: The string is not null terminated). It
+                        must be between 80 and [HEADER_SIZE - 2].
+
+            24 - 27:    image file name size
+                        Length of the image file name in bytes.
+                        Must be less than [HEADER_SIZE - 80] to fit in the reserved
+                        part of the header.
+
+            28 - 35:    features
+                        Currently only 3 feature bit is used:
+                        Feature bits:
+                            The image uses a backing file:
+                            * ADD_COW_F_BACKING_FILE    = 0x01.
+                            The image uses a image file:
+                            * ADD_COW_F_IMAGE_FILE      = 0x02.
+                            All bits in bitmap have been set to 1, add-cow wrapper
+                            can be discarded.
+                            * ADD_COW_F_All_ALLOCATED   = 0x04.
+
+            36 - 43:    optional features
+                        Not used now. Researved for future use.
+
+            44 - 47:    header pages size
+                        The header field is variable-sized. This field indicates
+                        how many pages(4k) will be used to store add-cow header.
+                        In add-cow v1, it is fixed to 1, so the header size will
+                        be 4k * 1 = 4096 bytes.
+
+Image file name and backing file name must NOT be the same, we prevent this
+while creating add-cow files.
+
+Image file and backing file are interpreted relative to the qcow2 file, not
+to the current working directory of the process that opened the qcow2 file.
+
+== Reserved ==
+
+    Byte    48 - 63:    backing file format
+                        format of backing file. It will be filled with 0 if
+                        backing file name offset is 0. If backing file name
+                        offset is none-zero, it must be non-zero.
+
+            64 - 79:    image file format
+                        format of image file. It must be non-zero.
+
+            80 - [HEADER_SIZE - 1]:
+                        It is used to make sure COW bitmap field starts at the
+                        HEADER_SIZE byte, backing file name and image file name
+                        will be stored here.
+
+== COW bitmap ==
+
+The "COW bitmap" field starts at the 4096th byte, stores a bitmap related to
+backing file and image file. The bitmap will track whether the sector in
+backing file is dirty or not.
+
+Each bit in the bitmap indicates one cluster's status. One cluster includes 128
+sectors, then each bit indicates 512 * 128 = 64k bytes. the size of bitmap is
+calculated according to virtual size of backing file. In each byte, bit 0 to 7
+will track the 1st to 7th cluster in sequence, bit orders in one byte look like:
+ +----+----+----+----+----+----+----+----+
+ | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
+ +----+----+----+----+----+----+----+----+
+
+If the bit is 0, indicates the sector has not been allocated in image file, data
+should be loaded from backing file while reading; if the bit is 1, indicates the
+related sector has been dirty, should be loaded from image file while reading.
+Writing to a sector causes the corresponding bit to be set to 1.
+
+If raw image is not an even multiple of cluster bytes, bits that correspond to
+bytes beyond the raw file size in add-cow will be 0.