diff mbox

[V12,1/6] docs: document for add-cow file format

Message ID 1344613185-12308-2-git-send-email-wdongxu@linux.vnet.ibm.com
State New
Headers show

Commit Message

Robert Wang Aug. 10, 2012, 3:39 p.m. UTC
Document for add-cow format, the usage and spec of add-cow are introduced.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
 docs/specs/add-cow.txt |  123 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 123 insertions(+), 0 deletions(-)
 create mode 100644 docs/specs/add-cow.txt

Comments

Michael Roth Sept. 6, 2012, 5:27 p.m. UTC | #1
On Fri, Aug 10, 2012 at 11:39:40PM +0800, Dong Xu Wang wrote:
> Document for add-cow format, the usage and spec of add-cow are introduced.
> 
> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> ---
>  docs/specs/add-cow.txt |  123 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 123 insertions(+), 0 deletions(-)
>  create mode 100644 docs/specs/add-cow.txt
> 
> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
> new file mode 100644
> index 0000000..d5a7a68
> --- /dev/null
> +++ b/docs/specs/add-cow.txt
> @@ -0,0 +1,123 @@
> +== General ==
> +
> +The raw file format does not support backing files or copy on write feature.
> +The add-cow image format makes it possible to use backing files with raw
> +image by keeping a separate .add-cow metadata file. Once all sectors
> +have been written into the raw image it is safe to discard the .add-cow
> +and backing files, then we can use the raw image directly.
> +
> +An example usage of add-cow would look like::
> +(ubuntu.img is a disk image which has been installed OS.)
> +    1)  Create a raw image with the same size of ubuntu.img
> +            qemu-img create -f raw test.raw 8G
> +    2)  Create an add-cow image which will store dirty bitmap
> +            qemu-img create -f add-cow test.add-cow \
> +                -o backing_file=ubuntu.img,image_file=test.raw
> +    3)  Run qemu with add-cow image
> +            qemu -drive if=virtio,file=test.add-cow
> +
> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
> +will be calculated from the size of test.raw.
> +
> +=Specification=
> +
> +The file format looks like this:
> +
> + +---------------+-------------+-----------------+
> + |     Header    |   Reserved  |    COW bitmap   |
> + +---------------+-------------+-----------------+
> +
> +All numbers in add-cow are stored in Little Endian byte order.
> +
> +== Header ==
> +
> +The Header is included in the first bytes:
> +(#define HEADER_SIZE (4096 * header_pages_size))
> +    Byte    0 -  7:     magic
> +                        add-cow magic string ("ADD_COW\xff").
> +
> +            8 -  11:    version
> +                        Version number (only valid value is 1 now).
> +
> +            12 - 15:    backing file name offset
> +                        Offset in the add-cow file at which the backing file
> +                        name is stored (NB: The string is not nul-terminated).
> +                        If backing file name does NOT exist, this field will be
> +                        0. Must be between 80 and [HEADER_SIZE - 2](a file name
> +                        must be at least 1 byte).
> +
> +            16 - 19:    backing file name size
> +                        Length of the backing file name in bytes. It will be 0
> +                        if the backing file name offset is 0. If backing file
> +                        name offset is non-zero, then it must be non-zero. Must
> +                        be less than [HEADER_SIZE - 80] to fit in the reserved
> +                        part of the header.
> +
> +            20 - 23:    image file name offset
> +                        Offset in the add-cow file at which the image file name
> +                        is stored (NB: The string is not null terminated). It
> +                        must be between 80 and [HEADER_SIZE - 2].
> +
> +            24 - 27:    image file name size
> +                        Length of the image file name in bytes.
> +                        Must be less than [HEADER_SIZE - 80] to fit in the reserved
> +                        part of the header.
> +
> +            28 - 35:    features
> +                        Currently only 1 feature bit is used:
> +                        Feature bits:
> +                            * ADD_COW_F_All_ALLOCATED   = 0x01.
> +
> +            36 - 43:    optional features
> +                        Not used now. Reserved for future use. It must be set to 0.
> +
> +            44 - 47:    header pages size
> +                        The header field is variable-sized. This field indicates
> +                        how many pages(4k) will be used to store add-cow header.
> +                        In add-cow v1, it is fixed to 1, so the header size will
> +                        be 4k * 1 = 4096 bytes.
> +
> +            48 - 63:    backing file format
> +                        format of backing file. It will be filled with 0 if
> +                        backing file name offset is 0. If backing file name
> +                        offset is non-zero, it must be non-zero. It is coded
> +                        in free-form ASCII, and is not NUL-terminated.
> +
> +            64 - 79:    image file format
> +                        format of image file. It must be non-zero. It is coded
> +                        in free-form ASCII, and is not NUL-terminated.
> +
> +            80 - [HEADER_SIZE - 1]:
> +                        It is used to make sure COW bitmap field starts at the
> +                        HEADER_SIZE byte, backing file name and image file name
> +                        will be stored here. The bytes that is not pointing to
> +                        backing file and image file names will bet set to 0.
> +
> +== COW bitmap ==
> +
> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to
> +backing file and image file. The bitmap will track whether the sector in
> +backing file is dirty or not.
> +
> +Each bit in the bitmap indicates one cluster's status. One cluster includes 128
> +sectors, then each bit indicates 512 * 128 = 64k bytes. the size of bitmap is
> +calculated according to virtual size of image file, and it also should be multipe
> +of 65536, the bits not used will be set to 0. Within each byte, the least
> +significant bit covers the first cluster. Bit orders in one byte look like:
> + +----+----+----+----+----+----+----+----+
> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
> + +----+----+----+----+----+----+----+----+
> +
> +If the bit is 0, indicates the sector has not been allocated in image file, data
> +should be loaded from backing file while reading; if the bit is 1, indicates the
> +related sector has been dirty, should be loaded from image file while reading.
> +Writing to a sector causes the corresponding bit to be set to 1.
> +
> +If raw image is not an even multiple of cluster bytes, bits that correspond to
> +bytes beyond the raw file size in add-cow will be 0.
> +
> +Image file name and backing file name must NOT be the same, we prevent this
> +while creating add-cow files.
> +
> +Image file and backing file are interpreted relative to the qcow2 file, not

Relative to the add-cow file?

> +to the current working directory of the process that opened the qcow2 file.
> -- 
> 1.7.1
> 
>
Robert Wang Sept. 10, 2012, 1:48 a.m. UTC | #2
On Fri, Sep 7, 2012 at 1:27 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> On Fri, Aug 10, 2012 at 11:39:40PM +0800, Dong Xu Wang wrote:
>> Document for add-cow format, the usage and spec of add-cow are introduced.
>>
>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>> ---
>>  docs/specs/add-cow.txt |  123 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 123 insertions(+), 0 deletions(-)
>>  create mode 100644 docs/specs/add-cow.txt
>>
>> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
>> new file mode 100644
>> index 0000000..d5a7a68
>> --- /dev/null
>> +++ b/docs/specs/add-cow.txt
>> @@ -0,0 +1,123 @@
>> +== General ==
>> +
>> +The raw file format does not support backing files or copy on write feature.
>> +The add-cow image format makes it possible to use backing files with raw
>> +image by keeping a separate .add-cow metadata file. Once all sectors
>> +have been written into the raw image it is safe to discard the .add-cow
>> +and backing files, then we can use the raw image directly.
>> +
>> +An example usage of add-cow would look like::
>> +(ubuntu.img is a disk image which has been installed OS.)
>> +    1)  Create a raw image with the same size of ubuntu.img
>> +            qemu-img create -f raw test.raw 8G
>> +    2)  Create an add-cow image which will store dirty bitmap
>> +            qemu-img create -f add-cow test.add-cow \
>> +                -o backing_file=ubuntu.img,image_file=test.raw
>> +    3)  Run qemu with add-cow image
>> +            qemu -drive if=virtio,file=test.add-cow
>> +
>> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
>> +will be calculated from the size of test.raw.
>> +
>> +=Specification=
>> +
>> +The file format looks like this:
>> +
>> + +---------------+-------------+-----------------+
>> + |     Header    |   Reserved  |    COW bitmap   |
>> + +---------------+-------------+-----------------+
>> +
>> +All numbers in add-cow are stored in Little Endian byte order.
>> +
>> +== Header ==
>> +
>> +The Header is included in the first bytes:
>> +(#define HEADER_SIZE (4096 * header_pages_size))
>> +    Byte    0 -  7:     magic
>> +                        add-cow magic string ("ADD_COW\xff").
>> +
>> +            8 -  11:    version
>> +                        Version number (only valid value is 1 now).
>> +
>> +            12 - 15:    backing file name offset
>> +                        Offset in the add-cow file at which the backing file
>> +                        name is stored (NB: The string is not nul-terminated).
>> +                        If backing file name does NOT exist, this field will be
>> +                        0. Must be between 80 and [HEADER_SIZE - 2](a file name
>> +                        must be at least 1 byte).
>> +
>> +            16 - 19:    backing file name size
>> +                        Length of the backing file name in bytes. It will be 0
>> +                        if the backing file name offset is 0. If backing file
>> +                        name offset is non-zero, then it must be non-zero. Must
>> +                        be less than [HEADER_SIZE - 80] to fit in the reserved
>> +                        part of the header.
>> +
>> +            20 - 23:    image file name offset
>> +                        Offset in the add-cow file at which the image file name
>> +                        is stored (NB: The string is not null terminated). It
>> +                        must be between 80 and [HEADER_SIZE - 2].
>> +
>> +            24 - 27:    image file name size
>> +                        Length of the image file name in bytes.
>> +                        Must be less than [HEADER_SIZE - 80] to fit in the reserved
>> +                        part of the header.
>> +
>> +            28 - 35:    features
>> +                        Currently only 1 feature bit is used:
>> +                        Feature bits:
>> +                            * ADD_COW_F_All_ALLOCATED   = 0x01.
>> +
>> +            36 - 43:    optional features
>> +                        Not used now. Reserved for future use. It must be set to 0.
>> +
>> +            44 - 47:    header pages size
>> +                        The header field is variable-sized. This field indicates
>> +                        how many pages(4k) will be used to store add-cow header.
>> +                        In add-cow v1, it is fixed to 1, so the header size will
>> +                        be 4k * 1 = 4096 bytes.
>> +
>> +            48 - 63:    backing file format
>> +                        format of backing file. It will be filled with 0 if
>> +                        backing file name offset is 0. If backing file name
>> +                        offset is non-zero, it must be non-zero. It is coded
>> +                        in free-form ASCII, and is not NUL-terminated.
>> +
>> +            64 - 79:    image file format
>> +                        format of image file. It must be non-zero. It is coded
>> +                        in free-form ASCII, and is not NUL-terminated.
>> +
>> +            80 - [HEADER_SIZE - 1]:
>> +                        It is used to make sure COW bitmap field starts at the
>> +                        HEADER_SIZE byte, backing file name and image file name
>> +                        will be stored here. The bytes that is not pointing to
>> +                        backing file and image file names will bet set to 0.
>> +
>> +== COW bitmap ==
>> +
>> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to
>> +backing file and image file. The bitmap will track whether the sector in
>> +backing file is dirty or not.
>> +
>> +Each bit in the bitmap indicates one cluster's status. One cluster includes 128
>> +sectors, then each bit indicates 512 * 128 = 64k bytes. the size of bitmap is
>> +calculated according to virtual size of image file, and it also should be multipe
>> +of 65536, the bits not used will be set to 0. Within each byte, the least
>> +significant bit covers the first cluster. Bit orders in one byte look like:
>> + +----+----+----+----+----+----+----+----+
>> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
>> + +----+----+----+----+----+----+----+----+
>> +
>> +If the bit is 0, indicates the sector has not been allocated in image file, data
>> +should be loaded from backing file while reading; if the bit is 1, indicates the
>> +related sector has been dirty, should be loaded from image file while reading.
>> +Writing to a sector causes the corresponding bit to be set to 1.
>> +
>> +If raw image is not an even multiple of cluster bytes, bits that correspond to
>> +bytes beyond the raw file size in add-cow will be 0.
>> +
>> +Image file name and backing file name must NOT be the same, we prevent this
>> +while creating add-cow files.
>> +
>> +Image file and backing file are interpreted relative to the qcow2 file, not
>
> Relative to the add-cow file?
Ah, yes..
>
>> +to the current working directory of the process that opened the qcow2 file.

>> --
>> 1.7.1
>>
>>
>
Kevin Wolf Sept. 10, 2012, 3:23 p.m. UTC | #3
Am 10.08.2012 17:39, schrieb Dong Xu Wang:
> Document for add-cow format, the usage and spec of add-cow are introduced.
> 
> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> ---
>  docs/specs/add-cow.txt |  123 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 123 insertions(+), 0 deletions(-)
>  create mode 100644 docs/specs/add-cow.txt
> 
> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
> new file mode 100644
> index 0000000..d5a7a68
> --- /dev/null
> +++ b/docs/specs/add-cow.txt
> @@ -0,0 +1,123 @@
> +== General ==
> +
> +The raw file format does not support backing files or copy on write feature.
> +The add-cow image format makes it possible to use backing files with raw
> +image by keeping a separate .add-cow metadata file. Once all sectors
> +have been written into the raw image it is safe to discard the .add-cow
> +and backing files, then we can use the raw image directly.
> +
> +An example usage of add-cow would look like::
> +(ubuntu.img is a disk image which has been installed OS.)
> +    1)  Create a raw image with the same size of ubuntu.img
> +            qemu-img create -f raw test.raw 8G
> +    2)  Create an add-cow image which will store dirty bitmap
> +            qemu-img create -f add-cow test.add-cow \
> +                -o backing_file=ubuntu.img,image_file=test.raw
> +    3)  Run qemu with add-cow image
> +            qemu -drive if=virtio,file=test.add-cow
> +
> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
> +will be calculated from the size of test.raw.
> +
> +=Specification=
> +
> +The file format looks like this:
> +
> + +---------------+-------------+-----------------+
> + |     Header    |   Reserved  |    COW bitmap   |
> + +---------------+-------------+-----------------+
> +
> +All numbers in add-cow are stored in Little Endian byte order.
> +
> +== Header ==
> +
> +The Header is included in the first bytes:
> +(#define HEADER_SIZE (4096 * header_pages_size))
> +    Byte    0 -  7:     magic
> +                        add-cow magic string ("ADD_COW\xff").
> +
> +            8 -  11:    version
> +                        Version number (only valid value is 1 now).
> +
> +            12 - 15:    backing file name offset
> +                        Offset in the add-cow file at which the backing file
> +                        name is stored (NB: The string is not nul-terminated).
> +                        If backing file name does NOT exist, this field will be
> +                        0. Must be between 80 and [HEADER_SIZE - 2](a file name
> +                        must be at least 1 byte).
> +
> +            16 - 19:    backing file name size
> +                        Length of the backing file name in bytes. It will be 0
> +                        if the backing file name offset is 0. If backing file
> +                        name offset is non-zero, then it must be non-zero. Must
> +                        be less than [HEADER_SIZE - 80] to fit in the reserved
> +                        part of the header.
> +
> +            20 - 23:    image file name offset
> +                        Offset in the add-cow file at which the image file name
> +                        is stored (NB: The string is not null terminated). It
> +                        must be between 80 and [HEADER_SIZE - 2].
> +
> +            24 - 27:    image file name size
> +                        Length of the image file name in bytes.
> +                        Must be less than [HEADER_SIZE - 80] to fit in the reserved
> +                        part of the header.
> +
> +            28 - 35:    features
> +                        Currently only 1 feature bit is used:

What happens when opening a file with an unknown bit set? How must
unknown bits be initialised?

> +                        Feature bits:
> +                            * ADD_COW_F_All_ALLOCATED   = 0x01.

What does this flag mean, and is it required to be set on that
condition? Also, please use ALL_CAPS.

> +
> +            36 - 43:    optional features
> +                        Not used now. Reserved for future use. It must be set to 0.

And must be ignored when reading.

> +
> +            44 - 47:    header pages size
> +                        The header field is variable-sized. This field indicates
> +                        how many pages(4k) will be used to store add-cow header.
> +                        In add-cow v1, it is fixed to 1, so the header size will
> +                        be 4k * 1 = 4096 bytes.

Why arbitrarily defined "pages" instead of bytes or at least clusters?

> +
> +            48 - 63:    backing file format
> +                        format of backing file. It will be filled with 0 if
> +                        backing file name offset is 0. If backing file name
> +                        offset is non-zero, it must be non-zero. It is coded
> +                        in free-form ASCII, and is not NUL-terminated.

Zero padded on the right, I guess?

Also defining that a string must be "non-zero" looks odd, should
probably be "non-empty".

> +
> +            64 - 79:    image file format
> +                        format of image file. It must be non-zero. It is coded
> +                        in free-form ASCII, and is not NUL-terminated.

Same here.

> +
> +            80 - [HEADER_SIZE - 1]:
> +                        It is used to make sure COW bitmap field starts at the
> +                        HEADER_SIZE byte, backing file name and image file name
> +                        will be stored here. The bytes that is not pointing to
> +                        backing file and image file names will bet set to 0.

"will be set to 0" describes the behaviour of qemu. A spec should
describe the file format, not a specific implementation. Make it "must"
or "should".

> +
> +== COW bitmap ==
> +
> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to
> +backing file and image file. The bitmap will track whether the sector in
> +backing file is dirty or not.
> +
> +Each bit in the bitmap indicates one cluster's status. One cluster includes 128
> +sectors, then each bit indicates 512 * 128 = 64k bytes.

Should we make the cluster size configurable?

> the size of bitmap is
> +calculated according to virtual size of image file, and it also should be multipe

Typo: multiple

Sure you mean "should", or should it be "must"?

> +of 65536, the bits not used will be set to 0. Within each byte, the least
> +significant bit covers the first cluster. Bit orders in one byte look like:
> + +----+----+----+----+----+----+----+----+
> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
> + +----+----+----+----+----+----+----+----+
> +
> +If the bit is 0, indicates the sector has not been allocated in image file, data
> +should be loaded from backing file while reading; if the bit is 1, indicates the
> +related sector has been dirty, should be loaded from image file while reading.
> +Writing to a sector causes the corresponding bit to be set to 1.
> +
> +If raw image is not an even multiple of cluster bytes, bits that correspond to
> +bytes beyond the raw file size in add-cow will be 0.

"must be written as 0 and must be ignored when reading" or something
like that.

> +Image file name and backing file name must NOT be the same, we prevent this
> +while creating add-cow files.

What we do is irrelevant for a spec.

> +Image file and backing file are interpreted relative to the qcow2 file, not
> +to the current working directory of the process that opened the qcow2 file.

Kevin
Robert Wang Sept. 11, 2012, 2:12 a.m. UTC | #4
On Mon, Sep 10, 2012 at 11:23 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 10.08.2012 17:39, schrieb Dong Xu Wang:
>> Document for add-cow format, the usage and spec of add-cow are introduced.
>>
>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>> ---
>>  docs/specs/add-cow.txt |  123 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 123 insertions(+), 0 deletions(-)
>>  create mode 100644 docs/specs/add-cow.txt
>>
>> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
>> new file mode 100644
>> index 0000000..d5a7a68
>> --- /dev/null
>> +++ b/docs/specs/add-cow.txt
>> @@ -0,0 +1,123 @@
>> +== General ==
>> +
>> +The raw file format does not support backing files or copy on write feature.
>> +The add-cow image format makes it possible to use backing files with raw
>> +image by keeping a separate .add-cow metadata file. Once all sectors
>> +have been written into the raw image it is safe to discard the .add-cow
>> +and backing files, then we can use the raw image directly.
>> +
>> +An example usage of add-cow would look like::
>> +(ubuntu.img is a disk image which has been installed OS.)
>> +    1)  Create a raw image with the same size of ubuntu.img
>> +            qemu-img create -f raw test.raw 8G
>> +    2)  Create an add-cow image which will store dirty bitmap
>> +            qemu-img create -f add-cow test.add-cow \
>> +                -o backing_file=ubuntu.img,image_file=test.raw
>> +    3)  Run qemu with add-cow image
>> +            qemu -drive if=virtio,file=test.add-cow
>> +
>> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
>> +will be calculated from the size of test.raw.
>> +
>> +=Specification=
>> +
>> +The file format looks like this:
>> +
>> + +---------------+-------------+-----------------+
>> + |     Header    |   Reserved  |    COW bitmap   |
>> + +---------------+-------------+-----------------+
>> +
>> +All numbers in add-cow are stored in Little Endian byte order.
>> +
>> +== Header ==
>> +
>> +The Header is included in the first bytes:
>> +(#define HEADER_SIZE (4096 * header_pages_size))
>> +    Byte    0 -  7:     magic
>> +                        add-cow magic string ("ADD_COW\xff").
>> +
>> +            8 -  11:    version
>> +                        Version number (only valid value is 1 now).
>> +
>> +            12 - 15:    backing file name offset
>> +                        Offset in the add-cow file at which the backing file
>> +                        name is stored (NB: The string is not nul-terminated).
>> +                        If backing file name does NOT exist, this field will be
>> +                        0. Must be between 80 and [HEADER_SIZE - 2](a file name
>> +                        must be at least 1 byte).
>> +
>> +            16 - 19:    backing file name size
>> +                        Length of the backing file name in bytes. It will be 0
>> +                        if the backing file name offset is 0. If backing file
>> +                        name offset is non-zero, then it must be non-zero. Must
>> +                        be less than [HEADER_SIZE - 80] to fit in the reserved
>> +                        part of the header.
>> +
>> +            20 - 23:    image file name offset
>> +                        Offset in the add-cow file at which the image file name
>> +                        is stored (NB: The string is not null terminated). It
>> +                        must be between 80 and [HEADER_SIZE - 2].
>> +
>> +            24 - 27:    image file name size
>> +                        Length of the image file name in bytes.
>> +                        Must be less than [HEADER_SIZE - 80] to fit in the reserved
>> +                        part of the header.
>> +
>> +            28 - 35:    features
>> +                        Currently only 1 feature bit is used:
>
> What happens when opening a file with an unknown bit set? How must
> unknown bits be initialised?

Okay, I will code as qcow2, report report_unsupported_feature error.
And I will update
the spec file.

>
>> +                        Feature bits:
>> +                            * ADD_COW_F_All_ALLOCATED   = 0x01.
>
> What does this flag mean, and is it required to be set on that
> condition? Also, please use ALL_CAPS.

This feature bit will used as:
qemu-img create -f add-cow -o image_file=t.raw t.add-cow.

While creating add-cow and without backing_file, this feature can
avoid reading/updating
bitmap. I think it can let the code be more faster.

And also, maybe, I can implement add_cow_check, check if the feature
bit should be set.
How do you think, Kevin?

>
>> +
>> +            36 - 43:    optional features
>> +                        Not used now. Reserved for future use. It must be set to 0.
>
> And must be ignored when reading.
>
Okay.

>> +
>> +            44 - 47:    header pages size
>> +                        The header field is variable-sized. This field indicates
>> +                        how many pages(4k) will be used to store add-cow header.
>> +                        In add-cow v1, it is fixed to 1, so the header size will
>> +                        be 4k * 1 = 4096 bytes.
>
> Why arbitrarily defined "pages" instead of bytes or at least clusters?

Okay, next version I will just caclulate it by bytes.
>
>> +
>> +            48 - 63:    backing file format
>> +                        format of backing file. It will be filled with 0 if
>> +                        backing file name offset is 0. If backing file name
>> +                        offset is non-zero, it must be non-zero. It is coded
>> +                        in free-form ASCII, and is not NUL-terminated.
>
> Zero padded on the right, I guess?

Yes, will update.

>
> Also defining that a string must be "non-zero" looks odd, should
> probably be "non-empty".
>
Okay.

>> +
>> +            64 - 79:    image file format
>> +                        format of image file. It must be non-zero. It is coded
>> +                        in free-form ASCII, and is not NUL-terminated.
>
> Same here.
Okay.
>
>> +
>> +            80 - [HEADER_SIZE - 1]:
>> +                        It is used to make sure COW bitmap field starts at the
>> +                        HEADER_SIZE byte, backing file name and image file name
>> +                        will be stored here. The bytes that is not pointing to
>> +                        backing file and image file names will bet set to 0.
>
> "will be set to 0" describes the behaviour of qemu. A spec should
> describe the file format, not a specific implementation. Make it "must"
> or "should".
Okay.
>
>> +
>> +== COW bitmap ==
>> +
>> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to
>> +backing file and image file. The bitmap will track whether the sector in
>> +backing file is dirty or not.
>> +
>> +Each bit in the bitmap indicates one cluster's status. One cluster includes 128
>> +sectors, then each bit indicates 512 * 128 = 64k bytes.
>
> Should we make the cluster size configurable?
>
>> the size of bitmap is
>> +calculated according to virtual size of image file, and it also should be multipe
>
> Typo: multiple
>
> Sure you mean "should", or should it be "must"?
Okay.

>
>> +of 65536, the bits not used will be set to 0. Within each byte, the least
>> +significant bit covers the first cluster. Bit orders in one byte look like:
>> + +----+----+----+----+----+----+----+----+
>> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
>> + +----+----+----+----+----+----+----+----+
>> +
>> +If the bit is 0, indicates the sector has not been allocated in image file, data
>> +should be loaded from backing file while reading; if the bit is 1, indicates the
>> +related sector has been dirty, should be loaded from image file while reading.
>> +Writing to a sector causes the corresponding bit to be set to 1.
>> +
>> +If raw image is not an even multiple of cluster bytes, bits that correspond to
>> +bytes beyond the raw file size in add-cow will be 0.
>
> "must be written as 0 and must be ignored when reading" or something
> like that.

Okay.
>
>> +Image file name and backing file name must NOT be the same, we prevent this
>> +while creating add-cow files.
>
> What we do is irrelevant for a spec.

Okay.

>
>> +Image file and backing file are interpreted relative to the qcow2 file, not
>> +to the current working directory of the process that opened the qcow2 file.
>
> Kevin
>

Thank you, Kevin.
diff mbox

Patch

diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
new file mode 100644
index 0000000..d5a7a68
--- /dev/null
+++ b/docs/specs/add-cow.txt
@@ -0,0 +1,123 @@ 
+== General ==
+
+The raw file format does not support backing files or copy on write feature.
+The add-cow image format makes it possible to use backing files with raw
+image by keeping a separate .add-cow metadata file. Once all sectors
+have been written into the raw image it is safe to discard the .add-cow
+and backing files, then we can use the raw image directly.
+
+An example usage of add-cow would look like::
+(ubuntu.img is a disk image which has been installed OS.)
+    1)  Create a raw image with the same size of ubuntu.img
+            qemu-img create -f raw test.raw 8G
+    2)  Create an add-cow image which will store dirty bitmap
+            qemu-img create -f add-cow test.add-cow \
+                -o backing_file=ubuntu.img,image_file=test.raw
+    3)  Run qemu with add-cow image
+            qemu -drive if=virtio,file=test.add-cow
+
+test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
+will be calculated from the size of test.raw.
+
+=Specification=
+
+The file format looks like this:
+
+ +---------------+-------------+-----------------+
+ |     Header    |   Reserved  |    COW bitmap   |
+ +---------------+-------------+-----------------+
+
+All numbers in add-cow are stored in Little Endian byte order.
+
+== Header ==
+
+The Header is included in the first bytes:
+(#define HEADER_SIZE (4096 * header_pages_size))
+    Byte    0 -  7:     magic
+                        add-cow magic string ("ADD_COW\xff").
+
+            8 -  11:    version
+                        Version number (only valid value is 1 now).
+
+            12 - 15:    backing file name offset
+                        Offset in the add-cow file at which the backing file
+                        name is stored (NB: The string is not nul-terminated).
+                        If backing file name does NOT exist, this field will be
+                        0. Must be between 80 and [HEADER_SIZE - 2](a file name
+                        must be at least 1 byte).
+
+            16 - 19:    backing file name size
+                        Length of the backing file name in bytes. It will be 0
+                        if the backing file name offset is 0. If backing file
+                        name offset is non-zero, then it must be non-zero. Must
+                        be less than [HEADER_SIZE - 80] to fit in the reserved
+                        part of the header.
+
+            20 - 23:    image file name offset
+                        Offset in the add-cow file at which the image file name
+                        is stored (NB: The string is not null terminated). It
+                        must be between 80 and [HEADER_SIZE - 2].
+
+            24 - 27:    image file name size
+                        Length of the image file name in bytes.
+                        Must be less than [HEADER_SIZE - 80] to fit in the reserved
+                        part of the header.
+
+            28 - 35:    features
+                        Currently only 1 feature bit is used:
+                        Feature bits:
+                            * ADD_COW_F_All_ALLOCATED   = 0x01.
+
+            36 - 43:    optional features
+                        Not used now. Reserved for future use. It must be set to 0.
+
+            44 - 47:    header pages size
+                        The header field is variable-sized. This field indicates
+                        how many pages(4k) will be used to store add-cow header.
+                        In add-cow v1, it is fixed to 1, so the header size will
+                        be 4k * 1 = 4096 bytes.
+
+            48 - 63:    backing file format
+                        format of backing file. It will be filled with 0 if
+                        backing file name offset is 0. If backing file name
+                        offset is non-zero, it must be non-zero. It is coded
+                        in free-form ASCII, and is not NUL-terminated.
+
+            64 - 79:    image file format
+                        format of image file. It must be non-zero. It is coded
+                        in free-form ASCII, and is not NUL-terminated.
+
+            80 - [HEADER_SIZE - 1]:
+                        It is used to make sure COW bitmap field starts at the
+                        HEADER_SIZE byte, backing file name and image file name
+                        will be stored here. The bytes that is not pointing to
+                        backing file and image file names will bet set to 0.
+
+== COW bitmap ==
+
+The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to
+backing file and image file. The bitmap will track whether the sector in
+backing file is dirty or not.
+
+Each bit in the bitmap indicates one cluster's status. One cluster includes 128
+sectors, then each bit indicates 512 * 128 = 64k bytes. the size of bitmap is
+calculated according to virtual size of image file, and it also should be multipe
+of 65536, the bits not used will be set to 0. Within each byte, the least
+significant bit covers the first cluster. Bit orders in one byte look like:
+ +----+----+----+----+----+----+----+----+
+ | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
+ +----+----+----+----+----+----+----+----+
+
+If the bit is 0, indicates the sector has not been allocated in image file, data
+should be loaded from backing file while reading; if the bit is 1, indicates the
+related sector has been dirty, should be loaded from image file while reading.
+Writing to a sector causes the corresponding bit to be set to 1.
+
+If raw image is not an even multiple of cluster bytes, bits that correspond to
+bytes beyond the raw file size in add-cow will be 0.
+
+Image file name and backing file name must NOT be the same, we prevent this
+while creating add-cow files.
+
+Image file and backing file are interpreted relative to the qcow2 file, not
+to the current working directory of the process that opened the qcow2 file.