Patchwork [V17,1/6] docs: document for add-cow file format

login
register
mail settings
Submitter Robert Wang
Date Dec. 6, 2012, 6:51 a.m.
Message ID <1354776711-12449-2-git-send-email-wdongxu@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/204154/
State New
Headers show

Comments

Robert Wang - Dec. 6, 2012, 6:51 a.m.
Document for add-cow format, the usage and spec of add-cow are introduced.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
 docs/specs/add-cow.txt |  154 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 154 insertions(+), 0 deletions(-)
 create mode 100644 docs/specs/add-cow.txt
Kevin Wolf - Dec. 10, 2012, 3:39 p.m.
Am 06.12.2012 07:51, schrieb Dong Xu Wang:
> Document for add-cow format, the usage and spec of add-cow are introduced.
> 
> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> ---
>  docs/specs/add-cow.txt |  154 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 154 insertions(+), 0 deletions(-)
>  create mode 100644 docs/specs/add-cow.txt
> 
> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
> new file mode 100644
> index 0000000..24e9a11
> --- /dev/null
> +++ b/docs/specs/add-cow.txt
> @@ -0,0 +1,154 @@
> +== General ==
> +
> +The raw file format does not support backing files or copy on write feature.
> +The add-cow image format makes it possible to use backing files with a raw
> +image by keeping a separate .add-cow metadata file. Once all sectors
> +have been written into the raw image it is safe to discard the .add-cow
> +and backing files, then we can use the raw image directly.
> +
> +An example usage of add-cow would look like::

Double colon.

> +(ubuntu.img is a disk image which has an installed OS.)
> +    1)  Create a raw image with the same size of ubuntu.img
> +            qemu-img create -f raw test.raw 8G
> +    2)  Create an add-cow image which will store dirty bitmap
> +            qemu-img create -f add-cow test.add-cow \
> +                -o backing_file=ubuntu.img,image_file=test.raw
> +    3)  Run qemu with add-cow image
> +            qemu -drive if=virtio,file=test.add-cow
> +
> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
> +will be calculated from the size of test.raw.
> +
> +image_fmt can be omitted, in that case image_fmt should be set as "raw".

By "should be set as" you mean "is assumed to be"?

> +backing_fmt can also be omitted, add-cow should do a probe operation and determine

This line takes more than 80 characters. More follow, I won't comment on
each.

> +what the backing file's format is.
> +
> +=Specification=
> +
> +The file format looks like this:
> +
> + +---------------+-------------------------------+
> + |     Header    |           COW bitmap          |
> + +---------------+-------------------------------+
> +
> +All numbers in add-cow are stored in Little Endian byte order.
> +
> +== Header ==
> +
> +The Header is included in the first bytes:
> +(HEADER_SIZE is defined in 44-47 bytes.)
> +    Byte    0  -  3:    magic
> +                        add-cow magic string ("ACOW").
> +
> +            4  -  7:    version
> +                        Version number (only valid value is 1 now).
> +
> +            8  - 11:    backing file name offset
> +                        Offset in the add-cow file at which the backing file
> +                        name is stored (NB: The string is not NUL-terminated).
> +                        If backing file name does NOT exist, this field will be
> +                        0. Must be between 80 and [HEADER_SIZE - 2](a file name
> +                        must be at least 1 byte).
> +
> +            12 - 15:    backing file name size
> +                        Length of the backing file name in bytes. It will be 0
> +                        if the backing file name offset is 0. If backing file
> +                        name offset is non-zero, then it must be non-zero. Must
> +                        be less than [HEADER_SIZE - 80] to fit in the reserved
> +                        part of the header. Backing file name offset + size
> +                        must be no more than HEADER_SIZE.
> +
> +            16 - 19:    image file name offset
> +                        Offset in the add-cow file at which the image file name
> +                        is stored (NB: The string is not NUL-terminated). It
> +                        must be between 80 and [HEADER_SIZE - 2]. Image file
> +                        name size + offset must be no more than HEADER_SIZE.
> +
> +            20 - 23:    image file name size
> +                        Length of the image file name in bytes.
> +                        Must be less than [HEADER_SIZE - 80] to fit in the reserved
> +                        part of the header.
> +
> +            24 - 27:    cluster bits
> +                        Number of bits that are used for addressing an offset
> +                        within a cluster (1 << cluster_bits is the cluster size).
> +                        Must not be less than 9 (i.e. 512 byte clusters).
> +
> +                        Note: qemu as of today has an implementation limit of 2 MB
> +                        as the maximum cluster size and won't be able to open images
> +                        with larger cluster sizes.
> +
> +            28 - 35:    features
> +                        Bitmask of features. If a feature bit is set but not recognized,
> +                        the add-cow file should be dropped. They are not used in v1.

Does v1 mean header.version = 1? I think this is wrong, we will want to
add incompatible feature flags without increasing header.version (that's
the whole point of them)

> +
> +                        Bits 0-63:  Reserved (set to 0)
> +
> +            36 - 43:    compatible features
> +                        Bitmask of compatible features. An implementation can
> +                        safely ignore any unknown bits that are set.
> +                        Bit 0:      All allocated bit.  If this bit is set then
> +                                    backing file and COW bitmap will not be used,
> +                                    and can read from or write to image file directly.
> +
> +                        Bits 1-63:  Reserved (set to 0)
> +
> +            44 - 47:    HEADER_SIZE
> +                        The header field is variable-sized. This field indicates
> +                        how many bytes will be used to store add-cow header.
> +                        In add-cow v1, it is fixed to 4096.

Same question about v1. If it's fixed, why have a field for it?

> +
> +            48 - 63:    backing file format
> +                        Format of backing file. It will be filled with 0 if
> +                        backing file name offset is 0. If backing file name
> +                        offset is non-empty, it must be non-empty. It is coded
> +                        in free-form ASCII, and is not NUL-terminated. Zero
> +                        padded on the right.
> +
> +            64 - 79:    image file format
> +                        Format of image file. It must be non-empty. It is coded
> +                        in free-form ASCII, and is not NUL-terminated. Zero
> +                        padded on the right.
> +
> +            80 - [HEADER_SIZE - 1]:
> +                        It is used to make sure COW bitmap field starts at the
> +                        HEADER_SIZE byte, backing file name and image file name
> +                        will be stored here. The bytes that are not pointing to
> +                        backing file and image file names must be set to 0.
> +
> +== COW bitmap ==
> +
> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to
> +backing file and image file.  It is tracking whether the sector in image file
> +is allocated or not.
> +
> +Each bit in the bitmap tracks one cluster's status. For example, if cluster
> +bit is 16, then each bit tracks one cluster, (1 << 16) = 65536 bytes. The

clusters bit_s_

> +image file size is rounded up to cluster size (where any bytes in the
> +last cluster that do not fit in the image are ignored), then if the
> +number of clusters is not a multiple of 8, then remaining bits in the
> +bitmap will be set to 0.
> +
> +The size of bitmap is calculated according to virtual size of image file, and
> +the size of bitmap should be multiple of add-cow file's cluster size, the bits
> +not used will be set to 0. Within each byte, the least significant bit covers
> +the first cluster. Bit orders in one byte look like:
> + +----+----+----+----+----+----+----+----+
> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
> + +----+----+----+----+----+----+----+----+
> +
> +If the bit is 0, it indicates the sector has not been allocated in image file,

s/sector/cluster/

More instances follow, not commenting on each.

> +data should be loaded from backing file while reading; if the bit is 1, it
> +indicates the related sector has been dirty, should be loaded from image file
> +while reading. Writing to a sector causes the corresponding bit to be set to 1.
> +If there is no backing file, or if the image file is larger than the backing
> +file and the offset is beyond the end of the backing file, then the data should
> +be read as all zero bytes instead.
> +
> +If raw image is not an even multiple of cluster bytes, bits that correspond to
> +bytes beyond the raw file size in add-cow must be written as 0 and must be
> +ignored when reading.

Don't refer to a "raw image", it could be any image format.

> +
> +Image file name and backing file name must NOT be the same, we prevent this
> +while creating add-cow files via qemu-img. If image file name and backing file
> +name are the same, the add-cow image must be treated as invalid.

Kevin
Robert Wang - Dec. 11, 2012, 8:02 a.m.
On Mon, Dec 10, 2012 at 11:39 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 06.12.2012 07:51, schrieb Dong Xu Wang:
>> Document for add-cow format, the usage and spec of add-cow are introduced.
>>
>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>> ---
>>  docs/specs/add-cow.txt |  154 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 154 insertions(+), 0 deletions(-)
>>  create mode 100644 docs/specs/add-cow.txt
>>
>> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
>> new file mode 100644
>> index 0000000..24e9a11
>> --- /dev/null
>> +++ b/docs/specs/add-cow.txt
>> @@ -0,0 +1,154 @@
>> +== General ==
>> +
>> +The raw file format does not support backing files or copy on write feature.
>> +The add-cow image format makes it possible to use backing files with a raw
>> +image by keeping a separate .add-cow metadata file. Once all sectors
>> +have been written into the raw image it is safe to discard the .add-cow
>> +and backing files, then we can use the raw image directly.
>> +
>> +An example usage of add-cow would look like::
>
> Double colon.

Okay.

>
>> +(ubuntu.img is a disk image which has an installed OS.)
>> +    1)  Create a raw image with the same size of ubuntu.img
>> +            qemu-img create -f raw test.raw 8G
>> +    2)  Create an add-cow image which will store dirty bitmap
>> +            qemu-img create -f add-cow test.add-cow \
>> +                -o backing_file=ubuntu.img,image_file=test.raw
>> +    3)  Run qemu with add-cow image
>> +            qemu -drive if=virtio,file=test.add-cow
>> +
>> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
>> +will be calculated from the size of test.raw.
>> +
>> +image_fmt can be omitted, in that case image_fmt should be set as "raw".
>
> By "should be set as" you mean "is assumed to be"?
>
Okay. Will fix.
>> +backing_fmt can also be omitted, add-cow should do a probe operation and determine
>
> This line takes more than 80 characters. More follow, I won't comment on
> each.
>
Okay, will fix.
>> +what the backing file's format is.
>> +
>> +=Specification=
>> +
>> +The file format looks like this:
>> +
>> + +---------------+-------------------------------+
>> + |     Header    |           COW bitmap          |
>> + +---------------+-------------------------------+
>> +
>> +All numbers in add-cow are stored in Little Endian byte order.
>> +
>> +== Header ==
>> +
>> +The Header is included in the first bytes:
>> +(HEADER_SIZE is defined in 44-47 bytes.)
>> +    Byte    0  -  3:    magic
>> +                        add-cow magic string ("ACOW").
>> +
>> +            4  -  7:    version
>> +                        Version number (only valid value is 1 now).
>> +
>> +            8  - 11:    backing file name offset
>> +                        Offset in the add-cow file at which the backing file
>> +                        name is stored (NB: The string is not NUL-terminated).
>> +                        If backing file name does NOT exist, this field will be
>> +                        0. Must be between 80 and [HEADER_SIZE - 2](a file name
>> +                        must be at least 1 byte).
>> +
>> +            12 - 15:    backing file name size
>> +                        Length of the backing file name in bytes. It will be 0
>> +                        if the backing file name offset is 0. If backing file
>> +                        name offset is non-zero, then it must be non-zero. Must
>> +                        be less than [HEADER_SIZE - 80] to fit in the reserved
>> +                        part of the header. Backing file name offset + size
>> +                        must be no more than HEADER_SIZE.
>> +
>> +            16 - 19:    image file name offset
>> +                        Offset in the add-cow file at which the image file name
>> +                        is stored (NB: The string is not NUL-terminated). It
>> +                        must be between 80 and [HEADER_SIZE - 2]. Image file
>> +                        name size + offset must be no more than HEADER_SIZE.
>> +
>> +            20 - 23:    image file name size
>> +                        Length of the image file name in bytes.
>> +                        Must be less than [HEADER_SIZE - 80] to fit in the reserved
>> +                        part of the header.
>> +
>> +            24 - 27:    cluster bits
>> +                        Number of bits that are used for addressing an offset
>> +                        within a cluster (1 << cluster_bits is the cluster size).
>> +                        Must not be less than 9 (i.e. 512 byte clusters).
>> +
>> +                        Note: qemu as of today has an implementation limit of 2 MB
>> +                        as the maximum cluster size and won't be able to open images
>> +                        with larger cluster sizes.
>> +
>> +            28 - 35:    features
>> +                        Bitmask of features. If a feature bit is set but not recognized,
>> +                        the add-cow file should be dropped. They are not used in v1.
>
> Does v1 mean header.version = 1? I think this is wrong, we will want to
> add incompatible feature flags without increasing header.version (that's
> the whole point of them)
Okay, will fix.
>
>> +
>> +                        Bits 0-63:  Reserved (set to 0)
>> +
>> +            36 - 43:    compatible features
>> +                        Bitmask of compatible features. An implementation can
>> +                        safely ignore any unknown bits that are set.
>> +                        Bit 0:      All allocated bit.  If this bit is set then
>> +                                    backing file and COW bitmap will not be used,
>> +                                    and can read from or write to image file directly.
>> +
>> +                        Bits 1-63:  Reserved (set to 0)
>> +
>> +            44 - 47:    HEADER_SIZE
>> +                        The header field is variable-sized. This field indicates
>> +                        how many bytes will be used to store add-cow header.
>> +                        In add-cow v1, it is fixed to 4096.
>
> Same question about v1. If it's fixed, why have a field for it?
Okay, will make more clear in next version.

>
>> +
>> +            48 - 63:    backing file format
>> +                        Format of backing file. It will be filled with 0 if
>> +                        backing file name offset is 0. If backing file name
>> +                        offset is non-empty, it must be non-empty. It is coded
>> +                        in free-form ASCII, and is not NUL-terminated. Zero
>> +                        padded on the right.
>> +
>> +            64 - 79:    image file format
>> +                        Format of image file. It must be non-empty. It is coded
>> +                        in free-form ASCII, and is not NUL-terminated. Zero
>> +                        padded on the right.
>> +
>> +            80 - [HEADER_SIZE - 1]:
>> +                        It is used to make sure COW bitmap field starts at the
>> +                        HEADER_SIZE byte, backing file name and image file name
>> +                        will be stored here. The bytes that are not pointing to
>> +                        backing file and image file names must be set to 0.
>> +
>> +== COW bitmap ==
>> +
>> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to
>> +backing file and image file.  It is tracking whether the sector in image file
>> +is allocated or not.
>> +
>> +Each bit in the bitmap tracks one cluster's status. For example, if cluster
>> +bit is 16, then each bit tracks one cluster, (1 << 16) = 65536 bytes. The
>
> clusters bit_s_
>
Okay. "cluster_bits is".
>> +image file size is rounded up to cluster size (where any bytes in the
>> +last cluster that do not fit in the image are ignored), then if the
>> +number of clusters is not a multiple of 8, then remaining bits in the
>> +bitmap will be set to 0.
>> +
>> +The size of bitmap is calculated according to virtual size of image file, and
>> +the size of bitmap should be multiple of add-cow file's cluster size, the bits
>> +not used will be set to 0. Within each byte, the least significant bit covers
>> +the first cluster. Bit orders in one byte look like:
>> + +----+----+----+----+----+----+----+----+
>> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
>> + +----+----+----+----+----+----+----+----+
>> +
>> +If the bit is 0, it indicates the sector has not been allocated in image file,
>
> s/sector/cluster/
>
> More instances follow, not commenting on each.
>
Okay.

>> +data should be loaded from backing file while reading; if the bit is 1, it
>> +indicates the related sector has been dirty, should be loaded from image file
>> +while reading. Writing to a sector causes the corresponding bit to be set to 1.
>> +If there is no backing file, or if the image file is larger than the backing
>> +file and the offset is beyond the end of the backing file, then the data should
>> +be read as all zero bytes instead.
>> +
>> +If raw image is not an even multiple of cluster bytes, bits that correspond to
>> +bytes beyond the raw file size in add-cow must be written as 0 and must be
>> +ignored when reading.
>
> Don't refer to a "raw image", it could be any image format.
>
Okay, will fix.
>> +
>> +Image file name and backing file name must NOT be the same, we prevent this
>> +while creating add-cow files via qemu-img. If image file name and backing file
>> +name are the same, the add-cow image must be treated as invalid.
>
> Kevin
>

Patch

diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
new file mode 100644
index 0000000..24e9a11
--- /dev/null
+++ b/docs/specs/add-cow.txt
@@ -0,0 +1,154 @@ 
+== General ==
+
+The raw file format does not support backing files or copy on write feature.
+The add-cow image format makes it possible to use backing files with a raw
+image by keeping a separate .add-cow metadata file. Once all sectors
+have been written into the raw image it is safe to discard the .add-cow
+and backing files, then we can use the raw image directly.
+
+An example usage of add-cow would look like::
+(ubuntu.img is a disk image which has an installed OS.)
+    1)  Create a raw image with the same size of ubuntu.img
+            qemu-img create -f raw test.raw 8G
+    2)  Create an add-cow image which will store dirty bitmap
+            qemu-img create -f add-cow test.add-cow \
+                -o backing_file=ubuntu.img,image_file=test.raw
+    3)  Run qemu with add-cow image
+            qemu -drive if=virtio,file=test.add-cow
+
+test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow
+will be calculated from the size of test.raw.
+
+image_fmt can be omitted, in that case image_fmt should be set as "raw".
+backing_fmt can also be omitted, add-cow should do a probe operation and determine
+what the backing file's format is.
+
+=Specification=
+
+The file format looks like this:
+
+ +---------------+-------------------------------+
+ |     Header    |           COW bitmap          |
+ +---------------+-------------------------------+
+
+All numbers in add-cow are stored in Little Endian byte order.
+
+== Header ==
+
+The Header is included in the first bytes:
+(HEADER_SIZE is defined in 44-47 bytes.)
+    Byte    0  -  3:    magic
+                        add-cow magic string ("ACOW").
+
+            4  -  7:    version
+                        Version number (only valid value is 1 now).
+
+            8  - 11:    backing file name offset
+                        Offset in the add-cow file at which the backing file
+                        name is stored (NB: The string is not NUL-terminated).
+                        If backing file name does NOT exist, this field will be
+                        0. Must be between 80 and [HEADER_SIZE - 2](a file name
+                        must be at least 1 byte).
+
+            12 - 15:    backing file name size
+                        Length of the backing file name in bytes. It will be 0
+                        if the backing file name offset is 0. If backing file
+                        name offset is non-zero, then it must be non-zero. Must
+                        be less than [HEADER_SIZE - 80] to fit in the reserved
+                        part of the header. Backing file name offset + size
+                        must be no more than HEADER_SIZE.
+
+            16 - 19:    image file name offset
+                        Offset in the add-cow file at which the image file name
+                        is stored (NB: The string is not NUL-terminated). It
+                        must be between 80 and [HEADER_SIZE - 2]. Image file
+                        name size + offset must be no more than HEADER_SIZE.
+
+            20 - 23:    image file name size
+                        Length of the image file name in bytes.
+                        Must be less than [HEADER_SIZE - 80] to fit in the reserved
+                        part of the header.
+
+            24 - 27:    cluster bits
+                        Number of bits that are used for addressing an offset
+                        within a cluster (1 << cluster_bits is the cluster size).
+                        Must not be less than 9 (i.e. 512 byte clusters).
+
+                        Note: qemu as of today has an implementation limit of 2 MB
+                        as the maximum cluster size and won't be able to open images
+                        with larger cluster sizes.
+
+            28 - 35:    features
+                        Bitmask of features. If a feature bit is set but not recognized,
+                        the add-cow file should be dropped. They are not used in v1.
+
+                        Bits 0-63:  Reserved (set to 0)
+
+            36 - 43:    compatible features
+                        Bitmask of compatible features. An implementation can
+                        safely ignore any unknown bits that are set.
+                        Bit 0:      All allocated bit.  If this bit is set then
+                                    backing file and COW bitmap will not be used,
+                                    and can read from or write to image file directly.
+
+                        Bits 1-63:  Reserved (set to 0)
+
+            44 - 47:    HEADER_SIZE
+                        The header field is variable-sized. This field indicates
+                        how many bytes will be used to store add-cow header.
+                        In add-cow v1, it is fixed to 4096.
+
+            48 - 63:    backing file format
+                        Format of backing file. It will be filled with 0 if
+                        backing file name offset is 0. If backing file name
+                        offset is non-empty, it must be non-empty. It is coded
+                        in free-form ASCII, and is not NUL-terminated. Zero
+                        padded on the right.
+
+            64 - 79:    image file format
+                        Format of image file. It must be non-empty. It is coded
+                        in free-form ASCII, and is not NUL-terminated. Zero
+                        padded on the right.
+
+            80 - [HEADER_SIZE - 1]:
+                        It is used to make sure COW bitmap field starts at the
+                        HEADER_SIZE byte, backing file name and image file name
+                        will be stored here. The bytes that are not pointing to
+                        backing file and image file names must be set to 0.
+
+== COW bitmap ==
+
+The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to
+backing file and image file.  It is tracking whether the sector in image file
+is allocated or not.
+
+Each bit in the bitmap tracks one cluster's status. For example, if cluster
+bit is 16, then each bit tracks one cluster, (1 << 16) = 65536 bytes. The
+image file size is rounded up to cluster size (where any bytes in the
+last cluster that do not fit in the image are ignored), then if the
+number of clusters is not a multiple of 8, then remaining bits in the
+bitmap will be set to 0.
+
+The size of bitmap is calculated according to virtual size of image file, and
+the size of bitmap should be multiple of add-cow file's cluster size, the bits
+not used will be set to 0. Within each byte, the least significant bit covers
+the first cluster. Bit orders in one byte look like:
+ +----+----+----+----+----+----+----+----+
+ | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
+ +----+----+----+----+----+----+----+----+
+
+If the bit is 0, it indicates the sector has not been allocated in image file,
+data should be loaded from backing file while reading; if the bit is 1, it
+indicates the related sector has been dirty, should be loaded from image file
+while reading. Writing to a sector causes the corresponding bit to be set to 1.
+If there is no backing file, or if the image file is larger than the backing
+file and the offset is beyond the end of the backing file, then the data should
+be read as all zero bytes instead.
+
+If raw image is not an even multiple of cluster bytes, bits that correspond to
+bytes beyond the raw file size in add-cow must be written as 0 and must be
+ignored when reading.
+
+Image file name and backing file name must NOT be the same, we prevent this
+while creating add-cow files via qemu-img. If image file name and backing file
+name are the same, the add-cow image must be treated as invalid.