Message ID | 1332860615-3047-2-git-send-email-kwolf@redhat.com |
---|---|
State | New |
Headers | show |
On 03/27/2012 09:03 AM, Kevin Wolf wrote: > This is the second draft for what I think could be added when we increase qcow2's > version number to 3. This includes points that have been made by several people > over the past few months. We're probably not going to implement this next week, > but I think it's important to get discussions started early, so here it is. > > +If the version is 3 or higher, the header has the following additional fields. > +For version 2, the values are assumed to be zero, unless specified otherwise > +in the description of a field. > + > + 72 - 79: incompatible_features > + Bitmask of incompatible features. An implementation must > + fail to open an image if an unknown bit is set. > + > + Bit 0: The reference counts in the image file may be > + inaccurate. Implementations must check/rebuild > + them if they rely on them. > + > + Bit 1: Enable subclusters. This affects the L2 table > + format. > + > + Bits 2-31: Reserved (set to 0) Offsets 72-79 forms 8 bytes, so this should be bits 2-63 are reserved. > + > + 80 - 87: compatible_features > + Bitmask of compatible features. An implementation can > + safely ignore any unknown bits that are set. > + > + Bits 0-31: Reserved (set to 0) Again, bits 0-63, based on offsets. > + > + 88 - 95: autoclear_features > + Bitmask of auto-clear features. An implementation may only > + write to an image with unknown auto-clear features if it > + clears the respective bits from this field first. > + > + Bits 0-31: Reserved (set to 0) And again. > + > + 96 - 99: refcount_bits > + Size of a reference count block entry in bits. For version 2 > + images, the size is always assumed to be 16 bits. The size > + must be a power of two. > + [ TODO: Define order in sub-byte sizes ] > + > + 100 - 103: header_length > + Length of the header structure in bytes. For version 2 > + images, the length is always assumed to be 72 bytes. Might be a good idea to require this to be a multiple of 8, since both 72 and 104 qualify, and since header extensions are also required to be padded out to multiples of 8. > +== Feature name table == > + > +A feature name table is an optional header extension that contains the name for > +features used by the image. It can be used by applications that don't know > +the respective feature (e.g. because the feature was introduced only later) to > +display a useful error message. > + > +The number of entries in the feature name table is determined by the length of > +the header extension data. Its entries look like this: > + > + Byte 0: Type of feature (select feature bitmap) > + 0: Incompatible feature > + 1: Compatible feature > + 2: Autoclear feature > + > + 1: Bit number within the selected feature bitmap > + > + 2 - 47: Feature name (padded with zeros, but not necessarily null > + terminated if it has full length) Semantic nit: The NUL character is all zeros; it is one byte in all unibyte and multi-byte encodings, and the NUL wide character is the all-zero wchar_t value; while 'null' refers to a pointer to nowhere. Saying a string is null terminated is wrong, because you don't have a 4- or 8-byte NULL pointer at the end of the string, just a one-byte NUL character. Therefore, strings are nul-terminated, not null-terminated. Is this extension capped at 48 bytes, or it is a repeating table of as many 48-byte multiples as necessary to represent each feature name?
Am 27.03.2012 18:25, schrieb Eric Blake: > On 03/27/2012 09:03 AM, Kevin Wolf wrote: >> This is the second draft for what I think could be added when we increase qcow2's >> version number to 3. This includes points that have been made by several people >> over the past few months. We're probably not going to implement this next week, >> but I think it's important to get discussions started early, so here it is. >> > >> +If the version is 3 or higher, the header has the following additional fields. >> +For version 2, the values are assumed to be zero, unless specified otherwise >> +in the description of a field. >> + >> + 72 - 79: incompatible_features >> + Bitmask of incompatible features. An implementation must >> + fail to open an image if an unknown bit is set. >> + >> + Bit 0: The reference counts in the image file may be >> + inaccurate. Implementations must check/rebuild >> + them if they rely on them. >> + >> + Bit 1: Enable subclusters. This affects the L2 table >> + format. >> + >> + Bits 2-31: Reserved (set to 0) > > Offsets 72-79 forms 8 bytes, so this should be bits 2-63 are reserved. Thanks, good catch! This was a 32 bit field initially and when I updated it, I forgot this. >> + >> + 96 - 99: refcount_bits >> + Size of a reference count block entry in bits. For version 2 >> + images, the size is always assumed to be 16 bits. The size >> + must be a power of two. >> + [ TODO: Define order in sub-byte sizes ] >> + >> + 100 - 103: header_length >> + Length of the header structure in bytes. For version 2 >> + images, the length is always assumed to be 72 bytes. > > Might be a good idea to require this to be a multiple of 8, since both > 72 and 104 qualify, and since header extensions are also required to be > padded out to multiples of 8. Do you see any arguments for padding to multiples of 8 besides consistency? If I did the format from scratch, without having to pay attention to compatibility, I would drop the requirement even for header extensions as I don't see what it buys us. Consistency is important and certainly good enough to make me unsure about this, but I don't like artificial restrictions either. If we had another good reason, it would be easier for me to decide. >> +== Feature name table == >> + >> +A feature name table is an optional header extension that contains the name for >> +features used by the image. It can be used by applications that don't know >> +the respective feature (e.g. because the feature was introduced only later) to >> +display a useful error message. >> + >> +The number of entries in the feature name table is determined by the length of >> +the header extension data. Its entries look like this: >> + >> + Byte 0: Type of feature (select feature bitmap) >> + 0: Incompatible feature >> + 1: Compatible feature >> + 2: Autoclear feature >> + >> + 1: Bit number within the selected feature bitmap >> + >> + 2 - 47: Feature name (padded with zeros, but not necessarily null >> + terminated if it has full length) > > Semantic nit: The NUL character is all zeros; it is one byte in all > unibyte and multi-byte encodings, and the NUL wide character is the > all-zero wchar_t value; while 'null' refers to a pointer to nowhere. > Saying a string is null terminated is wrong, because you don't have a 4- > or 8-byte NULL pointer at the end of the string, just a one-byte NUL > character. Therefore, strings are nul-terminated, not null-terminated. "null-terminated" is much more common. Google and Wikipedia are the proof. ;-) > Is this extension capped at 48 bytes, or it is a repeating table of as > many 48-byte multiples as necessary to represent each feature name? The latter. All feature names are in a single table in a single header extensions. Any suggestion how to clarify this? Would something like "There shall be at most one feature name table header extension in an image" be clear enough? Kevin
On 04/02/2012 04:00 AM, Kevin Wolf wrote: > Am 27.03.2012 18:25, schrieb Eric Blake: >> On 03/27/2012 09:03 AM, Kevin Wolf wrote: >>> This is the second draft for what I think could be added when we increase qcow2's >>> version number to 3. This includes points that have been made by several people >>> over the past few months. We're probably not going to implement this next week, >>> but I think it's important to get discussions started early, so here it is. >>> >> >>> + >>> + 100 - 103: header_length >>> + Length of the header structure in bytes. For version 2 >>> + images, the length is always assumed to be 72 bytes. >> >> Might be a good idea to require this to be a multiple of 8, since both >> 72 and 104 qualify, and since header extensions are also required to be >> padded out to multiples of 8. > > Do you see any arguments for padding to multiples of 8 besides > consistency? Yes - void* on some platforms is 8 bytes, and having everything guarantee 8-byte alignment can make processing of headers more efficient when you are reading things on natural alignments. Furthermore, guaranteeing 8-byte alignment buys us three bits that are always 0 but which can later be converted to bit flags for future extensions; by requiring 8-byte alignment, older parsers will reject the new bit flags (because it looks like a non-multiple-of-8 length), while newer parsers will know that they are bit flags and what those flags mean, as well as know to mask out those bits when computing aligned size of the header. > If I did the format from scratch, without having to pay > attention to compatibility, I would drop the requirement even for header > extensions as I don't see what it buys us. It's always hard to predict what future extensions will look like, but I argue in return that it is easier to start out strict and relax things in the future than it is to start relaxed and then wish we could tighten it up. > > Consistency is important and certainly good enough to make me unsure > about this, but I don't like artificial restrictions either. If we had > another good reason, it would be easier for me to decide. If sizeof(void*) for natural alignment and the possibility of extension to 3 bit flags per extension header don't convince you, then I won't insist. >> Semantic nit: The NUL character is all zeros; it is one byte in all >> unibyte and multi-byte encodings, and the NUL wide character is the >> all-zero wchar_t value; while 'null' refers to a pointer to nowhere. >> Saying a string is null terminated is wrong, because you don't have a 4- >> or 8-byte NULL pointer at the end of the string, just a one-byte NUL >> character. Therefore, strings are nul-terminated, not null-terminated. > > "null-terminated" is much more common. Google and Wikipedia are the > proof. ;-) Unfortunately true :) But I'll quit bothering you about this one, as I'm swimming against the current on that one. > >> Is this extension capped at 48 bytes, or it is a repeating table of as >> many 48-byte multiples as necessary to represent each feature name? > > The latter. All feature names are in a single table in a single header > extensions. Any suggestion how to clarify this? Would something like > "There shall be at most one feature name table header extension in an > image" be clear enough? Maybe: A feature name table is an optional header extension that contains the name for features used by the image. It can be used by applications that don't know the respective feature (e.g. because the feature was introduced only later) to display a useful error message. There can be at most one feature name table, and within that table, each feature name may only appear once. The number of entries (n) in the feature name table is determined by the length of the header extension data. Its entries look like this: Byte 48*n + 0: Type of feature (select feature bitmap) 0: Incompatible feature 1: Compatible feature 2: Autoclear feature 48*n + 1: Bit number within the selected feature bitmap 48*n + 2 to 47: Feature name (padded with zeros, but not necessarily null terminated if it has full length) Do we also need to clarify that at offsets 48*n + 1, the bit number must be 0-63 (and thus the upper two bits must be 0)? Do we also want to enforce that the table is sorted (that is, given the tuple <feature,bit> in bytes 0 and 1 of each entry, we want to require that entry <0,0> appears before <0,1> appears before <1,0>)?
diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt index b6adcad..9a492cd 100644 --- a/docs/specs/qcow2.txt +++ b/docs/specs/qcow2.txt @@ -18,7 +18,7 @@ The first cluster of a qcow2 image contains the file header: QCOW magic string ("QFI\xfb") 4 - 7: version - Version number (only valid value is 2) + Version number (valid values are 2 and 3) 8 - 15: backing_file_offset Offset into the image file at which the backing file name @@ -67,12 +67,53 @@ The first cluster of a qcow2 image contains the file header: Offset into the image file at which the snapshot table starts. Must be aligned to a cluster boundary. +If the version is 3 or higher, the header has the following additional fields. +For version 2, the values are assumed to be zero, unless specified otherwise +in the description of a field. + + 72 - 79: incompatible_features + Bitmask of incompatible features. An implementation must + fail to open an image if an unknown bit is set. + + Bit 0: The reference counts in the image file may be + inaccurate. Implementations must check/rebuild + them if they rely on them. + + Bit 1: Enable subclusters. This affects the L2 table + format. + + Bits 2-31: Reserved (set to 0) + + 80 - 87: compatible_features + Bitmask of compatible features. An implementation can + safely ignore any unknown bits that are set. + + Bits 0-31: Reserved (set to 0) + + 88 - 95: autoclear_features + Bitmask of auto-clear features. An implementation may only + write to an image with unknown auto-clear features if it + clears the respective bits from this field first. + + Bits 0-31: Reserved (set to 0) + + 96 - 99: refcount_bits + Size of a reference count block entry in bits. For version 2 + images, the size is always assumed to be 16 bits. The size + must be a power of two. + [ TODO: Define order in sub-byte sizes ] + + 100 - 103: header_length + Length of the header structure in bytes. For version 2 + images, the length is always assumed to be 72 bytes. + Directly after the image header, optional sections called header extensions can be stored. Each extension has a structure like the following: Byte 0 - 3: Header extension type: 0x00000000 - End of the header extension area 0xE2792ACA - Backing file format name + 0x6803f857 - Feature name table other - Unknown header extension, can be safely ignored @@ -84,8 +125,32 @@ be stored. Each extension has a structure like the following: multiple of 8. The remaining space between the end of the header extension area and the end of -the first cluster can be used for other data. Usually, the backing file name is -stored there. +the first cluster can be used for the backing file name. It is not allowed to +store other data here, so that an implementation can safely modify the header +and add extensions without harming data of compatible features that it +doesn't support. Compatible features that need space for additional data can +use a header extension. + + +== Feature name table == + +A feature name table is an optional header extension that contains the name for +features used by the image. It can be used by applications that don't know +the respective feature (e.g. because the feature was introduced only later) to +display a useful error message. + +The number of entries in the feature name table is determined by the length of +the header extension data. Its entries look like this: + + Byte 0: Type of feature (select feature bitmap) + 0: Incompatible feature + 1: Compatible feature + 2: Autoclear feature + + 1: Bit number within the selected feature bitmap + + 2 - 47: Feature name (padded with zeros, but not necessarily null + terminated if it has full length) == Host cluster management == @@ -138,7 +203,8 @@ guest clusters to host clusters. They are called L1 and L2 table. The L1 table has a variable size (stored in the header) and may use multiple clusters, however it must be contiguous in the image file. L2 tables are -exactly one cluster in size. +exactly one cluster in size if subclusters are disabled, and two clusters if +they are enabled. Given a offset into the virtual disk, the offset into the image file can be obtained as follows: @@ -168,9 +234,40 @@ L1 table entry: refcount is exactly one. This information is only accurate in the active L1 table. -L2 table entry (for normal clusters): +L2 table entry: - Bit 0 - 8: Reserved (set to 0) + Bit 0 - 61: Cluster descriptor + + 62: 0 for standard clusters + 1 for compressed clusters + + 63: 0 for a cluster that is unused or requires COW, 1 if its + refcount is exactly one. This information is only accurate + in L2 tables that are reachable from the the active L1 + table. + + 64 - 127: If subclusters are enabled, this contains a bitmask that + describes the allocation status of all 32 subclusters (two + bits for each). The first subcluster is represented by the + LSB. The values for each subcluster are: + + 0: Subcluster is unallocated + 1: Subcluster is allocated + 2: Subcluster is unallocated and reads as all zeros + instead of referring to the backing file + 3: Reserved + +Standard Cluster Descriptor: + + Bit 0: If set to 1, the cluster reads as all zeros. The host + cluster offset can be used to describe a preallocation, + but it won't be used for reading data from this cluster, + nor is data read from the backing file if the cluster is + unallocated. + + With version 2, this is always 0. + + 1 - 8: Reserved (set to 0) 9 - 55: Bits 9-55 of host cluster offset. Must be aligned to a cluster boundary. If the offset is 0, the cluster is @@ -178,29 +275,17 @@ L2 table entry (for normal clusters): 56 - 61: Reserved (set to 0) - 62: 0 (this cluster is not compressed) - - 63: 0 for a cluster that is unused or requires COW, 1 if its - refcount is exactly one. This information is only accurate - in L2 tables that are reachable from the the active L1 - table. -L2 table entry (for compressed clusters; x = 62 - (cluster_size - 8)): +Compressed Clusters Descriptor (x = 62 - (cluster_size - 8)): Bit 0 - x: Host cluster offset. This is usually _not_ aligned to a cluster boundary! x+1 - 61: Compressed size of the images in sectors of 512 bytes - 62: 1 (this cluster is compressed using zlib) - - 63: 0 for a cluster that is unused or requires COW, 1 if its - refcount is exactly one. This information is only accurate - in L2 tables that are reachable from the the active L1 - table. - -If a cluster is unallocated, read requests shall read the data from the backing -file. If there is no backing file or the backing file is smaller than the image, +If a cluster or a subcluster is unallocated, read requests shall read the data +from the backing file (except if bit 0 in the Standard Cluster Descriptor is +set). If there is no backing file or the backing file is smaller than the image, they shall read zeros for all parts that are not covered by the backing file.