diff mbox

[RFC,v2] Specification for qcow2 version 3

Message ID 1309187514-26562-1-git-send-email-kwolf@redhat.com
State New
Headers show

Commit Message

Kevin Wolf June 27, 2011, 3:11 p.m. UTC
This is the second draft for what I think could be added when we increase qcow2's
version number to 3. This includes points that have been made by several people
over the past few months. We're probably not going to implement this next week,
but I think it's important to get discussions started early, so here it is.

Changes implemented in this RFC:

- Added compatible/incompatible/auto-clear feature bits plus an optional
  feature name table to allow useful error messages even if an older version
  doesn't know some feature at all.

- Added a dirty flag which tells that the refcount may not be accurate ("QED
  mode"). This means that we can save writes to the refcount table with
  cache=writethrough, but isn't really useful otherwise since Qcow2Cache.

- Configurable refcount width. If you don't want to use internal snapshots,
  make refcounts one bit and save cache space and I/O.

- Added subclusters. This separate the COW size (one subcluster, I'm thinking
  of 64k default size here) from the allocation size (one cluster, 2M). Less
  fragmentation, less metadata, but still reasonable COW granularity.

  This also allows to preallocate clusters, but none of their subclusters. You
  can have an image that is like raw + COW metadata, and you can also
  preallocate metadata for images with backing files.

- Zero cluster flags. This allows discard even with a backing file that doesn't
  contain zeros. It is also useful for copy-on-read/image streaming, as you'll
  want to keep sparseness without accessing the remote image for an unallocated
  cluster all the time.

- Fixed internal snapshot metadata to use 64 bit VM state size. You can't save
  a snapshot of a VM with >= 4 GB RAM today.

Possible future additions:

- Add per-L2-table dirty flag to L1?
- Add per-refcount-block full flag to refcount table?
---
 docs/specs/qcow2.txt |  135 +++++++++++++++++++++++++++++++++++++++++---------
 1 files changed, 112 insertions(+), 23 deletions(-)

Comments

Frediano Ziglio June 28, 2011, 9:38 a.m. UTC | #1
2011/6/27 Kevin Wolf <kwolf@redhat.com>:
> This is the second draft for what I think could be added when we increase qcow2's
> version number to 3. This includes points that have been made by several people
> over the past few months. We're probably not going to implement this next week,
> but I think it's important to get discussions started early, so here it is.
>
> Changes implemented in this RFC:
>
> - Added compatible/incompatible/auto-clear feature bits plus an optional
>  feature name table to allow useful error messages even if an older version
>  doesn't know some feature at all.
>
> - Added a dirty flag which tells that the refcount may not be accurate ("QED
>  mode"). This means that we can save writes to the refcount table with
>  cache=writethrough, but isn't really useful otherwise since Qcow2Cache.
>
> - Configurable refcount width. If you don't want to use internal snapshots,
>  make refcounts one bit and save cache space and I/O.
>
> - Added subclusters. This separate the COW size (one subcluster, I'm thinking
>  of 64k default size here) from the allocation size (one cluster, 2M). Less
>  fragmentation, less metadata, but still reasonable COW granularity.
>
>  This also allows to preallocate clusters, but none of their subclusters. You
>  can have an image that is like raw + COW metadata, and you can also
>  preallocate metadata for images with backing files.
>
> - Zero cluster flags. This allows discard even with a backing file that doesn't
>  contain zeros. It is also useful for copy-on-read/image streaming, as you'll
>  want to keep sparseness without accessing the remote image for an unallocated
>  cluster all the time.
>
> - Fixed internal snapshot metadata to use 64 bit VM state size. You can't save
>  a snapshot of a VM with >= 4 GB RAM today.
>
> Possible future additions:
>
> - Add per-L2-table dirty flag to L1?
> - Add per-refcount-block full flag to refcount table?

Hi,
  thinking about image improvement I would add

- GUID for image and backing file
- relative path for backing file

This would help finding images in a distributed environment or if file
are moved, ie: gfs/nfs/ocfs mounted in different mount points, backing
used a template in a different images directory and move this
directory somewhere else. Also with GUID a possible higher level could
manage a GUID <-> file image db.

I was also think about a "backing file length" field to support
resizing but probably can be implemented with zero cluster. Assume you
have a image of 5gb, create a new image with first image as backing
one, now resize second image from 5gb to 3gb then resize it again
(after some works) to 10gb, part from 3gb to 5gb should not be read
from backing file.

Also a bit in l2 offset to say "there is no l2 table" cause all
clusters in l2 are contiguous so we avoid entirely l2. Obviously this
require an optimization step to detect or create such condition.

For check perhaps it would be helpful to save not only a flag but also
a size where data are ok (for instance already allocated and with
refcount saved correctly).

A possible optimization for refcount would be to initialize refcount
to 1 instead of 0. When clusters are allocated at end-of-file this
would not require refcount change and would be easy to check file size
to see which clusters are marked as allocated but not present.

Fields for sectors and heads to support old CHS systems ??

This mail sound quite strange to me, I thought qed would be the future
of qcow2 but I must be really wrong.

I think a big limit for current qed and qcow2 implementation is the
serialization of metadata informations (qcow2 use synchronous
operation while qed use a queue). I used bonnie++ program to test
speed and performances allocating data is about 15-20% of allocated
one. I'm working (in the few spare time I have) improving it.
VirtualBox and ESX use large clusters (1mb) to mitigate
allocation/metadata problem. Perhaps raising default cluster size
would help changing a spread idea of bad qemu i/o performance.

Regards
  Frediano Ziglio
Stefan Hajnoczi Oct. 12, 2011, 12:51 p.m. UTC | #2
On Tue, Jun 28, 2011 at 10:38 AM, Frediano Ziglio <freddy77@gmail.com> wrote:
> 2011/6/27 Kevin Wolf <kwolf@redhat.com>:
>> This is the second draft for what I think could be added when we increase qcow2's
>> version number to 3. This includes points that have been made by several people
>> over the past few months. We're probably not going to implement this next week,
>> but I think it's important to get discussions started early, so here it is.
>>
>> Changes implemented in this RFC:
>>
>> - Added compatible/incompatible/auto-clear feature bits plus an optional
>>  feature name table to allow useful error messages even if an older version
>>  doesn't know some feature at all.
>>
>> - Added a dirty flag which tells that the refcount may not be accurate ("QED
>>  mode"). This means that we can save writes to the refcount table with
>>  cache=writethrough, but isn't really useful otherwise since Qcow2Cache.
>>
>> - Configurable refcount width. If you don't want to use internal snapshots,
>>  make refcounts one bit and save cache space and I/O.
>>
>> - Added subclusters. This separate the COW size (one subcluster, I'm thinking
>>  of 64k default size here) from the allocation size (one cluster, 2M). Less
>>  fragmentation, less metadata, but still reasonable COW granularity.
>>
>>  This also allows to preallocate clusters, but none of their subclusters. You
>>  can have an image that is like raw + COW metadata, and you can also
>>  preallocate metadata for images with backing files.
>>
>> - Zero cluster flags. This allows discard even with a backing file that doesn't
>>  contain zeros. It is also useful for copy-on-read/image streaming, as you'll
>>  want to keep sparseness without accessing the remote image for an unallocated
>>  cluster all the time.
>>
>> - Fixed internal snapshot metadata to use 64 bit VM state size. You can't save
>>  a snapshot of a VM with >= 4 GB RAM today.
>>
>> Possible future additions:
>>
>> - Add per-L2-table dirty flag to L1?
>> - Add per-refcount-block full flag to refcount table?
>
> Hi,
>  thinking about image improvement I would add
>
> - GUID for image and backing file
> - relative path for backing file
>
> This would help finding images in a distributed environment or if file
> are moved, ie: gfs/nfs/ocfs mounted in different mount points, backing
> used a template in a different images directory and move this
> directory somewhere else. Also with GUID a possible higher level could
> manage a GUID <-> file image db.
>
> I was also think about a "backing file length" field to support
> resizing but probably can be implemented with zero cluster. Assume you
> have a image of 5gb, create a new image with first image as backing
> one, now resize second image from 5gb to 3gb then resize it again
> (after some works) to 10gb, part from 3gb to 5gb should not be read
> from backing file.

Interesting idea.  One could argue either way.  When image file size
!= backing file size you need to know what you are doing :).  I think
the case where the image is smaller than the backing file is rare and
zeroing vs exposing the backing file on resize isn't an obvious
choice.

> Also a bit in l2 offset to say "there is no l2 table" cause all
> clusters in l2 are contiguous so we avoid entirely l2. Obviously this
> require an optimization step to detect or create such condition.

There are several reserved L1 entry bits which could be used to mark
this mode.  This mode severely restricts qcow2 features though: how
would snapshots and COW work?  Perhaps by breaking the huge cluster
back into an L2 table with individual clusters?  Backing files also
cannot be used - unless we extend the sub-clusters approach and also
keep a large bitmap with allocated/unallocated/zero information.

A mode like this could be used for best performance on local storage,
where efficiently image transport (e.g. scp or http) is not required.
Actually I think this is reasonable, we could use qemu-img convert to
produce a compact qcow2 for export and use the L2-less qcow2 for
running the actual VM.

Kevin: what do you think about fleshing out this mode instead of sub-clusters?

> This mail sound quite strange to me, I thought qed would be the future
> of qcow2 but I must be really wrong.

What it's called doesn't matter but we need better metadata, and by
making qcow2v3 extensible we can now improvements without losing
support for existing image files.

Stefan
Kevin Wolf Oct. 12, 2011, 1:31 p.m. UTC | #3
Am 12.10.2011 14:51, schrieb Stefan Hajnoczi:
>> Also a bit in l2 offset to say "there is no l2 table" cause all
>> clusters in l2 are contiguous so we avoid entirely l2. Obviously this
>> require an optimization step to detect or create such condition.
> 
> There are several reserved L1 entry bits which could be used to mark
> this mode.  This mode severely restricts qcow2 features though: how
> would snapshots and COW work?  Perhaps by breaking the huge cluster
> back into an L2 table with individual clusters?  Backing files also
> cannot be used - unless we extend the sub-clusters approach and also
> keep a large bitmap with allocated/unallocated/zero information.
> 
> A mode like this could be used for best performance on local storage,
> where efficiently image transport (e.g. scp or http) is not required.
> Actually I think this is reasonable, we could use qemu-img convert to
> produce a compact qcow2 for export and use the L2-less qcow2 for
> running the actual VM.
> 
> Kevin: what do you think about fleshing out this mode instead of sub-clusters?

I'm hesitant to something like this as it adds quite some complexity and
I'm not sure if there are practical use cases for it at all.

If you take the current cluster sizes, an L2 table contains 512 MB of
data, so you would lose any sparseness. You would probably already get
full allocation just by creating a file system on the image.

But even if you do have a use case where sparseness doesn't matter, the
effect is very much the same as allowing a 512 MB cluster size and not
changing any of the qcow2 internals.

(What would the use case be? Backing files or snapshots with a COW
granularity of 512 MB isn't going to fly. That leaves only something
like encryption.)

Kevin
Stefan Hajnoczi Oct. 12, 2011, 2:37 p.m. UTC | #4
On Wed, Oct 12, 2011 at 2:31 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 12.10.2011 14:51, schrieb Stefan Hajnoczi:
>>> Also a bit in l2 offset to say "there is no l2 table" cause all
>>> clusters in l2 are contiguous so we avoid entirely l2. Obviously this
>>> require an optimization step to detect or create such condition.
>>
>> There are several reserved L1 entry bits which could be used to mark
>> this mode.  This mode severely restricts qcow2 features though: how
>> would snapshots and COW work?  Perhaps by breaking the huge cluster
>> back into an L2 table with individual clusters?  Backing files also
>> cannot be used - unless we extend the sub-clusters approach and also
>> keep a large bitmap with allocated/unallocated/zero information.
>>
>> A mode like this could be used for best performance on local storage,
>> where efficiently image transport (e.g. scp or http) is not required.
>> Actually I think this is reasonable, we could use qemu-img convert to
>> produce a compact qcow2 for export and use the L2-less qcow2 for
>> running the actual VM.
>>
>> Kevin: what do you think about fleshing out this mode instead of sub-clusters?
>
> I'm hesitant to something like this as it adds quite some complexity and
> I'm not sure if there are practical use cases for it at all.
>
> If you take the current cluster sizes, an L2 table contains 512 MB of
> data, so you would lose any sparseness. You would probably already get
> full allocation just by creating a file system on the image.
>
> But even if you do have a use case where sparseness doesn't matter, the
> effect is very much the same as allowing a 512 MB cluster size and not
> changing any of the qcow2 internals.

I guess I'm thinking of the 512 MB cluster size situation, because
we'd definitely want a cow bitmap in order to keep backing files and
sparseness.

> (What would the use case be? Backing files or snapshots with a COW
> granularity of 512 MB isn't going to fly. That leaves only something
> like encryption.)

COW granularity needs to stay at 64-256 kb since those are reasonable
request sizes for COW.

Stefan
Kevin Wolf Oct. 12, 2011, 2:58 p.m. UTC | #5
Am 12.10.2011 16:37, schrieb Stefan Hajnoczi:
> On Wed, Oct 12, 2011 at 2:31 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> Am 12.10.2011 14:51, schrieb Stefan Hajnoczi:
>>>> Also a bit in l2 offset to say "there is no l2 table" cause all
>>>> clusters in l2 are contiguous so we avoid entirely l2. Obviously this
>>>> require an optimization step to detect or create such condition.
>>>
>>> There are several reserved L1 entry bits which could be used to mark
>>> this mode.  This mode severely restricts qcow2 features though: how
>>> would snapshots and COW work?  Perhaps by breaking the huge cluster
>>> back into an L2 table with individual clusters?  Backing files also
>>> cannot be used - unless we extend the sub-clusters approach and also
>>> keep a large bitmap with allocated/unallocated/zero information.
>>>
>>> A mode like this could be used for best performance on local storage,
>>> where efficiently image transport (e.g. scp or http) is not required.
>>> Actually I think this is reasonable, we could use qemu-img convert to
>>> produce a compact qcow2 for export and use the L2-less qcow2 for
>>> running the actual VM.
>>>
>>> Kevin: what do you think about fleshing out this mode instead of sub-clusters?
>>
>> I'm hesitant to something like this as it adds quite some complexity and
>> I'm not sure if there are practical use cases for it at all.
>>
>> If you take the current cluster sizes, an L2 table contains 512 MB of
>> data, so you would lose any sparseness. You would probably already get
>> full allocation just by creating a file system on the image.
>>
>> But even if you do have a use case where sparseness doesn't matter, the
>> effect is very much the same as allowing a 512 MB cluster size and not
>> changing any of the qcow2 internals.
> 
> I guess I'm thinking of the 512 MB cluster size situation, because
> we'd definitely want a cow bitmap in order to keep backing files and
> sparseness.
> 
>> (What would the use case be? Backing files or snapshots with a COW
>> granularity of 512 MB isn't going to fly. That leaves only something
>> like encryption.)
> 
> COW granularity needs to stay at 64-256 kb since those are reasonable
> request sizes for COW.

But how do you do that without L2 tables? What you're describing
(different sizes for allocation and COW) is exactly what subclusters are
doing. I can't see how switching to 512 MB clusters and a single-level
table can make that work.

Kevin
Stefan Hajnoczi Oct. 13, 2011, 2:43 p.m. UTC | #6
On Wed, Oct 12, 2011 at 3:58 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 12.10.2011 16:37, schrieb Stefan Hajnoczi:
>> On Wed, Oct 12, 2011 at 2:31 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>>> Am 12.10.2011 14:51, schrieb Stefan Hajnoczi:
>>>>> Also a bit in l2 offset to say "there is no l2 table" cause all
>>>>> clusters in l2 are contiguous so we avoid entirely l2. Obviously this
>>>>> require an optimization step to detect or create such condition.
>>>>
>>>> There are several reserved L1 entry bits which could be used to mark
>>>> this mode.  This mode severely restricts qcow2 features though: how
>>>> would snapshots and COW work?  Perhaps by breaking the huge cluster
>>>> back into an L2 table with individual clusters?  Backing files also
>>>> cannot be used - unless we extend the sub-clusters approach and also
>>>> keep a large bitmap with allocated/unallocated/zero information.
>>>>
>>>> A mode like this could be used for best performance on local storage,
>>>> where efficiently image transport (e.g. scp or http) is not required.
>>>> Actually I think this is reasonable, we could use qemu-img convert to
>>>> produce a compact qcow2 for export and use the L2-less qcow2 for
>>>> running the actual VM.
>>>>
>>>> Kevin: what do you think about fleshing out this mode instead of sub-clusters?
>>>
>>> I'm hesitant to something like this as it adds quite some complexity and
>>> I'm not sure if there are practical use cases for it at all.
>>>
>>> If you take the current cluster sizes, an L2 table contains 512 MB of
>>> data, so you would lose any sparseness. You would probably already get
>>> full allocation just by creating a file system on the image.
>>>
>>> But even if you do have a use case where sparseness doesn't matter, the
>>> effect is very much the same as allowing a 512 MB cluster size and not
>>> changing any of the qcow2 internals.
>>
>> I guess I'm thinking of the 512 MB cluster size situation, because
>> we'd definitely want a cow bitmap in order to keep backing files and
>> sparseness.
>>
>>> (What would the use case be? Backing files or snapshots with a COW
>>> granularity of 512 MB isn't going to fly. That leaves only something
>>> like encryption.)
>>
>> COW granularity needs to stay at 64-256 kb since those are reasonable
>> request sizes for COW.
>
> But how do you do that without L2 tables? What you're describing
> (different sizes for allocation and COW) is exactly what subclusters are
> doing. I can't see how switching to 512 MB clusters and a single-level
> table can make that work.

Yes, very large sub-clusters are likely to provide the best performance:

1. The refcounts are incremented in a single operation when the large
cluster is allocated.
2. COW still works on smaller granularity so allocating a large
cluster does not require zeroing data.
3. Writes simply need to update the COW bitmap, no refcount updates
are required.

Stefan
diff mbox

Patch

diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 8fc3cb2..e4722bc 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -18,7 +18,7 @@  The first cluster of a qcow2 image contains the file header:
                     QCOW magic string ("QFI\xfb")
 
           4 -  7:   version
-                    Version number (only valid value is 2)
+                    Version number (valid values are 2 and 3)
 
           8 - 15:   backing_file_offset
                     Offset into the image file at which the backing file name
@@ -67,12 +67,53 @@  The first cluster of a qcow2 image contains the file header:
                     Offset into the image file at which the snapshot table
                     starts. Must be aligned to a cluster boundary.
 
+If the version is 3 or higher, the header has the following additional fields.
+For version 2, the values are assumed to be zero, unless specified otherwise
+in the description of a field.
+
+         72 -  79:  incompatible_features
+                    Bitmask of incompatible features. An implementation must
+                    fail to open an image if an unknown bit is set.
+
+                    Bit 0:      The reference counts in the image file may be
+                                inaccurate. Implementations must check/rebuild
+                                them if they rely on them.
+
+                    Bit 1:      Enable subclusters. This affects the L2 table
+                                format.
+
+                    Bits 2-31:  Reserved (set to 0)
+
+         80 -  87:  compatible_features
+                    Bitmask of compatible features. An implementation can
+                    safely ignore any unknown bits that are set.
+
+                    Bits 0-31:  Reserved (set to 0)
+
+         88 -  95:  autoclear_features
+                    Bitmask of auto-clear features. An implementation may only
+                    write to an image with unknown auto-clear features if it
+                    clears the respective bits from this field first.
+
+                    Bits 0-31:  Reserved (set to 0)
+
+         96 -  99:  refcount_bits
+                    Size of a reference count block entry in bits. For version 2
+                    images, the size is always assumed to be 16 bits. The size
+                    must be a power of two.
+                    [ TODO: Define order in sub-byte sizes ]
+
+        100 - 104:  header_length
+                    Length of the header structure in bytes. For version 2
+                    images, the length is always assumed to be 72 bytes.
+
 Directly after the image header, optional sections called header extensions can
 be stored. Each extension has a structure like the following:
 
     Byte  0 -  3:   Header extension type:
                         0x00000000 - End of the header extension area
                         0xE2792ACA - Backing file format name
+                        0x6803f857 - Feature name table
                         other      - Unknown header extension, can be safely
                                      ignored
 
@@ -84,8 +125,32 @@  be stored. Each extension has a structure like the following:
                     multiple of 8.
 
 The remaining space between the end of the header extension area and the end of
-the first cluster can be used for other data. Usually, the backing file name is
-stored there.
+the first cluster can be used for the backing file name. It is not allowed to
+store other data here, so that an implementation can safely modify the header
+and add extensions without harming data of compatible features that it
+doesn't support. Compatible features that need space for additional data can
+use a header extension.
+
+
+== Feature name table ==
+
+A feature name table is an optional header extension that contains the name for
+features used by the image. It can be used by applications that don't know
+the respective feature (e.g. because the feature was introduced only later) to
+display a useful error message.
+
+The number of entries in the feature name table is determined by the length of
+the header extension data. Its entries look like this:
+
+    Byte       0:   Type of feature (select feature bitmap)
+                        0: Incompatible feature
+                        1: Compatible feature
+                        2: Autoclear feature
+
+               1:   Bit number within the selected feature bitmap
+
+          2 - 47:   Feature name (padded with zeros, but not necessarily null
+                    terminated if it has full length)
 
 
 == Host cluster management ==
@@ -138,7 +203,8 @@  guest clusters to host clusters. They are called L1 and L2 table.
 
 The L1 table has a variable size (stored in the header) and may use multiple
 clusters, however it must be contiguous in the image file. L2 tables are
-exactly one cluster in size.
+exactly one cluster in size if subclusters are disabled, and two clusters if
+they are enabled.
 
 Given a offset into the virtual disk, the offset into the image file can be
 obtained as follows:
@@ -168,9 +234,38 @@  L1 table entry:
                     refcount is exactly one. This information is only accurate
                     in the active L1 table.
 
-L2 table entry (for normal clusters):
+L2 table entry:
 
-    Bit  0 -  8:    Reserved (set to 0)
+    Bit  0 -  61:   Cluster descriptor
+
+              62:   0 for standard clusters
+                    1 for compressed clusters
+
+              63:   0 for a cluster that is unused or requires COW, 1 if its
+                    refcount is exactly one. This information is only accurate
+                    in L2 tables that are reachable from the the active L1
+                    table.
+
+        64 - 127:   If subclusters are enabled, this contains a bitmask that
+                    describes the allocation status of all 32 subclusters (two
+                    bits for each). The first subcluster is represented by the
+                    LSB. The values for each subcluster are:
+
+                     0: Subcluster is unallocated
+                     1: Subcluster is allocated
+                     2: Subcluster is unallocated and reads as all zeros
+                        instead of referring to the backing file
+                     3: Reserved
+
+Standard Cluster Descriptor:
+
+    Bit       0:    If set to 1, the cluster reads as all zeros instead of
+                    referring to the backing file if the (sub-)cluster is
+                    unallocated.
+
+                    With version 2, this is always 0.
+
+         1 -  8:    Reserved (set to 0)
 
          9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
                     cluster boundary. If the offset is 0, the cluster is
@@ -178,29 +273,17 @@  L2 table entry (for normal clusters):
 
         56 - 61:    Reserved (set to 0)
 
-             62:    0 (this cluster is not compressed)
 
-             63:    0 for a cluster that is unused or requires COW, 1 if its
-                    refcount is exactly one. This information is only accurate
-                    in L2 tables that are reachable from the the active L1
-                    table.
-
-L2 table entry (for compressed clusters; x = 62 - (cluster_size - 8)):
+Compressed Clusters Descriptor (x = 62 - (cluster_size - 8)):
 
     Bit  0 -  x:    Host cluster offset. This is usually _not_ aligned to a
                     cluster boundary!
 
        x+1 - 61:    Compressed size of the images in sectors of 512 bytes
 
-             62:    1 (this cluster is compressed using zlib)
-
-             63:    0 for a cluster that is unused or requires COW, 1 if its
-                    refcount is exactly one. This information is only accurate
-                    in L2 tables that are reachable from the the active L1
-                    table.
-
-If a cluster is unallocated, read requests shall read the data from the backing
-file. If there is no backing file or the backing file is smaller than the image,
+If a cluster or a subcluster is unallocated, read requests shall read the data
+from the backing file (except if bit 0 in the Standard Cluster Descriptor is
+set). If there is no backing file or the backing file is smaller than the image,
 they shall read zeros for all parts that are not covered by the backing file.
 
 
@@ -253,7 +336,13 @@  Snapshot table entry:
         36 - 39:    Size of extra data in the table entry (used for future
                     extensions of the format)
 
-        variable:   Extra data for future extensions. Must be ignored.
+        variable:   Extra data for future extensions. Unknown fields must be
+                    ignored. Currently defined are (offset relative to snapshot
+                    table entry):
+
+                    Byte 40 - 47:   Size of the VM state in bytes. 0 if no VM
+                                    state is saved. If this field is present,
+                                    the 32-bit value in bytes 32-35 is ignored.
 
         variable:   Unique ID string for the snapshot (not null terminated)