diff mbox series

[v2] Document qemu-img options data_file and data_file_raw

Message ID 20210430133452.253102-1-ckuehl@redhat.com
State New
Headers show
Series [v2] Document qemu-img options data_file and data_file_raw | expand

Commit Message

Connor Kuehl April 30, 2021, 1:34 p.m. UTC
The contents of this patch were initially developed and posted by Han
Han[1], however, it appears the original patch was not applied. Since
then, the relevant documentation has been moved and adapted to a new
format.

I've taken most of the original wording and tweaked it according to
some of the feedback from the original patch submission. I've also
adapted it to restructured text, which is the format the documentation
currently uses.

[1] https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01253.html

Fixes: https://bugzilla.redhat.com/1763105
Signed-off-by: Han Han <hhan@redhat.com>
Signed-off-by: Connor Kuehl <ckuehl@redhat.com>
---
Changes since v1:
  * Clarify different behaviors with these options when using qemu-img
    create vs amend (Max)
  * Touch on the negative case of how the file becomes inconsistent
    (John)

 docs/tools/qemu-img.rst | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Max Reitz April 30, 2021, 2:45 p.m. UTC | #1
On 30.04.21 15:34, Connor Kuehl wrote:
> The contents of this patch were initially developed and posted by Han
> Han[1], however, it appears the original patch was not applied. Since
> then, the relevant documentation has been moved and adapted to a new
> format.
> 
> I've taken most of the original wording and tweaked it according to
> some of the feedback from the original patch submission. I've also
> adapted it to restructured text, which is the format the documentation
> currently uses.
> 
> [1] https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01253.html
> 
> Fixes: https://bugzilla.redhat.com/1763105
> Signed-off-by: Han Han <hhan@redhat.com>
> Signed-off-by: Connor Kuehl <ckuehl@redhat.com>
> ---
> Changes since v1:
>    * Clarify different behaviors with these options when using qemu-img
>      create vs amend (Max)
>    * Touch on the negative case of how the file becomes inconsistent
>      (John)
> 
>   docs/tools/qemu-img.rst | 20 ++++++++++++++++++++
>   1 file changed, 20 insertions(+)
> 
> diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
> index c9efcfaefc..87b4a65535 100644
> --- a/docs/tools/qemu-img.rst
> +++ b/docs/tools/qemu-img.rst
> @@ -866,6 +866,26 @@ Supported image file formats:
>       issue ``lsattr filename`` to check if the NOCOW flag is set or not
>       (Capital 'C' is NOCOW flag).
>   
> +  ``data_file``
> +    Filename where all guest data will be stored. If this option is used,
> +    the qcow2 file will only contain the image's metadata.
> +
> +    Note: Data loss will occur if the given filename already exists when
> +    using this option with ``qemu-img create`` since ``qemu-img`` will create
> +    the data file anew, overwriting the file's original contents. To simply
> +    update the reference to point to the given pre-existing file, use
> +    ``qemu-img amend``.
> +
> +  ``data_file_raw``
> +    If this option is set to ``on``, QEMU will always keep the external
> +    data file consistent as a standalone read-only raw image. It does
> +    this by forwarding updates through to the raw image in addition to
> +    updating the image metadata. If set to ``off``, QEMU will only
> +    update the image metadata without forwarding the changes through
> +    to the raw image. The default value is ``off``.

Hm, what updates and what changes?  I mean, the first part makes sense 
(the “It does this by...”), but the second part doesn’t.  qemu will 
still forward most writes to the data file.  (Not all, but most.)

(Also, nit pick: With data_file_raw=off, the data file is not a raw 
image.  (You still call it that in the penultimate sentence.))

When you write data to a qcow2 file with data_file, the data also goes 
to the data_file, most of the time.  The exception is when it can be 
handled with a metadata update, i.e. when it's a zero write or discard.

In addition, such updates (i.e. zero writes, I presume) not happening to 
the data file are usually a minor problem.  The real problem is that 
without data_file_raw, data clusters can be allocated anywhere in the 
data file, whereas with data_file_raw, they are allocated at their 
respective guest offset (i.e. the host offset always equals the guest 
offset).

I personally would have been fine with the first sentence, but if we 
want more of an explanation...  Perhaps:

<<EOF

If this option is set to ``on``, QEMU will always keep the external data 
file consistent as a standalone read-only raw image.

It does this by effectively forwarding all write accesses that happen to 
the qcow2 file to the raw data file, including their offsets. 
Therefore, data that is visible on the qcow2 node (i.e., to the guest) 
at some offset is visible at the same offset in the raw data file.

If this option is ``off``, QEMU will use the data file just to store 
data in an effectively arbitrary manner.  The file’s content will not 
make sense without the accompanying qcow2 metadata.  Where data is 
written will have no relation to its offset as seen by the guest, and 
some writes (specifically zero writes) may not be forwarded to the data 
file at all, but will only be handled by modifying qcow2 metadata.

In short: With data_file_raw, the data file reads as a valid raw VM 
image file.  Without it, its content can only be interpreted by reading 
the accompanying qcow2 metadata.

Note that this option only makes the data file valid as a read-only raw 
image.  You should not write to it, as this may effectively corrupt the 
qcow2 metadata (for example, dirty bitmaps may become out of sync).

EOF

This got longer than I wanted it to be.  Hm.  Anyway, what do you think?

Max

> +
> +    This option can only be enabled if ``data_file`` is set.
> +
>   ``Other``
>   
>     QEMU also supports various other image file formats for
>
Connor Kuehl May 3, 2021, 11:15 p.m. UTC | #2
On 4/30/21 9:45 AM, Max Reitz wrote:
>> +  ``data_file_raw``
>> +    If this option is set to ``on``, QEMU will always keep the external
>> +    data file consistent as a standalone read-only raw image. It does
>> +    this by forwarding updates through to the raw image in addition to
>> +    updating the image metadata. If set to ``off``, QEMU will only
>> +    update the image metadata without forwarding the changes through
>> +    to the raw image. The default value is ``off``.
> 
> Hm, what updates and what changes?  I mean, the first part makes sense (the “It does this by...”), but the second part doesn’t.  qemu will still forward most writes to the data file.  (Not all, but most.)
> 
> (Also, nit pick: With data_file_raw=off, the data file is not a raw image.  (You still call it that in the penultimate sentence.))
> When you write data to a qcow2 file with data_file, the data also goes to the data_file, most of the time.  The exception is when it can be handled with a metadata update, i.e. when it's a zero write or discard.
> 
> In addition, such updates (i.e. zero writes, I presume) not happening to the data file are usually a minor problem.  The real problem is that without data_file_raw, data clusters can be allocated anywhere in the data file, whereas with data_file_raw, they are allocated at their respective guest offset (i.e. the host offset always equals the guest offset).
> 
> I personally would have been fine with the first sentence, but if we want more of an explanation...  Perhaps:
> 
> <<EOF
> 
> If this option is set to ``on``, QEMU will always keep the external data file consistent as a standalone read-only raw image.
> 
> It does this by effectively forwarding all write accesses that happen to the qcow2 file to the raw data file, including their offsets. Therefore, data that is visible on the qcow2 node (i.e., to the guest) at some offset is visible at the same offset in the raw data file.
> 
> If this option is ``off``, QEMU will use the data file just to store data in an effectively arbitrary manner.  The file’s content will not make sense without the accompanying qcow2 metadata.  Where data is written will have no relation to its offset as seen by the guest, and some writes (specifically zero writes) may not be forwarded to the data file at all, but will only be handled by modifying qcow2 metadata.
> 
> In short: With data_file_raw, the data file reads as a valid raw VM image file.  Without it, its content can only be interpreted by reading the accompanying qcow2 metadata.
> 
> Note that this option only makes the data file valid as a read-only raw image.  You should not write to it, as this may effectively corrupt the qcow2 metadata (for example, dirty bitmaps may become out of sync).
> 
> EOF
> 
> This got longer than I wanted it to be.  Hm.  Anyway, what do you think?

I found it very helpful. I'll incorporate your explanation into the next
revision.

I'm wondering what the most appropriate trailer would be for the next
revision?

	Suggested-by: Max [..]
	Co-developed-by: Max [..]

Let me know if you have a strong preference, otherwise I'll go with
Suggested-by:

Thank you,

Connor
Max Reitz May 4, 2021, 7:46 a.m. UTC | #3
On 04.05.21 01:15, Connor Kuehl wrote:
> On 4/30/21 9:45 AM, Max Reitz wrote:
>>> +  ``data_file_raw``
>>> +    If this option is set to ``on``, QEMU will always keep the external
>>> +    data file consistent as a standalone read-only raw image. It does
>>> +    this by forwarding updates through to the raw image in addition to
>>> +    updating the image metadata. If set to ``off``, QEMU will only
>>> +    update the image metadata without forwarding the changes through
>>> +    to the raw image. The default value is ``off``.
>>
>> Hm, what updates and what changes?  I mean, the first part makes sense (the “It does this by...”), but the second part doesn’t.  qemu will still forward most writes to the data file.  (Not all, but most.)
>>
>> (Also, nit pick: With data_file_raw=off, the data file is not a raw image.  (You still call it that in the penultimate sentence.))
>> When you write data to a qcow2 file with data_file, the data also goes to the data_file, most of the time.  The exception is when it can be handled with a metadata update, i.e. when it's a zero write or discard.
>>
>> In addition, such updates (i.e. zero writes, I presume) not happening to the data file are usually a minor problem.  The real problem is that without data_file_raw, data clusters can be allocated anywhere in the data file, whereas with data_file_raw, they are allocated at their respective guest offset (i.e. the host offset always equals the guest offset).
>>
>> I personally would have been fine with the first sentence, but if we want more of an explanation...  Perhaps:
>>
>> <<EOF
>>
>> If this option is set to ``on``, QEMU will always keep the external data file consistent as a standalone read-only raw image.
>>
>> It does this by effectively forwarding all write accesses that happen to the qcow2 file to the raw data file, including their offsets. Therefore, data that is visible on the qcow2 node (i.e., to the guest) at some offset is visible at the same offset in the raw data file.
>>
>> If this option is ``off``, QEMU will use the data file just to store data in an effectively arbitrary manner.  The file’s content will not make sense without the accompanying qcow2 metadata.  Where data is written will have no relation to its offset as seen by the guest, and some writes (specifically zero writes) may not be forwarded to the data file at all, but will only be handled by modifying qcow2 metadata.
>>
>> In short: With data_file_raw, the data file reads as a valid raw VM image file.  Without it, its content can only be interpreted by reading the accompanying qcow2 metadata.
>>
>> Note that this option only makes the data file valid as a read-only raw image.  You should not write to it, as this may effectively corrupt the qcow2 metadata (for example, dirty bitmaps may become out of sync).
>>
>> EOF
>>
>> This got longer than I wanted it to be.  Hm.  Anyway, what do you think?
> 
> I found it very helpful. I'll incorporate your explanation into the next
> revision.
> 
> I'm wondering what the most appropriate trailer would be for the next
> revision?
> 
> 	Suggested-by: Max [..]
> 	Co-developed-by: Max [..]
> 
> Let me know if you have a strong preference, otherwise I'll go with
> Suggested-by:

I’m fine without any tag (if I merge this patch, it’ll get my S-o-b 
anyway :)), but if any, I’d probably go with a Suggested-by, yes.

Max
diff mbox series

Patch

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index c9efcfaefc..87b4a65535 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -866,6 +866,26 @@  Supported image file formats:
     issue ``lsattr filename`` to check if the NOCOW flag is set or not
     (Capital 'C' is NOCOW flag).
 
+  ``data_file``
+    Filename where all guest data will be stored. If this option is used,
+    the qcow2 file will only contain the image's metadata.
+
+    Note: Data loss will occur if the given filename already exists when
+    using this option with ``qemu-img create`` since ``qemu-img`` will create
+    the data file anew, overwriting the file's original contents. To simply
+    update the reference to point to the given pre-existing file, use
+    ``qemu-img amend``.
+
+  ``data_file_raw``
+    If this option is set to ``on``, QEMU will always keep the external
+    data file consistent as a standalone read-only raw image. It does
+    this by forwarding updates through to the raw image in addition to
+    updating the image metadata. If set to ``off``, QEMU will only
+    update the image metadata without forwarding the changes through
+    to the raw image. The default value is ``off``.
+
+    This option can only be enabled if ``data_file`` is set.
+
 ``Other``
 
   QEMU also supports various other image file formats for