diff mbox

[RFC,2/2] block: Warn on insecure format probing

Message ID 1414512220-19058-3-git-send-email-armbru@redhat.com
State New
Headers show

Commit Message

Markus Armbruster Oct. 28, 2014, 4:03 p.m. UTC
If the user neglects to specify the image format, QEMU probes the
image to guess it automatically, for convenience.

Relying on format probing is insecure for raw images (CVE-2008-2004).
If the guest writes a suitable header to the device, the next probe
will recognize a format chosen by the guest.  A malicious guest can
abuse this to gain access to host files, e.g. by crafting a QCOW2
header with backing file /etc/shadow.

Commit 1e72d3b (April 2008) provided -drive parameter format to let
users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
optionally store the backing file format, to let users disable backing
file probing.  QED has had a flag to suppress probing since the
beginning (2010), set whenever a raw backing file is assigned.

Despite all this work (and time!), we're still insecure by default.  I
think we're doing our users a disservice by sticking to the fatally
flawed probing.  "Broken by default" is just wrong, and "convenience"
is no excuse.

I believe we can retain 90% of the convenience without compromising
security by keying on image file name instead of image contents: if
the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
assume qcow2, and so forth.

Naturally, this would break command lines where the filename doesn't
provide the correct clue.  So don't do it just yet, only warn if the
the change would lead to a different result.  Looks like this:

    qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
    To get rid of this warning, specify format=qcow2 explicitly, or change
    the image name to end with '.qcow2'

This should steer users away from insecure format probing.  After a
suitable grace period, we can hopefully drop format probing
alltogether.

Example 0: file=WHATEVER,format=F

Never warns, because the explicit format suppresses probing.

Example 1: file=FOO.img

Warns when probing of FOO.img results in anything but raw.  In
particular, it warns when the guest just p0wned you.

Example 2: file=FOO.qcow2 with backing file name FOO.img and no
backing image format.

Warns when probing of FOO.qcow2 results in anything but qcow2, or
probing of FOO.img results in anything but raw.

This patch is RFC because of open questions:

* Should tools warn, too?  Probing isn't insecure there, but a "this
  may pick a different format in the future" warning may be
  appropriate.

* I didn't specify recognized extensions for formats "bochs", "cloop",
  "parallels", "vpc", because I have no idea which ones to recognize.

Additionally, some tests still need to be updated.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 block.c                   | 46 ++++++++++++++++++++++++++++++++++++++++++++--
 block/dmg.c               |  1 +
 block/qcow.c              |  1 +
 block/qcow2.c             |  1 +
 block/qed.c               |  1 +
 block/raw_bsd.c           |  1 +
 block/vdi.c               |  1 +
 block/vhdx.c              |  1 +
 block/vmdk.c              |  1 +
 include/block/block_int.h |  1 +
 10 files changed, 53 insertions(+), 2 deletions(-)

Comments

Eric Blake Oct. 28, 2014, 5:02 p.m. UTC | #1
On 10/28/2014 10:03 AM, Markus Armbruster wrote:
> If the user neglects to specify the image format, QEMU probes the
> image to guess it automatically, for convenience.
> 
> Relying on format probing is insecure for raw images (CVE-2008-2004).
> If the guest writes a suitable header to the device, the next probe
> will recognize a format chosen by the guest.  A malicious guest can
> abuse this to gain access to host files, e.g. by crafting a QCOW2
> header with backing file /etc/shadow.
> 
> Commit 1e72d3b (April 2008) provided -drive parameter format to let
> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
> optionally store the backing file format, to let users disable backing
> file probing.  QED has had a flag to suppress probing since the
> beginning (2010), set whenever a raw backing file is assigned.

May also be worth mentioning libvirt CVE-2010-2237, CVE-2010-2238,
CVE-2010-2239, when libvirt was (mostly) patched to fix probing mishaps
by defaulting to refusing to probe by default.  And as you say, we keep
discovering variations on the theme (as recently as this year, I patched
a bug where libvirt accidentally probes during a drive-mirror operation
even when the user had explicitly marked an image as raw, although it
was not given a CVE because of libvirt's default to refuse to probe).

> 
> Despite all this work (and time!), we're still insecure by default.  I
> think we're doing our users a disservice by sticking to the fatally
> flawed probing.  "Broken by default" is just wrong, and "convenience"
> is no excuse.
> 
> I believe we can retain 90% of the convenience without compromising
> security by keying on image file name instead of image contents: if
> the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
> assume qcow2, and so forth.
> 
> Naturally, this would break command lines where the filename doesn't
> provide the correct clue.  So don't do it just yet, only warn if the
> the change would lead to a different result.  Looks like this:
> 
>     qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
>     To get rid of this warning, specify format=qcow2 explicitly, or change
>     the image name to end with '.qcow2'
> 
> This should steer users away from insecure format probing.  After a
> suitable grace period, we can hopefully drop format probing
> alltogether.

s/alltogether/altogether/

> 
> Example 0: file=WHATEVER,format=F
> 
> Never warns, because the explicit format suppresses probing.

Good, because this is what libvirt tries to use most of the time.

> 
> Example 1: file=FOO.img
> 
> Warns when probing of FOO.img results in anything but raw.  In
> particular, it warns when the guest just p0wned you.

Warning will be a false positive if the user misnamed their
intentionally-created qcow2 file, but that's okay; the warning message
tells them how to fix their setup.  However, I agree that most users
pick .img or .iso for raw files, so when the probe finds something other
than raw that the warning is worth it; better to catch the real problems
in spite of the false negatives.

> 
> Example 2: file=FOO.qcow2 with backing file name FOO.img and no
> backing image format.
> 
> Warns when probing of FOO.qcow2 results in anything but qcow2, or
> probing of FOO.img results in anything but raw.

It's nice that it only warns the user when the probe came back with
confusing results, while maintaining the convenience factor (no warning
if probe matched expectations).  But this means the user is set up for
late failure.

Would it be any smarter to teach tools to automatically write in a
backing format for any image that probed correctly where the backing
format had not been previously recorded?  That is, if we are opening
FOO.qcow2 for writing and it currently claims backing file BAR.img and
no format, then before closing FOO.qcow2 we also write the backing
format of raw into the metadata.  That way, the next time we open
FOO.qcow2, we no longer have to probe BAR.img, and are ensured that we
are reading the backing file with the same format as previously. (When
opening FOO.qcow2 read-only, we don't have that luxury).  In other
words, while we may still need to probe the TOP level file every time it
is opened (for convenience for users that don't type the format on the
command line every time), we should at LEAST minimize probing backing
files to the just the first try.

Not shown - what is the error message when a file extension that
normally indicates a non-raw file comes up raw?  That is, if 'FOO.qcow2'
does not actually contain qcow2, is that also worth a warning to the
user?  What happens where we don't have a file extension (such as when
using block device names to hold qcow2 files, where there is no '.' in
the name)?

> 
> This patch is RFC because of open questions:
> 
> * Should tools warn, too?  Probing isn't insecure there, but a "this
>   may pick a different format in the future" warning may be
>   appropriate.

Yes.  For precedent, libvirt can be considered a tool on images for
certain operations, and libvirt has been warning about probing since 2010.

Also, while I agree that any tool that operates on ONLY one layer of an
image, without ever trying to chase backing chains, can't be tricked
into opening wrong files, I'm not sure I agree with the claim that
"probing isn't insecure" -

> 
> * I didn't specify recognized extensions for formats "bochs", "cloop",
>   "parallels", "vpc", because I have no idea which ones to recognize.

Fair enough.  If there are common enough extensions for a given format,
I'm sure people can request they be recognized as future patches.

> 
> Additionally, some tests still need to be updated.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>  block.c                   | 46 ++++++++++++++++++++++++++++++++++++++++++++--
>  block/dmg.c               |  1 +
>  block/qcow.c              |  1 +
>  block/qcow2.c             |  1 +
>  block/qed.c               |  1 +
>  block/raw_bsd.c           |  1 +
>  block/vdi.c               |  1 +
>  block/vhdx.c              |  1 +
>  block/vmdk.c              |  1 +
>  include/block/block_int.h |  1 +
>  10 files changed, 53 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 8da6e61..7a1c592 100644
> --- a/block.c
> +++ b/block.c
> @@ -674,10 +674,37 @@ static BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
>      return drv;
>  }
>  
> +/*
> + * Guess image format from file name @fname
> + */
> +static BlockDriver *bdrv_guess_format(const char *fname)
> +{
> +    BlockDriver *d;
> +    int i, ext;
> +    size_t fn_len = strlen(fname);
> +    size_t ext_len;
> +
> +    QLIST_FOREACH(d, &bdrv_drivers, list) {
> +        for (i = 0; i < ARRAY_SIZE(d->fname_ext) && d->fname_ext[i]; i++) {
> +            ext_len = strlen(d->fname_ext[i]);
> +            if (fn_len <= ext_len) {
> +                continue;
> +            }
> +            ext = fn_len - ext_len;
> +            if (fname[ext - 1] == '.'
> +                && !strcmp(fname + ext, d->fname_ext[i])) {
> +                return d;

What happens if more than one format tends to pick the same extension?
For example, would you consider '.qcow' a typical extension for qcow2
files, even though it would probably match the older qcow driver first?...

> +++ b/include/block/block_int.h
> @@ -91,6 +91,7 @@ struct BlockDriver {
>  
>      int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename);
>      int (*bdrv_probe_device)(const char *filename);
> +    const char *fname_ext[2];

...Answering myself - so far, you haven't included any collision extensions.

Overall, I like the idea of this patch, but still wonder if we can do
better with the auto-amending of writable image metadata to record
anything that we ever learned from an initial probe.
Jeff Cody Oct. 28, 2014, 6:29 p.m. UTC | #2
On Tue, Oct 28, 2014 at 11:02:56AM -0600, Eric Blake wrote:
> On 10/28/2014 10:03 AM, Markus Armbruster wrote:
> > If the user neglects to specify the image format, QEMU probes the
> > image to guess it automatically, for convenience.
> > 
> > Relying on format probing is insecure for raw images (CVE-2008-2004).
> > If the guest writes a suitable header to the device, the next probe
> > will recognize a format chosen by the guest.  A malicious guest can
> > abuse this to gain access to host files, e.g. by crafting a QCOW2
> > header with backing file /etc/shadow.
> > 
> > Commit 1e72d3b (April 2008) provided -drive parameter format to let
> > users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
> > optionally store the backing file format, to let users disable backing
> > file probing.  QED has had a flag to suppress probing since the
> > beginning (2010), set whenever a raw backing file is assigned.
> 
> May also be worth mentioning libvirt CVE-2010-2237, CVE-2010-2238,
> CVE-2010-2239, when libvirt was (mostly) patched to fix probing mishaps
> by defaulting to refusing to probe by default.  And as you say, we keep
> discovering variations on the theme (as recently as this year, I patched
> a bug where libvirt accidentally probes during a drive-mirror operation
> even when the user had explicitly marked an image as raw, although it
> was not given a CVE because of libvirt's default to refuse to probe).
> 
> > 
> > Despite all this work (and time!), we're still insecure by default.  I
> > think we're doing our users a disservice by sticking to the fatally
> > flawed probing.  "Broken by default" is just wrong, and "convenience"
> > is no excuse.
> > 
> > I believe we can retain 90% of the convenience without compromising
> > security by keying on image file name instead of image contents: if
> > the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
> > assume qcow2, and so forth.
> > 
> > Naturally, this would break command lines where the filename doesn't
> > provide the correct clue.  So don't do it just yet, only warn if the
> > the change would lead to a different result.  Looks like this:
> > 
> >     qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
> >     To get rid of this warning, specify format=qcow2 explicitly, or change
> >     the image name to end with '.qcow2'
> > 
> > This should steer users away from insecure format probing.  After a
> > suitable grace period, we can hopefully drop format probing
> > alltogether.
> 
> s/alltogether/altogether/
> 
> > 
> > Example 0: file=WHATEVER,format=F
> > 
> > Never warns, because the explicit format suppresses probing.
> 
> Good, because this is what libvirt tries to use most of the time.
> 
> > 
> > Example 1: file=FOO.img
> > 
> > Warns when probing of FOO.img results in anything but raw.  In
> > particular, it warns when the guest just p0wned you.
> 
> Warning will be a false positive if the user misnamed their
> intentionally-created qcow2 file, but that's okay; the warning message
> tells them how to fix their setup.  However, I agree that most users
> pick .img or .iso for raw files, so when the probe finds something other
> than raw that the warning is worth it; better to catch the real problems
> in spite of the false negatives.
> 
> > 
> > Example 2: file=FOO.qcow2 with backing file name FOO.img and no
> > backing image format.
> > 
> > Warns when probing of FOO.qcow2 results in anything but qcow2, or
> > probing of FOO.img results in anything but raw.
> 
> It's nice that it only warns the user when the probe came back with
> confusing results, while maintaining the convenience factor (no warning
> if probe matched expectations).  But this means the user is set up for
> late failure.
> 
> Would it be any smarter to teach tools to automatically write in a
> backing format for any image that probed correctly where the backing
> format had not been previously recorded?  That is, if we are opening
> FOO.qcow2 for writing and it currently claims backing file BAR.img and
> no format, then before closing FOO.qcow2 we also write the backing
> format of raw into the metadata.  That way, the next time we open
> FOO.qcow2, we no longer have to probe BAR.img, and are ensured that we
> are reading the backing file with the same format as previously. (When
> opening FOO.qcow2 read-only, we don't have that luxury).  In other
> words, while we may still need to probe the TOP level file every time it
> is opened (for convenience for users that don't type the format on the
> command line every time), we should at LEAST minimize probing backing
> files to the just the first try.
> 
> Not shown - what is the error message when a file extension that
> normally indicates a non-raw file comes up raw?  That is, if 'FOO.qcow2'
> does not actually contain qcow2, is that also worth a warning to the
> user?  What happens where we don't have a file extension (such as when
> using block device names to hold qcow2 files, where there is no '.' in
> the name)?
> 
> > 
> > This patch is RFC because of open questions:
> > 
> > * Should tools warn, too?  Probing isn't insecure there, but a "this
> >   may pick a different format in the future" warning may be
> >   appropriate.
> 
> Yes.  For precedent, libvirt can be considered a tool on images for
> certain operations, and libvirt has been warning about probing since 2010.
> 

I think at least the invocation 'qemu-img info' should be exempt from
the warning; doing a format probe is arguably part of its intended
usage.

> Also, while I agree that any tool that operates on ONLY one layer of an
> image, without ever trying to chase backing chains, can't be tricked
> into opening wrong files, I'm not sure I agree with the claim that
> "probing isn't insecure" -
> 

Maybe we should draw the distinction at tools that write data?
Without a guest running, a tool that simply reads files should be safe
to probe.

> > 
> > * I didn't specify recognized extensions for formats "bochs", "cloop",
> >   "parallels", "vpc", because I have no idea which ones to recognize.
> 
> Fair enough.  If there are common enough extensions for a given format,
> I'm sure people can request they be recognized as future patches.
> 
> > 
> > Additionally, some tests still need to be updated.
> > 
> > Signed-off-by: Markus Armbruster <armbru@redhat.com>
> > ---
> >  block.c                   | 46 ++++++++++++++++++++++++++++++++++++++++++++--
> >  block/dmg.c               |  1 +
> >  block/qcow.c              |  1 +
> >  block/qcow2.c             |  1 +
> >  block/qed.c               |  1 +
> >  block/raw_bsd.c           |  1 +
> >  block/vdi.c               |  1 +
> >  block/vhdx.c              |  1 +
> >  block/vmdk.c              |  1 +
> >  include/block/block_int.h |  1 +
> >  10 files changed, 53 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block.c b/block.c
> > index 8da6e61..7a1c592 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -674,10 +674,37 @@ static BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
> >      return drv;
> >  }
> >  
> > +/*
> > + * Guess image format from file name @fname
> > + */
> > +static BlockDriver *bdrv_guess_format(const char *fname)
> > +{
> > +    BlockDriver *d;
> > +    int i, ext;
> > +    size_t fn_len = strlen(fname);
> > +    size_t ext_len;
> > +
> > +    QLIST_FOREACH(d, &bdrv_drivers, list) {
> > +        for (i = 0; i < ARRAY_SIZE(d->fname_ext) && d->fname_ext[i]; i++) {
> > +            ext_len = strlen(d->fname_ext[i]);
> > +            if (fn_len <= ext_len) {
> > +                continue;
> > +            }
> > +            ext = fn_len - ext_len;
> > +            if (fname[ext - 1] == '.'
> > +                && !strcmp(fname + ext, d->fname_ext[i])) {
> > +                return d;
> 
> What happens if more than one format tends to pick the same extension?
> For example, would you consider '.qcow' a typical extension for qcow2
> files, even though it would probably match the older qcow driver first?...
>

I think this could arguably end up being the case for VHD and VHDX
(i.e., both using .vhd).

I guess the question is, in the case of common extensions, should the
priority be on the probe, or on the extension?  With the way the code
is written, the priority is all going to depend on the order the
driver is registered, so it may or may not warn.

Currently, the code does a format probe, does an independent extension
lookup, and checks if the two agree.  Instead, would it be sufficient
to do the format probe, and then just verify the detected driver has a
compatible extension name?

> > +++ b/include/block/block_int.h
> > @@ -91,6 +91,7 @@ struct BlockDriver {
> >  
> >      int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename);
> >      int (*bdrv_probe_device)(const char *filename);
> > +    const char *fname_ext[2];
> 
> ...Answering myself - so far, you haven't included any collision extensions.
> 
> Overall, I like the idea of this patch, but still wonder if we can do
> better with the auto-amending of writable image metadata to record
> anything that we ever learned from an initial probe.
> 
> -- 
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
Jeff Cody Oct. 28, 2014, 6:33 p.m. UTC | #3
On Tue, Oct 28, 2014 at 05:03:40PM +0100, Markus Armbruster wrote:
> If the user neglects to specify the image format, QEMU probes the
> image to guess it automatically, for convenience.
> 
> Relying on format probing is insecure for raw images (CVE-2008-2004).
> If the guest writes a suitable header to the device, the next probe
> will recognize a format chosen by the guest.  A malicious guest can
> abuse this to gain access to host files, e.g. by crafting a QCOW2
> header with backing file /etc/shadow.
> 
> Commit 1e72d3b (April 2008) provided -drive parameter format to let
> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
> optionally store the backing file format, to let users disable backing
> file probing.  QED has had a flag to suppress probing since the
> beginning (2010), set whenever a raw backing file is assigned.
> 
> Despite all this work (and time!), we're still insecure by default.  I
> think we're doing our users a disservice by sticking to the fatally
> flawed probing.  "Broken by default" is just wrong, and "convenience"
> is no excuse.
> 
> I believe we can retain 90% of the convenience without compromising
> security by keying on image file name instead of image contents: if
> the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
> assume qcow2, and so forth.
> 
> Naturally, this would break command lines where the filename doesn't
> provide the correct clue.  So don't do it just yet, only warn if the
> the change would lead to a different result.  Looks like this:
> 
>     qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
>     To get rid of this warning, specify format=qcow2 explicitly, or change
>     the image name to end with '.qcow2'
> 
> This should steer users away from insecure format probing.  After a
> suitable grace period, we can hopefully drop format probing
> alltogether.
> 
> Example 0: file=WHATEVER,format=F
> 
> Never warns, because the explicit format suppresses probing.
> 
> Example 1: file=FOO.img
> 
> Warns when probing of FOO.img results in anything but raw.  In
> particular, it warns when the guest just p0wned you.
> 
> Example 2: file=FOO.qcow2 with backing file name FOO.img and no
> backing image format.
> 
> Warns when probing of FOO.qcow2 results in anything but qcow2, or
> probing of FOO.img results in anything but raw.
> 
> This patch is RFC because of open questions:
> 
> * Should tools warn, too?  Probing isn't insecure there, but a "this
>   may pick a different format in the future" warning may be
>   appropriate.
> 
> * I didn't specify recognized extensions for formats "bochs", "cloop",
>   "parallels", "vpc", because I have no idea which ones to recognize.
>

Format 'vpc' should probably recognize both extensions "vpc" and
"vhd".  The actual format is VHD, so most MS tools will probably
create files with .vhd extensions.

(This creates an overlap with vhdx; see my response to Eric's email).

> Additionally, some tests still need to be updated.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>


[ ...]

> diff --git a/block/vhdx.c b/block/vhdx.c
> index 12bfe75..d2c3a20 100644
> --- a/block/vhdx.c
> +++ b/block/vhdx.c
> @@ -1945,6 +1945,7 @@ static BlockDriver bdrv_vhdx = {
>      .format_name            = "vhdx",
>      .instance_size          = sizeof(BDRVVHDXState),
>      .bdrv_probe             = vhdx_probe,
> +    .fname_ext              = { "vhd" },

This should also have "vhdx", I think.  

>      .bdrv_open              = vhdx_open,
>      .bdrv_close             = vhdx_close,
>      .bdrv_reopen_prepare    = vhdx_reopen_prepare,
> diff --git a/block/vmdk.c b/block/vmdk.c
> index 673d3f5..91a42d2 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -2225,6 +2225,7 @@ static BlockDriver bdrv_vmdk = {
>      .format_name                  = "vmdk",
>      .instance_size                = sizeof(BDRVVmdkState),
>      .bdrv_probe                   = vmdk_probe,
> +    .fname_ext                    = { "vdmk" },
>      .bdrv_open                    = vmdk_open,
>      .bdrv_check                   = vmdk_check,
>      .bdrv_reopen_prepare          = vmdk_reopen_prepare,
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 8898c6c..55609ae 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -91,6 +91,7 @@ struct BlockDriver {
>  
>      int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename);
>      int (*bdrv_probe_device)(const char *filename);
> +    const char *fname_ext[2];
>  
>      /* Any driver implementing this callback is expected to be able to handle
>       * NULL file names in its .bdrv_open() implementation */
> -- 
> 1.9.3
>
Eric Blake Oct. 28, 2014, 6:56 p.m. UTC | #4
On 10/28/2014 12:29 PM, Jeff Cody wrote:

>>> This patch is RFC because of open questions:
>>>
>>> * Should tools warn, too?  Probing isn't insecure there, but a "this
>>>   may pick a different format in the future" warning may be
>>>   appropriate.
>>
>> Yes.  For precedent, libvirt can be considered a tool on images for
>> certain operations, and libvirt has been warning about probing since 2010.
>>
> 
> I think at least the invocation 'qemu-img info' should be exempt from
> the warning; doing a format probe is arguably part of its intended
> usage.
> 
>> Also, while I agree that any tool that operates on ONLY one layer of an
>> image, without ever trying to chase backing chains, can't be tricked
>> into opening wrong files, I'm not sure I agree with the claim that
>> "probing isn't insecure" -
>>
> 
> Maybe we should draw the distinction at tools that write data?
> Without a guest running, a tool that simply reads files should be safe
> to probe.

Misprobing a top-level raw file as qcow2 can result in opening and
reading a backing file, even when the top-level file was opened with
read-only intent.  If the guest can stick some sort of /proc filesystem
name as a qcow2 backing file for interpretation for a bogus probe of a
raw file, you can result in hanging the process in trying to read the
backing file.  Even if you aren't leaking data about what was read, this
could still possibly constitute a denial of service attack.

I was about to propose these two rules as something I'd still feel more
comfortable with:

if it is the top-level file, then warn for read-write access doing a
probe where the probe differed from filename heuristics, be silent for
read-only access doing a probe (whether or not the file claims to have a
backing image)

if it is chasing the backing chain (necessarily read-only access of the
backing), then warn if the backing format was not specified and the
probe differs from filename heuristics

But that still has the drawback that if the backing file is some /proc
name that will cause the process to hang, you don't want to print the
message until after you read the file to discover that the probe
differed from heuristics, but it is doing the open/read that determines
the hang.  So I don't see an elegant way to break the chicken-and-egg
problem.


>> What happens if more than one format tends to pick the same extension?
>> For example, would you consider '.qcow' a typical extension for qcow2
>> files, even though it would probably match the older qcow driver first?...
>>
> 
> I think this could arguably end up being the case for VHD and VHDX
> (i.e., both using .vhd).
> 
> I guess the question is, in the case of common extensions, should the
> priority be on the probe, or on the extension?  With the way the code
> is written, the priority is all going to depend on the order the
> driver is registered, so it may or may not warn.

Technically, don't we correctly probe both VHD and VHDX files?  It is
only files that start out raw and later get mis-probed as non-raw where
we have an issue, so I'd rather treat the probe as accurate if it
matches a common extension for that format, and NOT treat the extension
as dictating a single required format.

> 
> Currently, the code does a format probe, does an independent extension
> lookup, and checks if the two agree.  Instead, would it be sufficient
> to do the format probe, and then just verify the detected driver has a
> compatible extension name?

Yes, that was what I was thinking as well.
Jeff Cody Oct. 28, 2014, 7:42 p.m. UTC | #5
On Tue, Oct 28, 2014 at 12:56:37PM -0600, Eric Blake wrote:
> On 10/28/2014 12:29 PM, Jeff Cody wrote:
> 
> >>> This patch is RFC because of open questions:
> >>>
> >>> * Should tools warn, too?  Probing isn't insecure there, but a "this
> >>>   may pick a different format in the future" warning may be
> >>>   appropriate.
> >>
> >> Yes.  For precedent, libvirt can be considered a tool on images for
> >> certain operations, and libvirt has been warning about probing since 2010.
> >>
> > 
> > I think at least the invocation 'qemu-img info' should be exempt from
> > the warning; doing a format probe is arguably part of its intended
> > usage.
> > 
> >> Also, while I agree that any tool that operates on ONLY one layer of an
> >> image, without ever trying to chase backing chains, can't be tricked
> >> into opening wrong files, I'm not sure I agree with the claim that
> >> "probing isn't insecure" -
> >>
> > 
> > Maybe we should draw the distinction at tools that write data?
> > Without a guest running, a tool that simply reads files should be safe
> > to probe.
> 
> Misprobing a top-level raw file as qcow2 can result in opening and
> reading a backing file, even when the top-level file was opened with
> read-only intent.  If the guest can stick some sort of /proc filesystem
> name as a qcow2 backing file for interpretation for a bogus probe of a
> raw file, you can result in hanging the process in trying to read the
> backing file.  Even if you aren't leaking data about what was read, this
> could still possibly constitute a denial of service attack.
>

True, but the warning doesn't prevent the probe.  My thinking is that
if I am running 'qemu-img info' without specifying a format, I
explicitly want the probe (how else to determine the format of a .img
file, or other generic file/device?)

But I am not hung up on this; a warning won't negate the usefulness of
'qemu-img info', so if others feel it is useful in that usage case, it
is OK with me.

> I was about to propose these two rules as something I'd still feel more
> comfortable with:
> 
> if it is the top-level file, then warn for read-write access doing a
> probe where the probe differed from filename heuristics, be silent for
> read-only access doing a probe (whether or not the file claims to have a
> backing image)
> 
> if it is chasing the backing chain (necessarily read-only access of the
> backing), then warn if the backing format was not specified and the
> probe differs from filename heuristics
>

It'd also be nice if there was something that indicated the tree depth
the warning was from - it may be confusing for the user if they run a
qemu command on 'image.qcow2', and get a warning because a backing
file image in the chain just had a generic '.img' extension.

> But that still has the drawback that if the backing file is some /proc
> name that will cause the process to hang, you don't want to print the
> message until after you read the file to discover that the probe
> differed from heuristics, but it is doing the open/read that determines
> the hang.  So I don't see an elegant way to break the chicken-and-egg
> problem.
> 
> 
> >> What happens if more than one format tends to pick the same extension?
> >> For example, would you consider '.qcow' a typical extension for qcow2
> >> files, even though it would probably match the older qcow driver first?...
> >>
> > 
> > I think this could arguably end up being the case for VHD and VHDX
> > (i.e., both using .vhd).
> > 
> > I guess the question is, in the case of common extensions, should the
> > priority be on the probe, or on the extension?  With the way the code
> > is written, the priority is all going to depend on the order the
> > driver is registered, so it may or may not warn.
> 
> Technically, don't we correctly probe both VHD and VHDX files?  It is
> only files that start out raw and later get mis-probed as non-raw where
> we have an issue, so I'd rather treat the probe as accurate if it
> matches a common extension for that format, and NOT treat the extension
> as dictating a single required format.
>
> > 
> > Currently, the code does a format probe, does an independent extension
> > lookup, and checks if the two agree.  Instead, would it be sufficient
> > to do the format probe, and then just verify the detected driver has a
> > compatible extension name?
> 
> Yes, that was what I was thinking as well.
>
Fam Zheng Oct. 29, 2014, 1:16 a.m. UTC | #6
On Tue, 10/28 17:03, Markus Armbruster wrote:
> diff --git a/block/vmdk.c b/block/vmdk.c
> index 673d3f5..91a42d2 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -2225,6 +2225,7 @@ static BlockDriver bdrv_vmdk = {
>      .format_name                  = "vmdk",
>      .instance_size                = sizeof(BDRVVmdkState),
>      .bdrv_probe                   = vmdk_probe,
> +    .fname_ext                    = { "vdmk" },

s/vdmk/vmdk/

Fam

>      .bdrv_open                    = vmdk_open,
>      .bdrv_check                   = vmdk_check,
>      .bdrv_reopen_prepare          = vmdk_reopen_prepare,
Markus Armbruster Oct. 29, 2014, 6:32 a.m. UTC | #7
Fam Zheng <famz@redhat.com> writes:

> On Tue, 10/28 17:03, Markus Armbruster wrote:
>> diff --git a/block/vmdk.c b/block/vmdk.c
>> index 673d3f5..91a42d2 100644
>> --- a/block/vmdk.c
>> +++ b/block/vmdk.c
>> @@ -2225,6 +2225,7 @@ static BlockDriver bdrv_vmdk = {
>>      .format_name                  = "vmdk",
>>      .instance_size                = sizeof(BDRVVmdkState),
>>      .bdrv_probe                   = vmdk_probe,
>> +    .fname_ext                    = { "vdmk" },
>
> s/vdmk/vmdk/

D'oh!  Thanks :)

>
> Fam
>
>>      .bdrv_open                    = vmdk_open,
>>      .bdrv_check                   = vmdk_check,
>>      .bdrv_reopen_prepare          = vmdk_reopen_prepare,
Markus Armbruster Oct. 29, 2014, 6:37 a.m. UTC | #8
Jeff Cody <jcody@redhat.com> writes:

> On Tue, Oct 28, 2014 at 05:03:40PM +0100, Markus Armbruster wrote:
>> If the user neglects to specify the image format, QEMU probes the
>> image to guess it automatically, for convenience.
>> 
>> Relying on format probing is insecure for raw images (CVE-2008-2004).
>> If the guest writes a suitable header to the device, the next probe
>> will recognize a format chosen by the guest.  A malicious guest can
>> abuse this to gain access to host files, e.g. by crafting a QCOW2
>> header with backing file /etc/shadow.
>> 
>> Commit 1e72d3b (April 2008) provided -drive parameter format to let
>> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
>> optionally store the backing file format, to let users disable backing
>> file probing.  QED has had a flag to suppress probing since the
>> beginning (2010), set whenever a raw backing file is assigned.
>> 
>> Despite all this work (and time!), we're still insecure by default.  I
>> think we're doing our users a disservice by sticking to the fatally
>> flawed probing.  "Broken by default" is just wrong, and "convenience"
>> is no excuse.
>> 
>> I believe we can retain 90% of the convenience without compromising
>> security by keying on image file name instead of image contents: if
>> the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
>> assume qcow2, and so forth.
>> 
>> Naturally, this would break command lines where the filename doesn't
>> provide the correct clue.  So don't do it just yet, only warn if the
>> the change would lead to a different result.  Looks like this:
>> 
>>     qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
>>     To get rid of this warning, specify format=qcow2 explicitly, or change
>>     the image name to end with '.qcow2'
>> 
>> This should steer users away from insecure format probing.  After a
>> suitable grace period, we can hopefully drop format probing
>> alltogether.
>> 
>> Example 0: file=WHATEVER,format=F
>> 
>> Never warns, because the explicit format suppresses probing.
>> 
>> Example 1: file=FOO.img
>> 
>> Warns when probing of FOO.img results in anything but raw.  In
>> particular, it warns when the guest just p0wned you.
>> 
>> Example 2: file=FOO.qcow2 with backing file name FOO.img and no
>> backing image format.
>> 
>> Warns when probing of FOO.qcow2 results in anything but qcow2, or
>> probing of FOO.img results in anything but raw.
>> 
>> This patch is RFC because of open questions:
>> 
>> * Should tools warn, too?  Probing isn't insecure there, but a "this
>>   may pick a different format in the future" warning may be
>>   appropriate.
>> 
>> * I didn't specify recognized extensions for formats "bochs", "cloop",
>>   "parallels", "vpc", because I have no idea which ones to recognize.
>>
>
> Format 'vpc' should probably recognize both extensions "vpc" and
> "vhd".  The actual format is VHD, so most MS tools will probably
> create files with .vhd extensions.
>
> (This creates an overlap with vhdx; see my response to Eric's email).

Going to discuss it there.

>> Additionally, some tests still need to be updated.
>> 
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>
>
> [ ...]
>
>> diff --git a/block/vhdx.c b/block/vhdx.c
>> index 12bfe75..d2c3a20 100644
>> --- a/block/vhdx.c
>> +++ b/block/vhdx.c
>> @@ -1945,6 +1945,7 @@ static BlockDriver bdrv_vhdx = {
>>      .format_name            = "vhdx",
>>      .instance_size          = sizeof(BDRVVHDXState),
>>      .bdrv_probe             = vhdx_probe,
>> +    .fname_ext              = { "vhd" },
>
> This should also have "vhdx", I think.  

Okay.  I looked for confirmation in Wikipedia, and found:

    Hyper-V, like Microsoft Virtual Server and Windows Virtual PC, saves
    each guest OS to a single virtual hard disk file with the extension
    .VHD, except in Windows 8 and Windows Server 2012 where it can be
    the newer .vhdx.

https://en.wikipedia.org/wiki/Hyper-V#VHD_compatibility_with_Virtual_Server_2005_and_Virtual_PC_2004.2F2007

Makes me wonder whether .vhd is really used for both vhdx and vpc format
images.  What have you seen in the wild?

>>      .bdrv_open              = vhdx_open,
>>      .bdrv_close             = vhdx_close,
>>      .bdrv_reopen_prepare    = vhdx_reopen_prepare,
[...]
Markus Armbruster Oct. 29, 2014, 7:03 a.m. UTC | #9
Eric Blake <eblake@redhat.com> writes:

> On 10/28/2014 10:03 AM, Markus Armbruster wrote:
>> If the user neglects to specify the image format, QEMU probes the
>> image to guess it automatically, for convenience.
>> 
>> Relying on format probing is insecure for raw images (CVE-2008-2004).
>> If the guest writes a suitable header to the device, the next probe
>> will recognize a format chosen by the guest.  A malicious guest can
>> abuse this to gain access to host files, e.g. by crafting a QCOW2
>> header with backing file /etc/shadow.
>> 
>> Commit 1e72d3b (April 2008) provided -drive parameter format to let
>> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
>> optionally store the backing file format, to let users disable backing
>> file probing.  QED has had a flag to suppress probing since the
>> beginning (2010), set whenever a raw backing file is assigned.
>
> May also be worth mentioning libvirt CVE-2010-2237, CVE-2010-2238,
> CVE-2010-2239, when libvirt was (mostly) patched to fix probing mishaps
> by defaulting to refusing to probe by default.  And as you say, we keep
> discovering variations on the theme (as recently as this year, I patched
> a bug where libvirt accidentally probes during a drive-mirror operation
> even when the user had explicitly marked an image as raw, although it
> was not given a CVE because of libvirt's default to refuse to probe).

I'll work that in, thanks.

>> Despite all this work (and time!), we're still insecure by default.  I
>> think we're doing our users a disservice by sticking to the fatally
>> flawed probing.  "Broken by default" is just wrong, and "convenience"
>> is no excuse.
>> 
>> I believe we can retain 90% of the convenience without compromising
>> security by keying on image file name instead of image contents: if
>> the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
>> assume qcow2, and so forth.
>> 
>> Naturally, this would break command lines where the filename doesn't
>> provide the correct clue.  So don't do it just yet, only warn if the
>> the change would lead to a different result.  Looks like this:
>> 
>>     qemu: -drive file=my.img: warning: insecure format probing of
>> image 'my.img'
>>     To get rid of this warning, specify format=qcow2 explicitly, or change
>>     the image name to end with '.qcow2'
>> 
>> This should steer users away from insecure format probing.  After a
>> suitable grace period, we can hopefully drop format probing
>> alltogether.
>
> s/alltogether/altogether/

Okay.

>> Example 0: file=WHATEVER,format=F
>> 
>> Never warns, because the explicit format suppresses probing.
>
> Good, because this is what libvirt tries to use most of the time.
>
>> 
>> Example 1: file=FOO.img
>> 
>> Warns when probing of FOO.img results in anything but raw.  In
>> particular, it warns when the guest just p0wned you.
>
> Warning will be a false positive if the user misnamed their
> intentionally-created qcow2 file, but that's okay; the warning message
> tells them how to fix their setup.

I consider that a feature :)

>                                     However, I agree that most users
> pick .img or .iso for raw files, so when the probe finds something other
> than raw that the warning is worth it; better to catch the real problems
> in spite of the false negatives.

Positives, you mean.

>> Example 2: file=FOO.qcow2 with backing file name FOO.img and no
>> backing image format.
>> 
>> Warns when probing of FOO.qcow2 results in anything but qcow2, or
>> probing of FOO.img results in anything but raw.
>
> It's nice that it only warns the user when the probe came back with
> confusing results, while maintaining the convenience factor (no warning
> if probe matched expectations).  But this means the user is set up for
> late failure.
>
> Would it be any smarter to teach tools to automatically write in a
> backing format for any image that probed correctly where the backing
> format had not been previously recorded?  That is, if we are opening
> FOO.qcow2 for writing and it currently claims backing file BAR.img and
> no format, then before closing FOO.qcow2 we also write the backing
> format of raw into the metadata.  That way, the next time we open
> FOO.qcow2, we no longer have to probe BAR.img, and are ensured that we
> are reading the backing file with the same format as previously. (When
> opening FOO.qcow2 read-only, we don't have that luxury).  In other
> words, while we may still need to probe the TOP level file every time it
> is opened (for convenience for users that don't type the format on the
> command line every time), we should at LEAST minimize probing backing
> files to the just the first try.

Three separate cases, I think:

1. Creating an image without an explicitly specified backing file format

   Currently, we don't store a backing file format then.  You're
   proposing to record the probed format.  Example: qemu-img create.

2. Changing an image's backing file without explicitly specifying format

   Currently, we don't store a backing file format then.  You're
   proposing to record the probed format.  Example: qemu-img rebase

3. Opening an image read/write

   We don't mess with the stored backing file format then.  You're
   proposing to record the probed format.

Opinions?

> Not shown - what is the error message when a file extension that
> normally indicates a non-raw file comes up raw?  That is, if 'FOO.qcow2'
> does not actually contain qcow2, is that also worth a warning to the
> user?

This is the first of two possible warnings in example 2.

>        What happens where we don't have a file extension (such as when
> using block device names to hold qcow2 files, where there is no '.' in
> the name)?

Such names give no clue (in coding terms: extensions are matched
including the '.').  Therefore, the user has to specify the format.

We could use stat() in addition to the filename.  Say, block and
character devices are assumed raw, everything else is guessed from the
filename.  That way, the user has to specify the format only when he
uses devices in unusual ways.  Opinions?

>> This patch is RFC because of open questions:
>> 
>> * Should tools warn, too?  Probing isn't insecure there, but a "this
>>   may pick a different format in the future" warning may be
>>   appropriate.
>
> Yes.  For precedent, libvirt can be considered a tool on images for
> certain operations, and libvirt has been warning about probing since 2010.
>
> Also, while I agree that any tool that operates on ONLY one layer of an
> image, without ever trying to chase backing chains, can't be tricked
> into opening wrong files, I'm not sure I agree with the claim that
> "probing isn't insecure" -

I used a narrow definition of "insecure" here, namely the exact
CVE-2008-2004.  I shouldn't.

>> * I didn't specify recognized extensions for formats "bochs", "cloop",
>>   "parallels", "vpc", because I have no idea which ones to recognize.
>
> Fair enough.  If there are common enough extensions for a given format,
> I'm sure people can request they be recognized as future patches.

Or in review of this one :)

>> Additionally, some tests still need to be updated.
>> 
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>  block.c                   | 46 ++++++++++++++++++++++++++++++++++++++++++++--
>>  block/dmg.c               |  1 +
>>  block/qcow.c              |  1 +
>>  block/qcow2.c             |  1 +
>>  block/qed.c               |  1 +
>>  block/raw_bsd.c           |  1 +
>>  block/vdi.c               |  1 +
>>  block/vhdx.c              |  1 +
>>  block/vmdk.c              |  1 +
>>  include/block/block_int.h |  1 +
>>  10 files changed, 53 insertions(+), 2 deletions(-)
>> 
>> diff --git a/block.c b/block.c
>> index 8da6e61..7a1c592 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -674,10 +674,37 @@ static BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
>>      return drv;
>>  }
>>  
>> +/*
>> + * Guess image format from file name @fname
>> + */
>> +static BlockDriver *bdrv_guess_format(const char *fname)
>> +{
>> +    BlockDriver *d;
>> +    int i, ext;
>> +    size_t fn_len = strlen(fname);
>> +    size_t ext_len;
>> +
>> +    QLIST_FOREACH(d, &bdrv_drivers, list) {
>> +        for (i = 0; i < ARRAY_SIZE(d->fname_ext) && d->fname_ext[i]; i++) {
>> +            ext_len = strlen(d->fname_ext[i]);
>> +            if (fn_len <= ext_len) {
>> +                continue;
>> +            }
>> +            ext = fn_len - ext_len;
>> +            if (fname[ext - 1] == '.'
>> +                && !strcmp(fname + ext, d->fname_ext[i])) {
>> +                return d;
>
> What happens if more than one format tends to pick the same extension?
> For example, would you consider '.qcow' a typical extension for qcow2
> files, even though it would probably match the older qcow driver first?...

First one wins.  Therefore, adding the same extension to multiple
formats make no sense with this code.

>> +++ b/include/block/block_int.h
>> @@ -91,6 +91,7 @@ struct BlockDriver {
>>  
>>      int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char
>> *filename);
>>      int (*bdrv_probe_device)(const char *filename);
>> +    const char *fname_ext[2];
>
> ...Answering myself - so far, you haven't included any collision extensions.
>
> Overall, I like the idea of this patch, but still wonder if we can do
> better with the auto-amending of writable image metadata to record
> anything that we ever learned from an initial probe.

Thanks for your thoughtful review!
Markus Armbruster Oct. 29, 2014, 7:22 a.m. UTC | #10
Eric Blake <eblake@redhat.com> writes:

> On 10/28/2014 12:29 PM, Jeff Cody wrote:
[...]
>>> What happens if more than one format tends to pick the same extension?
>>> For example, would you consider '.qcow' a typical extension for qcow2
>>> files, even though it would probably match the older qcow driver first?...
>>>
>> 
>> I think this could arguably end up being the case for VHD and VHDX
>> (i.e., both using .vhd).
>> 
>> I guess the question is, in the case of common extensions, should the
>> priority be on the probe, or on the extension?  With the way the code
>> is written, the priority is all going to depend on the order the
>> driver is registered, so it may or may not warn.
>
> Technically, don't we correctly probe both VHD and VHDX files?  It is
> only files that start out raw and later get mis-probed as non-raw where
> we have an issue, so I'd rather treat the probe as accurate if it
> matches a common extension for that format, and NOT treat the extension
> as dictating a single required format.
>
>> 
>> Currently, the code does a format probe, does an independent extension
>> lookup, and checks if the two agree.  Instead, would it be sufficient
>> to do the format probe, and then just verify the detected driver has a
>> compatible extension name?
>
> Yes, that was what I was thinking as well.

I designed the code with the eventual removal of probing in mind:

    This should steer users away from insecure format probing.  After a
    suitable grace period, we can hopefully drop format probing
    alltogether.

The warning triggers on insecure probing.  That's one view of it.  The
other view is it triggers when potentially insecure probing and our new
secure guessing come up with different answers.  It serves as warning of
future behavioral change.

If we only check the extension matches the probing, we gain support for
ambiguous file name extensions, but make the future change warning
incomplete.  That's bad.

If .vhd can really be two formats so different we actually want to
separate block drivers for them, we want something more sophisticated
than my patch.  Need to think.
Markus Armbruster Oct. 29, 2014, 7:36 a.m. UTC | #11
Jeff Cody <jcody@redhat.com> writes:

> On Tue, Oct 28, 2014 at 12:56:37PM -0600, Eric Blake wrote:
>> On 10/28/2014 12:29 PM, Jeff Cody wrote:
>> 
>> >>> This patch is RFC because of open questions:
>> >>>
>> >>> * Should tools warn, too?  Probing isn't insecure there, but a "this
>> >>>   may pick a different format in the future" warning may be
>> >>>   appropriate.
>> >>
>> >> Yes.  For precedent, libvirt can be considered a tool on images for
>> >> certain operations, and libvirt has been warning about probing since 2010.
>> >>
>> > 
>> > I think at least the invocation 'qemu-img info' should be exempt from
>> > the warning; doing a format probe is arguably part of its intended
>> > usage.
>> > 
>> >> Also, while I agree that any tool that operates on ONLY one layer of an
>> >> image, without ever trying to chase backing chains, can't be tricked
>> >> into opening wrong files, I'm not sure I agree with the claim that
>> >> "probing isn't insecure" -
>> >>
>> > 
>> > Maybe we should draw the distinction at tools that write data?
>> > Without a guest running, a tool that simply reads files should be safe
>> > to probe.
>> 
>> Misprobing a top-level raw file as qcow2 can result in opening and
>> reading a backing file, even when the top-level file was opened with
>> read-only intent.  If the guest can stick some sort of /proc filesystem
>> name as a qcow2 backing file for interpretation for a bogus probe of a
>> raw file, you can result in hanging the process in trying to read the
>> backing file.  Even if you aren't leaking data about what was read, this
>> could still possibly constitute a denial of service attack.
>>
>
> True, but the warning doesn't prevent the probe.  My thinking is that
> if I am running 'qemu-img info' without specifying a format, I
> explicitly want the probe (how else to determine the format of a .img
> file, or other generic file/device?)
>
> But I am not hung up on this; a warning won't negate the usefulness of
> 'qemu-img info', so if others feel it is useful in that usage case, it
> is OK with me.

As far as I can tell, "qemu-img info" doesn't probe the backing file.

I'd prefer not to warn there.

>> I was about to propose these two rules as something I'd still feel more
>> comfortable with:
>> 
>> if it is the top-level file, then warn for read-write access doing a
>> probe where the probe differed from filename heuristics, be silent for
>> read-only access doing a probe (whether or not the file claims to have a
>> backing image)
>> 
>> if it is chasing the backing chain (necessarily read-only access of the
>> backing), then warn if the backing format was not specified and the
>> probe differs from filename heuristics

Have you considered the "warn of future change" role?

> It'd also be nice if there was something that indicated the tree depth
> the warning was from - it may be confusing for the user if they run a
> qemu command on 'image.qcow2', and get a warning because a backing
> file image in the chain just had a generic '.img' extension.

This is how it looks now:

    qemu: -drive file=flawed.img,if=none: warning: insecure format probing of image 'flawed.img'
    To get rid of this warning, specify format=qcow2 explicitly, or change
    the image name to end with '.qcow2'
    qemu: -drive file=flawed.img,if=none: warning: insecure format probing of image 'backing.img'
    To get rid of this warning, specify format=qcow2 explicitly, or change
    the image name to end with '.qcow2'

Would be less clear with a differently named backing file.  Could you
sketch what you'd like to see?

>> But that still has the drawback that if the backing file is some /proc
>> name that will cause the process to hang, you don't want to print the
>> message until after you read the file to discover that the probe
>> differed from heuristics, but it is doing the open/read that determines
>> the hang.  So I don't see an elegant way to break the chicken-and-egg
>> problem.

Probing needs to die.  Leave it to file(1).

[...]
Max Reitz Oct. 29, 2014, 8:25 a.m. UTC | #12
On 2014-10-29 at 08:36, Markus Armbruster wrote:
> Jeff Cody <jcody@redhat.com> writes:
>
>> On Tue, Oct 28, 2014 at 12:56:37PM -0600, Eric Blake wrote:
>>> On 10/28/2014 12:29 PM, Jeff Cody wrote:
>>>
>>>>>> This patch is RFC because of open questions:
>>>>>>
>>>>>> * Should tools warn, too?  Probing isn't insecure there, but a "this
>>>>>>    may pick a different format in the future" warning may be
>>>>>>    appropriate.
>>>>> Yes.  For precedent, libvirt can be considered a tool on images for
>>>>> certain operations, and libvirt has been warning about probing since 2010.
>>>>>
>>>> I think at least the invocation 'qemu-img info' should be exempt from
>>>> the warning; doing a format probe is arguably part of its intended
>>>> usage.
>>>>
>>>>> Also, while I agree that any tool that operates on ONLY one layer of an
>>>>> image, without ever trying to chase backing chains, can't be tricked
>>>>> into opening wrong files, I'm not sure I agree with the claim that
>>>>> "probing isn't insecure" -
>>>>>
>>>> Maybe we should draw the distinction at tools that write data?
>>>> Without a guest running, a tool that simply reads files should be safe
>>>> to probe.
>>> Misprobing a top-level raw file as qcow2 can result in opening and
>>> reading a backing file, even when the top-level file was opened with
>>> read-only intent.  If the guest can stick some sort of /proc filesystem
>>> name as a qcow2 backing file for interpretation for a bogus probe of a
>>> raw file, you can result in hanging the process in trying to read the
>>> backing file.  Even if you aren't leaking data about what was read, this
>>> could still possibly constitute a denial of service attack.
>>>
>> True, but the warning doesn't prevent the probe.  My thinking is that
>> if I am running 'qemu-img info' without specifying a format, I
>> explicitly want the probe (how else to determine the format of a .img
>> file, or other generic file/device?)
>>
>> But I am not hung up on this; a warning won't negate the usefulness of
>> 'qemu-img info', so if others feel it is useful in that usage case, it
>> is OK with me.
> As far as I can tell, "qemu-img info" doesn't probe the backing file.
>
> I'd prefer not to warn there.

Except for when you're using --backing-chain.

I don't really see the point in warning because qemu-img acts with the 
privileges of the invoking user and only passes data to that user, so 
there should not be any security issues here. However, we may want to 
warn anyway just so the user knows that he/she should rename the image file.

So for me it comes down to what is easier, and I think just always 
emitting the warning is easier.

Max

>>> I was about to propose these two rules as something I'd still feel more
>>> comfortable with:
>>>
>>> if it is the top-level file, then warn for read-write access doing a
>>> probe where the probe differed from filename heuristics, be silent for
>>> read-only access doing a probe (whether or not the file claims to have a
>>> backing image)
>>>
>>> if it is chasing the backing chain (necessarily read-only access of the
>>> backing), then warn if the backing format was not specified and the
>>> probe differs from filename heuristics
> Have you considered the "warn of future change" role?
>
>> It'd also be nice if there was something that indicated the tree depth
>> the warning was from - it may be confusing for the user if they run a
>> qemu command on 'image.qcow2', and get a warning because a backing
>> file image in the chain just had a generic '.img' extension.
> This is how it looks now:
>
>      qemu: -drive file=flawed.img,if=none: warning: insecure format probing of image 'flawed.img'
>      To get rid of this warning, specify format=qcow2 explicitly, or change
>      the image name to end with '.qcow2'
>      qemu: -drive file=flawed.img,if=none: warning: insecure format probing of image 'backing.img'
>      To get rid of this warning, specify format=qcow2 explicitly, or change
>      the image name to end with '.qcow2'
>
> Would be less clear with a differently named backing file.  Could you
> sketch what you'd like to see?
>
>>> But that still has the drawback that if the backing file is some /proc
>>> name that will cause the process to hang, you don't want to print the
>>> message until after you read the file to discover that the probe
>>> differed from heuristics, but it is doing the open/read that determines
>>> the hang.  So I don't see an elegant way to break the chicken-and-egg
>>> problem.
> Probing needs to die.  Leave it to file(1).
>
> [...]
>
Kevin Wolf Oct. 29, 2014, 10:12 a.m. UTC | #13
Am 28.10.2014 um 17:03 hat Markus Armbruster geschrieben:
> If the user neglects to specify the image format, QEMU probes the
> image to guess it automatically, for convenience.
> 
> Relying on format probing is insecure for raw images (CVE-2008-2004).
> If the guest writes a suitable header to the device, the next probe
> will recognize a format chosen by the guest.  A malicious guest can
> abuse this to gain access to host files, e.g. by crafting a QCOW2
> header with backing file /etc/shadow.
> 
> Commit 1e72d3b (April 2008) provided -drive parameter format to let
> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
> optionally store the backing file format, to let users disable backing
> file probing.  QED has had a flag to suppress probing since the
> beginning (2010), set whenever a raw backing file is assigned.
> 
> Despite all this work (and time!), we're still insecure by default.  I
> think we're doing our users a disservice by sticking to the fatally
> flawed probing.  "Broken by default" is just wrong, and "convenience"
> is no excuse.
> 
> I believe we can retain 90% of the convenience without compromising
> security by keying on image file name instead of image contents: if
> the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
> assume qcow2, and so forth.
> 
> Naturally, this would break command lines where the filename doesn't
> provide the correct clue.  So don't do it just yet, only warn if the
> the change would lead to a different result.  Looks like this:
> 
>     qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
>     To get rid of this warning, specify format=qcow2 explicitly, or change
>     the image name to end with '.qcow2'
> 
> This should steer users away from insecure format probing.  After a
> suitable grace period, we can hopefully drop format probing
> alltogether.
> 
> Example 0: file=WHATEVER,format=F
> 
> Never warns, because the explicit format suppresses probing.
> 
> Example 1: file=FOO.img
> 
> Warns when probing of FOO.img results in anything but raw.  In
> particular, it warns when the guest just p0wned you.
> 
> Example 2: file=FOO.qcow2 with backing file name FOO.img and no
> backing image format.
> 
> Warns when probing of FOO.qcow2 results in anything but qcow2, or
> probing of FOO.img results in anything but raw.
> 
> This patch is RFC because of open questions:
> 
> * Should tools warn, too?  Probing isn't insecure there, but a "this
>   may pick a different format in the future" warning may be
>   appropriate.
> 
> * I didn't specify recognized extensions for formats "bochs", "cloop",
>   "parallels", "vpc", because I have no idea which ones to recognize.
> 
> Additionally, some tests still need to be updated.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>

I still don't like this approach very much.

The first of the reasons, and arguably the weakest one, is simply that I
aesthetically dislike to encode semantics in filenames by relying on
filename extensions. Feels like I'm being moved back to DOS times.


The second one, though, is probably a show stopper for me. You assume
that there even is a filename that could have an extension, and that it
is passed in the legacy filename argument instead of the QDict. With your
patches:

$ qemu-img create -f qcow2 /tmp/test.img 64M
Formatting '/tmp/test.img', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 lazy_refcounts=off 
$ qemu-system-x86_64 -drive file=/tmp/test.img
qemu-system-x86_64: -drive file=/tmp/test.img: warning: insecure format probing of image '/tmp/test.img'
To get rid of this warning, specify format=qcow2 explicitly, or change
the image name to end with '.qcow2'
$ x86_64-softmmu/qemu-system-x86_64 -drive file.filename=/tmp/test.img
Speicherzugriffsfehler (Speicherabzug geschrieben)

Oops, dereferencing a NULL pointer. Now, that's surely fixable, but what
I originally wanted to demonstrate is that you won't output a warning
there. Even that would still be fixable, by adding code into raw-posix,
but what do you do with '-drive file.driver=nbd,file.host=localhost'?

And how does your patch help for '-drive blkverify:hacked.img:good.img'?


Third, this is only a warning. In this form it doesn't help you much.
You only see the message when it's already too late, and you might not
see it at all because it can disappear somewhere deep in log files if
there is some scripting around qemu (we're only talking about
non-libvirt cases here anyway because libvirt always is explicit about
the format).

I know that your commit message says that you want to make it an error
eventually, but we all know qemu's policies regarding compatibility, so:
No, it won't happen, because there is no way to make it compatible.


Instead, let me try once more to sell my old proposal [1] from the
thread you mentioned:

> What if we let the raw driver know that it was probed and then it
> enables a check that returns -EIO for any write on the first 2k if that
> write would make the image look like a different format?

Attacks the problem where it arises instead of trying to detect the
outcome of it, and works in whatever way it is nested in the BDS graph
and whatever way is used to address the image file.

Kevin

[1] https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02548.html
Markus Armbruster Oct. 29, 2014, 1:54 p.m. UTC | #14
Kevin Wolf <kwolf@redhat.com> writes:

> Am 28.10.2014 um 17:03 hat Markus Armbruster geschrieben:
>> If the user neglects to specify the image format, QEMU probes the
>> image to guess it automatically, for convenience.
>> 
>> Relying on format probing is insecure for raw images (CVE-2008-2004).
>> If the guest writes a suitable header to the device, the next probe
>> will recognize a format chosen by the guest.  A malicious guest can
>> abuse this to gain access to host files, e.g. by crafting a QCOW2
>> header with backing file /etc/shadow.
>> 
>> Commit 1e72d3b (April 2008) provided -drive parameter format to let
>> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
>> optionally store the backing file format, to let users disable backing
>> file probing.  QED has had a flag to suppress probing since the
>> beginning (2010), set whenever a raw backing file is assigned.
>> 
>> Despite all this work (and time!), we're still insecure by default.  I
>> think we're doing our users a disservice by sticking to the fatally
>> flawed probing.  "Broken by default" is just wrong, and "convenience"
>> is no excuse.
>> 
>> I believe we can retain 90% of the convenience without compromising
>> security by keying on image file name instead of image contents: if
>> the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
>> assume qcow2, and so forth.
>> 
>> Naturally, this would break command lines where the filename doesn't
>> provide the correct clue.  So don't do it just yet, only warn if the
>> the change would lead to a different result.  Looks like this:
>> 
>>     qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
>>     To get rid of this warning, specify format=qcow2 explicitly, or change
>>     the image name to end with '.qcow2'
>> 
>> This should steer users away from insecure format probing.  After a
>> suitable grace period, we can hopefully drop format probing
>> alltogether.
>> 
>> Example 0: file=WHATEVER,format=F
>> 
>> Never warns, because the explicit format suppresses probing.
>> 
>> Example 1: file=FOO.img
>> 
>> Warns when probing of FOO.img results in anything but raw.  In
>> particular, it warns when the guest just p0wned you.
>> 
>> Example 2: file=FOO.qcow2 with backing file name FOO.img and no
>> backing image format.
>> 
>> Warns when probing of FOO.qcow2 results in anything but qcow2, or
>> probing of FOO.img results in anything but raw.
>> 
>> This patch is RFC because of open questions:
>> 
>> * Should tools warn, too?  Probing isn't insecure there, but a "this
>>   may pick a different format in the future" warning may be
>>   appropriate.
>> 
>> * I didn't specify recognized extensions for formats "bochs", "cloop",
>>   "parallels", "vpc", because I have no idea which ones to recognize.
>> 
>> Additionally, some tests still need to be updated.
>> 
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>
> I still don't like this approach very much.
>
> The first of the reasons, and arguably the weakest one, is simply that I
> aesthetically dislike to encode semantics in filenames by relying on
> filename extensions. Feels like I'm being moved back to DOS times.

It's the least bad solution so far, in my opinion.

For what it's worth, gcc decides what to do with a file based on its
file name extension, too.

> The second one, though, is probably a show stopper for me. You assume
> that there even is a filename that could have an extension, and that it
> is passed in the legacy filename argument instead of the QDict. With your
> patches:
>
> $ qemu-img create -f qcow2 /tmp/test.img 64M
> Formatting '/tmp/test.img', fmt=qcow2 size=67108864 encryption=off
> cluster_size=65536 lazy_refcounts=off
> $ qemu-system-x86_64 -drive file=/tmp/test.img
> qemu-system-x86_64: -drive file=/tmp/test.img: warning: insecure
> format probing of image '/tmp/test.img'
> To get rid of this warning, specify format=qcow2 explicitly, or change
> the image name to end with '.qcow2'
> $ x86_64-softmmu/qemu-system-x86_64 -drive file.filename=/tmp/test.img
> Speicherzugriffsfehler (Speicherabzug geschrieben)
>
> Oops, dereferencing a NULL pointer. Now, that's surely fixable, but what
> I originally wanted to demonstrate is that you won't output a warning
> there. Even that would still be fixable, by adding code into raw-posix,
> but what do you do with '-drive file.driver=nbd,file.host=localhost'?
>
> And how does your patch help for '-drive blkverify:hacked.img:good.img'?

I'll reply to this as soon as I've had time to think.

> Third, this is only a warning. In this form it doesn't help you much.
> You only see the message when it's already too late, and you might not
> see it at all because it can disappear somewhere deep in log files if
> there is some scripting around qemu (we're only talking about
> non-libvirt cases here anyway because libvirt always is explicit about
> the format).
>
> I know that your commit message says that you want to make it an error
> eventually, but we all know qemu's policies regarding compatibility, so:
> No, it won't happen, because there is no way to make it compatible.

I disagree.  "Backwards compatibility über alles" would be idiotic
policy, like pretty much every "über alles" is.  Letting backwards
compatibility win over making things secure by default has brought a
neverending chain of CVEs and bugs upon us.  Eric listed some.

No matter what we do, we're inconveniencing some users.

If we fix the user interface design flaw, some existing usage will
break.

If we do nothing, we expose users to a security risk they actively need
to opt out of.  Even the users who know that very well don't get it
right every time.  See again Eric's message for the libvirt-related CVEs
and bugs.

I'm arguing for limiting the inconvenience in scope and in time.  Time
means we actually fix the stupid thing instead of dragging it along
forever.  Scope means we try to keep as much usage as possible
unchanged.

> Instead, let me try once more to sell my old proposal [1] from the
> thread you mentioned:
>
>> What if we let the raw driver know that it was probed and then it
>> enables a check that returns -EIO for any write on the first 2k if that
>> write would make the image look like a different format?
>
> Attacks the problem where it arises instead of trying to detect the
> outcome of it, and works in whatever way it is nested in the BDS graph
> and whatever way is used to address the image file.
>
> Kevin
>
> [1] https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02548.html

Arbitrarily breaking the virtual hardware that way is incompatible, too.
It's a bigger no-no for me than changing user interface corner cases or
deciding what to do with a file based on its file name extension.

Anthony tried something similar (commit 79368c8), but couldn't get it
right (commit 8b33d9e).
Stefan Hajnoczi Oct. 29, 2014, 3:34 p.m. UTC | #15
On Wed, Oct 29, 2014 at 02:54:32PM +0100, Markus Armbruster wrote:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 28.10.2014 um 17:03 hat Markus Armbruster geschrieben:
> >> If the user neglects to specify the image format, QEMU probes the
> >> image to guess it automatically, for convenience.
> >> 
> >> Relying on format probing is insecure for raw images (CVE-2008-2004).
> >> If the guest writes a suitable header to the device, the next probe
> >> will recognize a format chosen by the guest.  A malicious guest can
> >> abuse this to gain access to host files, e.g. by crafting a QCOW2
> >> header with backing file /etc/shadow.
> >> 
> >> Commit 1e72d3b (April 2008) provided -drive parameter format to let
> >> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
> >> optionally store the backing file format, to let users disable backing
> >> file probing.  QED has had a flag to suppress probing since the
> >> beginning (2010), set whenever a raw backing file is assigned.
> >> 
> >> Despite all this work (and time!), we're still insecure by default.  I
> >> think we're doing our users a disservice by sticking to the fatally
> >> flawed probing.  "Broken by default" is just wrong, and "convenience"
> >> is no excuse.
> >> 
> >> I believe we can retain 90% of the convenience without compromising
> >> security by keying on image file name instead of image contents: if
> >> the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
> >> assume qcow2, and so forth.
> >> 
> >> Naturally, this would break command lines where the filename doesn't
> >> provide the correct clue.  So don't do it just yet, only warn if the
> >> the change would lead to a different result.  Looks like this:
> >> 
> >>     qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
> >>     To get rid of this warning, specify format=qcow2 explicitly, or change
> >>     the image name to end with '.qcow2'
> >> 
> >> This should steer users away from insecure format probing.  After a
> >> suitable grace period, we can hopefully drop format probing
> >> alltogether.
> >> 
> >> Example 0: file=WHATEVER,format=F
> >> 
> >> Never warns, because the explicit format suppresses probing.
> >> 
> >> Example 1: file=FOO.img
> >> 
> >> Warns when probing of FOO.img results in anything but raw.  In
> >> particular, it warns when the guest just p0wned you.
> >> 
> >> Example 2: file=FOO.qcow2 with backing file name FOO.img and no
> >> backing image format.
> >> 
> >> Warns when probing of FOO.qcow2 results in anything but qcow2, or
> >> probing of FOO.img results in anything but raw.
> >> 
> >> This patch is RFC because of open questions:
> >> 
> >> * Should tools warn, too?  Probing isn't insecure there, but a "this
> >>   may pick a different format in the future" warning may be
> >>   appropriate.
> >> 
> >> * I didn't specify recognized extensions for formats "bochs", "cloop",
> >>   "parallels", "vpc", because I have no idea which ones to recognize.
> >> 
> >> Additionally, some tests still need to be updated.
> >> 
> >> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> >
> > I still don't like this approach very much.
> >
> > The first of the reasons, and arguably the weakest one, is simply that I
> > aesthetically dislike to encode semantics in filenames by relying on
> > filename extensions. Feels like I'm being moved back to DOS times.
> 
> It's the least bad solution so far, in my opinion.
> 
> For what it's worth, gcc decides what to do with a file based on its
> file name extension, too.
> 
> > The second one, though, is probably a show stopper for me. You assume
> > that there even is a filename that could have an extension, and that it
> > is passed in the legacy filename argument instead of the QDict. With your
> > patches:
> >
> > $ qemu-img create -f qcow2 /tmp/test.img 64M
> > Formatting '/tmp/test.img', fmt=qcow2 size=67108864 encryption=off
> > cluster_size=65536 lazy_refcounts=off
> > $ qemu-system-x86_64 -drive file=/tmp/test.img
> > qemu-system-x86_64: -drive file=/tmp/test.img: warning: insecure
> > format probing of image '/tmp/test.img'
> > To get rid of this warning, specify format=qcow2 explicitly, or change
> > the image name to end with '.qcow2'
> > $ x86_64-softmmu/qemu-system-x86_64 -drive file.filename=/tmp/test.img
> > Speicherzugriffsfehler (Speicherabzug geschrieben)
> >
> > Oops, dereferencing a NULL pointer. Now, that's surely fixable, but what
> > I originally wanted to demonstrate is that you won't output a warning
> > there. Even that would still be fixable, by adding code into raw-posix,
> > but what do you do with '-drive file.driver=nbd,file.host=localhost'?
> >
> > And how does your patch help for '-drive blkverify:hacked.img:good.img'?
> 
> I'll reply to this as soon as I've had time to think.

If you are using fancy command-lines, you need to use format=.

The probing feature is really useful for -hda test.qcow2.  Anything
fancier requires enough knowledge (and typing on the command-line) that
format=qcow2 really isn't too much to ask for.

> > Instead, let me try once more to sell my old proposal [1] from the
> > thread you mentioned:
> >
> >> What if we let the raw driver know that it was probed and then it
> >> enables a check that returns -EIO for any write on the first 2k if that
> >> write would make the image look like a different format?
> >
> > Attacks the problem where it arises instead of trying to detect the
> > outcome of it, and works in whatever way it is nested in the BDS graph
> > and whatever way is used to address the image file.

I think this is too clever.  It's another thing to debug if a guest
starts hitting EIO.

My opinion on probing is: it's ugly but let's leave it for QEMU 3.0 at
which point we implement Markus solution with exit(1).

In the meantime the CVE has been known for a long time so vulnerable
users (VM hosting, cloud, etc) have the information they need.  Many are
automatically protected by libvirt.

Stefan
Markus Armbruster Oct. 30, 2014, 9:07 a.m. UTC | #16
Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Wed, Oct 29, 2014 at 02:54:32PM +0100, Markus Armbruster wrote:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > Am 28.10.2014 um 17:03 hat Markus Armbruster geschrieben:
>> >> If the user neglects to specify the image format, QEMU probes the
>> >> image to guess it automatically, for convenience.
>> >> 
>> >> Relying on format probing is insecure for raw images (CVE-2008-2004).
>> >> If the guest writes a suitable header to the device, the next probe
>> >> will recognize a format chosen by the guest.  A malicious guest can
>> >> abuse this to gain access to host files, e.g. by crafting a QCOW2
>> >> header with backing file /etc/shadow.
>> >> 
>> >> Commit 1e72d3b (April 2008) provided -drive parameter format to let
>> >> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
>> >> optionally store the backing file format, to let users disable backing
>> >> file probing.  QED has had a flag to suppress probing since the
>> >> beginning (2010), set whenever a raw backing file is assigned.
>> >> 
>> >> Despite all this work (and time!), we're still insecure by default.  I
>> >> think we're doing our users a disservice by sticking to the fatally
>> >> flawed probing.  "Broken by default" is just wrong, and "convenience"
>> >> is no excuse.
>> >> 
>> >> I believe we can retain 90% of the convenience without compromising
>> >> security by keying on image file name instead of image contents: if
>> >> the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
>> >> assume qcow2, and so forth.
>> >> 
>> >> Naturally, this would break command lines where the filename doesn't
>> >> provide the correct clue.  So don't do it just yet, only warn if the
>> >> the change would lead to a different result.  Looks like this:
>> >> 
>> >>     qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
>> >>     To get rid of this warning, specify format=qcow2 explicitly, or change
>> >>     the image name to end with '.qcow2'
>> >> 
>> >> This should steer users away from insecure format probing.  After a
>> >> suitable grace period, we can hopefully drop format probing
>> >> alltogether.
>> >> 
>> >> Example 0: file=WHATEVER,format=F
>> >> 
>> >> Never warns, because the explicit format suppresses probing.
>> >> 
>> >> Example 1: file=FOO.img
>> >> 
>> >> Warns when probing of FOO.img results in anything but raw.  In
>> >> particular, it warns when the guest just p0wned you.
>> >> 
>> >> Example 2: file=FOO.qcow2 with backing file name FOO.img and no
>> >> backing image format.
>> >> 
>> >> Warns when probing of FOO.qcow2 results in anything but qcow2, or
>> >> probing of FOO.img results in anything but raw.
>> >> 
>> >> This patch is RFC because of open questions:
>> >> 
>> >> * Should tools warn, too?  Probing isn't insecure there, but a "this
>> >>   may pick a different format in the future" warning may be
>> >>   appropriate.
>> >> 
>> >> * I didn't specify recognized extensions for formats "bochs", "cloop",
>> >>   "parallels", "vpc", because I have no idea which ones to recognize.
>> >> 
>> >> Additionally, some tests still need to be updated.
>> >> 
>> >> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> >
>> > I still don't like this approach very much.
>> >
>> > The first of the reasons, and arguably the weakest one, is simply that I
>> > aesthetically dislike to encode semantics in filenames by relying on
>> > filename extensions. Feels like I'm being moved back to DOS times.
>> 
>> It's the least bad solution so far, in my opinion.
>> 
>> For what it's worth, gcc decides what to do with a file based on its
>> file name extension, too.
>> 
>> > The second one, though, is probably a show stopper for me. You assume
>> > that there even is a filename that could have an extension, and that it
>> > is passed in the legacy filename argument instead of the QDict. With your
>> > patches:
>> >
>> > $ qemu-img create -f qcow2 /tmp/test.img 64M
>> > Formatting '/tmp/test.img', fmt=qcow2 size=67108864 encryption=off
>> > cluster_size=65536 lazy_refcounts=off
>> > $ qemu-system-x86_64 -drive file=/tmp/test.img
>> > qemu-system-x86_64: -drive file=/tmp/test.img: warning: insecure
>> > format probing of image '/tmp/test.img'
>> > To get rid of this warning, specify format=qcow2 explicitly, or change
>> > the image name to end with '.qcow2'
>> > $ x86_64-softmmu/qemu-system-x86_64 -drive file.filename=/tmp/test.img
>> > Speicherzugriffsfehler (Speicherabzug geschrieben)
>> >
>> > Oops, dereferencing a NULL pointer. Now, that's surely fixable, but what
>> > I originally wanted to demonstrate is that you won't output a warning
>> > there. Even that would still be fixable, by adding code into raw-posix,
>> > but what do you do with '-drive file.driver=nbd,file.host=localhost'?

You're right, my patch's use of file name is naive.

Related bug:

static int dmg_probe(const uint8_t *buf, int buf_size, const char *filename)
{
    int len;

    if (!filename) {
        return 0;
    }

    len = strlen(filename);
    if (len > 4 && !strcmp(filename + len - 4, ".dmg")) {
        return 2;
    }
    return 0;
}

With -drive if=none,file=foo.dmg the probe succeeds, and the image is
opened with driver bdrv_dmg.  With file.filename=foo.dmg, the probe
fails, and the image is opened with driver bdrv_raw.  Oops.

Two ideas for fixing this come to mind:

* Probing must not use the file name.  Remove the temptation by dropping
  the parameter from bdrv_probe().  Change dmg_probe() to probe the
  image contents instead.  Backward incompatible.  I suspect that could
  be just fine with you in this case ;-P

* Probing may use the file name.  Fix the code to pass the file name to
  ->bdrv_probe() whenever there is one.  There certainly is one when we
  got our file descriptor from open(), i.e. in the simple case.

  Whenever we pass a file name to ->bdrv_probe(), we can just as well
  pass it to bdrv_guess_format(), too.  It must still be prepared for a
  null name.

  There is a file name in your file.filename=/tmp/test.img example.

  There is none in your file.driver=nbd,file.host=localhost' example,
  because there we get the file descriptor from socket().

  If you do fancy things like nbd, you need to specify format= to avoid
  the warning.  I agree with Stefan: it's tolerable price to pay for
  "secure by default".

>> > And how does your patch help for '-drive blkverify:hacked.img:good.img'?

My patch puts "guess format from image file name" right next to "guess
format from image contents" (a.k.a. probing).  Therefore, it runs
*exactly* when we probe an image.

Your example:

    $ qemu-img create -f qcow2 good.img 1m
    Formatting 'good.img', fmt=qcow2 size=1048576 encryption=off cluster_size=65536 lazy_refcounts=off 
    $ cp good.img hacked.img
    $ qemu -S -display none -monitor stdio -drive if=none,file=blkverify:hacked.img:good.img
    qemu: -drive if=none,file=blkverify:hacked.img:good.img: warning: insecure format probing of image 'good.img'
    To get rid of this warning, specify format=qcow2 explicitly, or change
    the image name to end with '.qcow2'
    blkverify: read sector_num=0 nb_sectors=4 contents mismatch in sector 0

Only good.img is probed.  Why?  Let's examine how this thing gets
opened.

First bdrv_open() has
    filename = "blkverify:hacked.img:good.img"
    options = {}
    flags = BDRV_O_RDWR | BDRV_O_CACHE_WB
    drv = NULL

    It calls itself recursively via bdrv_open_image():
        Same parameters, except flags |= BDRV_O_ALLOW_RDWR |
        BDRV_O_UNMAP | BDRV_O_PROTOCOL.

        This parses the file name, and calls itself again, via
        bdrv_open_common(), blkverify_open(), bdrv_open_image(), two
        times.

            First:
                filename = "hacked.img"
                options = {}
                flags = as above

                bdrv_open() changes options to
                driver=file,filename=hacked.img, then opens hacked.img
                with driver "file", i.e. *without* probing.

            Next:
                filename = "good.img"
                options = {}
                flags = as above, less BDRV_O_PROTOCOL

                Once again, this calls itself, via bdrv_open_image():
                    Same parameters, except BDRV_O_PROTOCOL is back
            
                    bdrv_open() changes options to
                    driver=file,filename=good.img, then opens
                    good.img with driver "file", i.e. *without*
                    probing.

                Then it probes the image contents, resulting in driver
                bdrv_qcow2.  My patch's code runs, and emits the
                warning.

                Then it calls bdrv_open_common() to put the qcow2 BDS on
                top of the file BDS.  If it had a backing file, we'd
                call bdrv_open() some more.

        Back in the outermost bdrv_open().  It attempts to probe the
        image contents.  Fails reading the contents, because the two
        children of the blkverify BDSs have different contents: good.img
        is read with qcow2 over file, while hacked.img is read with just
        file.

        If I create hacked.img with

            dd if=/dev/zero of=hacked.img bs=4M count=1

        instead, then the read succeeds, and probing yields bdrv_raw.
        My patch's code runs, guesses bdrv_raw, and doesn't warn.

Quite a jungle.

>> I'll reply to this as soon as I've had time to think.
>
> If you are using fancy command-lines, you need to use format=.
>
> The probing feature is really useful for -hda test.qcow2.  Anything
> fancier requires enough knowledge (and typing on the command-line) that
> format=qcow2 really isn't too much to ask for.
>
>> > Instead, let me try once more to sell my old proposal [1] from the
>> > thread you mentioned:
>> >
>> >> What if we let the raw driver know that it was probed and then it
>> >> enables a check that returns -EIO for any write on the first 2k if that
>> >> write would make the image look like a different format?
>> >
>> > Attacks the problem where it arises instead of trying to detect the
>> > outcome of it, and works in whatever way it is nested in the BDS graph
>> > and whatever way is used to address the image file.
>
> I think this is too clever.  It's another thing to debug if a guest
> starts hitting EIO.
>
> My opinion on probing is: it's ugly but let's leave it for QEMU 3.0 at
> which point we implement Markus solution with exit(1).

I regard my patch as a necessary preliminary step for that.  Warn now,
change behavior a couple of releases later.  When exactly is debatable.

> In the meantime the CVE has been known for a long time so vulnerable
> users (VM hosting, cloud, etc) have the information they need.  Many are
> automatically protected by libvirt.

The warning hopefully helps libvirt developers with keeping libvirt
users fully protected.
Max Reitz Oct. 30, 2014, 9:08 a.m. UTC | #17
On 2014-10-28 at 17:03, Markus Armbruster wrote:
> If the user neglects to specify the image format, QEMU probes the
> image to guess it automatically, for convenience.
>
> Relying on format probing is insecure for raw images (CVE-2008-2004).
> If the guest writes a suitable header to the device, the next probe
> will recognize a format chosen by the guest.  A malicious guest can
> abuse this to gain access to host files, e.g. by crafting a QCOW2
> header with backing file /etc/shadow.
>
> Commit 1e72d3b (April 2008) provided -drive parameter format to let
> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
> optionally store the backing file format, to let users disable backing
> file probing.  QED has had a flag to suppress probing since the
> beginning (2010), set whenever a raw backing file is assigned.
>
> Despite all this work (and time!), we're still insecure by default.  I
> think we're doing our users a disservice by sticking to the fatally
> flawed probing.  "Broken by default" is just wrong, and "convenience"
> is no excuse.
>
> I believe we can retain 90% of the convenience without compromising
> security by keying on image file name instead of image contents: if
> the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
> assume qcow2, and so forth.
>
> Naturally, this would break command lines where the filename doesn't
> provide the correct clue.  So don't do it just yet, only warn if the
> the change would lead to a different result.  Looks like this:
>
>      qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
>      To get rid of this warning, specify format=qcow2 explicitly, or change
>      the image name to end with '.qcow2'
>
> This should steer users away from insecure format probing.  After a
> suitable grace period, we can hopefully drop format probing
> alltogether.
>
> Example 0: file=WHATEVER,format=F
>
> Never warns, because the explicit format suppresses probing.
>
> Example 1: file=FOO.img
>
> Warns when probing of FOO.img results in anything but raw.  In
> particular, it warns when the guest just p0wned you.
>
> Example 2: file=FOO.qcow2 with backing file name FOO.img and no
> backing image format.
>
> Warns when probing of FOO.qcow2 results in anything but qcow2, or
> probing of FOO.img results in anything but raw.
>
> This patch is RFC because of open questions:
>
> * Should tools warn, too?  Probing isn't insecure there, but a "this
>    may pick a different format in the future" warning may be
>    appropriate.
>
> * I didn't specify recognized extensions for formats "bochs", "cloop",
>    "parallels", "vpc", because I have no idea which ones to recognize.
>
> Additionally, some tests still need to be updated.
>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>   block.c                   | 46 ++++++++++++++++++++++++++++++++++++++++++++--
>   block/dmg.c               |  1 +
>   block/qcow.c              |  1 +
>   block/qcow2.c             |  1 +
>   block/qed.c               |  1 +
>   block/raw_bsd.c           |  1 +
>   block/vdi.c               |  1 +
>   block/vhdx.c              |  1 +
>   block/vmdk.c              |  1 +
>   include/block/block_int.h |  1 +
>   10 files changed, 53 insertions(+), 2 deletions(-)


So I guess it's my turn to give yet another opinion (or just something 
in between of what has been already said).

First, I'm fine with this patch, or at least the idea as there were yet 
some quirks.

But I oppose turning the warning into an error eventually. I want to be 
able to use -hda $filename and it should work. As Kevin said, a warning 
alone will not help a lot, though. I don't know about that, I think it 
should work. This warning should only appear when you're using qemu 
directly because management tools should already use -drive with format= 
or driver=; and if you're using qemu directly you should be watching stderr.

The only way I'd be fine with turning this into an error would be to 
make an exception for -hda and the like. There were already plans of 
introducing pseudo block drivers for format and protocol probing; if we 
do that we can use those block drivers for -hda. We should still emit 
the warning, but in my opinion it should never be an error with -hda, 
-cdrom etc.. For me, this is not about backwards compatibility but 
because I'm using -hda myself.

On the other hand, I'm very fine with Kevin's proposal. Bootloaders 
normally don't change that often and there's not that much variety 
(aside from hand-written hobby stuff) so we should be able to regularly 
make sure that nothing breaks. I'd like to increase the entropy, though, 
so we should take another look into the block drivers' probing functions 
and make them more specific, if possible. I don't feel like testing four 
bytes is enough (although in practice it should be).

Also, I like Kevin's proposal/Anthony's approach a lot more because of 
its principle. If a guest can overwrite the beginning of the image so it 
looks like an image format, that's the real bug. Afterwards, anyone will 
recognize that image as non-raw and they'd be correct.

Max
Stefan Hajnoczi Oct. 30, 2014, 9:24 a.m. UTC | #18
On Thu, Oct 30, 2014 at 10:07:26AM +0100, Markus Armbruster wrote:
> Stefan Hajnoczi <stefanha@gmail.com> writes:
> 
> > On Wed, Oct 29, 2014 at 02:54:32PM +0100, Markus Armbruster wrote:
> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> 
> >> > Am 28.10.2014 um 17:03 hat Markus Armbruster geschrieben:
> >> > Instead, let me try once more to sell my old proposal [1] from the
> >> > thread you mentioned:
> >> >
> >> >> What if we let the raw driver know that it was probed and then it
> >> >> enables a check that returns -EIO for any write on the first 2k if that
> >> >> write would make the image look like a different format?
> >> >
> >> > Attacks the problem where it arises instead of trying to detect the
> >> > outcome of it, and works in whatever way it is nested in the BDS graph
> >> > and whatever way is used to address the image file.
> >
> > I think this is too clever.  It's another thing to debug if a guest
> > starts hitting EIO.
> >
> > My opinion on probing is: it's ugly but let's leave it for QEMU 3.0 at
> > which point we implement Markus solution with exit(1).
> 
> I regard my patch as a necessary preliminary step for that.  Warn now,
> change behavior a couple of releases later.  When exactly is debatable.
> 
> > In the meantime the CVE has been known for a long time so vulnerable
> > users (VM hosting, cloud, etc) have the information they need.  Many are
> > automatically protected by libvirt.
> 
> The warning hopefully helps libvirt developers with keeping libvirt
> users fully protected.

I'm happy with this approach (haven't reviewed the patches in detail
yet).

Stefan
Stefan Hajnoczi Oct. 30, 2014, 9:27 a.m. UTC | #19
On Thu, Oct 30, 2014 at 10:08:46AM +0100, Max Reitz wrote:
> Also, I like Kevin's proposal/Anthony's approach a lot more because of its
> principle. If a guest can overwrite the beginning of the image so it looks
> like an image format, that's the real bug. Afterwards, anyone will recognize
> that image as non-raw and they'd be correct.

No, it is not a guest bug.

The guest may legitimately use raw devices that contain image format
data.  Imagine tools similar to libguestfs.

It's perfectly okay for them to lay out image format data onto a raw
device.

Probing is the problem, not putting image format data onto a raw device.

Stefan
Kevin Wolf Oct. 30, 2014, 9:30 a.m. UTC | #20
Am 29.10.2014 um 14:54 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> > Instead, let me try once more to sell my old proposal [1] from the
> > thread you mentioned:
> >
> >> What if we let the raw driver know that it was probed and then it
> >> enables a check that returns -EIO for any write on the first 2k if that
> >> write would make the image look like a different format?
> >
> > Attacks the problem where it arises instead of trying to detect the
> > outcome of it, and works in whatever way it is nested in the BDS graph
> > and whatever way is used to address the image file.
> >
> > Kevin
> >
> > [1] https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02548.html
> 
> Arbitrarily breaking the virtual hardware that way is incompatible, too.
> It's a bigger no-no for me than changing user interface corner cases or
> deciding what to do with a file based on its file name extension.

It is arbitrarily breaking theoretical cases, while your patch is less
arbitrarily breaking common cases. I strongly prefer the former.

Nobody will ever want to write a qcow2 header into their boot sector for
just running a guest:

0:   51                      push   %cx
1:   46                      inc    %si
2:   49                      dec    %cx
3:   fb                      sti

This code doesn't make any sense. It's similar for other formats. And if
they really have some odd reason to do that, the fix is easy: Either
don't use raw, or pass an explicit format=raw.

> Anthony tried something similar (commit 79368c8), but couldn't get it
> right (commit 8b33d9e).

The discussion back then: http://patchwork.ozlabs.org/patch/58980/

The problem with Anthony's code was that he didn't handle a qiov
correctly that had unaligned members. With today's block layer, this is
not a big deal to implement correctly. We're running coroutines instead
of AIO callbacks and we don't have to do all the manual qiov fixing
magic that Anthony had in his patch, util/iov.c provides all you need.

I'll send out an RFC series that implements this.

Kevin
Kevin Wolf Oct. 30, 2014, 9:36 a.m. UTC | #21
Am 30.10.2014 um 10:27 hat Stefan Hajnoczi geschrieben:
> On Thu, Oct 30, 2014 at 10:08:46AM +0100, Max Reitz wrote:
> > Also, I like Kevin's proposal/Anthony's approach a lot more because of its
> > principle. If a guest can overwrite the beginning of the image so it looks
> > like an image format, that's the real bug. Afterwards, anyone will recognize
> > that image as non-raw and they'd be correct.
> 
> No, it is not a guest bug.

No, but it is a host bug. When probed, this is not content that raw can
reliably store.

> The guest may legitimately use raw devices that contain image format
> data.  Imagine tools similar to libguestfs.
> 
> It's perfectly okay for them to lay out image format data onto a raw
> device.
> 
> Probing is the problem, not putting image format data onto a raw device.

Agreed, that's why any restrictions only apply when probing was used to
detect a raw image. If you want to do anything exotic like storing a
qcow2 image for nested virt on a disk that is a raw image in the host,
then making sure to pass format=raw shouldn't be too much.

Kevin
Markus Armbruster Oct. 30, 2014, 12:19 p.m. UTC | #22
Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Thu, Oct 30, 2014 at 10:07:26AM +0100, Markus Armbruster wrote:
>> Stefan Hajnoczi <stefanha@gmail.com> writes:
>> 
>> > On Wed, Oct 29, 2014 at 02:54:32PM +0100, Markus Armbruster wrote:
>> >> Kevin Wolf <kwolf@redhat.com> writes:
>> >> 
>> >> > Am 28.10.2014 um 17:03 hat Markus Armbruster geschrieben:
>> >> > Instead, let me try once more to sell my old proposal [1] from the
>> >> > thread you mentioned:
>> >> >
>> >> >> What if we let the raw driver know that it was probed and then it
>> >> >> enables a check that returns -EIO for any write on the first 2k if that
>> >> >> write would make the image look like a different format?
>> >> >
>> >> > Attacks the problem where it arises instead of trying to detect the
>> >> > outcome of it, and works in whatever way it is nested in the BDS graph
>> >> > and whatever way is used to address the image file.
>> >
>> > I think this is too clever.  It's another thing to debug if a guest
>> > starts hitting EIO.
>> >
>> > My opinion on probing is: it's ugly but let's leave it for QEMU 3.0 at
>> > which point we implement Markus solution with exit(1).
>> 
>> I regard my patch as a necessary preliminary step for that.  Warn now,
>> change behavior a couple of releases later.  When exactly is debatable.
>> 
>> > In the meantime the CVE has been known for a long time so vulnerable
>> > users (VM hosting, cloud, etc) have the information they need.  Many are
>> > automatically protected by libvirt.
>> 
>> The warning hopefully helps libvirt developers with keeping libvirt
>> users fully protected.
>
> I'm happy with this approach (haven't reviewed the patches in detail
> yet).

PATCH 1/2 is fully baked, but it's also trivial, and got plenty of
review already.

PATCH 2/2 isn't baked, yet, and I think I know what needs to be done.  I
guess your review cycles are better spent elsewhere.
Markus Armbruster Oct. 30, 2014, 12:49 p.m. UTC | #23
Kevin Wolf <kwolf@redhat.com> writes:

> Am 29.10.2014 um 14:54 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> > Instead, let me try once more to sell my old proposal [1] from the
>> > thread you mentioned:
>> >
>> >> What if we let the raw driver know that it was probed and then it
>> >> enables a check that returns -EIO for any write on the first 2k if that
>> >> write would make the image look like a different format?
>> >
>> > Attacks the problem where it arises instead of trying to detect the
>> > outcome of it, and works in whatever way it is nested in the BDS graph
>> > and whatever way is used to address the image file.
>> >
>> > Kevin
>> >
>> > [1] https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02548.html
>> 
>> Arbitrarily breaking the virtual hardware that way is incompatible, too.
>> It's a bigger no-no for me than changing user interface corner cases or
>> deciding what to do with a file based on its file name extension.
>
> It is arbitrarily breaking theoretical cases, while your patch is less
> arbitrarily breaking common cases. I strongly prefer the former.

I challenge "theoretical" as well as "common".

For "theoretical", see below.

As to "common": are unrecognizable image names really common?  I doubt
it.  If I'm wrong, I expect users to complain about my warning, and then
we can reconsider.

> Nobody will ever want to write a qcow2 header into their boot sector for
> just running a guest:
>
> 0:   51                      push   %cx
> 1:   46                      inc    %si
> 2:   49                      dec    %cx
> 3:   fb                      sti
>
> This code doesn't make any sense. It's similar for other formats. And if
> they really have some odd reason to do that, the fix is easy: Either
> don't use raw, or pass an explicit format=raw.

A disk needn't start with a PC-style boot sector.  Even on a PC.  Not
every disk needs to be bootable, or even partitioned.  Databases exist
that like to work on raw devices.  Giving them /dev/sdb instead of a
whole-disk partition /dev/sdb1 may not be the smartest thing to do, but
it *works* with real hardware, so why not with virtual hardware?

You either have to prevent *any* writing of the first 2048 bytes (the
part that can be examined by a bdrv_probe() method, or your have to
prevent writing anything a probe recognizes, or the user has to specify
the format explicitly.

If you do the former, you're way outside the realm of "theoretical".

If you do the latter, the degree of "theoreticalness" depends on the
union of the patterns you prevent.  Issues:

1. Anthony's method of checking a few known signatures is fragile.  The
only obviously robust way to test "is probing going to find something
other than 'raw'?" is running the probes.  Feasible.

2. Whether the union of patterns qualifies as "theoretical" for all our
targets is not obvious, and whether it'll remain "theoretical" for all
future formats and target machines even less.

3. A write access that works just fine in one QEMU version can be
rejected by another version that recognizes more formats.  Or probes the
same formats differently.

4. Rejecting a write fails *late*, and looks like hardware failure to
the guest.  With imperfect guest software, this risks data corruption.

(4) is a deal breaker for me.

(3) adds a cherry on top.  I care about command line compatibility, but
I care about guest ABI stability a whole lot more.

I'm utterly unwilling to risk breaking a running guest's hardware just
so that users can continue to call their qcow2 images
"I want my pony.dammit".

>> Anthony tried something similar (commit 79368c8), but couldn't get it
>> right (commit 8b33d9e).
>
> The discussion back then: http://patchwork.ozlabs.org/patch/58980/
>
> The problem with Anthony's code was that he didn't handle a qiov
> correctly that had unaligned members. With today's block layer, this is
> not a big deal to implement correctly. We're running coroutines instead
> of AIO callbacks and we don't have to do all the manual qiov fixing
> magic that Anthony had in his patch, util/iov.c provides all you need.
>
> I'll send out an RFC series that implements this.

I'm strongly opposed to this idea.
Markus Armbruster Oct. 30, 2014, 1:02 p.m. UTC | #24
Max Reitz <mreitz@redhat.com> writes:

> So I guess it's my turn to give yet another opinion (or just something
> in between of what has been already said).
>
> First, I'm fine with this patch, or at least the idea as there were
> yet some quirks.

Yes, the patch has (fixable) issues.  It's really just a sketch that
happens to work in common cases :)

> But I oppose turning the warning into an error eventually. I want to
> be able to use -hda $filename and it should work. As Kevin said, a
> warning alone will not help a lot, though. I don't know about that, I
> think it should work. This warning should only appear when you're
> using qemu directly because management tools should already use -drive
> with format= or driver=; and if you're using qemu directly you should
> be watching stderr.
>
> The only way I'd be fine with turning this into an error would be to
> make an exception for -hda and the like. There were already plans of
> introducing pseudo block drivers for format and protocol probing; if
> we do that we can use those block drivers for -hda. We should still
> emit the warning, but in my opinion it should never be an error with
> -hda, -cdrom etc.. For me, this is not about backwards compatibility
> but because I'm using -hda myself.

Examples of usage that is just fine (no warning):

* -hda test.qcow2

  Fine as long as test.qcow2 is really qcow2 (as it should!), and
  either specifies a backing format (as it arguably should), or the
  backing file name is sane.

* -hda disk.img

  Fine as long as disk.img is really a disk image (as it should).

* -hda /dev/mapper/vg0-virtdisk

  Fine as long as the logical volume is raw.  Not actually implemented
  in my patch, but it's easy enough to do.

Can you give me a few examples from your usage that triggers the
warning?

[...]
Jeff Cody Oct. 30, 2014, 1:52 p.m. UTC | #25
On Wed, Oct 29, 2014 at 07:37:02AM +0100, Markus Armbruster wrote:
> Jeff Cody <jcody@redhat.com> writes:
> 
> > On Tue, Oct 28, 2014 at 05:03:40PM +0100, Markus Armbruster wrote:
> >> If the user neglects to specify the image format, QEMU probes the
> >> image to guess it automatically, for convenience.
> >> 
> >> Relying on format probing is insecure for raw images (CVE-2008-2004).
> >> If the guest writes a suitable header to the device, the next probe
> >> will recognize a format chosen by the guest.  A malicious guest can
> >> abuse this to gain access to host files, e.g. by crafting a QCOW2
> >> header with backing file /etc/shadow.
> >> 
> >> Commit 1e72d3b (April 2008) provided -drive parameter format to let
> >> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
> >> optionally store the backing file format, to let users disable backing
> >> file probing.  QED has had a flag to suppress probing since the
> >> beginning (2010), set whenever a raw backing file is assigned.
> >> 
> >> Despite all this work (and time!), we're still insecure by default.  I
> >> think we're doing our users a disservice by sticking to the fatally
> >> flawed probing.  "Broken by default" is just wrong, and "convenience"
> >> is no excuse.
> >> 
> >> I believe we can retain 90% of the convenience without compromising
> >> security by keying on image file name instead of image contents: if
> >> the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
> >> assume qcow2, and so forth.
> >> 
> >> Naturally, this would break command lines where the filename doesn't
> >> provide the correct clue.  So don't do it just yet, only warn if the
> >> the change would lead to a different result.  Looks like this:
> >> 
> >>     qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
> >>     To get rid of this warning, specify format=qcow2 explicitly, or change
> >>     the image name to end with '.qcow2'
> >> 
> >> This should steer users away from insecure format probing.  After a
> >> suitable grace period, we can hopefully drop format probing
> >> alltogether.
> >> 
> >> Example 0: file=WHATEVER,format=F
> >> 
> >> Never warns, because the explicit format suppresses probing.
> >> 
> >> Example 1: file=FOO.img
> >> 
> >> Warns when probing of FOO.img results in anything but raw.  In
> >> particular, it warns when the guest just p0wned you.
> >> 
> >> Example 2: file=FOO.qcow2 with backing file name FOO.img and no
> >> backing image format.
> >> 
> >> Warns when probing of FOO.qcow2 results in anything but qcow2, or
> >> probing of FOO.img results in anything but raw.
> >> 
> >> This patch is RFC because of open questions:
> >> 
> >> * Should tools warn, too?  Probing isn't insecure there, but a "this
> >>   may pick a different format in the future" warning may be
> >>   appropriate.
> >> 
> >> * I didn't specify recognized extensions for formats "bochs", "cloop",
> >>   "parallels", "vpc", because I have no idea which ones to recognize.
> >>
> >
> > Format 'vpc' should probably recognize both extensions "vpc" and
> > "vhd".  The actual format is VHD, so most MS tools will probably
> > create files with .vhd extensions.
> >
> > (This creates an overlap with vhdx; see my response to Eric's email).
> 
> Going to discuss it there.
> 
> >> Additionally, some tests still need to be updated.
> >> 
> >> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> >
> >
> > [ ...]
> >
> >> diff --git a/block/vhdx.c b/block/vhdx.c
> >> index 12bfe75..d2c3a20 100644
> >> --- a/block/vhdx.c
> >> +++ b/block/vhdx.c
> >> @@ -1945,6 +1945,7 @@ static BlockDriver bdrv_vhdx = {
> >>      .format_name            = "vhdx",
> >>      .instance_size          = sizeof(BDRVVHDXState),
> >>      .bdrv_probe             = vhdx_probe,
> >> +    .fname_ext              = { "vhd" },
> >
> > This should also have "vhdx", I think.  
> 
> Okay.  I looked for confirmation in Wikipedia, and found:
> 
>     Hyper-V, like Microsoft Virtual Server and Windows Virtual PC, saves
>     each guest OS to a single virtual hard disk file with the extension
>     .VHD, except in Windows 8 and Windows Server 2012 where it can be
>     the newer .vhdx.
> 
> https://en.wikipedia.org/wiki/Hyper-V#VHD_compatibility_with_Virtual_Server_2005_and_Virtual_PC_2004.2F2007
> 
> Makes me wonder whether .vhd is really used for both vhdx and vpc format
> images.  What have you seen in the wild?
>

I need to resurrect my Windows Server Hyper-V test machine, and see
what it generates by default.  Most likely '.vhdx'

However, even so, it seems entirely plausible that a 4-letter
extension may end up represented as a 3-digit extension, and be .vhd,
even if that is not the 'official' name.

> >>      .bdrv_open              = vhdx_open,
> >>      .bdrv_close             = vhdx_close,
> >>      .bdrv_reopen_prepare    = vhdx_reopen_prepare,
> [...]
Jeff Cody Oct. 30, 2014, 1:58 p.m. UTC | #26
On Wed, Oct 29, 2014 at 08:22:16AM +0100, Markus Armbruster wrote:
> Eric Blake <eblake@redhat.com> writes:
> 
> > On 10/28/2014 12:29 PM, Jeff Cody wrote:
> [...]
> >>> What happens if more than one format tends to pick the same extension?
> >>> For example, would you consider '.qcow' a typical extension for qcow2
> >>> files, even though it would probably match the older qcow driver first?...
> >>>
> >> 
> >> I think this could arguably end up being the case for VHD and VHDX
> >> (i.e., both using .vhd).
> >> 
> >> I guess the question is, in the case of common extensions, should the
> >> priority be on the probe, or on the extension?  With the way the code
> >> is written, the priority is all going to depend on the order the
> >> driver is registered, so it may or may not warn.
> >
> > Technically, don't we correctly probe both VHD and VHDX files?  It is
> > only files that start out raw and later get mis-probed as non-raw where
> > we have an issue, so I'd rather treat the probe as accurate if it
> > matches a common extension for that format, and NOT treat the extension
> > as dictating a single required format.
> >
> >> 
> >> Currently, the code does a format probe, does an independent extension
> >> lookup, and checks if the two agree.  Instead, would it be sufficient
> >> to do the format probe, and then just verify the detected driver has a
> >> compatible extension name?
> >
> > Yes, that was what I was thinking as well.
> 
> I designed the code with the eventual removal of probing in mind:
> 
>     This should steer users away from insecure format probing.  After a
>     suitable grace period, we can hopefully drop format probing
>     alltogether.
>

Not for 'qemu-img info', or similar commands.  I can see the file name
extension, but sometimes I may want to be able to determine what a
.img file actually contains.


> The warning triggers on insecure probing.  That's one view of it.  The
> other view is it triggers when potentially insecure probing and our new
> secure guessing come up with different answers.  It serves as warning of
> future behavioral change.
> 
> If we only check the extension matches the probing, we gain support for
> ambiguous file name extensions, but make the future change warning
> incomplete.  That's bad.
> 
> If .vhd can really be two formats so different we actually want to
> separate block drivers for them, we want something more sophisticated
> than my patch.  Need to think.
Richard W.M. Jones Oct. 30, 2014, 8:45 p.m. UTC | #27
Can you add something like:

  -drive ...,format=unsafe-probe

so it does the probing anyway, even though we know it's unsafe?

This will minimize the churn needed in libguestfs to make this work.

Rich.
Stefan Hajnoczi Oct. 31, 2014, 11:19 a.m. UTC | #28
On Thu, Oct 30, 2014 at 01:49:22PM +0100, Markus Armbruster wrote:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 29.10.2014 um 14:54 hat Markus Armbruster geschrieben:
> >> Anthony tried something similar (commit 79368c8), but couldn't get it
> >> right (commit 8b33d9e).
> >
> > The discussion back then: http://patchwork.ozlabs.org/patch/58980/
> >
> > The problem with Anthony's code was that he didn't handle a qiov
> > correctly that had unaligned members. With today's block layer, this is
> > not a big deal to implement correctly. We're running coroutines instead
> > of AIO callbacks and we don't have to do all the manual qiov fixing
> > magic that Anthony had in his patch, util/iov.c provides all you need.
> >
> > I'll send out an RFC series that implements this.
> 
> I'm strongly opposed to this idea.

Me too.  Meddling with the guest is wrong and just adds a new problem!

It is perfectly okay for backup appliances, nested virtualization, or
disk management appliances to write image format headers to raw disks.

I'm am 100% against trying to detect the guest writing image format
headers to raw disks.

Probing is not a guest problem.  It is a QEMU problem.  Fix QEMU, don't
cripple the guest.

Stefan
Stefan Hajnoczi Oct. 31, 2014, 11:24 a.m. UTC | #29
On Thu, Oct 30, 2014 at 10:36:35AM +0100, Kevin Wolf wrote:
> Am 30.10.2014 um 10:27 hat Stefan Hajnoczi geschrieben:
> > The guest may legitimately use raw devices that contain image format
> > data.  Imagine tools similar to libguestfs.
> > 
> > It's perfectly okay for them to lay out image format data onto a raw
> > device.
> > 
> > Probing is the problem, not putting image format data onto a raw device.
> 
> Agreed, that's why any restrictions only apply when probing was used to
> detect a raw image. If you want to do anything exotic like storing a
> qcow2 image for nested virt on a disk that is a raw image in the host,
> then making sure to pass format=raw shouldn't be too much.

Because at that point the solution is way over-engineered.

Probing checks should be in the QEMU command-line code, not sprinkled
across the codebase and even at run-time.

Isn't Markus approach much simpler and cleaner?

Stefan
Kevin Wolf Oct. 31, 2014, 11:56 a.m. UTC | #30
Am 31.10.2014 um 12:24 hat Stefan Hajnoczi geschrieben:
> On Thu, Oct 30, 2014 at 10:36:35AM +0100, Kevin Wolf wrote:
> > Am 30.10.2014 um 10:27 hat Stefan Hajnoczi geschrieben:
> > > The guest may legitimately use raw devices that contain image format
> > > data.  Imagine tools similar to libguestfs.
> > > 
> > > It's perfectly okay for them to lay out image format data onto a raw
> > > device.
> > > 
> > > Probing is the problem, not putting image format data onto a raw device.
> > 
> > Agreed, that's why any restrictions only apply when probing was used to
> > detect a raw image. If you want to do anything exotic like storing a
> > qcow2 image for nested virt on a disk that is a raw image in the host,
> > then making sure to pass format=raw shouldn't be too much.
> 
> Because at that point the solution is way over-engineered.
> 
> Probing checks should be in the QEMU command-line code, not sprinkled
> across the codebase and even at run-time.
> 
> Isn't Markus approach much simpler and cleaner?

I don't think so. My code isn't "sprinkled across the codebase", it has
the checks right where the problem arises, in the raw block driver.

It's with Markus's approach that we'll have to have code in many
different places as I showed. Its fundamental assumption that there is
always a filename string and the filename isn't passed in some QDict
option is simply wrong. Specifying the image is driver-dependent and
therefore you'd have to add functionality to each driver in order to get
the filename extension (or the information that there isn't anything
close enough to a filename).

The only argument brought up so far that I can reasonably buy is that
in the unlikely case of the restrictions applying it may be surprising
for the user to see requests failing. To address this, we could print
a warning when an image is opened in the "restricted raw" mode. This
way the user knows what's going on, and at the same time we still
effectively protect them instead of only printing a warning without real
protection.

Kevin
Eric Blake Oct. 31, 2014, 10:45 p.m. UTC | #31
On 10/30/2014 06:49 AM, Markus Armbruster wrote:

> You either have to prevent *any* writing of the first 2048 bytes (the
> part that can be examined by a bdrv_probe() method, or your have to
> prevent writing anything a probe recognizes, or the user has to specify
> the format explicitly.
> 
> If you do the former, you're way outside the realm of "theoretical".
> 
> If you do the latter, the degree of "theoreticalness" depends on the
> union of the patterns you prevent.  Issues:
> 
> 1. Anthony's method of checking a few known signatures is fragile.  The
> only obviously robust way to test "is probing going to find something
> other than 'raw'?" is running the probes.  Feasible.
> 
> 2. Whether the union of patterns qualifies as "theoretical" for all our
> targets is not obvious, and whether it'll remain "theoretical" for all
> future formats and target machines even less.

This one scares me.  The proof of concept patch you posted tests whether
a write to the first sector would result in the sector matching a
_currently known probe_ for the file formats that were compiled in at
configure time to the currently running binary.  But this is NOT the set
of all possible binary formats that may be introduced in the future.  By
extrapolation, if we pursue write blocking, then the only SAFE way to is
to reject ALL writes to the first sector, because we can't know which
writes will match some theoretical future probe - but by the time you
get to that point, then we no longer match bare metal (rejecting ALL
writes to the first sector is ridiculous).

> 
> 3. A write access that works just fine in one QEMU version can be
> rejected by another version that recognizes more formats.  Or probes the
> same formats differently.

This is just a manifestation of my observation above, which can be
easily triggered now by having two qemu binaries built from the same
source but different compile options.

> 
> 4. Rejecting a write fails *late*, and looks like hardware failure to
> the guest.  With imperfect guest software, this risks data corruption.
> 
> (4) is a deal breaker for me.

It is mitigated somewhat by the fact that the proposed patch will NEVER
reject writes for a management app that always correctly passes in a
format and thus avoids probes.  But that is still fragile.
Markus Armbruster Nov. 3, 2014, 8:07 a.m. UTC | #32
Jeff Cody <jcody@redhat.com> writes:

> On Wed, Oct 29, 2014 at 08:22:16AM +0100, Markus Armbruster wrote:
>> Eric Blake <eblake@redhat.com> writes:
>> 
>> > On 10/28/2014 12:29 PM, Jeff Cody wrote:
>> [...]
>> >>> What happens if more than one format tends to pick the same extension?
>> >>> For example, would you consider '.qcow' a typical extension for qcow2
>> >>> files, even though it would probably match the older qcow driver first?...
>> >>>
>> >> 
>> >> I think this could arguably end up being the case for VHD and VHDX
>> >> (i.e., both using .vhd).
>> >> 
>> >> I guess the question is, in the case of common extensions, should the
>> >> priority be on the probe, or on the extension?  With the way the code
>> >> is written, the priority is all going to depend on the order the
>> >> driver is registered, so it may or may not warn.
>> >
>> > Technically, don't we correctly probe both VHD and VHDX files?  It is
>> > only files that start out raw and later get mis-probed as non-raw where
>> > we have an issue, so I'd rather treat the probe as accurate if it
>> > matches a common extension for that format, and NOT treat the extension
>> > as dictating a single required format.
>> >
>> >> 
>> >> Currently, the code does a format probe, does an independent extension
>> >> lookup, and checks if the two agree.  Instead, would it be sufficient
>> >> to do the format probe, and then just verify the detected driver has a
>> >> compatible extension name?
>> >
>> > Yes, that was what I was thinking as well.
>> 
>> I designed the code with the eventual removal of probing in mind:
>> 
>>     This should steer users away from insecure format probing.  After a
>>     suitable grace period, we can hopefully drop format probing
>>     alltogether.
>>
>
> Not for 'qemu-img info', or similar commands.  I can see the file name
> extension, but sometimes I may want to be able to determine what a
> .img file actually contains.

That's a file(1) job.  But since the security argument doesn't apply to
"qemu-img info", regressing its functionality would be hard to justify.
In short, you're right.
Markus Armbruster Nov. 3, 2014, 8:11 a.m. UTC | #33
Jeff Cody <jcody@redhat.com> writes:

> On Wed, Oct 29, 2014 at 07:37:02AM +0100, Markus Armbruster wrote:
>> Jeff Cody <jcody@redhat.com> writes:
>> 
>> > On Tue, Oct 28, 2014 at 05:03:40PM +0100, Markus Armbruster wrote:
>> >> If the user neglects to specify the image format, QEMU probes the
>> >> image to guess it automatically, for convenience.
>> >> 
>> >> Relying on format probing is insecure for raw images (CVE-2008-2004).
>> >> If the guest writes a suitable header to the device, the next probe
>> >> will recognize a format chosen by the guest.  A malicious guest can
>> >> abuse this to gain access to host files, e.g. by crafting a QCOW2
>> >> header with backing file /etc/shadow.
>> >> 
>> >> Commit 1e72d3b (April 2008) provided -drive parameter format to let
>> >> users disable probing.  Commit f965509 (March 2009) extended QCOW2 to
>> >> optionally store the backing file format, to let users disable backing
>> >> file probing.  QED has had a flag to suppress probing since the
>> >> beginning (2010), set whenever a raw backing file is assigned.
>> >> 
>> >> Despite all this work (and time!), we're still insecure by default.  I
>> >> think we're doing our users a disservice by sticking to the fatally
>> >> flawed probing.  "Broken by default" is just wrong, and "convenience"
>> >> is no excuse.
>> >> 
>> >> I believe we can retain 90% of the convenience without compromising
>> >> security by keying on image file name instead of image contents: if
>> >> the file name ends in .img or .iso, assume raw, if it ends in .qcow2,
>> >> assume qcow2, and so forth.
>> >> 
>> >> Naturally, this would break command lines where the filename doesn't
>> >> provide the correct clue.  So don't do it just yet, only warn if the
>> >> the change would lead to a different result.  Looks like this:
>> >> 
>> >>     qemu: -drive file=my.img: warning: insecure format probing of image 'my.img'
>> >>     To get rid of this warning, specify format=qcow2 explicitly, or change
>> >>     the image name to end with '.qcow2'
>> >> 
>> >> This should steer users away from insecure format probing.  After a
>> >> suitable grace period, we can hopefully drop format probing
>> >> alltogether.
>> >> 
>> >> Example 0: file=WHATEVER,format=F
>> >> 
>> >> Never warns, because the explicit format suppresses probing.
>> >> 
>> >> Example 1: file=FOO.img
>> >> 
>> >> Warns when probing of FOO.img results in anything but raw.  In
>> >> particular, it warns when the guest just p0wned you.
>> >> 
>> >> Example 2: file=FOO.qcow2 with backing file name FOO.img and no
>> >> backing image format.
>> >> 
>> >> Warns when probing of FOO.qcow2 results in anything but qcow2, or
>> >> probing of FOO.img results in anything but raw.
>> >> 
>> >> This patch is RFC because of open questions:
>> >> 
>> >> * Should tools warn, too?  Probing isn't insecure there, but a "this
>> >>   may pick a different format in the future" warning may be
>> >>   appropriate.
>> >> 
>> >> * I didn't specify recognized extensions for formats "bochs", "cloop",
>> >>   "parallels", "vpc", because I have no idea which ones to recognize.
>> >>
>> >
>> > Format 'vpc' should probably recognize both extensions "vpc" and
>> > "vhd".  The actual format is VHD, so most MS tools will probably
>> > create files with .vhd extensions.
>> >
>> > (This creates an overlap with vhdx; see my response to Eric's email).
>> 
>> Going to discuss it there.
>> 
>> >> Additionally, some tests still need to be updated.
>> >> 
>> >> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> >
>> >
>> > [ ...]
>> >
>> >> diff --git a/block/vhdx.c b/block/vhdx.c
>> >> index 12bfe75..d2c3a20 100644
>> >> --- a/block/vhdx.c
>> >> +++ b/block/vhdx.c
>> >> @@ -1945,6 +1945,7 @@ static BlockDriver bdrv_vhdx = {
>> >>      .format_name            = "vhdx",
>> >>      .instance_size          = sizeof(BDRVVHDXState),
>> >>      .bdrv_probe             = vhdx_probe,
>> >> +    .fname_ext              = { "vhd" },
>> >
>> > This should also have "vhdx", I think.  
>> 
>> Okay.  I looked for confirmation in Wikipedia, and found:
>> 
>>     Hyper-V, like Microsoft Virtual Server and Windows Virtual PC, saves
>>     each guest OS to a single virtual hard disk file with the extension
>>     .VHD, except in Windows 8 and Windows Server 2012 where it can be
>>     the newer .vhdx.
>> 
>> https://en.wikipedia.org/wiki/Hyper-V#VHD_compatibility_with_Virtual_Server_2005_and_Virtual_PC_2004.2F2007
>> 
>> Makes me wonder whether .vhd is really used for both vhdx and vpc format
>> images.  What have you seen in the wild?
>>
>
> I need to resurrect my Windows Server Hyper-V test machine, and see
> what it generates by default.  Most likely '.vhdx'
>
> However, even so, it seems entirely plausible that a 4-letter
> extension may end up represented as a 3-digit extension, and be .vhd,
> even if that is not the 'official' name.

If .vhd turns out to be ambiguous in practice, we need to deal with it.
I got an idea how, but I'd prefer to discuss it after we figured out how
to address our "inscure by default" issue.


>> >>      .bdrv_open              = vhdx_open,
>> >>      .bdrv_close             = vhdx_close,
>> >>      .bdrv_reopen_prepare    = vhdx_reopen_prepare,
>> [...]
Markus Armbruster Nov. 3, 2014, 8:15 a.m. UTC | #34
Eric Blake <eblake@redhat.com> writes:

> On 10/30/2014 06:49 AM, Markus Armbruster wrote:
>
>> You either have to prevent *any* writing of the first 2048 bytes (the
>> part that can be examined by a bdrv_probe() method, or your have to
>> prevent writing anything a probe recognizes, or the user has to specify
>> the format explicitly.
>> 
>> If you do the former, you're way outside the realm of "theoretical".
>> 
>> If you do the latter, the degree of "theoreticalness" depends on the
>> union of the patterns you prevent.  Issues:
>> 
>> 1. Anthony's method of checking a few known signatures is fragile.  The
>> only obviously robust way to test "is probing going to find something
>> other than 'raw'?" is running the probes.  Feasible.
>> 
>> 2. Whether the union of patterns qualifies as "theoretical" for all our
>> targets is not obvious, and whether it'll remain "theoretical" for all
>> future formats and target machines even less.
>
> This one scares me.  The proof of concept patch you posted tests whether
> a write to the first sector would result in the sector matching a
> _currently known probe_ for the file formats that were compiled in at
> configure time to the currently running binary.  But this is NOT the set
> of all possible binary formats that may be introduced in the future.  By
> extrapolation, if we pursue write blocking, then the only SAFE way to is
> to reject ALL writes to the first sector, because we can't know which
> writes will match some theoretical future probe -

Yes!

>                                                   but by the time you
> get to that point, then we no longer match bare metal (rejecting ALL
> writes to the first sector is ridiculous).
>
>> 3. A write access that works just fine in one QEMU version can be
>> rejected by another version that recognizes more formats.  Or probes the
>> same formats differently.
>
> This is just a manifestation of my observation above, which can be
> easily triggered now by having two qemu binaries built from the same
> source but different compile options.

Throw in loadable modules, and it may even depend on runtime
configuration.

>> 4. Rejecting a write fails *late*, and looks like hardware failure to
>> the guest.  With imperfect guest software, this risks data corruption.
>> 
>> (4) is a deal breaker for me.
>
> It is mitigated somewhat by the fact that the proposed patch will NEVER
> reject writes for a management app that always correctly passes in a
> format and thus avoids probes.  But that is still fragile.

If I was content with a "specify format= explicitly and you'll be fine"
argument, we wouldn't have this discussion :)
Markus Armbruster Nov. 3, 2014, 8:18 a.m. UTC | #35
"Richard W.M. Jones" <rjones@redhat.com> writes:

> Can you add something like:
>
>   -drive ...,format=unsafe-probe
>
> so it does the probing anyway, even though we know it's unsafe?
>
> This will minimize the churn needed in libguestfs to make this work.

Retaining the insecure old default behavior as an explicit option makes
sense to me.
Max Reitz Nov. 3, 2014, 8:44 a.m. UTC | #36
On 2014-10-30 at 14:02, Markus Armbruster wrote:
> Max Reitz <mreitz@redhat.com> writes:
>
>> So I guess it's my turn to give yet another opinion (or just something
>> in between of what has been already said).
>>
>> First, I'm fine with this patch, or at least the idea as there were
>> yet some quirks.
> Yes, the patch has (fixable) issues.  It's really just a sketch that
> happens to work in common cases :)
>
>> But I oppose turning the warning into an error eventually. I want to
>> be able to use -hda $filename and it should work. As Kevin said, a
>> warning alone will not help a lot, though. I don't know about that, I
>> think it should work. This warning should only appear when you're
>> using qemu directly because management tools should already use -drive
>> with format= or driver=; and if you're using qemu directly you should
>> be watching stderr.
>>
>> The only way I'd be fine with turning this into an error would be to
>> make an exception for -hda and the like. There were already plans of
>> introducing pseudo block drivers for format and protocol probing; if
>> we do that we can use those block drivers for -hda. We should still
>> emit the warning, but in my opinion it should never be an error with
>> -hda, -cdrom etc.. For me, this is not about backwards compatibility
>> but because I'm using -hda myself.
> Examples of usage that is just fine (no warning):
>
> * -hda test.qcow2
>
>    Fine as long as test.qcow2 is really qcow2 (as it should!), and
>    either specifies a backing format (as it arguably should), or the
>    backing file name is sane.
>
> * -hda disk.img
>
>    Fine as long as disk.img is really a disk image (as it should).
>
> * -hda /dev/mapper/vg0-virtdisk
>
>    Fine as long as the logical volume is raw.  Not actually implemented
>    in my patch, but it's easy enough to do.
>
> Can you give me a few examples from your usage that triggers the
> warning?

You're right, I mistook making it an error if extension and probed 
format do not match for disabling probing completely. In that case I'm 
withdrawing my objection, though I still don't object Kevin's proposal 
either.

Max
Markus Armbruster Nov. 3, 2014, 8:54 a.m. UTC | #37
Kevin Wolf <kwolf@redhat.com> writes:

> Am 31.10.2014 um 12:24 hat Stefan Hajnoczi geschrieben:
>> On Thu, Oct 30, 2014 at 10:36:35AM +0100, Kevin Wolf wrote:
>> > Am 30.10.2014 um 10:27 hat Stefan Hajnoczi geschrieben:
>> > > The guest may legitimately use raw devices that contain image format
>> > > data.  Imagine tools similar to libguestfs.
>> > > 
>> > > It's perfectly okay for them to lay out image format data onto a raw
>> > > device.
>> > > 
>> > > Probing is the problem, not putting image format data onto a raw device.
>> > 
>> > Agreed, that's why any restrictions only apply when probing was used to
>> > detect a raw image. If you want to do anything exotic like storing a
>> > qcow2 image for nested virt on a disk that is a raw image in the host,
>> > then making sure to pass format=raw shouldn't be too much.
>> 
>> Because at that point the solution is way over-engineered.
>> 
>> Probing checks should be in the QEMU command-line code, not sprinkled
>> across the codebase and even at run-time.
>> 
>> Isn't Markus approach much simpler and cleaner?
>
> I don't think so. My code isn't "sprinkled across the codebase", it has
> the checks right where the problem arises, in the raw block driver.

Actually, that's not where the problem is.

The guest writing funny things to its disks is *not* the problem, it's
perfectly normal operation.  Probing untrusted images is the problem!

> It's with Markus's approach that we'll have to have code in many
> different places as I showed. Its fundamental assumption that there is
> always a filename string and the filename isn't passed in some QDict
> option is simply wrong. Specifying the image is driver-dependent and
> therefore you'd have to add functionality to each driver in order to get
> the filename extension (or the information that there isn't anything
> close enough to a filename).

The RFC patch is incomplete.  Can't say how much code recording the
filename correctly will take until I've done it.

I suspect the lack of an image filename already leads to rotten error
messages in places.

> The only argument brought up so far that I can reasonably buy is that
> in the unlikely case of the restrictions applying it may be surprising
> for the user to see requests failing. To address this, we could print
> a warning when an image is opened in the "restricted raw" mode. This
> way the user knows what's going on,

Improves the virtual hardware from "crippled" to "openly crippled", and
my opinion on it from "I hate it" to "I hate it slightly less" :)

>                                     and at the same time we still
> effectively protect them instead of only printing a warning without real
> protection.

This isn't real protection, either.

If the user runs with format=raw, the guest can write whatever it likes,
and that's a feature.  If the user forgets format=raw just once, he's
p0wned.  That's because your "protection" does not address the real
problem (probing of untrusted images), but tries to prevent it by
refusing to created "bad" untrusted images, but can do so only in
special configurations, and not at all for images written by something
else.

Unlike your proposal, mine does attack the real problem, and thus *can*
provide real protection.  Not immediately because we want to avoid
aprupt change, but eventually.  Here's how:

1. Initially, we warn on insecure usage.  This doesn't protect users,
but it guides them towards protecting themselves.

2. Eventually, we make the warning an error.  This *does* protect users,
regardless of how they use or have used QEMU, *except* when they
explicitly ask for insecure probing with format=insecure-probe (assuming
we provide that option).
Max Reitz Nov. 3, 2014, 9:11 a.m. UTC | #38
On 2014-11-03 at 09:54, Markus Armbruster wrote:
> Kevin Wolf <kwolf@redhat.com> writes:
>
>> Am 31.10.2014 um 12:24 hat Stefan Hajnoczi geschrieben:
>>> On Thu, Oct 30, 2014 at 10:36:35AM +0100, Kevin Wolf wrote:
>>>> Am 30.10.2014 um 10:27 hat Stefan Hajnoczi geschrieben:
>>>>> The guest may legitimately use raw devices that contain image format
>>>>> data.  Imagine tools similar to libguestfs.
>>>>>
>>>>> It's perfectly okay for them to lay out image format data onto a raw
>>>>> device.
>>>>>
>>>>> Probing is the problem, not putting image format data onto a raw device.
>>>> Agreed, that's why any restrictions only apply when probing was used to
>>>> detect a raw image. If you want to do anything exotic like storing a
>>>> qcow2 image for nested virt on a disk that is a raw image in the host,
>>>> then making sure to pass format=raw shouldn't be too much.
>>> Because at that point the solution is way over-engineered.
>>>
>>> Probing checks should be in the QEMU command-line code, not sprinkled
>>> across the codebase and even at run-time.
>>>
>>> Isn't Markus approach much simpler and cleaner?
>> I don't think so. My code isn't "sprinkled across the codebase", it has
>> the checks right where the problem arises, in the raw block driver.
> Actually, that's not where the problem is.
>
> The guest writing funny things to its disks is *not* the problem, it's
> perfectly normal operation.  Probing untrusted images is the problem!

I don't think making your hard disk a qcow2 image is a perfectly normal 
operation, but perhaps that's just me.

I still think writing to sector 0 is not a perfectly normal operation, 
but rarely ever happens (at least for x86[1]), and if it does, there's 
not that much variety to what is written there.

[1] I'm open for arguments about how on other platforms there's not 
necessarily a boot loader in sector 0 of an image and the host is 
therefore actually completely free to write whatever it likes and 
there's no telling of what it will be.

>> It's with Markus's approach that we'll have to have code in many
>> different places as I showed. Its fundamental assumption that there is
>> always a filename string and the filename isn't passed in some QDict
>> option is simply wrong. Specifying the image is driver-dependent and
>> therefore you'd have to add functionality to each driver in order to get
>> the filename extension (or the information that there isn't anything
>> close enough to a filename).
> The RFC patch is incomplete.  Can't say how much code recording the
> filename correctly will take until I've done it.
>
> I suspect the lack of an image filename already leads to rotten error
> messages in places.
>
>> The only argument brought up so far that I can reasonably buy is that
>> in the unlikely case of the restrictions applying it may be surprising
>> for the user to see requests failing. To address this, we could print
>> a warning when an image is opened in the "restricted raw" mode. This
>> way the user knows what's going on,
> Improves the virtual hardware from "crippled" to "openly crippled", and
> my opinion on it from "I hate it" to "I hate it slightly less" :)
>
>>                                      and at the same time we still
>> effectively protect them instead of only printing a warning without real
>> protection.
> This isn't real protection, either.
>
> If the user runs with format=raw, the guest can write whatever it likes,
> and that's a feature.  If the user forgets format=raw just once, he's
> p0wned.  That's because your "protection" does not address the real
> problem (probing of untrusted images),

Well, that's your definition of the real problem. I think making a raw 
image effectively a non-raw image (according to what file(1) says) is a 
real problem as well. Detecting an image as qcow2 when file(1) says it 
is indeed qcow2 is not that much of a problem, in my very humble opinion.

Probably the real real problem is that qemu supports raw images at all, 
in contrast to other VM software (which support it for CD and floppy at 
the most). Hey, let's just add a flat mode to qcow2 and get rid of raw 
altogether! (that was a joke)

> but tries to prevent it by
> refusing to created "bad" untrusted images, but can do so only in
> special configurations, and not at all for images written by something
> else.
>
> Unlike your proposal, mine does attack the real problem, and thus *can*
> provide real protection.

It's what you define to be the real problem. I for one think the 
definition of what a raw image is is the real problem. When does a raw 
image cease to be a raw image? My definition would be that it ceases to 
be raw when sufficiently complex probing detects another format. Your 
definition seems to be that it's always raw when the user wants it to be 
raw, and sometimes the user doesn't know that he/she wants it to be raw 
(because of legacy issues).

Also, I think it's perfectly valid to prevent a guest from writing 
things to the first sector which this sufficiently complex probing(TM) 
detects to be some image format. You don't think so. That's why you say 
probing is the real problem, whereas I say allowing the guest to write a 
fake image header is the real problem (not the least because it may fool 
other tools as well).

So for me, your patch only mitigates the problem but does not fix it. 
Mitigating it is better than doing nothing, however, which is why I'm 
fine with it.

Max

> Not immediately because we want to avoid
> aprupt change, but eventually.  Here's how:
>
> 1. Initially, we warn on insecure usage.  This doesn't protect users,
> but it guides them towards protecting themselves.
>
> 2. Eventually, we make the warning an error.  This *does* protect users,
> regardless of how they use or have used QEMU, *except* when they
> explicitly ask for insecure probing with format=insecure-probe (assuming
> we provide that option).
Kevin Wolf Nov. 3, 2014, 10:25 a.m. UTC | #39
Am 03.11.2014 um 09:54 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 31.10.2014 um 12:24 hat Stefan Hajnoczi geschrieben:
> >> On Thu, Oct 30, 2014 at 10:36:35AM +0100, Kevin Wolf wrote:
> >> > Am 30.10.2014 um 10:27 hat Stefan Hajnoczi geschrieben:
> >> > > The guest may legitimately use raw devices that contain image format
> >> > > data.  Imagine tools similar to libguestfs.
> >> > > 
> >> > > It's perfectly okay for them to lay out image format data onto a raw
> >> > > device.
> >> > > 
> >> > > Probing is the problem, not putting image format data onto a raw device.
> >> > 
> >> > Agreed, that's why any restrictions only apply when probing was used to
> >> > detect a raw image. If you want to do anything exotic like storing a
> >> > qcow2 image for nested virt on a disk that is a raw image in the host,
> >> > then making sure to pass format=raw shouldn't be too much.
> >> 
> >> Because at that point the solution is way over-engineered.
> >> 
> >> Probing checks should be in the QEMU command-line code, not sprinkled
> >> across the codebase and even at run-time.
> >> 
> >> Isn't Markus approach much simpler and cleaner?
> >
> > I don't think so. My code isn't "sprinkled across the codebase", it has
> > the checks right where the problem arises, in the raw block driver.
> 
> Actually, that's not where the problem is.
> 
> The guest writing funny things to its disks is *not* the problem, it's
> perfectly normal operation.  Probing untrusted images is the problem!

I think we all know it, but let's clearly state the real problem once
again to remind us what we're trying to do on a high level. The security
problem we're talking about is that some image formats can contain file
names, and the content of these files can be exposed (in some cases even
r/w) to the guest.

Therefore not probing, but using untrusted images is the root problem -
but it's not one that we can fix. If the users downloads (or otherwise
obtains) an untrusted image with a bad backing file name, and of course
the correct file extension, we don't have a chance to protect them.

What we can do something about is a special case of this, where a
malicious guest tries to turn a (trusted) image into something that
isn't trustworthy any more. In my book, the bad action is modifying the
image so that it looks like a different format with a bad header, it's
not treating an image file like what it looks like. If you disable
probing in qemu, you still allow qemu guests to write bad headers that
can fool other tools.

The problem really isn't all the non-raw formats that have a header
which can contain file names. The problem is with raw which doesn't have
a header and allows the guest to turn its image file into any other file
type it wants and can potentially trick the user. raw is really a bad
format (or actually, it's a non-format), and if I could choose I
wouldn't support it, or at least only r/o. We should instead have used a
format with a header and then the raw file shifted by some offset. It's
too late for that, though.

But coming back to your specific statement: Probing images isn't the
problem. The guest writing funny things isn't the problem either. Using
a format that can't reliably represent those funny things is the
problem. And if the format can't reliably represent something, it should
return an error when you try to do it anyway. If you need the feature of
storing funny things, use a format that supports it reliably. (And
calling explicit format=raw reliable isn't quite right either, but it's
a compromise I'd be willing to make.)

> > It's with Markus's approach that we'll have to have code in many
> > different places as I showed. Its fundamental assumption that there is
> > always a filename string and the filename isn't passed in some QDict
> > option is simply wrong. Specifying the image is driver-dependent and
> > therefore you'd have to add functionality to each driver in order to get
> > the filename extension (or the information that there isn't anything
> > close enough to a filename).
> 
> The RFC patch is incomplete.  Can't say how much code recording the
> filename correctly will take until I've done it.

And will you be sure that you'll have caught all places?

> I suspect the lack of an image filename already leads to rotten error
> messages in places.

Yes, we're in the middle of a conversion. I fully expect some messages
to be broken at the moment.

> > The only argument brought up so far that I can reasonably buy is that
> > in the unlikely case of the restrictions applying it may be surprising
> > for the user to see requests failing. To address this, we could print
> > a warning when an image is opened in the "restricted raw" mode. This
> > way the user knows what's going on,
> 
> Improves the virtual hardware from "crippled" to "openly crippled", and
> my opinion on it from "I hate it" to "I hate it slightly less" :)
> 
> >                                     and at the same time we still
> > effectively protect them instead of only printing a warning without real
> > protection.
> 
> This isn't real protection, either.
> 
> If the user runs with format=raw, the guest can write whatever it likes,
> and that's a feature.  If the user forgets format=raw just once, he's
> p0wned.  That's because your "protection" does not address the real
> problem (probing of untrusted images), but tries to prevent it by
> refusing to created "bad" untrusted images, but can do so only in
> special configurations, and not at all for images written by something
> else.

If you want to fully resolve the problem of trusted images turning into
something bad, you need to forbid raw. Period.

But we probably agree that we can't do that. What's left is that we can
try to make it less likely to happen.

> Unlike your proposal, mine does attack the real problem, and thus *can*
> provide real protection.  Not immediately because we want to avoid
> aprupt change, but eventually.

Your patch isn't attacking the real problem (using bad images). It also
isn't attacking the full special case (bad guest turning trusted images
into something bad). It's only attacking a special case of the special
case (turning a trusted image into something bad, and then exploiting
the bad image with qemu).

> Here's how:
> 
> 1. Initially, we warn on insecure usage.  This doesn't protect users,
> but it guides them towards protecting themselves.
> 
> 2. Eventually, we make the warning an error.  This *does* protect users,
> regardless of how they use or have used QEMU, *except* when they
> explicitly ask for insecure probing with format=insecure-probe (assuming
> we provide that option).

"Eventually" leaves users vulnerable for another year or two, even in
your special case of the special case.

And it will actually hit some users in practice with false positives (I
can remember BZs with command lines where qcow2 images don't have a
.qcow2 extension; and for LVs it's true anyway). I've yet to see
examples where my "restricted raw" patch would actually trigger in
practice outside of constructed nested virt examples with whole disks
formatted as qcow2 (seriously...) or actual attacks.

Kevin
Kevin Wolf Nov. 3, 2014, 11 a.m. UTC | #40
Am 30.10.2014 um 13:49 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 29.10.2014 um 14:54 hat Markus Armbruster geschrieben:
> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> > Instead, let me try once more to sell my old proposal [1] from the
> >> > thread you mentioned:
> >> >
> >> >> What if we let the raw driver know that it was probed and then it
> >> >> enables a check that returns -EIO for any write on the first 2k if that
> >> >> write would make the image look like a different format?
> >> >
> >> > Attacks the problem where it arises instead of trying to detect the
> >> > outcome of it, and works in whatever way it is nested in the BDS graph
> >> > and whatever way is used to address the image file.
> >> >
> >> > Kevin
> >> >
> >> > [1] https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02548.html
> >> 
> >> Arbitrarily breaking the virtual hardware that way is incompatible, too.
> >> It's a bigger no-no for me than changing user interface corner cases or
> >> deciding what to do with a file based on its file name extension.
> >
> > It is arbitrarily breaking theoretical cases, while your patch is less
> > arbitrarily breaking common cases. I strongly prefer the former.
> 
> I challenge "theoretical" as well as "common".
> 
> For "theoretical", see below.
> 
> As to "common": are unrecognizable image names really common?  I doubt
> it.  If I'm wrong, I expect users to complain about my warning, and then
> we can reconsider.

As I said in my other mail, I remember seeing BZs with command lines
that don't have a file extension. Block devices are affected anyway.

Your RFC also misses the extension .raw that I personally use
occasionally, so I would get the warning even though the image name
isn't "unrecognizable" at all. And other people probably have other
extensions in use that we don't know of.

> > Nobody will ever want to write a qcow2 header into their boot sector for
> > just running a guest:
> >
> > 0:   51                      push   %cx
> > 1:   46                      inc    %si
> > 2:   49                      dec    %cx
> > 3:   fb                      sti
> >
> > This code doesn't make any sense. It's similar for other formats. And if
> > they really have some odd reason to do that, the fix is easy: Either
> > don't use raw, or pass an explicit format=raw.
> 
> A disk needn't start with a PC-style boot sector.  Even on a PC.  Not
> every disk needs to be bootable, or even partitioned.  Databases exist
> that like to work on raw devices.  Giving them /dev/sdb instead of a
> whole-disk partition /dev/sdb1 may not be the smartest thing to do, but
> it *works* with real hardware, so why not with virtual hardware?

Oh, it does. Just use a format that can represent this reliably (and as
a compromise, we're going to declare explicitly requested raw as such -
again, see my other mail for details).

> You either have to prevent *any* writing of the first 2048 bytes (the
> part that can be examined by a bdrv_probe() method, or your have to
> prevent writing anything a probe recognizes, or the user has to specify
> the format explicitly.
> 
> If you do the former, you're way outside the realm of "theoretical".

Right, that's not an option.

> If you do the latter, the degree of "theoreticalness" depends on the
> union of the patterns you prevent.  Issues:
> 
> 1. Anthony's method of checking a few known signatures is fragile.  The
> only obviously robust way to test "is probing going to find something
> other than 'raw'?" is running the probes.  Feasible.

Yes, this is what I've been talking about.

> 2. Whether the union of patterns qualifies as "theoretical" for all our
> targets is not obvious, and whether it'll remain "theoretical" for all
> future formats and target machines even less.

At least we don't have examples of it happening yet. And even if it
happens to become not theoretical at some point, the only thing that
happens is that they need to add format=raw - something that we already
know is sure to happen with your approach.

Once we get to know of such cases, we can probably even resolve them by
improving the probing function of the problematic format.

> 3. A write access that works just fine in one QEMU version can be
> rejected by another version that recognizes more formats.  Or probes the
> same formats differently.

True. We're only sure to protect this binary, and we have good chances
of protecting some other qemu versions and some other tools, too.

If the attacker has a time machine and knows what the next installed
qemu version will recognize (or if they exploit a file format that is
not a disk image, with a program other than qemu), we don't fully
protect the user. We'd have to completely forbid raw for that.

> 4. Rejecting a write fails *late*, and looks like hardware failure to
> the guest.  With imperfect guest software, this risks data corruption.

Okay. So your common case is that you have a badly written database that
gets the full unpartitioned disk assigned and doesn't start to write at
the first block, but randomly, and when it finally tries to use block 0,
it gets deadly confused by the I/O error so that it will break the data
that it has already stored elsewhere.

Yeah right.

And you probably deserved it then for using such software. It would have
happened anyway sooner or later.

Kevin
Kevin Wolf Nov. 3, 2014, 11:13 a.m. UTC | #41
Am 31.10.2014 um 23:45 hat Eric Blake geschrieben:
> On 10/30/2014 06:49 AM, Markus Armbruster wrote:
> 
> > You either have to prevent *any* writing of the first 2048 bytes (the
> > part that can be examined by a bdrv_probe() method, or your have to
> > prevent writing anything a probe recognizes, or the user has to specify
> > the format explicitly.
> > 
> > If you do the former, you're way outside the realm of "theoretical".
> > 
> > If you do the latter, the degree of "theoreticalness" depends on the
> > union of the patterns you prevent.  Issues:
> > 
> > 1. Anthony's method of checking a few known signatures is fragile.  The
> > only obviously robust way to test "is probing going to find something
> > other than 'raw'?" is running the probes.  Feasible.
> > 
> > 2. Whether the union of patterns qualifies as "theoretical" for all our
> > targets is not obvious, and whether it'll remain "theoretical" for all
> > future formats and target machines even less.
> 
> This one scares me.  The proof of concept patch you posted tests whether
> a write to the first sector would result in the sector matching a
> _currently known probe_ for the file formats that were compiled in at
> configure time to the currently running binary.  But this is NOT the set
> of all possible binary formats that may be introduced in the future.  By
> extrapolation, if we pursue write blocking, then the only SAFE way to is
> to reject ALL writes to the first sector, because we can't know which
> writes will match some theoretical future probe - but by the time you
> get to that point, then we no longer match bare metal (rejecting ALL
> writes to the first sector is ridiculous).

There is no absolute security without forbidding raw. Who says that the
next format doesn't have its magic in sector 42?

You are right that if an attacker knows which new format supporting
backing files we'll have in the next version, and the user uses probed
raw despite the warning (which means they are not using libvirt), the
attacker can write the header of the new image format now and hope that
the user leaves it alone until the update happens at some point in the
future. Then the malicious guest can access that one file, but not
obtain access to the next one (because the new checks are in place
then).

I don't think this is a relevant attack vector. It's probably much
easier to get the user to run an untrusted image than converting a
trusted image into a time bomb and waiting potentially for months or
years for it to explode.

Kevin
Stefan Hajnoczi Nov. 3, 2014, 3:05 p.m. UTC | #42
On Mon, Nov 03, 2014 at 11:25:10AM +0100, Kevin Wolf wrote:
> Am 03.11.2014 um 09:54 hat Markus Armbruster geschrieben:
> > Kevin Wolf <kwolf@redhat.com> writes:
> > 
> > > Am 31.10.2014 um 12:24 hat Stefan Hajnoczi geschrieben:
> > >> On Thu, Oct 30, 2014 at 10:36:35AM +0100, Kevin Wolf wrote:
> > >> > Am 30.10.2014 um 10:27 hat Stefan Hajnoczi geschrieben:
> > >> > > The guest may legitimately use raw devices that contain image format
> > >> > > data.  Imagine tools similar to libguestfs.
> > >> > > 
> > >> > > It's perfectly okay for them to lay out image format data onto a raw
> > >> > > device.
> > >> > > 
> > >> > > Probing is the problem, not putting image format data onto a raw device.
> > >> > 
> > >> > Agreed, that's why any restrictions only apply when probing was used to
> > >> > detect a raw image. If you want to do anything exotic like storing a
> > >> > qcow2 image for nested virt on a disk that is a raw image in the host,
> > >> > then making sure to pass format=raw shouldn't be too much.
> > >> 
> > >> Because at that point the solution is way over-engineered.
> > >> 
> > >> Probing checks should be in the QEMU command-line code, not sprinkled
> > >> across the codebase and even at run-time.
> > >> 
> > >> Isn't Markus approach much simpler and cleaner?
> > >
> > > I don't think so. My code isn't "sprinkled across the codebase", it has
> > > the checks right where the problem arises, in the raw block driver.
> > 
> > Actually, that's not where the problem is.
> > 
> > The guest writing funny things to its disks is *not* the problem, it's
> > perfectly normal operation.  Probing untrusted images is the problem!
> 
> I think we all know it, but let's clearly state the real problem once
> again to remind us what we're trying to do on a high level. The security
> problem we're talking about is that some image formats can contain file
> names, and the content of these files can be exposed (in some cases even
> r/w) to the guest.

That's too specific.  There are other cases:

Imagine you are paying for a 10 GB disk KVM virtual server with a
hosting provider.

The hosting provider did not configure KVM correctly and they use
probing.  You can write a qcow2 header with a 100 GB disk size to the
raw disk and reboot the VM.

Suddenly your guest has access to 100 GB of disk space even though you
are only paying for 10 GB!

> Therefore not probing, but using untrusted images is the root problem -
> but it's not one that we can fix. If the users downloads (or otherwise
> obtains) an untrusted image with a bad backing file name, and of course
> the correct file extension, we don't have a chance to protect them.

This is driving the discussion into the wrong direction.  You are right
that QEMU cannot solve that but it's also a well-known problem that
OpenStack and co solved a long time ago.

> What we can do something about is a special case of this, where a
> malicious guest tries to turn a (trusted) image into something that
> isn't trustworthy any more. In my book, the bad action is modifying the
> image so that it looks like a different format with a bad header, it's
> not treating an image file like what it looks like. If you disable
> probing in qemu, you still allow qemu guests to write bad headers that
> can fool other tools.
> 
> The problem really isn't all the non-raw formats that have a header
> which can contain file names. The problem is with raw which doesn't have
> a header and allows the guest to turn its image file into any other file
> type it wants and can potentially trick the user. raw is really a bad
> format (or actually, it's a non-format), and if I could choose I
> wouldn't support it, or at least only r/o. We should instead have used a
> format with a header and then the raw file shifted by some offset. It's
> too late for that, though.
> 
> But coming back to your specific statement: Probing images isn't the
> problem. The guest writing funny things isn't the problem either. Using
> a format that can't reliably represent those funny things is the
> problem. And if the format can't reliably represent something, it should
> return an error when you try to do it anyway. If you need the feature of
> storing funny things, use a format that supports it reliably. (And
> calling explicit format=raw reliable isn't quite right either, but it's
> a compromise I'd be willing to make.)

I don't follow this logic.  You end up with "raw is bad because it
doesn't have a header".  That doesn't get us anywhere.

(There are other formats that can also be confused.  For example, you
can create a qcow2 file that is also a vpc file because vpc allows files
that have only a footer.)

Fact remains that you must not probe untrusted guest disk images.

The argument that there might not be a traditional filename doesn't make
sense to me.  When there is no filename the command-line is already
sufficiently complex and usage is fancy enough that probing adds no
convenience, the user can just specify the format.

Anyway, does this sound reasonable:

In QEMU 3.0, require the format= option for -drive.  Keep probing the
way it is for non-drive options because they are used for convenience by
local users.

Stefan
Max Reitz Nov. 3, 2014, 3:13 p.m. UTC | #43
On 2014-11-03 at 16:05, Stefan Hajnoczi wrote:
> On Mon, Nov 03, 2014 at 11:25:10AM +0100, Kevin Wolf wrote:
>> Am 03.11.2014 um 09:54 hat Markus Armbruster geschrieben:
>>> Kevin Wolf <kwolf@redhat.com> writes:
>>>
>>>> Am 31.10.2014 um 12:24 hat Stefan Hajnoczi geschrieben:
>>>>> On Thu, Oct 30, 2014 at 10:36:35AM +0100, Kevin Wolf wrote:
>>>>>> Am 30.10.2014 um 10:27 hat Stefan Hajnoczi geschrieben:
>>>>>>> The guest may legitimately use raw devices that contain image format
>>>>>>> data.  Imagine tools similar to libguestfs.
>>>>>>>
>>>>>>> It's perfectly okay for them to lay out image format data onto a raw
>>>>>>> device.
>>>>>>>
>>>>>>> Probing is the problem, not putting image format data onto a raw device.
>>>>>> Agreed, that's why any restrictions only apply when probing was used to
>>>>>> detect a raw image. If you want to do anything exotic like storing a
>>>>>> qcow2 image for nested virt on a disk that is a raw image in the host,
>>>>>> then making sure to pass format=raw shouldn't be too much.
>>>>> Because at that point the solution is way over-engineered.
>>>>>
>>>>> Probing checks should be in the QEMU command-line code, not sprinkled
>>>>> across the codebase and even at run-time.
>>>>>
>>>>> Isn't Markus approach much simpler and cleaner?
>>>> I don't think so. My code isn't "sprinkled across the codebase", it has
>>>> the checks right where the problem arises, in the raw block driver.
>>> Actually, that's not where the problem is.
>>>
>>> The guest writing funny things to its disks is *not* the problem, it's
>>> perfectly normal operation.  Probing untrusted images is the problem!
>> I think we all know it, but let's clearly state the real problem once
>> again to remind us what we're trying to do on a high level. The security
>> problem we're talking about is that some image formats can contain file
>> names, and the content of these files can be exposed (in some cases even
>> r/w) to the guest.
> That's too specific.  There are other cases:
>
> Imagine you are paying for a 10 GB disk KVM virtual server with a
> hosting provider.
>
> The hosting provider did not configure KVM correctly and they use
> probing.  You can write a qcow2 header with a 100 GB disk size to the
> raw disk and reboot the VM.
>
> Suddenly your guest has access to 100 GB of disk space even though you
> are only paying for 10 GB!
>
>> Therefore not probing, but using untrusted images is the root problem -
>> but it's not one that we can fix. If the users downloads (or otherwise
>> obtains) an untrusted image with a bad backing file name, and of course
>> the correct file extension, we don't have a chance to protect them.
> This is driving the discussion into the wrong direction.  You are right
> that QEMU cannot solve that but it's also a well-known problem that
> OpenStack and co solved a long time ago.
>
>> What we can do something about is a special case of this, where a
>> malicious guest tries to turn a (trusted) image into something that
>> isn't trustworthy any more. In my book, the bad action is modifying the
>> image so that it looks like a different format with a bad header, it's
>> not treating an image file like what it looks like. If you disable
>> probing in qemu, you still allow qemu guests to write bad headers that
>> can fool other tools.
>>
>> The problem really isn't all the non-raw formats that have a header
>> which can contain file names. The problem is with raw which doesn't have
>> a header and allows the guest to turn its image file into any other file
>> type it wants and can potentially trick the user. raw is really a bad
>> format (or actually, it's a non-format), and if I could choose I
>> wouldn't support it, or at least only r/o. We should instead have used a
>> format with a header and then the raw file shifted by some offset. It's
>> too late for that, though.
>>
>> But coming back to your specific statement: Probing images isn't the
>> problem. The guest writing funny things isn't the problem either. Using
>> a format that can't reliably represent those funny things is the
>> problem. And if the format can't reliably represent something, it should
>> return an error when you try to do it anyway. If you need the feature of
>> storing funny things, use a format that supports it reliably. (And
>> calling explicit format=raw reliable isn't quite right either, but it's
>> a compromise I'd be willing to make.)
> I don't follow this logic.  You end up with "raw is bad because it
> doesn't have a header".  That doesn't get us anywhere.

It's true anyway. Just because an argument is not constructive does not 
mean it's bad, as long as it's true.

> (There are other formats that can also be confused.  For example, you
> can create a qcow2 file that is also a vpc file because vpc allows files
> that have only a footer.)

Then vpc is bad, too.

> Fact remains that you must not probe untrusted guest disk images.

Nobody wants to completely abolish probing. Markus only wants to use the 
extension as a second source of information, basically.

> The argument that there might not be a traditional filename doesn't make
> sense to me.  When there is no filename the command-line is already
> sufficiently complex and usage is fancy enough that probing adds no
> convenience, the user can just specify the format.

Apropos "the cloud provider did not configure KVM correctly"?

> Anyway, does this sound reasonable:
>
> In QEMU 3.0, require the format= option for -drive.  Keep probing the
> way it is for non-drive options because they are used for convenience by
> local users.

To me it does, yes.

Max
Markus Armbruster Nov. 4, 2014, 9:34 a.m. UTC | #44
Max Reitz <mreitz@redhat.com> writes:

> On 2014-11-03 at 09:54, Markus Armbruster wrote:
>> Kevin Wolf <kwolf@redhat.com> writes:
>>
>>> Am 31.10.2014 um 12:24 hat Stefan Hajnoczi geschrieben:
>>>> On Thu, Oct 30, 2014 at 10:36:35AM +0100, Kevin Wolf wrote:
>>>>> Am 30.10.2014 um 10:27 hat Stefan Hajnoczi geschrieben:
>>>>>> The guest may legitimately use raw devices that contain image format
>>>>>> data.  Imagine tools similar to libguestfs.
>>>>>>
>>>>>> It's perfectly okay for them to lay out image format data onto a raw
>>>>>> device.
>>>>>>
>>>>>> Probing is the problem, not putting image format data onto a raw device.
>>>>> Agreed, that's why any restrictions only apply when probing was used to
>>>>> detect a raw image. If you want to do anything exotic like storing a
>>>>> qcow2 image for nested virt on a disk that is a raw image in the host,
>>>>> then making sure to pass format=raw shouldn't be too much.
>>>> Because at that point the solution is way over-engineered.
>>>>
>>>> Probing checks should be in the QEMU command-line code, not sprinkled
>>>> across the codebase and even at run-time.
>>>>
>>>> Isn't Markus approach much simpler and cleaner?
>>> I don't think so. My code isn't "sprinkled across the codebase", it has
>>> the checks right where the problem arises, in the raw block driver.
>> Actually, that's not where the problem is.
>>
>> The guest writing funny things to its disks is *not* the problem, it's
>> perfectly normal operation.  Probing untrusted images is the problem!
>
> I don't think making your hard disk a qcow2 image is a perfectly
> normal operation, but perhaps that's just me.
>
> I still think writing to sector 0 is not a perfectly normal operation,
> but rarely ever happens (at least for x86[1]), and if it does, there's
> not that much variety to what is written there.
>
> [1] I'm open for arguments about how on other platforms there's not
> necessarily a boot loader in sector 0 of an image and the host is
> therefore actually completely free to write whatever it likes and
> there's no telling of what it will be.
>
>>> It's with Markus's approach that we'll have to have code in many
>>> different places as I showed. Its fundamental assumption that there is
>>> always a filename string and the filename isn't passed in some QDict
>>> option is simply wrong. Specifying the image is driver-dependent and
>>> therefore you'd have to add functionality to each driver in order to get
>>> the filename extension (or the information that there isn't anything
>>> close enough to a filename).
>> The RFC patch is incomplete.  Can't say how much code recording the
>> filename correctly will take until I've done it.
>>
>> I suspect the lack of an image filename already leads to rotten error
>> messages in places.
>>
>>> The only argument brought up so far that I can reasonably buy is that
>>> in the unlikely case of the restrictions applying it may be surprising
>>> for the user to see requests failing. To address this, we could print
>>> a warning when an image is opened in the "restricted raw" mode. This
>>> way the user knows what's going on,
>> Improves the virtual hardware from "crippled" to "openly crippled", and
>> my opinion on it from "I hate it" to "I hate it slightly less" :)
>>
>>>                                      and at the same time we still
>>> effectively protect them instead of only printing a warning without real
>>> protection.
>> This isn't real protection, either.
>>
>> If the user runs with format=raw, the guest can write whatever it likes,
>> and that's a feature.  If the user forgets format=raw just once, he's
>> p0wned.  That's because your "protection" does not address the real
>> problem (probing of untrusted images),
>
> Well, that's your definition of the real problem. I think making a raw
> image effectively a non-raw image (according to what file(1) says) is
> a real problem as well. Detecting an image as qcow2 when file(1) says
> it is indeed qcow2 is not that much of a problem, in my very humble
> opinion.
>
> Probably the real real problem is that qemu supports raw images at
> all, in contrast to other VM software (which support it for CD and
> floppy at the most). Hey, let's just add a flat mode to qcow2 and get
> rid of raw altogether! (that was a joke)
>
>> but tries to prevent it by
>> refusing to created "bad" untrusted images, but can do so only in
>> special configurations, and not at all for images written by something
>> else.
>>
>> Unlike your proposal, mine does attack the real problem, and thus *can*
>> provide real protection.
>
> It's what you define to be the real problem. I for one think the
> definition of what a raw image is is the real problem. When does a raw
> image cease to be a raw image? My definition would be that it ceases
> to be raw when sufficiently complex probing detects another
> format. Your definition seems to be that it's always raw when the user
> wants it to be raw, and sometimes the user doesn't know that he/she
> wants it to be raw (because of legacy issues).
>
> Also, I think it's perfectly valid to prevent a guest from writing
> things to the first sector which this sufficiently complex probing(TM)
> detects to be some image format. You don't think so. That's why you
> say probing is the real problem, whereas I say allowing the guest to
> write a fake image header is the real problem (not the least because
> it may fool other tools as well).
>
> So for me, your patch only mitigates the problem but does not fix
> it. Mitigating it is better than doing nothing, however, which is why
> I'm fine with it.

You're right, the *root* problem is the raw vs. other image formats
ambiguity.

This is an instance of the general raw data file vs. structured data
file ambiguity.  Whether a file contains raw data or not only the user
can know.  Users generally keep track of what's what by giving their
files suitable names.

The root problem becomes a QEMU security problem only because (a) we
support both raw and image formats, (b) we second-guess the user by
default, and (c) our guess can be controlled by a malicious guest.

We can try to fix it by attacking one or more of the three conditions:

(a) Outlaw raw images.  Nobody is seriously suggesting that.

(b) Make the second-guessing opt-in rather than opt-out.  This is my
approach.

(c) Prevent the guest from affecting the second-guessing.  This is
Kevin's approach.

I'll discuss these in more detail in a separate message.

>> Not immediately because we want to avoid
>> aprupt change, but eventually.  Here's how:
>>
>> 1. Initially, we warn on insecure usage.  This doesn't protect users,
>> but it guides them towards protecting themselves.
>>
>> 2. Eventually, we make the warning an error.  This *does* protect users,
>> regardless of how they use or have used QEMU, *except* when they
>> explicitly ask for insecure probing with format=insecure-probe (assuming
>> we provide that option).
Markus Armbruster Nov. 4, 2014, 9:36 a.m. UTC | #45
Kevin Wolf <kwolf@redhat.com> writes:

> Am 31.10.2014 um 23:45 hat Eric Blake geschrieben:
>> On 10/30/2014 06:49 AM, Markus Armbruster wrote:
>> 
>> > You either have to prevent *any* writing of the first 2048 bytes (the
>> > part that can be examined by a bdrv_probe() method, or your have to
>> > prevent writing anything a probe recognizes, or the user has to specify
>> > the format explicitly.
>> > 
>> > If you do the former, you're way outside the realm of "theoretical".
>> > 
>> > If you do the latter, the degree of "theoreticalness" depends on the
>> > union of the patterns you prevent.  Issues:
>> > 
>> > 1. Anthony's method of checking a few known signatures is fragile.  The
>> > only obviously robust way to test "is probing going to find something
>> > other than 'raw'?" is running the probes.  Feasible.
>> > 
>> > 2. Whether the union of patterns qualifies as "theoretical" for all our
>> > targets is not obvious, and whether it'll remain "theoretical" for all
>> > future formats and target machines even less.
>> 
>> This one scares me.  The proof of concept patch you posted tests whether
>> a write to the first sector would result in the sector matching a
>> _currently known probe_ for the file formats that were compiled in at
>> configure time to the currently running binary.  But this is NOT the set
>> of all possible binary formats that may be introduced in the future.  By
>> extrapolation, if we pursue write blocking, then the only SAFE way to is
>> to reject ALL writes to the first sector, because we can't know which
>> writes will match some theoretical future probe - but by the time you
>> get to that point, then we no longer match bare metal (rejecting ALL
>> writes to the first sector is ridiculous).
>
> There is no absolute security without forbidding raw. Who says that the
> next format doesn't have its magic in sector 42?

Correct.

> You are right that if an attacker knows which new format supporting
> backing files we'll have in the next version, and the user uses probed
> raw despite the warning (which means they are not using libvirt), the
> attacker can write the header of the new image format now and hope that
> the user leaves it alone until the update happens at some point in the
> future. Then the malicious guest can access that one file, but not
> obtain access to the next one (because the new checks are in place
> then).
>
> I don't think this is a relevant attack vector. It's probably much
> easier to get the user to run an untrusted image than converting a
> trusted image into a time bomb and waiting potentially for months or
> years for it to explode.

The threat from using untrusted images with embedded filenames is real.

Regardless, I wouldn't discount the (different) threat from guests
exploiting "future" probes so cavalierly.  If your host is running a
stable distro, its future is easy to know.  Even the point of time when
it gets upgraded can be predictable.

In general, I recommend against leaving security holes in place just
because you think nobody could be bothered to exploit them :)

Security is not an absolute that cannot be trumped by any other
consideration.  I'm perfectly happy to consider usability and
compatibility issues, as well as implementation complexity.  I'm much
less willing to accept a "come on, nobody is going to try *that*"
argument.
Markus Armbruster Nov. 4, 2014, 9:39 a.m. UTC | #46
Kevin Wolf <kwolf@redhat.com> writes:

> Am 30.10.2014 um 13:49 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > Am 29.10.2014 um 14:54 hat Markus Armbruster geschrieben:
>> >> Kevin Wolf <kwolf@redhat.com> writes:
>> >> > Instead, let me try once more to sell my old proposal [1] from the
>> >> > thread you mentioned:
>> >> >
>> >> >> What if we let the raw driver know that it was probed and then it
>> >> >> enables a check that returns -EIO for any write on the first 2k if that
>> >> >> write would make the image look like a different format?
>> >> >
>> >> > Attacks the problem where it arises instead of trying to detect the
>> >> > outcome of it, and works in whatever way it is nested in the BDS graph
>> >> > and whatever way is used to address the image file.
>> >> >
>> >> > Kevin
>> >> >
>> >> > [1]
>> >> > https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02548.html
>> >> 
>> >> Arbitrarily breaking the virtual hardware that way is incompatible, too.
>> >> It's a bigger no-no for me than changing user interface corner cases or
>> >> deciding what to do with a file based on its file name extension.
>> >
>> > It is arbitrarily breaking theoretical cases, while your patch is less
>> > arbitrarily breaking common cases. I strongly prefer the former.
>> 
>> I challenge "theoretical" as well as "common".
>> 
>> For "theoretical", see below.
>> 
>> As to "common": are unrecognizable image names really common?  I doubt
>> it.  If I'm wrong, I expect users to complain about my warning, and then
>> we can reconsider.
>
> As I said in my other mail, I remember seeing BZs with command lines
> that don't have a file extension. Block devices are affected anyway.
>
> Your RFC also misses the extension .raw that I personally use
> occasionally, so I would get the warning even though the image name
> isn't "unrecognizable" at all. And other people probably have other
> extensions in use that we don't know of.
>
>> > Nobody will ever want to write a qcow2 header into their boot sector for
>> > just running a guest:
>> >
>> > 0:   51                      push   %cx
>> > 1:   46                      inc    %si
>> > 2:   49                      dec    %cx
>> > 3:   fb                      sti
>> >
>> > This code doesn't make any sense. It's similar for other formats. And if
>> > they really have some odd reason to do that, the fix is easy: Either
>> > don't use raw, or pass an explicit format=raw.
>> 
>> A disk needn't start with a PC-style boot sector.  Even on a PC.  Not
>> every disk needs to be bootable, or even partitioned.  Databases exist
>> that like to work on raw devices.  Giving them /dev/sdb instead of a
>> whole-disk partition /dev/sdb1 may not be the smartest thing to do, but
>> it *works* with real hardware, so why not with virtual hardware?
>
> Oh, it does. Just use a format that can represent this reliably (and as
> a compromise, we're going to declare explicitly requested raw as such -
> again, see my other mail for details).

This is "opt in safety".  I dislike that for the same reason I dislike
"opt in security".

Insecure: we risk guest escaping isolation.

Unsafe: we risk virtual hardware breakage.

>> You either have to prevent *any* writing of the first 2048 bytes (the
>> part that can be examined by a bdrv_probe() method, or your have to
>> prevent writing anything a probe recognizes, or the user has to specify
>> the format explicitly.
>> 
>> If you do the former, you're way outside the realm of "theoretical".
>
> Right, that's not an option.
>
>> If you do the latter, the degree of "theoreticalness" depends on the
>> union of the patterns you prevent.  Issues:
>> 
>> 1. Anthony's method of checking a few known signatures is fragile.  The
>> only obviously robust way to test "is probing going to find something
>> other than 'raw'?" is running the probes.  Feasible.
>
> Yes, this is what I've been talking about.
>
>> 2. Whether the union of patterns qualifies as "theoretical" for all our
>> targets is not obvious, and whether it'll remain "theoretical" for all
>> future formats and target machines even less.
>
> At least we don't have examples of it happening yet. And even if it
> happens to become not theoretical at some point, the only thing that
> happens is that they need to add format=raw - something that we already
> know is sure to happen with your approach.
>
> Once we get to know of such cases, we can probably even resolve them by
> improving the probing function of the problematic format.

Your proposal "improves" things from "insecure by default" to "now less
insecure, but additionally a bit unsafe".

Your proposed remedy for "unsafe" seems to be "if it breaks, you get to
keep the pieces, but you may ask us to narrow the conditions for
breakage so it wouldn't have broken for you if you had imported the
changed version from the future" ;)

>> 3. A write access that works just fine in one QEMU version can be
>> rejected by another version that recognizes more formats.  Or probes the
>> same formats differently.
>
> True. We're only sure to protect this binary, and we have good chances
> of protecting some other qemu versions and some other tools, too.
>
> If the attacker has a time machine and knows what the next installed
> qemu version will recognize (or if they exploit a file format that is
> not a disk image, with a program other than qemu), we don't fully
> protect the user. We'd have to completely forbid raw for that.
>
>> 4. Rejecting a write fails *late*, and looks like hardware failure to
>> the guest.  With imperfect guest software, this risks data corruption.
>
> Okay. So your common case is that you have a badly written database that
> gets the full unpartitioned disk assigned and doesn't start to write at
> the first block, but randomly, and when it finally tries to use block 0,
> it gets deadly confused by the I/O error so that it will break the data
> that it has already stored elsewhere.
>
> Yeah right.
>
> And you probably deserved it then for using such software. It would have
> happened anyway sooner or later.

Since you fail writes to sector 0 only when a "bad" pattern is written,
your hypothesis that any software I should be using surely writes sector
0 early is irrelevant.

"Common" doesn't matter either (and I never claimed it was common).  If
I deploy software that uses whole disks, I'd consider disks that can
fail writes to sector 0, even though the actual hardware is in good
working order, defective, and I wouldn't buy them[*].  Regardless of how
common the failure was.  Fortunately, no such disks exist.  Now you're
proposing to create some, with a special "do not break" switch (factory
default "do break").  Your excuse for doing so seems to be:

(a) If I'm doing something *that* outlandish, I should know which disks
to buy (i.e. anything but virtual raw images), or where to find the "do
not break" switch (i.e. use format=raw).

(b) If the breakage causes me harm, I deserve it, because my software
sucks.

For me, (a) is debatable, but (b) is simply bad attitude.  Let's not go
there.


[*] In fact, I'd consider them defective and wouldn't buy them no matter
what software I plan to use.
Markus Armbruster Nov. 4, 2014, 9:42 a.m. UTC | #47
Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Mon, Nov 03, 2014 at 11:25:10AM +0100, Kevin Wolf wrote:
>> Am 03.11.2014 um 09:54 hat Markus Armbruster geschrieben:
>> > Kevin Wolf <kwolf@redhat.com> writes:
>> > 
>> > > Am 31.10.2014 um 12:24 hat Stefan Hajnoczi geschrieben:
>> > >> On Thu, Oct 30, 2014 at 10:36:35AM +0100, Kevin Wolf wrote:
>> > >> > Am 30.10.2014 um 10:27 hat Stefan Hajnoczi geschrieben:
>> > >> > > The guest may legitimately use raw devices that contain image format
>> > >> > > data.  Imagine tools similar to libguestfs.
>> > >> > > 
>> > >> > > It's perfectly okay for them to lay out image format data onto a raw
>> > >> > > device.
>> > >> > > 
>> > >> > > Probing is the problem, not putting image format data onto
>> > >> > > a raw device.
>> > >> > 
>> > >> > Agreed, that's why any restrictions only apply when probing was used to
>> > >> > detect a raw image. If you want to do anything exotic like storing a
>> > >> > qcow2 image for nested virt on a disk that is a raw image in the host,
>> > >> > then making sure to pass format=raw shouldn't be too much.
>> > >> 
>> > >> Because at that point the solution is way over-engineered.
>> > >> 
>> > >> Probing checks should be in the QEMU command-line code, not sprinkled
>> > >> across the codebase and even at run-time.
>> > >> 
>> > >> Isn't Markus approach much simpler and cleaner?
>> > >
>> > > I don't think so. My code isn't "sprinkled across the codebase", it has
>> > > the checks right where the problem arises, in the raw block driver.
>> > 
>> > Actually, that's not where the problem is.
>> > 
>> > The guest writing funny things to its disks is *not* the problem, it's
>> > perfectly normal operation.  Probing untrusted images is the problem!
>> 
>> I think we all know it, but let's clearly state the real problem once
>> again to remind us what we're trying to do on a high level. The security
>> problem we're talking about is that some image formats can contain file
>> names, and the content of these files can be exposed (in some cases even
>> r/w) to the guest.
>
> That's too specific.  There are other cases:
>
> Imagine you are paying for a 10 GB disk KVM virtual server with a
> hosting provider.
>
> The hosting provider did not configure KVM correctly and they use
> probing.  You can write a qcow2 header with a 100 GB disk size to the
> raw disk and reboot the VM.
>
> Suddenly your guest has access to 100 GB of disk space even though you
> are only paying for 10 GB!

Yes.

>> Therefore not probing, but using untrusted images is the root problem -
>> but it's not one that we can fix. If the users downloads (or otherwise
>> obtains) an untrusted image with a bad backing file name, and of course
>> the correct file extension, we don't have a chance to protect them.
>
> This is driving the discussion into the wrong direction.  You are right
> that QEMU cannot solve that but it's also a well-known problem that
> OpenStack and co solved a long time ago.

I agree.  Conflating different threat vectors isn't going to help.
I think we got two here:

1. Trusting data from untrusted sources

   As Stefan said, this is a very general problem we can't solve in
   QEMU.  But it's also a well-known problem.

2. Letting the guest turn untrusted data into trusted data

   Unlike 1., this is a very specific problem, which *can* be solved in
   QEMU, at some cost.

   It's a guest isolation failure.  Since isolating guests is one of the
   points of virtualization, I believe users can reasonably expect
   working isolation.  Thus, isolation failure is an unexpected rather
   than a well-known problem.

Working definition of "trusted image" in this context: you trust the
image meta-data, but not the (guest-controlled) image contents.

Example: in a trusted QCOW2 image, you trust everything but the contents
of the actual data blocks.

Example: in a "trusted" raw image, you can trust nothing but file name
and attributes.

>> What we can do something about is a special case of this, where a
>> malicious guest tries to turn a (trusted) image into something that
>> isn't trustworthy any more. In my book, the bad action is modifying the
>> image so that it looks like a different format with a bad header, it's
>> not treating an image file like what it looks like.

I acknowledge your point of view that modifying the image in certain
ways is a bad action.  I just don't share it.  In my view, the guest can
legitimately write whatever it pleases.  Making it work with proper
isolation is our job.

>>                                                     If you disable
>> probing in qemu, you still allow qemu guests to write bad headers that
>> can fool other tools.

Yes.  Both your and my patch allow that.

Your patch prevents it sometimes, for some patterns.  When it does, it
protects all future users of the raw image from this particular bad
pattern.

My patch doesn't prevent it.  Instead, it immunizes QEMU against writes.
Only protects QEMU.  But then only QEMU has an untrusted guest inside.

>> The problem really isn't all the non-raw formats that have a header
>> which can contain file names. The problem is with raw which doesn't have
>> a header and allows the guest to turn its image file into any other file
>> type it wants and can potentially trick the user. raw is really a bad
>> format (or actually, it's a non-format), and if I could choose I
>> wouldn't support it, or at least only r/o. We should instead have used a
>> format with a header and then the raw file shifted by some offset. It's
>> too late for that, though.

Too late indeed.

>> But coming back to your specific statement: Probing images isn't the
>> problem. The guest writing funny things isn't the problem either. Using
>> a format that can't reliably represent those funny things is the
>> problem. And if the format can't reliably represent something, it should
>> return an error when you try to do it anyway. If you need the feature of
>> storing funny things, use a format that supports it reliably. (And
>> calling explicit format=raw reliable isn't quite right either, but it's
>> a compromise I'd be willing to make.)
>
> I don't follow this logic.  You end up with "raw is bad because it
> doesn't have a header".  That doesn't get us anywhere.
>
> (There are other formats that can also be confused.  For example, you
> can create a qcow2 file that is also a vpc file because vpc allows files
> that have only a footer.)

That way is madness.

> Fact remains that you must not probe untrusted guest disk images.
>
> The argument that there might not be a traditional filename doesn't make
> sense to me.  When there is no filename the command-line is already
> sufficiently complex and usage is fancy enough that probing adds no
> convenience, the user can just specify the format.

Yes.  Unknown file name can and should fail gracefully with my solution:
ask the user to specify a format.  Code to make the file name known in
more cases may be nice to have anyway.

> Anyway, does this sound reasonable:
>
> In QEMU 3.0, require the format= option for -drive.  Keep probing the
> way it is for non-drive options because they are used for convenience by
> local users.

We've discussed the problem in some depth, and several solutions have
been proposed.  I'll try to describe the solution space in a separate
message.
Kevin Wolf Nov. 4, 2014, 10:11 a.m. UTC | #48
Am 03.11.2014 um 16:05 hat Stefan Hajnoczi geschrieben:
> On Mon, Nov 03, 2014 at 11:25:10AM +0100, Kevin Wolf wrote:
> > Am 03.11.2014 um 09:54 hat Markus Armbruster geschrieben:
> > > Kevin Wolf <kwolf@redhat.com> writes:
> > > 
> > > > Am 31.10.2014 um 12:24 hat Stefan Hajnoczi geschrieben:
> > > >> On Thu, Oct 30, 2014 at 10:36:35AM +0100, Kevin Wolf wrote:
> > > >> > Am 30.10.2014 um 10:27 hat Stefan Hajnoczi geschrieben:
> > > >> > > The guest may legitimately use raw devices that contain image format
> > > >> > > data.  Imagine tools similar to libguestfs.
> > > >> > > 
> > > >> > > It's perfectly okay for them to lay out image format data onto a raw
> > > >> > > device.
> > > >> > > 
> > > >> > > Probing is the problem, not putting image format data onto a raw device.
> > > >> > 
> > > >> > Agreed, that's why any restrictions only apply when probing was used to
> > > >> > detect a raw image. If you want to do anything exotic like storing a
> > > >> > qcow2 image for nested virt on a disk that is a raw image in the host,
> > > >> > then making sure to pass format=raw shouldn't be too much.
> > > >> 
> > > >> Because at that point the solution is way over-engineered.
> > > >> 
> > > >> Probing checks should be in the QEMU command-line code, not sprinkled
> > > >> across the codebase and even at run-time.
> > > >> 
> > > >> Isn't Markus approach much simpler and cleaner?
> > > >
> > > > I don't think so. My code isn't "sprinkled across the codebase", it has
> > > > the checks right where the problem arises, in the raw block driver.
> > > 
> > > Actually, that's not where the problem is.
> > > 
> > > The guest writing funny things to its disks is *not* the problem, it's
> > > perfectly normal operation.  Probing untrusted images is the problem!
> > 
> > I think we all know it, but let's clearly state the real problem once
> > again to remind us what we're trying to do on a high level. The security
> > problem we're talking about is that some image formats can contain file
> > names, and the content of these files can be exposed (in some cases even
> > r/w) to the guest.
> 
> That's too specific.  There are other cases:
> 
> Imagine you are paying for a 10 GB disk KVM virtual server with a
> hosting provider.
> 
> The hosting provider did not configure KVM correctly and they use
> probing.  You can write a qcow2 header with a 100 GB disk size to the
> raw disk and reboot the VM.
> 
> Suddenly your guest has access to 100 GB of disk space even though you
> are only paying for 10 GB!

Good point. It doesn't change the conclusions, because we're still
talking about a trusted image being changed in a way that it isn't
trustworthy any mroe. But it's good to keep in mind that there are more
cases than the classical backing file thing.

> > Therefore not probing, but using untrusted images is the root problem -
> > but it's not one that we can fix. If the users downloads (or otherwise
> > obtains) an untrusted image with a bad backing file name, and of course
> > the correct file extension, we don't have a chance to protect them.
> 
> This is driving the discussion into the wrong direction.  You are right
> that QEMU cannot solve that but it's also a well-known problem that
> OpenStack and co solved a long time ago.

It wasn't driving in the wrong direction, just an attempt to start at
the root, even if that is outside of our realm.

Anyway, OpenStack may have solved something, but then, libvirt also
always specifies the format. People affected by the problem don't use
libvirt or OpenStack.

> > What we can do something about is a special case of this, where a
> > malicious guest tries to turn a (trusted) image into something that
> > isn't trustworthy any more. In my book, the bad action is modifying the
> > image so that it looks like a different format with a bad header, it's
> > not treating an image file like what it looks like. If you disable
> > probing in qemu, you still allow qemu guests to write bad headers that
> > can fool other tools.
> > 
> > The problem really isn't all the non-raw formats that have a header
> > which can contain file names. The problem is with raw which doesn't have
> > a header and allows the guest to turn its image file into any other file
> > type it wants and can potentially trick the user. raw is really a bad
> > format (or actually, it's a non-format), and if I could choose I
> > wouldn't support it, or at least only r/o. We should instead have used a
> > format with a header and then the raw file shifted by some offset. It's
> > too late for that, though.
> > 
> > But coming back to your specific statement: Probing images isn't the
> > problem. The guest writing funny things isn't the problem either. Using
> > a format that can't reliably represent those funny things is the
> > problem. And if the format can't reliably represent something, it should
> > return an error when you try to do it anyway. If you need the feature of
> > storing funny things, use a format that supports it reliably. (And
> > calling explicit format=raw reliable isn't quite right either, but it's
> > a compromise I'd be willing to make.)
> 
> I don't follow this logic.  You end up with "raw is bad because it
> doesn't have a header".  That doesn't get us anywhere.

I think it's still important to acknowledge. It tells us that the
root of the problem is with raw.

> (There are other formats that can also be confused.  For example, you
> can create a qcow2 file that is also a vpc file because vpc allows files
> that have only a footer.)

Yes, VHD Fixed is a problematic format, too. Fortunately we've never
probed for it.

> Fact remains that you must not probe untrusted guest disk images.

It's great that you know the facts, but where is the reasoning behind
it? This doesn't follow from any of the above.

You must not _use_ untrusted images, no matter whether they are probed
or not. But the problem we're discussing here isn't about untrusted
images, it's about trusted images that get manipulated.

> The argument that there might not be a traditional filename doesn't make
> sense to me.  When there is no filename the command-line is already
> sufficiently complex and usage is fancy enough that probing adds no
> convenience, the user can just specify the format.

-hda nbd://localhost
-drive file=nbd://localhost,format=raw

Almost double the length, and I don't see anything fancy in the first
line.

> Anyway, does this sound reasonable:
> 
> In QEMU 3.0, require the format= option for -drive.  Keep probing the
> way it is for non-drive options because they are used for convenience by
> local users.

And being hacked while using -hda is better in which way?

Kevin
Kevin Wolf Nov. 4, 2014, 10:32 a.m. UTC | #49
Am 04.11.2014 um 10:36 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 31.10.2014 um 23:45 hat Eric Blake geschrieben:
> >> On 10/30/2014 06:49 AM, Markus Armbruster wrote:
> >> 
> >> > You either have to prevent *any* writing of the first 2048 bytes (the
> >> > part that can be examined by a bdrv_probe() method, or your have to
> >> > prevent writing anything a probe recognizes, or the user has to specify
> >> > the format explicitly.
> >> > 
> >> > If you do the former, you're way outside the realm of "theoretical".
> >> > 
> >> > If you do the latter, the degree of "theoreticalness" depends on the
> >> > union of the patterns you prevent.  Issues:
> >> > 
> >> > 1. Anthony's method of checking a few known signatures is fragile.  The
> >> > only obviously robust way to test "is probing going to find something
> >> > other than 'raw'?" is running the probes.  Feasible.
> >> > 
> >> > 2. Whether the union of patterns qualifies as "theoretical" for all our
> >> > targets is not obvious, and whether it'll remain "theoretical" for all
> >> > future formats and target machines even less.
> >> 
> >> This one scares me.  The proof of concept patch you posted tests whether
> >> a write to the first sector would result in the sector matching a
> >> _currently known probe_ for the file formats that were compiled in at
> >> configure time to the currently running binary.  But this is NOT the set
> >> of all possible binary formats that may be introduced in the future.  By
> >> extrapolation, if we pursue write blocking, then the only SAFE way to is
> >> to reject ALL writes to the first sector, because we can't know which
> >> writes will match some theoretical future probe - but by the time you
> >> get to that point, then we no longer match bare metal (rejecting ALL
> >> writes to the first sector is ridiculous).
> >
> > There is no absolute security without forbidding raw. Who says that the
> > next format doesn't have its magic in sector 42?
> 
> Correct.
> 
> > You are right that if an attacker knows which new format supporting
> > backing files we'll have in the next version, and the user uses probed
> > raw despite the warning (which means they are not using libvirt), the
> > attacker can write the header of the new image format now and hope that
> > the user leaves it alone until the update happens at some point in the
> > future. Then the malicious guest can access that one file, but not
> > obtain access to the next one (because the new checks are in place
> > then).
> >
> > I don't think this is a relevant attack vector. It's probably much
> > easier to get the user to run an untrusted image than converting a
> > trusted image into a time bomb and waiting potentially for months or
> > years for it to explode.
> 
> The threat from using untrusted images with embedded filenames is real.
> 
> Regardless, I wouldn't discount the (different) threat from guests
> exploiting "future" probes so cavalierly.  If your host is running a
> stable distro, its future is easy to know.  Even the point of time when
> it gets upgraded can be predictable.
> 
> In general, I recommend against leaving security holes in place just
> because you think nobody could be bothered to exploit them :)
> 
> Security is not an absolute that cannot be trumped by any other
> consideration.  I'm perfectly happy to consider usability and
> compatibility issues, as well as implementation complexity.  I'm much
> less willing to accept a "come on, nobody is going to try *that*"
> argument.

No, I didn't mean to make a "come on, nobody is going to try *that*"
argument.

I just can think of more attacks that we won't avoid with either
approach. For example, what if another program on the host is
exploitable and you just need a file with the right content to attack
it? A raw image gives you what you need. Nobody is going to try *that*
then, if you're content with allowing this?

The only full solution is forbidding raw. We have already agreed that
this isn't an option. Now we have to draw a line somewhere.

Kevin
Stefan Hajnoczi Nov. 4, 2014, 3:25 p.m. UTC | #50
On Tue, Nov 04, 2014 at 11:11:33AM +0100, Kevin Wolf wrote:
> Am 03.11.2014 um 16:05 hat Stefan Hajnoczi geschrieben:
> > The argument that there might not be a traditional filename doesn't make
> > sense to me.  When there is no filename the command-line is already
> > sufficiently complex and usage is fancy enough that probing adds no
> > convenience, the user can just specify the format.
> 
> -hda nbd://localhost
> -drive file=nbd://localhost,format=raw
> 
> Almost double the length, and I don't see anything fancy in the first
> line.
> 
> > Anyway, does this sound reasonable:
> > 
> > In QEMU 3.0, require the format= option for -drive.  Keep probing the
> > way it is for non-drive options because they are used for convenience by
> > local users.
> 
> And being hacked while using -hda is better in which way?

Markus is proposing that we look at the filename extension.  In that
case QEMU cannot be tricked by the contents of a raw image.

That makes -hda perfectly safe although there are cases where QEMU
doesn't know what to do and requires format=.

I do worry that changing QEMU's probing behavior drastically can lead to
consistencies where libvirt does its own probing :(.  Haven't thought
through the bug scenarios but that could be a security problem in
itself.

Stefan
Kevin Wolf Nov. 4, 2014, 3:37 p.m. UTC | #51
Am 04.11.2014 um 16:25 hat Stefan Hajnoczi geschrieben:
> On Tue, Nov 04, 2014 at 11:11:33AM +0100, Kevin Wolf wrote:
> > Am 03.11.2014 um 16:05 hat Stefan Hajnoczi geschrieben:
> > > The argument that there might not be a traditional filename doesn't make
> > > sense to me.  When there is no filename the command-line is already
> > > sufficiently complex and usage is fancy enough that probing adds no
> > > convenience, the user can just specify the format.
> > 
> > -hda nbd://localhost
> > -drive file=nbd://localhost,format=raw
> > 
> > Almost double the length, and I don't see anything fancy in the first
> > line.
> > 
> > > Anyway, does this sound reasonable:
> > > 
> > > In QEMU 3.0, require the format= option for -drive.  Keep probing the
> > > way it is for non-drive options because they are used for convenience by
> > > local users.
> > 
> > And being hacked while using -hda is better in which way?
> 
> Markus is proposing that we look at the filename extension.  In that
> case QEMU cannot be tricked by the contents of a raw image.
> 
> That makes -hda perfectly safe although there are cases where QEMU
> doesn't know what to do and requires format=.

Wait, by "keep probing the way it is" you mean implementing one of the
other proposals? So you're only suggesting being stricter on -drive as
an additional measure?

> I do worry that changing QEMU's probing behavior drastically can lead to
> consistencies where libvirt does its own probing :(.  Haven't thought
> through the bug scenarios but that could be a security problem in
> itself.

Hm... In which cases does libvirt probe the image format? And is it even
consistent with qemu today?

If you can get libvirt to explicitly pass the wrong format=... option
because it did its own probing, we have a problem no matter what we
change in qemu.

Kevin
Jeff Cody Nov. 4, 2014, 4:09 p.m. UTC | #52
On Tue, Nov 04, 2014 at 10:39:36AM +0100, Markus Armbruster wrote:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 30.10.2014 um 13:49 hat Markus Armbruster geschrieben:
> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> 
> >> > Am 29.10.2014 um 14:54 hat Markus Armbruster geschrieben:
> >> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> >> > Instead, let me try once more to sell my old proposal [1] from the
> >> >> > thread you mentioned:
> >> >> >
> >> >> >> What if we let the raw driver know that it was probed and then it
> >> >> >> enables a check that returns -EIO for any write on the first 2k if that
> >> >> >> write would make the image look like a different format?
> >> >> >
> >> >> > Attacks the problem where it arises instead of trying to detect the
> >> >> > outcome of it, and works in whatever way it is nested in the BDS graph
> >> >> > and whatever way is used to address the image file.
> >> >> >
> >> >> > Kevin
> >> >> >
> >> >> > [1]
> >> >> > https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02548.html
> >> >> 
> >> >> Arbitrarily breaking the virtual hardware that way is incompatible, too.
> >> >> It's a bigger no-no for me than changing user interface corner cases or
> >> >> deciding what to do with a file based on its file name extension.
> >> >
> >> > It is arbitrarily breaking theoretical cases, while your patch is less
> >> > arbitrarily breaking common cases. I strongly prefer the former.
> >> 
> >> I challenge "theoretical" as well as "common".
> >> 
> >> For "theoretical", see below.
> >> 
> >> As to "common": are unrecognizable image names really common?  I doubt
> >> it.  If I'm wrong, I expect users to complain about my warning, and then
> >> we can reconsider.
> >
> > As I said in my other mail, I remember seeing BZs with command lines
> > that don't have a file extension. Block devices are affected anyway.
> >
> > Your RFC also misses the extension .raw that I personally use
> > occasionally, so I would get the warning even though the image name
> > isn't "unrecognizable" at all. And other people probably have other
> > extensions in use that we don't know of.
> >
> >> > Nobody will ever want to write a qcow2 header into their boot sector for
> >> > just running a guest:
> >> >
> >> > 0:   51                      push   %cx
> >> > 1:   46                      inc    %si
> >> > 2:   49                      dec    %cx
> >> > 3:   fb                      sti
> >> >
> >> > This code doesn't make any sense. It's similar for other formats. And if
> >> > they really have some odd reason to do that, the fix is easy: Either
> >> > don't use raw, or pass an explicit format=raw.
> >> 
> >> A disk needn't start with a PC-style boot sector.  Even on a PC.  Not
> >> every disk needs to be bootable, or even partitioned.  Databases exist
> >> that like to work on raw devices.  Giving them /dev/sdb instead of a
> >> whole-disk partition /dev/sdb1 may not be the smartest thing to do, but
> >> it *works* with real hardware, so why not with virtual hardware?
> >
> > Oh, it does. Just use a format that can represent this reliably (and as
> > a compromise, we're going to declare explicitly requested raw as such -
> > again, see my other mail for details).
> 
> This is "opt in safety".  I dislike that for the same reason I dislike
> "opt in security".
> 
> Insecure: we risk guest escaping isolation.
> 
> Unsafe: we risk virtual hardware breakage.
> 
> >> You either have to prevent *any* writing of the first 2048 bytes (the
> >> part that can be examined by a bdrv_probe() method, or your have to
> >> prevent writing anything a probe recognizes, or the user has to specify
> >> the format explicitly.
> >> 
> >> If you do the former, you're way outside the realm of "theoretical".
> >
> > Right, that's not an option.
> >
> >> If you do the latter, the degree of "theoreticalness" depends on the
> >> union of the patterns you prevent.  Issues:
> >> 
> >> 1. Anthony's method of checking a few known signatures is fragile.  The
> >> only obviously robust way to test "is probing going to find something
> >> other than 'raw'?" is running the probes.  Feasible.
> >
> > Yes, this is what I've been talking about.
> >
> >> 2. Whether the union of patterns qualifies as "theoretical" for all our
> >> targets is not obvious, and whether it'll remain "theoretical" for all
> >> future formats and target machines even less.
> >
> > At least we don't have examples of it happening yet. And even if it
> > happens to become not theoretical at some point, the only thing that
> > happens is that they need to add format=raw - something that we already
> > know is sure to happen with your approach.
> >
> > Once we get to know of such cases, we can probably even resolve them by
> > improving the probing function of the problematic format.
> 
> Your proposal "improves" things from "insecure by default" to "now less
> insecure, but additionally a bit unsafe".
> 
> Your proposed remedy for "unsafe" seems to be "if it breaks, you get to
> keep the pieces, but you may ask us to narrow the conditions for
> breakage so it wouldn't have broken for you if you had imported the
> changed version from the future" ;)
> 
> >> 3. A write access that works just fine in one QEMU version can be
> >> rejected by another version that recognizes more formats.  Or probes the
> >> same formats differently.
> >
> > True. We're only sure to protect this binary, and we have good chances
> > of protecting some other qemu versions and some other tools, too.
> >
> > If the attacker has a time machine and knows what the next installed
> > qemu version will recognize (or if they exploit a file format that is
> > not a disk image, with a program other than qemu), we don't fully
> > protect the user. We'd have to completely forbid raw for that.
> >
> >> 4. Rejecting a write fails *late*, and looks like hardware failure to
> >> the guest.  With imperfect guest software, this risks data corruption.
> >
> > Okay. So your common case is that you have a badly written database that
> > gets the full unpartitioned disk assigned and doesn't start to write at
> > the first block, but randomly, and when it finally tries to use block 0,
> > it gets deadly confused by the I/O error so that it will break the data
> > that it has already stored elsewhere.
> >
> > Yeah right.
> >
> > And you probably deserved it then for using such software. It would have
> > happened anyway sooner or later.
> 
> Since you fail writes to sector 0 only when a "bad" pattern is written,
> your hypothesis that any software I should be using surely writes sector
> 0 early is irrelevant.
> 
> "Common" doesn't matter either (and I never claimed it was common).  If
> I deploy software that uses whole disks, I'd consider disks that can
> fail writes to sector 0, even though the actual hardware is in good
> working order, defective, and I wouldn't buy them[*].  Regardless of how
> common the failure was.  Fortunately, no such disks exist.  Now you're
> proposing to create some, with a special "do not break" switch (factory
> default "do break").  Your excuse for doing so seems to be:
> 
> (a) If I'm doing something *that* outlandish, I should know which disks
> to buy (i.e. anything but virtual raw images), or where to find the "do
> not break" switch (i.e. use format=raw).
> 
> (b) If the breakage causes me harm, I deserve it, because my software
> sucks.
> 
> For me, (a) is debatable, but (b) is simply bad attitude.  Let's not go
> there.
> 
> 
> [*] In fact, I'd consider them defective and wouldn't buy them no matter
> what software I plan to use.

The ultimate goal (correct me if I am wrong) of your patch is to get
rid of content probing during open, and rely solely on filename
extension.

That leaves us open to guest data corruption in the opposite way -
identifying a qcow2 image as a raw image.  If I have qcow2 image named
mydisk.img, and QEMU 'detects' it as raw, then that means the first
guest data write will corrupt the image (especially to sector 0).

So if we are not specifying the image format, and allowing QEMU to
determine the format, Kevin's "restricted-raw" approach actually seems
safer to me, and less likely to break the guest.

The other option is to just remove any sort of probing/format
autodetection (aside from qemu-img info) completely.  This may break
scripts, but switching to filename extensions as a detection mechanism
may break some scripts anyway.
Markus Armbruster Nov. 5, 2014, 7:58 a.m. UTC | #53
Kevin Wolf <kwolf@redhat.com> writes:

> Am 04.11.2014 um 10:36 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > Am 31.10.2014 um 23:45 hat Eric Blake geschrieben:
>> >> On 10/30/2014 06:49 AM, Markus Armbruster wrote:
>> >> 
>> >> > You either have to prevent *any* writing of the first 2048 bytes (the
>> >> > part that can be examined by a bdrv_probe() method, or your have to
>> >> > prevent writing anything a probe recognizes, or the user has to specify
>> >> > the format explicitly.
>> >> > 
>> >> > If you do the former, you're way outside the realm of "theoretical".
>> >> > 
>> >> > If you do the latter, the degree of "theoreticalness" depends on the
>> >> > union of the patterns you prevent.  Issues:
>> >> > 
>> >> > 1. Anthony's method of checking a few known signatures is fragile.  The
>> >> > only obviously robust way to test "is probing going to find something
>> >> > other than 'raw'?" is running the probes.  Feasible.
>> >> > 
>> >> > 2. Whether the union of patterns qualifies as "theoretical" for all our
>> >> > targets is not obvious, and whether it'll remain "theoretical" for all
>> >> > future formats and target machines even less.
>> >> 
>> >> This one scares me.  The proof of concept patch you posted tests whether
>> >> a write to the first sector would result in the sector matching a
>> >> _currently known probe_ for the file formats that were compiled in at
>> >> configure time to the currently running binary.  But this is NOT the set
>> >> of all possible binary formats that may be introduced in the future.  By
>> >> extrapolation, if we pursue write blocking, then the only SAFE way to is
>> >> to reject ALL writes to the first sector, because we can't know which
>> >> writes will match some theoretical future probe - but by the time you
>> >> get to that point, then we no longer match bare metal (rejecting ALL
>> >> writes to the first sector is ridiculous).
>> >
>> > There is no absolute security without forbidding raw. Who says that the
>> > next format doesn't have its magic in sector 42?
>> 
>> Correct.
>> 
>> > You are right that if an attacker knows which new format supporting
>> > backing files we'll have in the next version, and the user uses probed
>> > raw despite the warning (which means they are not using libvirt), the
>> > attacker can write the header of the new image format now and hope that
>> > the user leaves it alone until the update happens at some point in the
>> > future. Then the malicious guest can access that one file, but not
>> > obtain access to the next one (because the new checks are in place
>> > then).
>> >
>> > I don't think this is a relevant attack vector. It's probably much
>> > easier to get the user to run an untrusted image than converting a
>> > trusted image into a time bomb and waiting potentially for months or
>> > years for it to explode.
>> 
>> The threat from using untrusted images with embedded filenames is real.
>> 
>> Regardless, I wouldn't discount the (different) threat from guests
>> exploiting "future" probes so cavalierly.  If your host is running a
>> stable distro, its future is easy to know.  Even the point of time when
>> it gets upgraded can be predictable.
>> 
>> In general, I recommend against leaving security holes in place just
>> because you think nobody could be bothered to exploit them :)
>> 
>> Security is not an absolute that cannot be trumped by any other
>> consideration.  I'm perfectly happy to consider usability and
>> compatibility issues, as well as implementation complexity.  I'm much
>> less willing to accept a "come on, nobody is going to try *that*"
>> argument.
>
> No, I didn't mean to make a "come on, nobody is going to try *that*"
> argument.
>
> I just can think of more attacks that we won't avoid with either
> approach. For example, what if another program on the host is
> exploitable and you just need a file with the right content to attack
> it? A raw image gives you what you need. Nobody is going to try *that*
> then, if you're content with allowing this?

When I give a raw image to an untrusted guest, it's entire contents
becomes untrusted.  I should treat it with the same caution as any other
untrusted data.

This is not the only way to put files of untrusted data on your system.
That it is one may be a bit more surprising than for downloading tools,
though.

We can't stop users from using untrusted data carelessly.

What we *can* do is making it easy for users to treat untrusted data
with the proper care as far as the software we're producing is
concerned.

The most obvious and simple way to use images with our software does not
treat untrusted data with the proper care.  This is also the way our
documentation advertises.

Worse, exercising proper care *consistently* is hard.  I couldn't use
raw images securely in all contexts myself, not without a lot more
staring at our source code.  And even then I wouldn't bet my own money
on it.  Libvirt has had a hard time doing it, with multiple failures,
including recent ones.

> The only full solution is forbidding raw. We have already agreed that
> this isn't an option. Now we have to draw a line somewhere.

I'm drawing the line around the software we actually maintain.  Make it
treat raw image contents as untrusted unless told otherwise by the user.
If somebody else's software doesn't, then that's regrettable, but we
can't fix that in QEMU.

Since you've been attacking this line vigorously, I'm drawing a second
line behind it, in bright red: secure usage of our software should not
be hard.

You can make me sacrifice my "secure by default" requirement (against my
better judgement), but you can't make me compromise on my "secure should
not be hard" requirement.
Markus Armbruster Nov. 5, 2014, 8:05 a.m. UTC | #54
Jeff Cody <jcody@redhat.com> writes:

> On Tue, Nov 04, 2014 at 10:39:36AM +0100, Markus Armbruster wrote:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > Am 30.10.2014 um 13:49 hat Markus Armbruster geschrieben:
>> >> Kevin Wolf <kwolf@redhat.com> writes:
>> >> 
>> >> > Am 29.10.2014 um 14:54 hat Markus Armbruster geschrieben:
>> >> >> Kevin Wolf <kwolf@redhat.com> writes:
>> >> >> > Instead, let me try once more to sell my old proposal [1] from the
>> >> >> > thread you mentioned:
>> >> >> >
>> >> >> >> What if we let the raw driver know that it was probed and then it
>> >> >> >> enables a check that returns -EIO for any write on the
>> >> >> >> first 2k if that
>> >> >> >> write would make the image look like a different format?
>> >> >> >
>> >> >> > Attacks the problem where it arises instead of trying to detect the
>> >> >> > outcome of it, and works in whatever way it is nested in the BDS graph
>> >> >> > and whatever way is used to address the image file.
>> >> >> >
>> >> >> > Kevin
>> >> >> >
>> >> >> > [1]
>> >> >> > https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02548.html
>> >> >> 
>> >> >> Arbitrarily breaking the virtual hardware that way is incompatible, too.
>> >> >> It's a bigger no-no for me than changing user interface corner cases or
>> >> >> deciding what to do with a file based on its file name extension.
>> >> >
>> >> > It is arbitrarily breaking theoretical cases, while your patch is less
>> >> > arbitrarily breaking common cases. I strongly prefer the former.
>> >> 
>> >> I challenge "theoretical" as well as "common".
>> >> 
>> >> For "theoretical", see below.
>> >> 
>> >> As to "common": are unrecognizable image names really common?  I doubt
>> >> it.  If I'm wrong, I expect users to complain about my warning, and then
>> >> we can reconsider.
>> >
>> > As I said in my other mail, I remember seeing BZs with command lines
>> > that don't have a file extension. Block devices are affected anyway.
>> >
>> > Your RFC also misses the extension .raw that I personally use
>> > occasionally, so I would get the warning even though the image name
>> > isn't "unrecognizable" at all. And other people probably have other
>> > extensions in use that we don't know of.
>> >
>> >> > Nobody will ever want to write a qcow2 header into their boot sector for
>> >> > just running a guest:
>> >> >
>> >> > 0:   51                      push   %cx
>> >> > 1:   46                      inc    %si
>> >> > 2:   49                      dec    %cx
>> >> > 3:   fb                      sti
>> >> >
>> >> > This code doesn't make any sense. It's similar for other formats. And if
>> >> > they really have some odd reason to do that, the fix is easy: Either
>> >> > don't use raw, or pass an explicit format=raw.
>> >> 
>> >> A disk needn't start with a PC-style boot sector.  Even on a PC.  Not
>> >> every disk needs to be bootable, or even partitioned.  Databases exist
>> >> that like to work on raw devices.  Giving them /dev/sdb instead of a
>> >> whole-disk partition /dev/sdb1 may not be the smartest thing to do, but
>> >> it *works* with real hardware, so why not with virtual hardware?
>> >
>> > Oh, it does. Just use a format that can represent this reliably (and as
>> > a compromise, we're going to declare explicitly requested raw as such -
>> > again, see my other mail for details).
>> 
>> This is "opt in safety".  I dislike that for the same reason I dislike
>> "opt in security".
>> 
>> Insecure: we risk guest escaping isolation.
>> 
>> Unsafe: we risk virtual hardware breakage.
>> 
>> >> You either have to prevent *any* writing of the first 2048 bytes (the
>> >> part that can be examined by a bdrv_probe() method, or your have to
>> >> prevent writing anything a probe recognizes, or the user has to specify
>> >> the format explicitly.
>> >> 
>> >> If you do the former, you're way outside the realm of "theoretical".
>> >
>> > Right, that's not an option.
>> >
>> >> If you do the latter, the degree of "theoreticalness" depends on the
>> >> union of the patterns you prevent.  Issues:
>> >> 
>> >> 1. Anthony's method of checking a few known signatures is fragile.  The
>> >> only obviously robust way to test "is probing going to find something
>> >> other than 'raw'?" is running the probes.  Feasible.
>> >
>> > Yes, this is what I've been talking about.
>> >
>> >> 2. Whether the union of patterns qualifies as "theoretical" for all our
>> >> targets is not obvious, and whether it'll remain "theoretical" for all
>> >> future formats and target machines even less.
>> >
>> > At least we don't have examples of it happening yet. And even if it
>> > happens to become not theoretical at some point, the only thing that
>> > happens is that they need to add format=raw - something that we already
>> > know is sure to happen with your approach.
>> >
>> > Once we get to know of such cases, we can probably even resolve them by
>> > improving the probing function of the problematic format.
>> 
>> Your proposal "improves" things from "insecure by default" to "now less
>> insecure, but additionally a bit unsafe".
>> 
>> Your proposed remedy for "unsafe" seems to be "if it breaks, you get to
>> keep the pieces, but you may ask us to narrow the conditions for
>> breakage so it wouldn't have broken for you if you had imported the
>> changed version from the future" ;)
>> 
>> >> 3. A write access that works just fine in one QEMU version can be
>> >> rejected by another version that recognizes more formats.  Or probes the
>> >> same formats differently.
>> >
>> > True. We're only sure to protect this binary, and we have good chances
>> > of protecting some other qemu versions and some other tools, too.
>> >
>> > If the attacker has a time machine and knows what the next installed
>> > qemu version will recognize (or if they exploit a file format that is
>> > not a disk image, with a program other than qemu), we don't fully
>> > protect the user. We'd have to completely forbid raw for that.
>> >
>> >> 4. Rejecting a write fails *late*, and looks like hardware failure to
>> >> the guest.  With imperfect guest software, this risks data corruption.
>> >
>> > Okay. So your common case is that you have a badly written database that
>> > gets the full unpartitioned disk assigned and doesn't start to write at
>> > the first block, but randomly, and when it finally tries to use block 0,
>> > it gets deadly confused by the I/O error so that it will break the data
>> > that it has already stored elsewhere.
>> >
>> > Yeah right.
>> >
>> > And you probably deserved it then for using such software. It would have
>> > happened anyway sooner or later.
>> 
>> Since you fail writes to sector 0 only when a "bad" pattern is written,
>> your hypothesis that any software I should be using surely writes sector
>> 0 early is irrelevant.
>> 
>> "Common" doesn't matter either (and I never claimed it was common).  If
>> I deploy software that uses whole disks, I'd consider disks that can
>> fail writes to sector 0, even though the actual hardware is in good
>> working order, defective, and I wouldn't buy them[*].  Regardless of how
>> common the failure was.  Fortunately, no such disks exist.  Now you're
>> proposing to create some, with a special "do not break" switch (factory
>> default "do break").  Your excuse for doing so seems to be:
>> 
>> (a) If I'm doing something *that* outlandish, I should know which disks
>> to buy (i.e. anything but virtual raw images), or where to find the "do
>> not break" switch (i.e. use format=raw).
>> 
>> (b) If the breakage causes me harm, I deserve it, because my software
>> sucks.
>> 
>> For me, (a) is debatable, but (b) is simply bad attitude.  Let's not go
>> there.
>> 
>> 
>> [*] In fact, I'd consider them defective and wouldn't buy them no matter
>> what software I plan to use.
>
> The ultimate goal (correct me if I am wrong) of your patch is to get
> rid of content probing during open, and rely solely on filename
> extension.

The goal is to separate raw and non-raw images sufficiently so we can
exercise the proper care with untrusted raw image contents.

Ditching probing is just a means.  I readily admit I rather like the
means, since I never liked probing.

> That leaves us open to guest data corruption in the opposite way -
> identifying a qcow2 image as a raw image.  If I have qcow2 image named
> mydisk.img, and QEMU 'detects' it as raw, then that means the first
> guest data write will corrupt the image (especially to sector 0).

I worked this issue into my "Image probing: how it can be insecure, and
what we could do about it" treatise.

> So if we are not specifying the image format, and allowing QEMU to
> determine the format, Kevin's "restricted-raw" approach actually seems
> safer to me, and less likely to break the guest.

Relative likelihoods are of course debatable.

> The other option is to just remove any sort of probing/format
> autodetection (aside from qemu-img info) completely.  This may break
> scripts, but switching to filename extensions as a detection mechanism
> may break some scripts anyway.

It breaks a whole lot more, and with fewer workarounds.  Fine with me,
but Kevin hates it.
Max Reitz Nov. 5, 2014, 8:09 a.m. UTC | #55
On 2014-11-05 at 09:05, Markus Armbruster wrote:
> Jeff Cody <jcody@redhat.com> writes:
>
>> On Tue, Nov 04, 2014 at 10:39:36AM +0100, Markus Armbruster wrote:
>>> Kevin Wolf <kwolf@redhat.com> writes:
>>>
>>>> Am 30.10.2014 um 13:49 hat Markus Armbruster geschrieben:
>>>>> Kevin Wolf <kwolf@redhat.com> writes:
>>>>>
>>>>>> Am 29.10.2014 um 14:54 hat Markus Armbruster geschrieben:
>>>>>>> Kevin Wolf <kwolf@redhat.com> writes:
>>>>>>>> Instead, let me try once more to sell my old proposal [1] from the
>>>>>>>> thread you mentioned:
>>>>>>>>
>>>>>>>>> What if we let the raw driver know that it was probed and then it
>>>>>>>>> enables a check that returns -EIO for any write on the
>>>>>>>>> first 2k if that
>>>>>>>>> write would make the image look like a different format?
>>>>>>>> Attacks the problem where it arises instead of trying to detect the
>>>>>>>> outcome of it, and works in whatever way it is nested in the BDS graph
>>>>>>>> and whatever way is used to address the image file.
>>>>>>>>
>>>>>>>> Kevin
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02548.html
>>>>>>> Arbitrarily breaking the virtual hardware that way is incompatible, too.
>>>>>>> It's a bigger no-no for me than changing user interface corner cases or
>>>>>>> deciding what to do with a file based on its file name extension.
>>>>>> It is arbitrarily breaking theoretical cases, while your patch is less
>>>>>> arbitrarily breaking common cases. I strongly prefer the former.
>>>>> I challenge "theoretical" as well as "common".
>>>>>
>>>>> For "theoretical", see below.
>>>>>
>>>>> As to "common": are unrecognizable image names really common?  I doubt
>>>>> it.  If I'm wrong, I expect users to complain about my warning, and then
>>>>> we can reconsider.
>>>> As I said in my other mail, I remember seeing BZs with command lines
>>>> that don't have a file extension. Block devices are affected anyway.
>>>>
>>>> Your RFC also misses the extension .raw that I personally use
>>>> occasionally, so I would get the warning even though the image name
>>>> isn't "unrecognizable" at all. And other people probably have other
>>>> extensions in use that we don't know of.
>>>>
>>>>>> Nobody will ever want to write a qcow2 header into their boot sector for
>>>>>> just running a guest:
>>>>>>
>>>>>> 0:   51                      push   %cx
>>>>>> 1:   46                      inc    %si
>>>>>> 2:   49                      dec    %cx
>>>>>> 3:   fb                      sti
>>>>>>
>>>>>> This code doesn't make any sense. It's similar for other formats. And if
>>>>>> they really have some odd reason to do that, the fix is easy: Either
>>>>>> don't use raw, or pass an explicit format=raw.
>>>>> A disk needn't start with a PC-style boot sector.  Even on a PC.  Not
>>>>> every disk needs to be bootable, or even partitioned.  Databases exist
>>>>> that like to work on raw devices.  Giving them /dev/sdb instead of a
>>>>> whole-disk partition /dev/sdb1 may not be the smartest thing to do, but
>>>>> it *works* with real hardware, so why not with virtual hardware?
>>>> Oh, it does. Just use a format that can represent this reliably (and as
>>>> a compromise, we're going to declare explicitly requested raw as such -
>>>> again, see my other mail for details).
>>> This is "opt in safety".  I dislike that for the same reason I dislike
>>> "opt in security".
>>>
>>> Insecure: we risk guest escaping isolation.
>>>
>>> Unsafe: we risk virtual hardware breakage.
>>>
>>>>> You either have to prevent *any* writing of the first 2048 bytes (the
>>>>> part that can be examined by a bdrv_probe() method, or your have to
>>>>> prevent writing anything a probe recognizes, or the user has to specify
>>>>> the format explicitly.
>>>>>
>>>>> If you do the former, you're way outside the realm of "theoretical".
>>>> Right, that's not an option.
>>>>
>>>>> If you do the latter, the degree of "theoreticalness" depends on the
>>>>> union of the patterns you prevent.  Issues:
>>>>>
>>>>> 1. Anthony's method of checking a few known signatures is fragile.  The
>>>>> only obviously robust way to test "is probing going to find something
>>>>> other than 'raw'?" is running the probes.  Feasible.
>>>> Yes, this is what I've been talking about.
>>>>
>>>>> 2. Whether the union of patterns qualifies as "theoretical" for all our
>>>>> targets is not obvious, and whether it'll remain "theoretical" for all
>>>>> future formats and target machines even less.
>>>> At least we don't have examples of it happening yet. And even if it
>>>> happens to become not theoretical at some point, the only thing that
>>>> happens is that they need to add format=raw - something that we already
>>>> know is sure to happen with your approach.
>>>>
>>>> Once we get to know of such cases, we can probably even resolve them by
>>>> improving the probing function of the problematic format.
>>> Your proposal "improves" things from "insecure by default" to "now less
>>> insecure, but additionally a bit unsafe".
>>>
>>> Your proposed remedy for "unsafe" seems to be "if it breaks, you get to
>>> keep the pieces, but you may ask us to narrow the conditions for
>>> breakage so it wouldn't have broken for you if you had imported the
>>> changed version from the future" ;)
>>>
>>>>> 3. A write access that works just fine in one QEMU version can be
>>>>> rejected by another version that recognizes more formats.  Or probes the
>>>>> same formats differently.
>>>> True. We're only sure to protect this binary, and we have good chances
>>>> of protecting some other qemu versions and some other tools, too.
>>>>
>>>> If the attacker has a time machine and knows what the next installed
>>>> qemu version will recognize (or if they exploit a file format that is
>>>> not a disk image, with a program other than qemu), we don't fully
>>>> protect the user. We'd have to completely forbid raw for that.
>>>>
>>>>> 4. Rejecting a write fails *late*, and looks like hardware failure to
>>>>> the guest.  With imperfect guest software, this risks data corruption.
>>>> Okay. So your common case is that you have a badly written database that
>>>> gets the full unpartitioned disk assigned and doesn't start to write at
>>>> the first block, but randomly, and when it finally tries to use block 0,
>>>> it gets deadly confused by the I/O error so that it will break the data
>>>> that it has already stored elsewhere.
>>>>
>>>> Yeah right.
>>>>
>>>> And you probably deserved it then for using such software. It would have
>>>> happened anyway sooner or later.
>>> Since you fail writes to sector 0 only when a "bad" pattern is written,
>>> your hypothesis that any software I should be using surely writes sector
>>> 0 early is irrelevant.
>>>
>>> "Common" doesn't matter either (and I never claimed it was common).  If
>>> I deploy software that uses whole disks, I'd consider disks that can
>>> fail writes to sector 0, even though the actual hardware is in good
>>> working order, defective, and I wouldn't buy them[*].  Regardless of how
>>> common the failure was.  Fortunately, no such disks exist.  Now you're
>>> proposing to create some, with a special "do not break" switch (factory
>>> default "do break").  Your excuse for doing so seems to be:
>>>
>>> (a) If I'm doing something *that* outlandish, I should know which disks
>>> to buy (i.e. anything but virtual raw images), or where to find the "do
>>> not break" switch (i.e. use format=raw).
>>>
>>> (b) If the breakage causes me harm, I deserve it, because my software
>>> sucks.
>>>
>>> For me, (a) is debatable, but (b) is simply bad attitude.  Let's not go
>>> there.
>>>
>>>
>>> [*] In fact, I'd consider them defective and wouldn't buy them no matter
>>> what software I plan to use.
>> The ultimate goal (correct me if I am wrong) of your patch is to get
>> rid of content probing during open, and rely solely on filename
>> extension.
> The goal is to separate raw and non-raw images sufficiently so we can
> exercise the proper care with untrusted raw image contents.
>
> Ditching probing is just a means.  I readily admit I rather like the
> means, since I never liked probing.

As I said before, I hope you won't ditch probing, because I'd strongly 
oppose that. I am however completely fine with adding more entropy to 
probing, i.e. using the filename extension as an additional source of 
information.

>> That leaves us open to guest data corruption in the opposite way -
>> identifying a qcow2 image as a raw image.  If I have qcow2 image named
>> mydisk.img, and QEMU 'detects' it as raw, then that means the first
>> guest data write will corrupt the image (especially to sector 0).
> I worked this issue into my "Image probing: how it can be insecure, and
> what we could do about it" treatise.
>
>> So if we are not specifying the image format, and allowing QEMU to
>> determine the format, Kevin's "restricted-raw" approach actually seems
>> safer to me, and less likely to break the guest.
> Relative likelihoods are of course debatable.
>
>> The other option is to just remove any sort of probing/format
>> autodetection (aside from qemu-img info) completely.  This may break
>> scripts, but switching to filename extensions as a detection mechanism
>> may break some scripts anyway.
> It breaks a whole lot more, and with fewer workarounds.  Fine with me,
> but Kevin hates it.

It would break my heart, too. (SCNR)

Max
Markus Armbruster Nov. 5, 2014, 8:39 a.m. UTC | #56
Kevin Wolf <kwolf@redhat.com> writes:

> Am 04.11.2014 um 16:25 hat Stefan Hajnoczi geschrieben:
>> On Tue, Nov 04, 2014 at 11:11:33AM +0100, Kevin Wolf wrote:
>> > Am 03.11.2014 um 16:05 hat Stefan Hajnoczi geschrieben:
>> > > The argument that there might not be a traditional filename doesn't make
>> > > sense to me.  When there is no filename the command-line is already
>> > > sufficiently complex and usage is fancy enough that probing adds no
>> > > convenience, the user can just specify the format.
>> > 
>> > -hda nbd://localhost
>> > -drive file=nbd://localhost,format=raw
>> > 
>> > Almost double the length, and I don't see anything fancy in the first
>> > line.
>> > 
>> > > Anyway, does this sound reasonable:
>> > > 
>> > > In QEMU 3.0, require the format= option for -drive.  Keep probing the
>> > > way it is for non-drive options because they are used for convenience by
>> > > local users.
>> > 
>> > And being hacked while using -hda is better in which way?
>> 
>> Markus is proposing that we look at the filename extension.  In that
>> case QEMU cannot be tricked by the contents of a raw image.
>> 
>> That makes -hda perfectly safe although there are cases where QEMU
>> doesn't know what to do and requires format=.
>
> Wait, by "keep probing the way it is" you mean implementing one of the
> other proposals? So you're only suggesting being stricter on -drive as
> an additional measure?
>
>> I do worry that changing QEMU's probing behavior drastically can lead to
>> consistencies where libvirt does its own probing :(.  Haven't thought
>> through the bug scenarios but that could be a security problem in
>> itself.
>
> Hm... In which cases does libvirt probe the image format? And is it even
> consistent with qemu today?

I had a quick look at the source.  Eric, please correct
misunderstandings.

Enumation type virStorageFileProbeFormat enumerates supported formats.
It has pseudo-formats VIR_STORAGE_FILE_AUTO, VIR_STORAGE_FILE_AUTO_SAFE.

I don't understand VIR_STORAGE_FILE_AUTO_SAFE offhand.

VIR_STORAGE_FILE_AUTO means probing.  Its use appears to be deprecated.
Actual probing happens in virStorageFileProbeFormatFromBuf():

    For all formats:
        if magic and version match, pick this format

    If some magic matched, but not the version: warn

    For all formats:
        if file name extension matches, pick this format

    Pick raw.

The formats' magic, version and extension are defined in fileTypeInfo[].

If I remember correctly, libvirt has its own probing because running an
external program just to probe is too slow.

Another reason for having its own probing might be providing a secure
replacement for QEMU's insecure probing.

> If you can get libvirt to explicitly pass the wrong format=... option
> because it did its own probing, we have a problem no matter what we
> change in qemu.

Yes, but that would be a libvirt problem.  No excuse for us to ignore
our own problems.
Eric Blake Nov. 5, 2014, 10:18 a.m. UTC | #57
On 11/05/2014 09:39 AM, Markus Armbruster wrote:

>> Hm... In which cases does libvirt probe the image format? And is it even
>> consistent with qemu today?
> 
> I had a quick look at the source.  Eric, please correct
> misunderstandings.
> 
> Enumation type virStorageFileProbeFormat enumerates supported formats.
> It has pseudo-formats VIR_STORAGE_FILE_AUTO, VIR_STORAGE_FILE_AUTO_SAFE.

VIR_STORAGE_FILE_AUTO is the format that libvirt assigns an image where
the libvirt XML was not explicit.  VIR_STORAGE_FILE_AUTO_SAFE is what
the image gets reassigned to for QED (as the only format where the
backing format is mandatory as part of the backing chain), which
basically says: trust the backing chain if the XML omitted a type,
instead of doing the normal rules of auto.

> 
> I don't understand VIR_STORAGE_FILE_AUTO_SAFE offhand.
> 
> VIR_STORAGE_FILE_AUTO means probing.  Its use appears to be deprecated.

Close.  It actually means: "the user did not specify a format in their
XML, so it is now up to a configuration knob whether we will probe or
whether we will forcefully error out and tell the user to fix their
XML".  The configuration knob (allow_disk_format_probing in
/etc/libvirt/qemu.conf) defaults to 0 by default (error out and tell the
user to fix their XML) but can be overridden to 1 by someone that knows
what they are doing (probes are allowed at the user's own risk).

> Actual probing happens in virStorageFileProbeFormatFromBuf():
> 
>     For all formats:
>         if magic and version match, pick this format
> 
>     If some magic matched, but not the version: warn
> 
>     For all formats:
>         if file name extension matches, pick this format
> 
>     Pick raw.
> 
> The formats' magic, version and extension are defined in fileTypeInfo[].
> 
> If I remember correctly, libvirt has its own probing because running an
> external program just to probe is too slow.

Correct.  And while the libvirt probing was originally modeled after
qemu, the two approaches have probably diverged a bit over time.  An
obvious difference: qemu only probes 512? bytes (maybe 4096?), but
libvirt probes deep enough to determine the .iso file format (a 5-byte
magic number at decimal offset 32769).  See VIR_STORAGE_MAX_HEADER
0x8200 in src/util/virstoragefile.h.

> 
> Another reason for having its own probing might be providing a secure
> replacement for QEMU's insecure probing.

While that may be a possible outcome, I'm not sure it was an intentional
design.

> 
>> If you can get libvirt to explicitly pass the wrong format=... option
>> because it did its own probing, we have a problem no matter what we
>> change in qemu.
> 
> Yes, but that would be a libvirt problem.  No excuse for us to ignore
> our own problems.

Correct - for the purposes of this thread, we need only figure out how
to make qemu closer to being secure by default.
diff mbox

Patch

diff --git a/block.c b/block.c
index 8da6e61..7a1c592 100644
--- a/block.c
+++ b/block.c
@@ -674,10 +674,37 @@  static BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
     return drv;
 }
 
+/*
+ * Guess image format from file name @fname
+ */
+static BlockDriver *bdrv_guess_format(const char *fname)
+{
+    BlockDriver *d;
+    int i, ext;
+    size_t fn_len = strlen(fname);
+    size_t ext_len;
+
+    QLIST_FOREACH(d, &bdrv_drivers, list) {
+        for (i = 0; i < ARRAY_SIZE(d->fname_ext) && d->fname_ext[i]; i++) {
+            ext_len = strlen(d->fname_ext[i]);
+            if (fn_len <= ext_len) {
+                continue;
+            }
+            ext = fn_len - ext_len;
+            if (fname[ext - 1] == '.'
+                && !strcmp(fname + ext, d->fname_ext[i])) {
+                return d;
+            }
+        }
+    }
+
+    return NULL;
+}
+
 static int find_image_format(BlockDriverState *bs, const char *filename,
                              BlockDriver **pdrv, Error **errp)
 {
-    BlockDriver *drv;
+    BlockDriver *drv, *secure_drv;
     uint8_t buf[2048];
     int ret = 0;
 
@@ -704,8 +731,23 @@  static int find_image_format(BlockDriverState *bs, const char *filename,
     if (!drv) {
         error_setg(errp, "Could not determine image format: No compatible "
                    "driver found");
-        ret = -ENOENT;
+        *pdrv = NULL;
+        return -ENOENT;
     }
+
+    secure_drv = bdrv_guess_format(filename);
+    if (use_bdrv_whitelist && drv != secure_drv) {
+        error_report("warning: insecure format probing of image '%s'",
+                     filename);
+        error_printf("To get rid of this warning, specify format=%s"
+                     " explicitly", drv->format_name);
+        if (drv->fname_ext[0]) {
+            error_printf(", or change\nthe image name to end with '.%s'",
+                         drv->fname_ext[0]);
+        }
+        error_printf("\n");
+    }
+
     *pdrv = drv;
     return ret;
 }
diff --git a/block/dmg.c b/block/dmg.c
index e455886..9f65f94 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -443,6 +443,7 @@  static BlockDriver bdrv_dmg = {
     .format_name    = "dmg",
     .instance_size  = sizeof(BDRVDMGState),
     .bdrv_probe     = dmg_probe,
+    .fname_ext      = { "dmg" },
     .bdrv_open      = dmg_open,
     .bdrv_read      = dmg_co_read,
     .bdrv_close     = dmg_close,
diff --git a/block/qcow.c b/block/qcow.c
index ece2269..268b248 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -956,6 +956,7 @@  static BlockDriver bdrv_qcow = {
     .format_name	= "qcow",
     .instance_size	= sizeof(BDRVQcowState),
     .bdrv_probe		= qcow_probe,
+    .fname_ext          = { "qcow" },
     .bdrv_open		= qcow_open,
     .bdrv_close		= qcow_close,
     .bdrv_reopen_prepare    = qcow_reopen_prepare,
diff --git a/block/qcow2.c b/block/qcow2.c
index d031515..31ff872 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2660,6 +2660,7 @@  static BlockDriver bdrv_qcow2 = {
     .format_name        = "qcow2",
     .instance_size      = sizeof(BDRVQcowState),
     .bdrv_probe         = qcow2_probe,
+    .fname_ext          = { "qcow2" },
     .bdrv_open          = qcow2_open,
     .bdrv_close         = qcow2_close,
     .bdrv_reopen_prepare  = qcow2_reopen_prepare,
diff --git a/block/qed.c b/block/qed.c
index 80f18d8..e1cbb94 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1659,6 +1659,7 @@  static BlockDriver bdrv_qed = {
     .supports_backing         = true,
 
     .bdrv_probe               = bdrv_qed_probe,
+    .fname_ext                = { "qed" },
     .bdrv_rebind              = bdrv_qed_rebind,
     .bdrv_open                = bdrv_qed_open,
     .bdrv_close               = bdrv_qed_close,
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 401b967..b429e66 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -176,6 +176,7 @@  static int raw_probe(const uint8_t *buf, int buf_size, const char *filename)
 static BlockDriver bdrv_raw = {
     .format_name          = "raw",
     .bdrv_probe           = &raw_probe,
+    .fname_ext            = { "img", "iso" },
     .bdrv_reopen_prepare  = &raw_reopen_prepare,
     .bdrv_open            = &raw_open,
     .bdrv_close           = &raw_close,
diff --git a/block/vdi.c b/block/vdi.c
index 19701ee..6c915cb 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -854,6 +854,7 @@  static BlockDriver bdrv_vdi = {
     .format_name = "vdi",
     .instance_size = sizeof(BDRVVdiState),
     .bdrv_probe = vdi_probe,
+    .fname_ext = { "vdi" },
     .bdrv_open = vdi_open,
     .bdrv_close = vdi_close,
     .bdrv_reopen_prepare = vdi_reopen_prepare,
diff --git a/block/vhdx.c b/block/vhdx.c
index 12bfe75..d2c3a20 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -1945,6 +1945,7 @@  static BlockDriver bdrv_vhdx = {
     .format_name            = "vhdx",
     .instance_size          = sizeof(BDRVVHDXState),
     .bdrv_probe             = vhdx_probe,
+    .fname_ext              = { "vhd" },
     .bdrv_open              = vhdx_open,
     .bdrv_close             = vhdx_close,
     .bdrv_reopen_prepare    = vhdx_reopen_prepare,
diff --git a/block/vmdk.c b/block/vmdk.c
index 673d3f5..91a42d2 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -2225,6 +2225,7 @@  static BlockDriver bdrv_vmdk = {
     .format_name                  = "vmdk",
     .instance_size                = sizeof(BDRVVmdkState),
     .bdrv_probe                   = vmdk_probe,
+    .fname_ext                    = { "vdmk" },
     .bdrv_open                    = vmdk_open,
     .bdrv_check                   = vmdk_check,
     .bdrv_reopen_prepare          = vmdk_reopen_prepare,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8898c6c..55609ae 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -91,6 +91,7 @@  struct BlockDriver {
 
     int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename);
     int (*bdrv_probe_device)(const char *filename);
+    const char *fname_ext[2];
 
     /* Any driver implementing this callback is expected to be able to handle
      * NULL file names in its .bdrv_open() implementation */