diff mbox series

[2/2] qemu-img: Add dd seek= option

Message ID 20180815025614.53588-3-eblake@redhat.com
State New
Headers show
Series Improve qemu-img dd | expand

Commit Message

Eric Blake Aug. 15, 2018, 2:56 a.m. UTC
For feature parity with dd, we want to be able to specify
the offset within the output file, just as we can specify
the offset for the input (in particular, this makes copying
a subset range of guest-visible bytes from one file to
another much easier).

The code style for 'qemu-img dd' was pretty hard to read;
unfortunately this patch focuses only on adding the new
feature in the existing style rather than trying to improve
the overall flow, other than switching octal constants to
hex.  Oh well.

Also, switch the test to use an offset of 0 instead of 1,
to test skip= and seek= on their own; as it is, this is
effectively quadrupling the test runtime, which starts
to make this test borderline on whether it should still
belong to './check -g quick'.  And I didn't bother to
reindent the test shell code for the new nested loop.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 qemu-img.c                 |  41 ++++--
 tests/qemu-iotests/160     |  12 +-
 tests/qemu-iotests/160.out | 304 +++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 336 insertions(+), 21 deletions(-)

Comments

Max Reitz Aug. 16, 2018, 2:20 a.m. UTC | #1
On 2018-08-15 04:56, Eric Blake wrote:
> For feature parity with dd, we want to be able to specify
> the offset within the output file, just as we can specify
> the offset for the input (in particular, this makes copying
> a subset range of guest-visible bytes from one file to
> another much easier).

In my opinion, we do not want feature parity with dd.  What we do want
is feature parity with convert.

> The code style for 'qemu-img dd' was pretty hard to read;
> unfortunately this patch focuses only on adding the new
> feature in the existing style rather than trying to improve
> the overall flow, other than switching octal constants to
> hex.  Oh well.

No, the real issue is that dd is still not implemented just as a
frontend to convert.  Which it should be.  I'm not sure dd was a very
good idea from the start, and now it should ideally be a frontend to
convert.

(My full opinion on the matter: dd has a horrible interface.  I don't
quite see why we replicated that inside qemu-img.  Also, if you want to
use dd, why not use qemu-nbd + Linux nbd device + real dd?)

((That gave me a good idea.  Actually, it's probably not such a good
idea, but I guess I'll do it in my spare time anyway.  A qemu-img fuse
might be nice which represents an image as a raw image at some mount
point.  Benefits over qemu-nbd: (1) You don't need root, (2) you don't
need to type modprobe nbd.))

> Also, switch the test to use an offset of 0 instead of 1,
> to test skip= and seek= on their own; as it is, this is
> effectively quadrupling the test runtime, which starts
> to make this test borderline on whether it should still
> belong to './check -g quick'.  And I didn't bother to
> reindent the test shell code for the new nested loop.

In my opinion, it should no longer belong to quick.  It takes 8 s on my
tmpfs.  My border is somewhere around 2 or 3; and I haven't yet decided
whether that's on tmpfs or SSD.

> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  qemu-img.c                 |  41 ++++--
>  tests/qemu-iotests/160     |  12 +-
>  tests/qemu-iotests/160.out | 304 +++++++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 336 insertions(+), 21 deletions(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index d72f0f0ec94..ee01a18f331 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c

[...]

> @@ -4574,7 +4592,14 @@ static int img_dd(int argc, char **argv)
>          size = dd.count * in.bsz;
>      }
> 
> -    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size, &error_abort);
> +    if (dd.flags & C_SEEK && out.offset * out.bsz > INT64_MAX - size) {

What about overflows in out.offset * out.bsz?

> +        error_report("Seek too large for '%s'", out.filename);
> +        ret = -1;
> +        goto out;

Real dd doesn't seem to error out (it just reports an error).  I don't
know whether that makes any difference, though.

The test looks good to me.

Max

> +    }
> +    seek = out.offset * out.bsz;
> +
> +    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size + seek, &error_abort);
>      end = size + in_pos;
> 
>      ret = bdrv_create(drv, out.filename, opts, &local_err);
Eric Blake Aug. 16, 2018, 2:39 a.m. UTC | #2
On 08/15/2018 09:20 PM, Max Reitz wrote:
> On 2018-08-15 04:56, Eric Blake wrote:
>> For feature parity with dd, we want to be able to specify
>> the offset within the output file, just as we can specify
>> the offset for the input (in particular, this makes copying
>> a subset range of guest-visible bytes from one file to
>> another much easier).
> 
> In my opinion, we do not want feature parity with dd.  What we do want
> is feature parity with convert.

Well, convert is lacking a way to specify a subset of one file to move 
to a (possibly different) subset of the other.  I'm fine if we want to 
enhance convert to do the things that right now require a dd-alike 
interface (namely, limiting the copying to less than the full file, and 
choosing the offset at which to start [before this patch] or write to 
[with this patch]).

If convert were more powerful, I'd be fine dropping 'qemu-img dd' after 
a proper deprecation period.

> 
>> The code style for 'qemu-img dd' was pretty hard to read;
>> unfortunately this patch focuses only on adding the new
>> feature in the existing style rather than trying to improve
>> the overall flow, other than switching octal constants to
>> hex.  Oh well.
> 
> No, the real issue is that dd is still not implemented just as a
> frontend to convert.  Which it should be.  I'm not sure dd was a very
> good idea from the start, and now it should ideally be a frontend to
> convert.
> 
> (My full opinion on the matter: dd has a horrible interface.  I don't
> quite see why we replicated that inside qemu-img.  Also, if you want to
> use dd, why not use qemu-nbd + Linux nbd device + real dd?)

Because of performance: qemu-nbd + Linux nbd device + real dd is one 
more layer of data copying (each write() from dd goes to kernel, then is 
sent to qemu-nbd in userspace as a socket message before being sent back 
to the kernel to actually write() to the final destination) compared to 
just doing it all in one process (write() lands in the final destination 
with no further user space bouncing).  And because the additional steps 
to set it up are awkward (see my other email where I rant about losing 
the better part of today to realizing that 'dd ...; qemu-nbd -d 
/dev/nbd1' loses data if you omit conv=fdatasync).

> 
> ((That gave me a good idea.  Actually, it's probably not such a good
> idea, but I guess I'll do it in my spare time anyway.  A qemu-img fuse
> might be nice which represents an image as a raw image at some mount
> point.  Benefits over qemu-nbd: (1) You don't need root, (2) you don't
> need to type modprobe nbd.))

So the kernel->userspace translation would be happening via the FUSE 
interface instead of the NBD interface.  Data still bounces around just 
as much, but it might be a fun project.  Does fuse behave well when 
serving exactly one file at the mountpoint, rather than the more typical 
file system rooted in a directory?  NBD at least has the benefit of 
claiming to be a block device all along, rather than complicating the 
user interface with POSIX file system rules (which you'll be bending, 
because you are serving exactly one file instead of a system).

> 
>> Also, switch the test to use an offset of 0 instead of 1,
>> to test skip= and seek= on their own; as it is, this is
>> effectively quadrupling the test runtime, which starts
>> to make this test borderline on whether it should still
>> belong to './check -g quick'.  And I didn't bother to
>> reindent the test shell code for the new nested loop.
> 
> In my opinion, it should no longer belong to quick.  It takes 8 s on my
> tmpfs.  My border is somewhere around 2 or 3; and I haven't yet decided
> whether that's on tmpfs or SSD.

I took 4 iterations pre-patch, to 8 iterations after patch 1, to 32 
iterations with this patch; my observed times went from 1s to 2s to 7s 
on SSD ext4. Yeah, for v2, I'll drop it from quick.

>> @@ -4574,7 +4592,14 @@ static int img_dd(int argc, char **argv)
>>           size = dd.count * in.bsz;
>>       }
>>
>> -    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size, &error_abort);
>> +    if (dd.flags & C_SEEK && out.offset * out.bsz > INT64_MAX - size) {
> 
> What about overflows in out.offset * out.bsz?

I've had enough of my eyes bleeding on all the code repeatedly scaling 
things. For v2, I'm strongly considering a cleanup patch that reads all 
input, then scales all values into bytes, and THEN performs any 
additional math in a single unit, just so the additions become easier to 
reason about.

> 
>> +        error_report("Seek too large for '%s'", out.filename);
>> +        ret = -1;
>> +        goto out;
> 
> Real dd doesn't seem to error out (it just reports an error).  I don't
> know whether that makes any difference, though.

But where does the data get written if you can't actually seek that far 
into the file?

> 
> The test looks good to me.

Other than my creative indentation levels ;)

> 
> Max
> 
>> +    }
>> +    seek = out.offset * out.bsz;
>> +
>> +    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size + seek, &error_abort);
>>       end = size + in_pos;
>>
>>       ret = bdrv_create(drv, out.filename, opts, &local_err);
>
Eric Blake Aug. 16, 2018, 2:49 a.m. UTC | #3
On 08/15/2018 09:39 PM, Eric Blake wrote:

>> (My full opinion on the matter: dd has a horrible interface.  I don't
>> quite see why we replicated that inside qemu-img.  Also, if you want to
>> use dd, why not use qemu-nbd + Linux nbd device + real dd?)
> 
> Because of performance: qemu-nbd + Linux nbd device + real dd is one 
> more layer of data copying (each write() from dd goes to kernel, then is 
> sent to qemu-nbd in userspace as a socket message before being sent back 
> to the kernel to actually write() to the final destination) compared to 
> just doing it all in one process (write() lands in the final destination 
> with no further user space bouncing).  And because the additional steps 
> to set it up are awkward (see my other email where I rant about losing 
> the better part of today to realizing that 'dd ...; qemu-nbd -d 
> /dev/nbd1' loses data if you omit conv=fdatasync).

Oh, and because the kernel NBD module still doesn't know how to do 
sparse writes or expose the location of holes. In the kernel source, see 
drivers/block/nbd.c and weep at how far it lags behind qemu-nbd features 
that at least 'qemu-img convert' can take advantage of, but aren't 
present via /dev/nbd*
Max Reitz Aug. 16, 2018, 2:49 a.m. UTC | #4
On 2018-08-16 04:39, Eric Blake wrote:
> On 08/15/2018 09:20 PM, Max Reitz wrote:
>> On 2018-08-15 04:56, Eric Blake wrote:
>>> For feature parity with dd, we want to be able to specify
>>> the offset within the output file, just as we can specify
>>> the offset for the input (in particular, this makes copying
>>> a subset range of guest-visible bytes from one file to
>>> another much easier).
>>
>> In my opinion, we do not want feature parity with dd.  What we do want
>> is feature parity with convert.
> 
> Well, convert is lacking a way to specify a subset of one file to move
> to a (possibly different) subset of the other.  I'm fine if we want to
> enhance convert to do the things that right now require a dd-alike
> interface (namely, limiting the copying to less than the full file, and
> choosing the offset at which to start [before this patch] or write to
> [with this patch]).

Yes, I would want that.

> If convert were more powerful, I'd be fine dropping 'qemu-img dd' after
> a proper deprecation period.

Technically it has those features already, with the raw block driver's
offset and size parameters.

>>> The code style for 'qemu-img dd' was pretty hard to read;
>>> unfortunately this patch focuses only on adding the new
>>> feature in the existing style rather than trying to improve
>>> the overall flow, other than switching octal constants to
>>> hex.  Oh well.
>>
>> No, the real issue is that dd is still not implemented just as a
>> frontend to convert.  Which it should be.  I'm not sure dd was a very
>> good idea from the start, and now it should ideally be a frontend to
>> convert.
>>
>> (My full opinion on the matter: dd has a horrible interface.  I don't
>> quite see why we replicated that inside qemu-img.  Also, if you want to
>> use dd, why not use qemu-nbd + Linux nbd device + real dd?)
> 
> Because of performance: qemu-nbd + Linux nbd device + real dd is one
> more layer of data copying (each write() from dd goes to kernel, then is
> sent to qemu-nbd in userspace as a socket message before being sent back
> to the kernel to actually write() to the final destination) compared to
> just doing it all in one process (write() lands in the final destination
> with no further user space bouncing).  And because the additional steps
> to set it up are awkward (see my other email where I rant about losing
> the better part of today to realizing that 'dd ...; qemu-nbd -d
> /dev/nbd1' loses data if you omit conv=fdatasync).

I can see the sync problems, but is the performance really that much worse?

>> ((That gave me a good idea.  Actually, it's probably not such a good
>> idea, but I guess I'll do it in my spare time anyway.  A qemu-img fuse
>> might be nice which represents an image as a raw image at some mount
>> point.  Benefits over qemu-nbd: (1) You don't need root, (2) you don't
>> need to type modprobe nbd.))
> 
> So the kernel->userspace translation would be happening via the FUSE
> interface instead of the NBD interface.  Data still bounces around just
> as much, but it might be a fun project.  Does fuse behave well when
> serving exactly one file at the mountpoint, rather than the more typical
> file system rooted in a directory?  NBD at least has the benefit of
> claiming to be a block device all along, rather than complicating the
> user interface with POSIX file system rules (which you'll be bending,
> because you are serving exactly one file instead of a system).

Well, but I can just pretend my FUSE file is a block device, no?

Also, I just discovered something really interesting: FUSE allows you to
specify a single file as a mountpoint.

And you know what?  You can open the original file before you replace it
by the FUSE "filesystem".

So my fun interface is going to looks like this:

$ qemu-img fuse foo.qcow2

And then your foo.qcow2 is a raw image until the next "fusermount -u
foo.qcow2"!  Isn't that fun?

>>> Also, switch the test to use an offset of 0 instead of 1,
>>> to test skip= and seek= on their own; as it is, this is
>>> effectively quadrupling the test runtime, which starts
>>> to make this test borderline on whether it should still
>>> belong to './check -g quick'.  And I didn't bother to
>>> reindent the test shell code for the new nested loop.
>>
>> In my opinion, it should no longer belong to quick.  It takes 8 s on my
>> tmpfs.  My border is somewhere around 2 or 3; and I haven't yet decided
>> whether that's on tmpfs or SSD.
> 
> I took 4 iterations pre-patch, to 8 iterations after patch 1, to 32
> iterations with this patch; my observed times went from 1s to 2s to 7s
> on SSD ext4. Yeah, for v2, I'll drop it from quick.

Thanks!

>>> @@ -4574,7 +4592,14 @@ static int img_dd(int argc, char **argv)
>>>           size = dd.count * in.bsz;
>>>       }
>>>
>>> -    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size, &error_abort);
>>> +    if (dd.flags & C_SEEK && out.offset * out.bsz > INT64_MAX - size) {
>>
>> What about overflows in out.offset * out.bsz?
> 
> I've had enough of my eyes bleeding on all the code repeatedly scaling
> things. For v2, I'm strongly considering a cleanup patch that reads all
> input, then scales all values into bytes, and THEN performs any
> additional math in a single unit, just so the additions become easier to
> reason about.

Haha.  I won't object.

>>> +        error_report("Seek too large for '%s'", out.filename);
>>> +        ret = -1;
>>> +        goto out;
>>
>> Real dd doesn't seem to error out (it just reports an error).  I don't
>> know whether that makes any difference, though.
> 
> But where does the data get written if you can't actually seek that far
> into the file?

Well, the stats printed say it doesn't write anything.  So that's why I
don't know whether it makes any difference.

>>
>> The test looks good to me.
> 
> Other than my creative indentation levels ;)

I like them.

I mean, usually I just don't indent anything when adding to a test case.
 I do it like this:


Original code:

...
qemu-img (something) $TEST_IMG
...


Post-my-patch:

for opt in x y; do
...
qemu-img (something) $opt $TEST_IMG
...
done


And I know I'm not the only one.  So, yeah, I liked your more creative
solution.

Max
Eric Blake Aug. 16, 2018, 2:57 a.m. UTC | #5
On 08/15/2018 09:49 PM, Max Reitz wrote:

>>> In my opinion, we do not want feature parity with dd.  What we do want
>>> is feature parity with convert.
>>
>> Well, convert is lacking a way to specify a subset of one file to move
>> to a (possibly different) subset of the other.  I'm fine if we want to
>> enhance convert to do the things that right now require a dd-alike
>> interface (namely, limiting the copying to less than the full file, and
>> choosing the offset at which to start [before this patch] or write to
>> [with this patch]).
> 
> Yes, I would want that.
> 
>> If convert were more powerful, I'd be fine dropping 'qemu-img dd' after
>> a proper deprecation period.
> 
> Technically it has those features already, with the raw block driver's
> offset and size parameters.

Perhaps so, but it will be an interesting exercise in rewriting the 
shorthand nbd://host:port/export into the proper longhand driver syntax.


>>
>> Because of performance: qemu-nbd + Linux nbd device + real dd is one
>> more layer of data copying (each write() from dd goes to kernel, then is
>> sent to qemu-nbd in userspace as a socket message before being sent back
>> to the kernel to actually write() to the final destination) compared to
>> just doing it all in one process (write() lands in the final destination
>> with no further user space bouncing).  And because the additional steps
>> to set it up are awkward (see my other email where I rant about losing
>> the better part of today to realizing that 'dd ...; qemu-nbd -d
>> /dev/nbd1' loses data if you omit conv=fdatasync).
> 
> I can see the sync problems, but is the performance really that much worse?

When you don't have sparse file support, reading or writing large blocks 
of zeroes really is worse over /dev/nbd* than over a server/client pair 
that know how to do it efficiently.  But for non-sparse data, I don't 
know if a benchmark would be able to consistently note a difference 
(might be a fun benchmark for someone to try, but not high on my current 
to-do list).
Max Reitz Aug. 16, 2018, 3 a.m. UTC | #6
On 2018-08-16 04:57, Eric Blake wrote:
> On 08/15/2018 09:49 PM, Max Reitz wrote:
> 
>>>> In my opinion, we do not want feature parity with dd.  What we do want
>>>> is feature parity with convert.
>>>
>>> Well, convert is lacking a way to specify a subset of one file to move
>>> to a (possibly different) subset of the other.  I'm fine if we want to
>>> enhance convert to do the things that right now require a dd-alike
>>> interface (namely, limiting the copying to less than the full file, and
>>> choosing the offset at which to start [before this patch] or write to
>>> [with this patch]).
>>
>> Yes, I would want that.
>>
>>> If convert were more powerful, I'd be fine dropping 'qemu-img dd' after
>>> a proper deprecation period.
>>
>> Technically it has those features already, with the raw block driver's
>> offset and size parameters.
> 
> Perhaps so, but it will be an interesting exercise in rewriting the
> shorthand nbd://host:port/export into the proper longhand driver syntax.

Don't dare me! :-)

>>> Because of performance: qemu-nbd + Linux nbd device + real dd is one
>>> more layer of data copying (each write() from dd goes to kernel, then is
>>> sent to qemu-nbd in userspace as a socket message before being sent back
>>> to the kernel to actually write() to the final destination) compared to
>>> just doing it all in one process (write() lands in the final destination
>>> with no further user space bouncing).  And because the additional steps
>>> to set it up are awkward (see my other email where I rant about losing
>>> the better part of today to realizing that 'dd ...; qemu-nbd -d
>>> /dev/nbd1' loses data if you omit conv=fdatasync).
>>
>> I can see the sync problems, but is the performance really that much
>> worse?
> 
> When you don't have sparse file support, reading or writing large blocks
> of zeroes really is worse over /dev/nbd* than over a server/client pair
> that know how to do it efficiently.  But for non-sparse data, I don't
> know if a benchmark would be able to consistently note a difference
> (might be a fun benchmark for someone to try, but not high on my current
> to-do list).

Hm.  Yeah.  Well, for me, it remains that it would be better to have a
good way of exposing image contents to all of the rest of the system and
all of the nice tools there already are instead of re-implementing them
in qemu.

Max
Kevin Wolf Aug. 16, 2018, 7:15 a.m. UTC | #7
Am 16.08.2018 um 04:49 hat Max Reitz geschrieben:
> On 2018-08-16 04:39, Eric Blake wrote:
> > If convert were more powerful, I'd be fine dropping 'qemu-img dd' after
> > a proper deprecation period.
> 
> Technically it has those features already, with the raw block driver's
> offset and size parameters.

In a way, yes. It requires you to precreate the target image, though,
because --target-image-opts requires -n.

We'll probably want to fix something in this area anyway because in the
common case, you already need those options to even precreate the image,
and qemu-img create doesn't support open options for the protocol layer
either (unlike blockdev-create).

> >> ((That gave me a good idea.  Actually, it's probably not such a good
> >> idea, but I guess I'll do it in my spare time anyway.  A qemu-img fuse
> >> might be nice which represents an image as a raw image at some mount
> >> point.  Benefits over qemu-nbd: (1) You don't need root, (2) you don't
> >> need to type modprobe nbd.))
> > 
> > So the kernel->userspace translation would be happening via the FUSE
> > interface instead of the NBD interface.  Data still bounces around just
> > as much, but it might be a fun project.  Does fuse behave well when
> > serving exactly one file at the mountpoint, rather than the more typical
> > file system rooted in a directory?  NBD at least has the benefit of
> > claiming to be a block device all along, rather than complicating the
> > user interface with POSIX file system rules (which you'll be bending,
> > because you are serving exactly one file instead of a system).

To make it more real, you could expose a snapshots/ subdirectory. Or
maybe better not.

> Well, but I can just pretend my FUSE file is a block device, no?
> 
> Also, I just discovered something really interesting: FUSE allows you to
> specify a single file as a mountpoint.
> 
> And you know what?  You can open the original file before you replace it
> by the FUSE "filesystem".
> 
> So my fun interface is going to looks like this:
> 
> $ qemu-img fuse foo.qcow2
> 
> And then your foo.qcow2 is a raw image until the next "fusermount -u
> foo.qcow2"!  Isn't that fun?

Yes, sounds fun. Until you realise that I'd actually like to merge
something like this when it exists. ;-)

For a more serious implementation, maybe it should even be a separate
qemu-fuse rather than a qemu-img subcommand.

Kevin
Max Reitz Aug. 17, 2018, 7:22 p.m. UTC | #8
On 2018-08-16 09:15, Kevin Wolf wrote:
> Am 16.08.2018 um 04:49 hat Max Reitz geschrieben:
>> On 2018-08-16 04:39, Eric Blake wrote:
>>> If convert were more powerful, I'd be fine dropping 'qemu-img dd' after
>>> a proper deprecation period.
>>
>> Technically it has those features already, with the raw block driver's
>> offset and size parameters.
> 
> In a way, yes. It requires you to precreate the target image, though,
> because --target-image-opts requires -n.
> 
> We'll probably want to fix something in this area anyway because in the
> common case, you already need those options to even precreate the image,
> and qemu-img create doesn't support open options for the protocol layer
> either (unlike blockdev-create).
> 
>>>> ((That gave me a good idea.  Actually, it's probably not such a good
>>>> idea, but I guess I'll do it in my spare time anyway.  A qemu-img fuse
>>>> might be nice which represents an image as a raw image at some mount
>>>> point.  Benefits over qemu-nbd: (1) You don't need root, (2) you don't
>>>> need to type modprobe nbd.))
>>>
>>> So the kernel->userspace translation would be happening via the FUSE
>>> interface instead of the NBD interface.  Data still bounces around just
>>> as much, but it might be a fun project.  Does fuse behave well when
>>> serving exactly one file at the mountpoint, rather than the more typical
>>> file system rooted in a directory?  NBD at least has the benefit of
>>> claiming to be a block device all along, rather than complicating the
>>> user interface with POSIX file system rules (which you'll be bending,
>>> because you are serving exactly one file instead of a system).
> 
> To make it more real, you could expose a snapshots/ subdirectory. Or
> maybe better not.
> 
>> Well, but I can just pretend my FUSE file is a block device, no?
>>
>> Also, I just discovered something really interesting: FUSE allows you to
>> specify a single file as a mountpoint.
>>
>> And you know what?  You can open the original file before you replace it
>> by the FUSE "filesystem".
>>
>> So my fun interface is going to looks like this:
>>
>> $ qemu-img fuse foo.qcow2
>>
>> And then your foo.qcow2 is a raw image until the next "fusermount -u
>> foo.qcow2"!  Isn't that fun?
> 
> Yes, sounds fun. Until you realise that I'd actually like to merge
> something like this when it exists. ;-)

Well, for reference:

https://git.xanclic.moe/XanClic/qemu/commits/branch/qemu-img-fuse

(The main issue now is that you don't see FUSE error messages,
because...  Well.  FUSE likes to fork the daemon process itself, but for
some reason that doesn't really work with qemu.  I don't really want to
investigate that, instead I rather like to declare it broken and launch
FUSE in foreground mode, and then do the forking ourselves (like
qemu-nbd does).  But then we lose all error output from FUSE, because we
have to close stderr before we get to the main loop...

(I suppose I can just look into what fuse_main() does exactly and
replicate that.  The doc says using fuse_main() is just lazy anyway. O:-P))

> For a more serious implementation, maybe it should even be a separate
> qemu-fuse rather than a qemu-img subcommand.

True.  It doesn't really fit well into qemu-img.

(To me, qemu-img is about exposing block layer features without the user
having to use qemu proper.  Exporting an image over FUSE really is not a
block layer feature.)

Max
Fam Zheng Aug. 20, 2018, 2:07 a.m. UTC | #9
On Thu, 08/16 04:20, Max Reitz wrote:
> No, the real issue is that dd is still not implemented just as a
> frontend to convert.  Which it should be.  I'm not sure dd was a very
> good idea from the start, and now it should ideally be a frontend to
> convert.
> 
> (My full opinion on the matter: dd has a horrible interface.  I don't
> quite see why we replicated that inside qemu-img.  Also, if you want to
> use dd, why not use qemu-nbd + Linux nbd device + real dd?)

The intention is that dd is a familiar interface and allows for operating on
portions of images. It is much more convenient than "qemu-nbd + Linux nbd + dd"
and a bit more convenient than "booting a Linux VM, attaching the image as a
virtual disk, then use dd in the guest". More so when writing tests.

Fam
Max Reitz Aug. 20, 2018, 12:20 p.m. UTC | #10
On 2018-08-20 04:07, Fam Zheng wrote:
> On Thu, 08/16 04:20, Max Reitz wrote:
>> No, the real issue is that dd is still not implemented just as a
>> frontend to convert.  Which it should be.  I'm not sure dd was a very
>> good idea from the start, and now it should ideally be a frontend to
>> convert.
>>
>> (My full opinion on the matter: dd has a horrible interface.  I don't
>> quite see why we replicated that inside qemu-img.  Also, if you want to
>> use dd, why not use qemu-nbd + Linux nbd device + real dd?)
> 
> The intention is that dd is a familiar interface and allows for operating on
> portions of images. It is much more convenient than "qemu-nbd + Linux nbd + dd"
> and a bit more convenient than "booting a Linux VM, attaching the image as a
> virtual disk, then use dd in the guest". More so when writing tests.

This is my fault, but frankly, since I always get seek and skip mixed
up, whenever I use dd for anything but copying a whole image, I have to
look into the man page anyway, so the advantage over modprobe nbd &&
qemu-nbd -c /dev/nbd0 is gone.

And in my opinion the fact that it is a familiar interface doesn't make
it less of a horrible interface.

My main issue is that it's built right into qemu-img where it shouldn't
be.  We have convert, you can now even access portions of images with
the raw driver, so it really should have been just a script on top.

Max
diff mbox series

Patch

diff --git a/qemu-img.c b/qemu-img.c
index d72f0f0ec94..ee01a18f331 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -195,7 +195,8 @@  static void QEMU_NORETURN help(void)
            "  'count=N' copy only N input blocks\n"
            "  'if=FILE' read from FILE\n"
            "  'of=FILE' write to FILE\n"
-           "  'skip=N' skip N bs-sized blocks at the start of input\n";
+           "  'skip=N' skip N bs-sized blocks at the start of input\n"
+           "  'seek=N' skip N bs-sized blocks at the start of output\n";

     printf("%s\nSupported formats:", help_msg);
     bdrv_iterate_format(format_print, NULL);
@@ -4296,11 +4297,12 @@  out:
     return 0;
 }

-#define C_BS      01
-#define C_COUNT   02
-#define C_IF      04
-#define C_OF      010
-#define C_SKIP    020
+#define C_BS      0x1
+#define C_COUNT   0x2
+#define C_IF      0x4
+#define C_OF      0x8
+#define C_SKIP    0x10
+#define C_SEEK    0x20

 struct DdInfo {
     unsigned int flags;
@@ -4383,6 +4385,20 @@  static int img_dd_skip(const char *arg,
     return 0;
 }

+static int img_dd_seek(const char *arg,
+                       struct DdIo *in, struct DdIo *out,
+                       struct DdInfo *dd)
+{
+    out->offset = cvtnum(arg);
+
+    if (out->offset < 0) {
+        error_report("invalid number: '%s'", arg);
+        return 1;
+    }
+
+    return 0;
+}
+
 static int img_dd(int argc, char **argv)
 {
     int ret = 0;
@@ -4399,6 +4415,7 @@  static int img_dd(int argc, char **argv)
     const char *fmt = NULL;
     int64_t size = 0;
     int64_t block_count = 0, out_pos, in_pos, end;
+    int64_t seek = 0;
     bool force_share = false;
     struct DdInfo dd = {
         .flags = 0,
@@ -4423,6 +4440,7 @@  static int img_dd(int argc, char **argv)
         { "if", img_dd_if, C_IF },
         { "of", img_dd_of, C_OF },
         { "skip", img_dd_skip, C_SKIP },
+        { "seek", img_dd_seek, C_SEEK },
         { NULL, NULL, 0 }
     };
     const struct option long_options[] = {
@@ -4574,7 +4592,14 @@  static int img_dd(int argc, char **argv)
         size = dd.count * in.bsz;
     }

-    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size, &error_abort);
+    if (dd.flags & C_SEEK && out.offset * out.bsz > INT64_MAX - size) {
+        error_report("Seek too large for '%s'", out.filename);
+        ret = -1;
+        goto out;
+    }
+    seek = out.offset * out.bsz;
+
+    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size + seek, &error_abort);
     end = size + in_pos;

     ret = bdrv_create(drv, out.filename, opts, &local_err);
@@ -4617,7 +4642,7 @@  static int img_dd(int argc, char **argv)
         }
         in_pos += in_ret;

-        out_ret = blk_pwrite(blk2, out_pos, in.buf, in_ret, 0);
+        out_ret = blk_pwrite(blk2, out_pos + seek, in.buf, in_ret, 0);

         if (out_ret < 0) {
             error_report("error while writing to output image file: %s",
diff --git a/tests/qemu-iotests/160 b/tests/qemu-iotests/160
index 48380a3aafc..0096911e75e 100755
--- a/tests/qemu-iotests/160
+++ b/tests/qemu-iotests/160
@@ -1,6 +1,6 @@ 
 #! /bin/bash
 #
-# qemu-img dd test for the skip option
+# qemu-img dd test for the skip/seek option
 #
 # Copyright (C) 2016 Reda Sallahi
 #
@@ -41,10 +41,11 @@  _supported_fmt raw
 _supported_proto file
 _supported_os Linux

-TEST_SKIP_BLOCKS="1 2 30 30K"
+TEST_SKIP_BLOCKS="0 2 30 30K"

 for skip in $TEST_SKIP_BLOCKS; do
   for count in '' 'count=1 '; do
+   for seek in $TEST_SKIP_BLOCKS; do
     echo
     echo "== Creating image =="

@@ -54,18 +55,19 @@  for skip in $TEST_SKIP_BLOCKS; do
     $QEMU_IO -c "write -P 0xa 24 512k" "$TEST_IMG" | _filter_qemu_io

     echo
-    echo "== Converting the image with dd with ${count}skip=$skip =="
+    echo "== Converting the image with dd with ${count}skip=$skip seek=$seek =="

-    $QEMU_IMG dd if="$TEST_IMG" of="$TEST_IMG.out" $count skip="$skip" -O "$IMGFMT" \
+    $QEMU_IMG dd if="$TEST_IMG" of="$TEST_IMG.out" $count skip="$skip" seek="$seek" -O "$IMGFMT" \
         2> /dev/null
     TEST_IMG="$TEST_IMG.out" _check_test_img
-    dd if="$TEST_IMG" of="$TEST_IMG.out.dd" $count skip="$skip" status=none
+    dd if="$TEST_IMG" of="$TEST_IMG.out.dd" $count skip="$skip" seek="$seek" status=none

     echo
     echo "== Compare the images with qemu-img compare =="

     $QEMU_IMG compare "$TEST_IMG.out.dd" "$TEST_IMG.out"
     rm "$TEST_IMG.out.dd"
+   done
   done
 done

diff --git a/tests/qemu-iotests/160.out b/tests/qemu-iotests/160.out
index 6147a8493d6..b93ecad05cf 100644
--- a/tests/qemu-iotests/160.out
+++ b/tests/qemu-iotests/160.out
@@ -6,7 +6,7 @@  No errors were found on the image.
 wrote 524288/524288 bytes at offset 24
 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

-== Converting the image with dd with skip=1 ==
+== Converting the image with dd with skip=0 seek=0 ==
 No errors were found on the image.

 == Compare the images with qemu-img compare ==
@@ -18,7 +18,7 @@  No errors were found on the image.
 wrote 524288/524288 bytes at offset 24
 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

-== Converting the image with dd with count=1 skip=1 ==
+== Converting the image with dd with skip=0 seek=2 ==
 No errors were found on the image.

 == Compare the images with qemu-img compare ==
@@ -30,7 +30,7 @@  No errors were found on the image.
 wrote 524288/524288 bytes at offset 24
 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

-== Converting the image with dd with skip=2 ==
+== Converting the image with dd with skip=0 seek=30 ==
 No errors were found on the image.

 == Compare the images with qemu-img compare ==
@@ -42,7 +42,7 @@  No errors were found on the image.
 wrote 524288/524288 bytes at offset 24
 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

-== Converting the image with dd with count=1 skip=2 ==
+== Converting the image with dd with skip=0 seek=30K ==
 No errors were found on the image.

 == Compare the images with qemu-img compare ==
@@ -54,7 +54,7 @@  No errors were found on the image.
 wrote 524288/524288 bytes at offset 24
 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

-== Converting the image with dd with skip=30 ==
+== Converting the image with dd with count=1 skip=0 seek=0 ==
 No errors were found on the image.

 == Compare the images with qemu-img compare ==
@@ -66,7 +66,7 @@  No errors were found on the image.
 wrote 524288/524288 bytes at offset 24
 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

-== Converting the image with dd with count=1 skip=30 ==
+== Converting the image with dd with count=1 skip=0 seek=2 ==
 No errors were found on the image.

 == Compare the images with qemu-img compare ==
@@ -78,7 +78,7 @@  No errors were found on the image.
 wrote 524288/524288 bytes at offset 24
 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

-== Converting the image with dd with skip=30K ==
+== Converting the image with dd with count=1 skip=0 seek=30 ==
 No errors were found on the image.

 == Compare the images with qemu-img compare ==
@@ -90,7 +90,295 @@  No errors were found on the image.
 wrote 524288/524288 bytes at offset 24
 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

-== Converting the image with dd with count=1 skip=30K ==
+== Converting the image with dd with count=1 skip=0 seek=30K ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=2 seek=0 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=2 seek=2 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=2 seek=30 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=2 seek=30K ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=2 seek=0 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=2 seek=2 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=2 seek=30 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=2 seek=30K ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=30 seek=0 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=30 seek=2 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=30 seek=30 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=30 seek=30K ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=30 seek=0 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=30 seek=2 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=30 seek=30 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=30 seek=30K ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=30K seek=0 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=30K seek=2 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=30K seek=30 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with skip=30K seek=30K ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=30K seek=0 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=30K seek=2 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=30K seek=30 ==
+No errors were found on the image.
+
+== Compare the images with qemu-img compare ==
+Images are identical.
+
+== Creating image ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+No errors were found on the image.
+wrote 524288/524288 bytes at offset 24
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== Converting the image with dd with count=1 skip=30K seek=30K ==
 No errors were found on the image.

 == Compare the images with qemu-img compare ==