diff mbox series

qemu-img: Enable BDRV_REQ_MAY_UNMAP in convert

Message ID 20190323225948.19737-1-nirsof@gmail.com
State New
Headers show
Series qemu-img: Enable BDRV_REQ_MAY_UNMAP in convert | expand

Commit Message

Nir Soffer March 23, 2019, 10:59 p.m. UTC
With Kevin's "block: Fix slow pre-zeroing in qemu-img convert"[1]
we skip the pre zero step called like this:

    blk_make_zero(s->target, BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK)

And we write zeroes later using:

    blk_co_pwrite_zeroes(s->target,
                         sector_num << BDRV_SECTOR_BITS,
                         n << BDRV_SECTOR_BITS, 0);

Since we use flags=0, this is translated to NBD_CMD_WRITE_ZEROES with
NBD_CMD_FLAG_NO_HOLE flag, which cause the NBD server to allocated space
instead of punching a hole.

Here is an example failure:

$ dd if=/dev/urandom of=src.img bs=1M count=5
$ truncate -s 50m src.img
$ truncate -s 50m dst.img
$ nbdkit -f -v -e '' -U nbd.sock file file=dst.img

$ ./qemu-img convert -n src.img nbd:unix:nbd.sock

We can see in nbdkit log that it received the NBD_CMD_FLAG_NO_HOLE
(may_trim=0):

nbdkit: file[1]: debug: newstyle negotiation: flags: export 0x4d
nbdkit: file[1]: debug: pwrite count=2097152 offset=0
nbdkit: file[1]: debug: pwrite count=2097152 offset=2097152
nbdkit: file[1]: debug: pwrite count=1048576 offset=4194304
nbdkit: file[1]: debug: zero count=33554432 offset=5242880 may_trim=0
nbdkit: file[1]: debug: zero count=13631488 offset=38797312 may_trim=0
nbdkit: file[1]: debug: flush

And the image became fully allocated:

$ qemu-img info dst.img
virtual size: 50M (52428800 bytes)
disk size: 50M

With this change we see that nbdkit did not receive the
NBD_CMD_FLAG_NO_HOLE (may_trim=1):

nbdkit: file[1]: debug: newstyle negotiation: flags: export 0x4d
nbdkit: file[1]: debug: pwrite count=2097152 offset=0
nbdkit: file[1]: debug: pwrite count=2097152 offset=2097152
nbdkit: file[1]: debug: pwrite count=1048576 offset=4194304
nbdkit: file[1]: debug: zero count=33554432 offset=5242880 may_trim=1
nbdkit: file[1]: debug: zero count=13631488 offset=38797312 may_trim=1
nbdkit: file[1]: debug: flush

And the file is sparse as expected:

$ qemu-img info dst.img
virtual size: 50M (52428800 bytes)
disk size: 5.0M

Tested on top of Kevin patches:
http://lists.nongnu.org/archive/html/qemu-block/2019-03/msg00755.html

I'm not sure this change is correct for all cases, posting for
discussion.

[1] http://lists.nongnu.org/archive/html/qemu-block/2019-03/msg00761.html

Signed-off-by: Nir Soffer <nirsof@gmail.com>
---
 qemu-img.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Eric Blake March 25, 2019, 2:30 p.m. UTC | #1
On 3/23/19 5:59 PM, Nir Soffer wrote:
> With Kevin's "block: Fix slow pre-zeroing in qemu-img convert"[1]
> we skip the pre zero step called like this:
> 
>     blk_make_zero(s->target, BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK)
> 
> And we write zeroes later using:
> 
>     blk_co_pwrite_zeroes(s->target,
>                          sector_num << BDRV_SECTOR_BITS,
>                          n << BDRV_SECTOR_BITS, 0);
> 
> Since we use flags=0, this is translated to NBD_CMD_WRITE_ZEROES with
> NBD_CMD_FLAG_NO_HOLE flag, which cause the NBD server to allocated space
> instead of punching a hole.
> 

> 
> $ qemu-img info dst.img
> virtual size: 50M (52428800 bytes)
> disk size: 5.0M

Up to here makes sense for the commit message...

> 
> Tested on top of Kevin patches:
> http://lists.nongnu.org/archive/html/qemu-block/2019-03/msg00755.html
> 
> I'm not sure this change is correct for all cases, posting for
> discussion.

...this part probably belongs...

> 
> [1] http://lists.nongnu.org/archive/html/qemu-block/2019-03/msg00761.html
> 
> Signed-off-by: Nir Soffer <nirsof@gmail.com>
> ---


...here, after the ---, as it is useful for reviewers but not needed for
the permanent git log.

>  qemu-img.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index 8ee63daeae..ca9deb3758 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -1752,11 +1752,12 @@ static int coroutine_fn convert_co_write(ImgConvertState *s, int64_t sector_num,
>                  assert(!s->target_has_backing);
>                  break;
>              }
>              ret = blk_co_pwrite_zeroes(s->target,
>                                         sector_num << BDRV_SECTOR_BITS,
> -                                       n << BDRV_SECTOR_BITS, 0);
> +                                       n << BDRV_SECTOR_BITS,
> +                                       BDRV_REQ_MAY_UNMAP);

The idea makes sense to me. Maybe Kevin will come up with a
counterargument why we don't want convert to allow holes by default, but
unless we have a strong argument against it, it looks like a performance
improvement worth having in 4.0.

Reviewed-by: Eric Blake <eblake@redhat.com>

>              if (ret < 0) {
>                  return ret;
>              }
>              break;
>          }
>
diff mbox series

Patch

diff --git a/qemu-img.c b/qemu-img.c
index 8ee63daeae..ca9deb3758 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1752,11 +1752,12 @@  static int coroutine_fn convert_co_write(ImgConvertState *s, int64_t sector_num,
                 assert(!s->target_has_backing);
                 break;
             }
             ret = blk_co_pwrite_zeroes(s->target,
                                        sector_num << BDRV_SECTOR_BITS,
-                                       n << BDRV_SECTOR_BITS, 0);
+                                       n << BDRV_SECTOR_BITS,
+                                       BDRV_REQ_MAY_UNMAP);
             if (ret < 0) {
                 return ret;
             }
             break;
         }