Patchwork [v2] qemu-img: Implement 'diff' operation.

login
register
mail settings
Submitter Richard W.M. Jones
Date May 17, 2012, 2:57 p.m.
Message ID <1337266628-3588-1-git-send-email-rjones@redhat.com>
Download mbox | patch
Permalink /patch/159942/
State New
Headers show

Comments

Richard W.M. Jones - May 17, 2012, 2:57 p.m.
From: "Richard W.M. Jones" <rjones@redhat.com>

This produces a qcow2 file which is the difference between
two disk images.  ie, if:

  base.img     - is a disk image (in any format)
  modified.img - is base.img, copied and modified

then:

  qemu-img diff -b base.img modified.img diff.qcow2

creates 'diff.qcow2' which contains the differences between 'base.img'
and 'modified.img'.  Note that 'diff.qcow2' has 'base.img' as its
backing file.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Matthew Booth <mbooth@redhat.com>
Cc: Pablo Iranzo Gómez <Pablo.Iranzo@redhat.com>
Cc: Tomas Von Veschler <tvvcox@redhat.com>
---
 qemu-img-cmds.hx |    6 ++
 qemu-img.c       |  162 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 qemu-img.texi    |   22 ++++++++
 3 files changed, 190 insertions(+)
Eric Blake - May 17, 2012, 3:15 p.m.
On 05/17/2012 08:57 AM, Richard W.M. Jones wrote:
> From: "Richard W.M. Jones" <rjones@redhat.com>
> 
> This produces a qcow2 file which is the difference between
> two disk images.  ie, if:
> 
>   base.img     - is a disk image (in any format)
>   modified.img - is base.img, copied and modified
> 
> then:
> 
>   qemu-img diff -b base.img modified.img diff.qcow2
> 
> creates 'diff.qcow2' which contains the differences between 'base.img'
> and 'modified.img'.  Note that 'diff.qcow2' has 'base.img' as its
> backing file.
> 

> +++ b/qemu-img-cmds.hx
> @@ -33,6 +33,12 @@ STEXI
>  @item convert [-c] [-p] [-f @var{fmt}] [-t @var{cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_name}] [-S @var{sparse_size}] @var{filename} [@var{filename2} [...]] @var{output_filename}
>  ETEXI
>  
> +DEF("diff", img_diff,
> +    "diff [-F backing_fmt] -b backing_file [-f fmt] [-O output_fmt] [-o options] filename output_filename")
> +STEXI
> +@item diff [-F @var{backing_fmt}] -b @var{backing_file} [-f @var{fmt}] [-O @var{output_fmt}] [-o @var{options}] @var{filename} @var{output_filename}

Why the difference in ordering between -o and -O?


> +
> +    if (argc - optind != 2) {
> +        error_report("The input and output filenames must be supplied");
> +        return 1;

Error is misleading if argc > optind+2.


> +
> +    if (fmt_out == NULL || fmt_out[0] == '\0') {
> +        fmt_out = "qcow2";
> +    }

So output defaults to qcow2, and input (both base and modified) default
to probing.  Works for me.

> +
> +    bdrv_get_geometry(bs_base, &num_sectors);
> +    bdrv_get_geometry(bs_modified, &modified_num_sectors);
> +    if (num_sectors != modified_num_sectors) {
> +        error_report("Number of sectors in backing and source must be the same");
> +        goto out2;
> +    }

I can live with an initial implementation that is strict, where a later
patch relaxes things to allow a modified image that has been enlarged.

> +
> +    /* Output image. */
> +    ret = bdrv_img_create(out, fmt_out,
> +                          /* base file becomes the new backing file */
> +                          base, fmt_base,
> +                          options,
> +                          num_sectors * BDRV_SECTOR_SIZE, BDRV_O_FLAGS);

I still think this should be modified_num_sectors - for now, the two
values are equal, but if we relax the error check up above, then you
really do want to go with the output file the same size as the modified
file.

It looks like you did indeed address most of my comments on v1.
Richard W.M. Jones - May 17, 2012, 3:27 p.m.
On Thu, May 17, 2012 at 09:15:08AM -0600, Eric Blake wrote:
> On 05/17/2012 08:57 AM, Richard W.M. Jones wrote:
> > +DEF("diff", img_diff,
> > +    "diff [-F backing_fmt] -b backing_file [-f fmt] [-O output_fmt] [-o options] filename output_filename")
> > +STEXI
> > +@item diff [-F @var{backing_fmt}] -b @var{backing_file} [-f @var{fmt}] [-O @var{output_fmt}] [-o @var{options}] @var{filename} @var{output_filename}
> 
> Why the difference in ordering between -o and -O?

I though it looked more logical, but I can swap them around too if you
think it's better.

> > +
> > +    if (argc - optind != 2) {
> > +        error_report("The input and output filenames must be supplied");
> > +        return 1;
> 
> Error is misleading if argc > optind+2.

OK will fix.

I'll rework the num_sectors thing in v3.

Rich.

Patch

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 49dce7c..00eef96 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -33,6 +33,12 @@  STEXI
 @item convert [-c] [-p] [-f @var{fmt}] [-t @var{cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_name}] [-S @var{sparse_size}] @var{filename} [@var{filename2} [...]] @var{output_filename}
 ETEXI
 
+DEF("diff", img_diff,
+    "diff [-F backing_fmt] -b backing_file [-f fmt] [-O output_fmt] [-o options] filename output_filename")
+STEXI
+@item diff [-F @var{backing_fmt}] -b @var{backing_file} [-f @var{fmt}] [-O @var{output_fmt}] [-o @var{options}] @var{filename} @var{output_filename}
+ETEXI
+
 DEF("info", img_info,
     "info [-f fmt] filename")
 STEXI
diff --git a/qemu-img.c b/qemu-img.c
index c8a70ff..ccf7541 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1533,6 +1533,168 @@  out:
     return 0;
 }
 
+static int img_diff(int argc, char **argv)
+{
+    /* qemu-img diff -b base modified out */
+    BlockDriverState *bs_base, *bs_modified, *bs_out;
+    const char *fmt_base, *base,
+        *fmt_modified, *modified,
+        *fmt_out, *out;
+    char *options;
+    int c, ret = 0;
+    uint64_t num_sectors, modified_num_sectors;
+    uint64_t sector;
+    int n;
+    uint8_t *buf_base;
+    uint8_t *buf_modified;
+
+    /* Parse commandline parameters */
+    fmt_base = NULL;
+    fmt_modified = NULL;
+    fmt_out = NULL;
+    base = NULL;
+    options = NULL;
+    for(;;) {
+        c = getopt(argc, argv, "b:hf:F:O:o:");
+        if (c == -1) {
+            break;
+        }
+        switch(c) {
+        case '?':
+        case 'h':
+            help();
+            return 0;
+        case 'f':
+            fmt_modified = optarg;
+            break;
+        case 'F':
+            fmt_base = optarg;
+            break;
+        case 'b':
+            base = optarg;
+            break;
+        case 'O':
+            fmt_out = optarg;
+            break;
+        case 'o':
+            options = optarg;
+            break;
+        }
+    }
+
+    if (base == NULL) {
+        error_report("The -b (backing filename) option must be supplied");
+        return 1;
+    }
+
+    if (argc - optind != 2) {
+        error_report("The input and output filenames must be supplied");
+        return 1;
+    }
+    modified = argv[optind++];
+    out = argv[optind++];
+
+    if (fmt_out == NULL || fmt_out[0] == '\0') {
+        fmt_out = "qcow2";
+    }
+
+    if (options && !strcmp(options, "?")) {
+        ret = print_block_option_help(out, fmt_out);
+        return 1;
+    }
+
+    /* Open the input images. */
+    bs_base = bdrv_new_open(base, fmt_base, BDRV_O_FLAGS);
+    if (!bs_base) {
+        return 1;
+    }
+
+    bs_modified = bdrv_new_open(modified, fmt_modified, BDRV_O_FLAGS);
+    if (!bs_modified) {
+        return 1;
+    }
+
+    bdrv_get_geometry(bs_base, &num_sectors);
+    bdrv_get_geometry(bs_modified, &modified_num_sectors);
+    if (num_sectors != modified_num_sectors) {
+        error_report("Number of sectors in backing and source must be the same");
+        goto out2;
+    }
+
+    /* Output image. */
+    ret = bdrv_img_create(out, fmt_out,
+                          /* base file becomes the new backing file */
+                          base, fmt_base,
+                          options,
+                          num_sectors * BDRV_SECTOR_SIZE, BDRV_O_FLAGS);
+    if (ret != 0) {
+        goto out2;
+    }
+    bs_out = bdrv_new_open(out, fmt_out, BDRV_O_RDWR);
+
+    buf_base = qemu_blockalign(bs_base, IO_BUF_SIZE);
+    buf_modified = qemu_blockalign(bs_modified, IO_BUF_SIZE);
+
+    for (sector = 0; sector < num_sectors; sector += n) {
+        /* How many sectors can we handle with the next read? */
+        if (sector + (IO_BUF_SIZE / BDRV_SECTOR_SIZE) <= num_sectors) {
+            n = IO_BUF_SIZE / BDRV_SECTOR_SIZE;
+        } else {
+            n = num_sectors - sector;
+        }
+
+        /* Read input files and compare. */
+        ret = bdrv_read(bs_base, sector, buf_base, n);
+        if (ret < 0) {
+            error_report("error while reading from backing file");
+            goto out;
+        }
+
+        ret = bdrv_read(bs_modified, sector, buf_modified, n);
+        if (ret < 0) {
+            error_report("error while reading from input file");
+            goto out;
+        }
+
+        /* If they differ, we need to write to the differences file. */
+        uint64_t written = 0;
+
+        while (written < n) {
+            int pnum;
+
+            if (compare_sectors(buf_base + written * BDRV_SECTOR_SIZE,
+                                buf_modified + written * BDRV_SECTOR_SIZE,
+                                n - written, &pnum)) {
+                ret = bdrv_write(bs_out, sector + written,
+                                 buf_modified + written * BDRV_SECTOR_SIZE,
+                                 pnum);
+                if (ret < 0) {
+                    error_report("Error while writing to output file: %s",
+                                 strerror(-ret));
+                    goto out;
+                }
+            }
+
+            written += pnum;
+        }
+    }
+
+    qemu_vfree(buf_base);
+    qemu_vfree(buf_modified);
+
+ out:
+    /* Cleanup */
+    bdrv_delete(bs_out);
+ out2:
+    bdrv_delete(bs_base);
+    bdrv_delete(bs_modified);
+
+    if (ret) {
+        return 1;
+    }
+    return 0;
+}
+
 static int img_resize(int argc, char **argv)
 {
     int c, ret, relative;
diff --git a/qemu-img.texi b/qemu-img.texi
index b2ca3a5..718d302 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -114,6 +114,28 @@  created as a copy on write image of the specified base image; the
 @var{backing_file} should have the same content as the input's base image,
 however the path, image format, etc may differ.
 
+@item diff [-F @var{backing_fmt}] -b @var{backing_file} [-f @var{fmt}] [-O @var{output_fmt}] [-o @var{options}] @var{filename} @var{output_filename}
+
+Create a new disk image (@var{output_filename}) which contains
+the differences between @var{backing_file} and @var{filename}.
+This is useful if you have cloned a disk image by copying it,
+and you want to get back to a thin image on top of a common base.
+
+Typical usage is:
+
+@code{qemu-img diff -b base.img modified.img diff.qcow2}
+
+The @var{backing_file} and @var{filename} must have the same
+virtual disk size, but may be in different formats.
+
+The format of @var{output_file} must be one that supports backing
+files (@code{qcow2} or @code{qed}).  If not specified,
+@var{output_fmt} defaults to @code{qcow2}.
+@var{options} can be used to specify options for the output file.
+
+@var{output_file} will have @var{backing_file} set as its backing
+file.
+
 @item info [-f @var{fmt}] @var{filename}
 
 Give information about the disk image @var{filename}. Use it in