Patchwork [3/5] qcow2: Options to enable discard for freed clusters

login
register
mail settings
Submitter Kevin Wolf
Date June 13, 2013, 11:47 a.m.
Message ID <1371124063-12971-4-git-send-email-kwolf@redhat.com>
Download mbox | patch
Permalink /patch/251060/
State New
Headers show

Comments

Kevin Wolf - June 13, 2013, 11:47 a.m.
Deleted snapshots are discarded in the image file by default, discard
requests take their default from the -drive discard=... option and other
places that free clusters must always be enabled explicitly.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-refcount.c |  5 +++++
 block/qcow2.c          | 26 ++++++++++++++++++++++++++
 block/qcow2.h          |  5 +++++
 3 files changed, 36 insertions(+)
Paolo Bonzini - June 13, 2013, 10:10 p.m.
Il 13/06/2013 07:47, Kevin Wolf ha scritto:
> +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
> +    s->discard_passthrough[QCOW2_DISCARD_ALWAYS] = true,
> +    s->discard_passthrough[QCOW2_DISCARD_REQUEST] =
> +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_REQUEST,
> +                          flags & BDRV_O_UNMAP),

I think there should not be two ways to enable it, it is confusing.

> +    s->discard_passthrough[QCOW2_DISCARD_SNAPSHOT] =
> +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_SNAPSHOT, true),
> +    s->discard_passthrough[QCOW2_DISCARD_OTHER] =
> +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false),

Please document the defaults in qcow2_runtime_opts.  (BTW, what is the
rationale?)

Paolo
Kevin Wolf - June 14, 2013, 8:31 a.m.
Am 14.06.2013 um 00:10 hat Paolo Bonzini geschrieben:
> Il 13/06/2013 07:47, Kevin Wolf ha scritto:
> > +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
> > +    s->discard_passthrough[QCOW2_DISCARD_ALWAYS] = true,
> > +    s->discard_passthrough[QCOW2_DISCARD_REQUEST] =
> > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_REQUEST,
> > +                          flags & BDRV_O_UNMAP),
> 
> I think there should not be two ways to enable it, it is confusing.

Hm, yes... But it's also confusing to have qcow2 provide an incomplete
set of categories. Maybe we shouldn't have introduced -drive discard=...
as a global option to begin with.

> > +    s->discard_passthrough[QCOW2_DISCARD_SNAPSHOT] =
> > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_SNAPSHOT, true),
> > +    s->discard_passthrough[QCOW2_DISCARD_OTHER] =
> > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false),
> 
> Please document the defaults in qcow2_runtime_opts.  (BTW, what is the
> rationale?)

The idea was that discard is slow and therefore disabled by default,
except when you're doing an expensive snapshot operation that can
potentially free a lot of space at once with not too many requests, so
there it's enabled. And if you said -drive discard=on, you obviously
want guest requests to take effect.

We could let QCOW2_OPT_DISCARD_OTHER default to BDRV_O_UNMAP as well if
you prefer.

Kevin
Paolo Bonzini - June 14, 2013, 2:16 p.m.
Il 14/06/2013 04:31, Kevin Wolf ha scritto:
>>> > > +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
>>> > > +    s->discard_passthrough[QCOW2_DISCARD_ALWAYS] = true,
>>> > > +    s->discard_passthrough[QCOW2_DISCARD_REQUEST] =
>>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_REQUEST,
>>> > > +                          flags & BDRV_O_UNMAP),
>> > 
>> > I think there should not be two ways to enable it, it is confusing.
> Hm, yes... But it's also confusing to have qcow2 provide an incomplete
> set of categories. Maybe we shouldn't have introduced -drive discard=...
> as a global option to begin with.
> 
>>> > > +    s->discard_passthrough[QCOW2_DISCARD_SNAPSHOT] =
>>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_SNAPSHOT, true),
>>> > > +    s->discard_passthrough[QCOW2_DISCARD_OTHER] =
>>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false),
>> > 
>> > Please document the defaults in qcow2_runtime_opts.  (BTW, what is the
>> > rationale?)
> The idea was that discard is slow and therefore disabled by default,
> except when you're doing an expensive snapshot operation that can
> potentially free a lot of space at once with not too many requests, so
> there it's enabled. And if you said -drive discard=on, you obviously
> want guest requests to take effect.
> 
> We could let QCOW2_OPT_DISCARD_OTHER default to BDRV_O_UNMAP as well if
> you prefer.

It looks like QCOW2_OPT_DISCARD_OTHER is a rare case, so I don't mind
leaving it as default to false.  It won't waste more than a few clusters.

In the end discard_snapshot and discard_other should rarely be needed in
practice, so I don't think having discard=... is a mistake.  Too many
knobs won't really be needed.

In fact, perhaps we do not need discard_snapshot and discard_request,
only discard_other.  discard_snapshot can be replaced by
file.discard=ignore, discard_request by discard=unmap.

Paolo
Kevin Wolf - June 14, 2013, 2:31 p.m.
Am 14.06.2013 um 16:16 hat Paolo Bonzini geschrieben:
> Il 14/06/2013 04:31, Kevin Wolf ha scritto:
> >>> > > +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
> >>> > > +    s->discard_passthrough[QCOW2_DISCARD_ALWAYS] = true,
> >>> > > +    s->discard_passthrough[QCOW2_DISCARD_REQUEST] =
> >>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_REQUEST,
> >>> > > +                          flags & BDRV_O_UNMAP),
> >> > 
> >> > I think there should not be two ways to enable it, it is confusing.
> > Hm, yes... But it's also confusing to have qcow2 provide an incomplete
> > set of categories. Maybe we shouldn't have introduced -drive discard=...
> > as a global option to begin with.
> > 
> >>> > > +    s->discard_passthrough[QCOW2_DISCARD_SNAPSHOT] =
> >>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_SNAPSHOT, true),
> >>> > > +    s->discard_passthrough[QCOW2_DISCARD_OTHER] =
> >>> > > +        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false),
> >> > 
> >> > Please document the defaults in qcow2_runtime_opts.  (BTW, what is the
> >> > rationale?)
> > The idea was that discard is slow and therefore disabled by default,
> > except when you're doing an expensive snapshot operation that can
> > potentially free a lot of space at once with not too many requests, so
> > there it's enabled. And if you said -drive discard=on, you obviously
> > want guest requests to take effect.
> > 
> > We could let QCOW2_OPT_DISCARD_OTHER default to BDRV_O_UNMAP as well if
> > you prefer.
> 
> It looks like QCOW2_OPT_DISCARD_OTHER is a rare case, so I don't mind
> leaving it as default to false.  It won't waste more than a few clusters.

Yes, it's generally relatively rare, like growing L1 or refcount table.
There is one case where it should trigger a lot, though: Overwriting
clusters of a compressed image.

Hm, though actually it doesn't make a lot of sense there. The freed
cluster will immediately be used by the next write. Maybe COW should
actually be QCOW2_OPT_DISCARD_NEVER...

> In the end discard_snapshot and discard_other should rarely be needed in
> practice, so I don't think having discard=... is a mistake.  Too many
> knobs won't really be needed.
> 
> In fact, perhaps we do not need discard_snapshot and discard_request,
> only discard_other.  discard_snapshot can be replaced by
> file.discard=ignore, discard_request by discard=unmap.

This is only true if you rule out some combination as useless. For
example you would say that if you want to process guest requests, you
always want to have snapshots discarded as well. You also assume that
nobody wants the current behaviour (free clusters in qcow2 metadata, but
don't send discards to raw-posix).

Isn't this assuming a bit too much?

To be clear, I don't expect these knobs to be used much either, but I
have some feeling that some people (including us while debugging or
asking questions) may be glad later to have such low-level options that
control each layer separately.

Kevin
Paolo Bonzini - June 14, 2013, 3 p.m.
Il 14/06/2013 10:31, Kevin Wolf ha scritto:
>> It looks like QCOW2_OPT_DISCARD_OTHER is a rare case, so I don't mind
>> leaving it as default to false.  It won't waste more than a few clusters.
> 
> Yes, it's generally relatively rare, like growing L1 or refcount table.
> There is one case where it should trigger a lot, though: Overwriting
> clusters of a compressed image.
> 
> Hm, though actually it doesn't make a lot of sense there. The freed
> cluster will immediately be used by the next write. Maybe COW should
> actually be QCOW2_OPT_DISCARD_NEVER...

Sounds reasonable.

>> In the end discard_snapshot and discard_other should rarely be needed in
>> practice, so I don't think having discard=... is a mistake.  Too many
>> knobs won't really be needed.
>>
>> In fact, perhaps we do not need discard_snapshot and discard_request,
>> only discard_other.  discard_snapshot can be replaced by
>> file.discard=ignore, discard_request by discard=unmap.
> 
> This is only true if you rule out some combination as useless. For
> example you would say that if you want to process guest requests, you
> always want to have snapshots discarded as well. You also assume that
> nobody wants the current behaviour (free clusters in qcow2 metadata, but
> don't send discards to raw-posix).
> 
> Isn't this assuming a bit too much?
> 
> To be clear, I don't expect these knobs to be used much either, but I
> have some feeling that some people (including us while debugging or
> asking questions) may be glad later to have such low-level options that
> control each layer separately.

Yeah, you're right.  They're definitely useful to have, even if it is
"just in case".

Paolo
Stefan Hajnoczi - June 17, 2013, 3:41 p.m.
On Thu, Jun 13, 2013 at 01:47:41PM +0200, Kevin Wolf wrote:
> @@ -532,6 +548,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
>      s->use_lazy_refcounts = qemu_opt_get_bool(opts, QCOW2_OPT_LAZY_REFCOUNTS,
>          (s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS));
>  
> +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,

comma instead of semicolon?

> @@ -187,6 +190,8 @@ typedef struct BDRVQcowState {
>      int qcow_version;
>      bool use_lazy_refcounts;
>  
> +    bool discard_passthrough[QCOW2_DISCARD_MAX];
> +

Neat solution to specifying discard behavior.

Stefan
Kevin Wolf - June 17, 2013, 3:58 p.m.
Am 17.06.2013 um 17:41 hat Stefan Hajnoczi geschrieben:
> On Thu, Jun 13, 2013 at 01:47:41PM +0200, Kevin Wolf wrote:
> > @@ -532,6 +548,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
> >      s->use_lazy_refcounts = qemu_opt_get_bool(opts, QCOW2_OPT_LAZY_REFCOUNTS,
> >          (s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS));
> >  
> > +    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
> 
> comma instead of semicolon?

Oh. Isn't C a nice language? :-)

I'll fix that.

Kevin

Patch

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 6d35e49..7488988 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -488,6 +488,11 @@  static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
             s->free_cluster_index = cluster_index;
         }
         refcount_block[block_index] = cpu_to_be16(refcount);
+        if (refcount == 0 && s->discard_passthrough[type]) {
+            /* Try discarding, ignore errors */
+            /* FIXME Doing this cluster by cluster will be painfully slow */
+            bdrv_discard(bs->file, cluster_offset, 1);
+        }
     }
 
     ret = 0;
diff --git a/block/qcow2.c b/block/qcow2.c
index e28ea47..62e6753 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -295,6 +295,22 @@  static QemuOptsList qcow2_runtime_opts = {
             .type = QEMU_OPT_BOOL,
             .help = "Postpone refcount updates",
         },
+        {
+            .name = QCOW2_OPT_DISCARD_REQUEST,
+            .type = QEMU_OPT_BOOL,
+            .help = "Pass guest discard requests to the layer below",
+        },
+        {
+            .name = QCOW2_OPT_DISCARD_SNAPSHOT,
+            .type = QEMU_OPT_BOOL,
+            .help = "Generate discard requests when snapshot related space "
+                    "is freed",
+        },
+        {
+            .name = QCOW2_OPT_DISCARD_OTHER,
+            .type = QEMU_OPT_BOOL,
+            .help = "Generate discard requests when other clusters are freed",
+        },
         { /* end of list */ }
     },
 };
@@ -532,6 +548,16 @@  static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
     s->use_lazy_refcounts = qemu_opt_get_bool(opts, QCOW2_OPT_LAZY_REFCOUNTS,
         (s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS));
 
+    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false,
+    s->discard_passthrough[QCOW2_DISCARD_ALWAYS] = true,
+    s->discard_passthrough[QCOW2_DISCARD_REQUEST] =
+        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_REQUEST,
+                          flags & BDRV_O_UNMAP),
+    s->discard_passthrough[QCOW2_DISCARD_SNAPSHOT] =
+        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_SNAPSHOT, true),
+    s->discard_passthrough[QCOW2_DISCARD_OTHER] =
+        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false),
+
     qemu_opts_del(opts);
 
     if (s->use_lazy_refcounts && s->qcow_version < 3) {
diff --git a/block/qcow2.h b/block/qcow2.h
index 64a6479..6f91b9a 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -60,6 +60,9 @@ 
 
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy_refcounts"
+#define QCOW2_OPT_DISCARD_REQUEST "pass_discard_request"
+#define QCOW2_OPT_DISCARD_SNAPSHOT "pass_discard_snapshot"
+#define QCOW2_OPT_DISCARD_OTHER "pass_discard_other"
 
 typedef struct QCowHeader {
     uint32_t magic;
@@ -187,6 +190,8 @@  typedef struct BDRVQcowState {
     int qcow_version;
     bool use_lazy_refcounts;
 
+    bool discard_passthrough[QCOW2_DISCARD_MAX];
+
     uint64_t incompatible_features;
     uint64_t compatible_features;
     uint64_t autoclear_features;