Message ID | 1360153926-9492-28-git-send-email-benoit@irqsave.net |
---|---|
State | New |
Headers | show |
On Wed, Feb 06, 2013 at 01:32:00PM +0100, Benoît Canet wrote: > Signed-off-by: Benoit Canet <benoit@irqsave.net> > --- > block/qcow2-refcount.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index d7d9339..ee5de6b 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -999,7 +999,14 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res, > PRIx64 ": %s\n", l2_entry, strerror(-refcount)); > goto fail; > } > - if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > + if (!s->has_dedup && > + (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > + fprintf(stderr, "ERROR OFLAG_COPIED: offset=%" > + PRIx64 " refcount=%d\n", l2_entry, refcount); > + res->corruptions++; > + } Why is this warning suppressed when dedup is enabled? The meaning of QCOW_OFLAG_COPIED is that refcount == 1. If this invariant is violated then something is wrong. > + if (s->has_dedup && refcount > 1 && > + ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > fprintf(stderr, "ERROR OFLAG_COPIED: offset=%" > PRIx64 " refcount=%d\n", l2_entry, refcount); > res->corruptions++; > -- > 1.7.10.4 >
> > - if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > + if (!s->has_dedup && > > + (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > + fprintf(stderr, "ERROR OFLAG_COPIED: offset=%" > > + PRIx64 " refcount=%d\n", l2_entry, refcount); > > + res->corruptions++; > > + } > > Why is this warning suppressed when dedup is enabled? The meaning of > QCOW_OFLAG_COPIED is that refcount == 1. If this invariant is violated > then something is wrong. When deduplication is done refcount will be bigger than one and QCOW_OFLAG_COPIED will be cleared. Then if enough logical clustere pointing to the same physical cluster are rewritten with something else the refcount will goes down back to one. But this time QCOW_OFLAG_COPIED can be set back so this equality won't be true. Best regards Benoît
On Wed, Feb 27, 2013 at 04:00:28PM +0100, Benoît Canet wrote: > > > - if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > > + if (!s->has_dedup && > > > + (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > > + fprintf(stderr, "ERROR OFLAG_COPIED: offset=%" > > > + PRIx64 " refcount=%d\n", l2_entry, refcount); > > > + res->corruptions++; > > > + } > > > > Why is this warning suppressed when dedup is enabled? The meaning of > > QCOW_OFLAG_COPIED is that refcount == 1. If this invariant is violated > > then something is wrong. > > When deduplication is done refcount will be bigger than one and > QCOW_OFLAG_COPIED will be cleared. > > Then if enough logical clustere pointing to the same physical cluster are > rewritten with something else the refcount will goes down back to one. > > But this time QCOW_OFLAG_COPIED can be set back so this equality won't be true. When the refcount decreases to 1 again we need to set QCOW_OFLAG_COPIED again. qcow2-snapshot.c:qcow2_snapshot_delete() does this with: /* must update the copied flag on the current cluster offsets */ ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0); Is dedup not restoring QCOW_OFLAG_COPIED? Stefan
Am 28.02.2013 um 10:41 hat Stefan Hajnoczi geschrieben: > On Wed, Feb 27, 2013 at 04:00:28PM +0100, Benoît Canet wrote: > > > > - if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > > > + if (!s->has_dedup && > > > > + (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > > > + fprintf(stderr, "ERROR OFLAG_COPIED: offset=%" > > > > + PRIx64 " refcount=%d\n", l2_entry, refcount); > > > > + res->corruptions++; > > > > + } > > > > > > Why is this warning suppressed when dedup is enabled? The meaning of > > > QCOW_OFLAG_COPIED is that refcount == 1. If this invariant is violated > > > then something is wrong. > > > > When deduplication is done refcount will be bigger than one and > > QCOW_OFLAG_COPIED will be cleared. > > > > Then if enough logical clustere pointing to the same physical cluster are > > rewritten with something else the refcount will goes down back to one. > > > > But this time QCOW_OFLAG_COPIED can be set back so this equality won't be true. > > When the refcount decreases to 1 again we need to set QCOW_OFLAG_COPIED > again. qcow2-snapshot.c:qcow2_snapshot_delete() does this with: > > /* must update the copied flag on the current cluster offsets */ > ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0); > > Is dedup not restoring QCOW_OFLAG_COPIED? This is a very expensive operation. I don't think that you can do it for each deduplicated cluster that is overwritten. Not doing it comes with the cost of doing more COW than is actually needed. And we need to mention in the spec that QCOW_OFLAG_COPIED can be missing on clusters with deduplication enabled. Kevin
Le Thursday 28 Feb 2013 à 11:14:34 (+0100), Kevin Wolf a écrit : > Am 28.02.2013 um 10:41 hat Stefan Hajnoczi geschrieben: > > On Wed, Feb 27, 2013 at 04:00:28PM +0100, Benoît Canet wrote: > > > > > - if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > > > > + if (!s->has_dedup && > > > > > + (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > > > > + fprintf(stderr, "ERROR OFLAG_COPIED: offset=%" > > > > > + PRIx64 " refcount=%d\n", l2_entry, refcount); > > > > > + res->corruptions++; > > > > > + } > > > > > > > > Why is this warning suppressed when dedup is enabled? The meaning of > > > > QCOW_OFLAG_COPIED is that refcount == 1. If this invariant is violated > > > > then something is wrong. > > > > > > When deduplication is done refcount will be bigger than one and > > > QCOW_OFLAG_COPIED will be cleared. > > > > > > Then if enough logical clustere pointing to the same physical cluster are > > > rewritten with something else the refcount will goes down back to one. > > > > > > But this time QCOW_OFLAG_COPIED can be set back so this equality won't be true. > > > > When the refcount decreases to 1 again we need to set QCOW_OFLAG_COPIED > > again. qcow2-snapshot.c:qcow2_snapshot_delete() does this with: > > > > /* must update the copied flag on the current cluster offsets */ > > ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0); > > > > Is dedup not restoring QCOW_OFLAG_COPIED? > > This is a very expensive operation. I don't think that you can do it for > each deduplicated cluster that is overwritten. Not doing it comes with > the cost of doing more COW than is actually needed. And we need to > mention in the spec that QCOW_OFLAG_COPIED can be missing on clusters > with deduplication enabled. Also when two logical clusters point to the same physical cluster and one of the logical cluster get overwritten the deduplication code has no way to know the index of the last logical cluster entry. Best regards Benoît > > Kevin
Am 28.02.2013 um 17:14 hat Benoît Canet geschrieben: > Le Thursday 28 Feb 2013 à 11:14:34 (+0100), Kevin Wolf a écrit : > > Am 28.02.2013 um 10:41 hat Stefan Hajnoczi geschrieben: > > > On Wed, Feb 27, 2013 at 04:00:28PM +0100, Benoît Canet wrote: > > > > > > - if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > > > > > + if (!s->has_dedup && > > > > > > + (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > > > > > + fprintf(stderr, "ERROR OFLAG_COPIED: offset=%" > > > > > > + PRIx64 " refcount=%d\n", l2_entry, refcount); > > > > > > + res->corruptions++; > > > > > > + } > > > > > > > > > > Why is this warning suppressed when dedup is enabled? The meaning of > > > > > QCOW_OFLAG_COPIED is that refcount == 1. If this invariant is violated > > > > > then something is wrong. > > > > > > > > When deduplication is done refcount will be bigger than one and > > > > QCOW_OFLAG_COPIED will be cleared. > > > > > > > > Then if enough logical clustere pointing to the same physical cluster are > > > > rewritten with something else the refcount will goes down back to one. > > > > > > > > But this time QCOW_OFLAG_COPIED can be set back so this equality won't be true. > > > > > > When the refcount decreases to 1 again we need to set QCOW_OFLAG_COPIED > > > again. qcow2-snapshot.c:qcow2_snapshot_delete() does this with: > > > > > > /* must update the copied flag on the current cluster offsets */ > > > ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0); > > > > > > Is dedup not restoring QCOW_OFLAG_COPIED? > > > > This is a very expensive operation. I don't think that you can do it for > > each deduplicated cluster that is overwritten. Not doing it comes with > > the cost of doing more COW than is actually needed. And we need to > > mention in the spec that QCOW_OFLAG_COPIED can be missing on clusters > > with deduplication enabled. > > Also when two logical clusters point to the same physical cluster and one of the > logical cluster get overwritten the deduplication code has no way to know the > index of the last logical cluster entry. Well, strictly speaking you can. The qcow2_update_snapshot_refcount() call that Stefan mention does exactly that. It's just insanely expensive because it has to look at the refcounts for all clusters. Kevin
On Fri, Mar 01, 2013 at 09:59:33AM +0100, Kevin Wolf wrote: > Am 28.02.2013 um 17:14 hat Benoît Canet geschrieben: > > Le Thursday 28 Feb 2013 à 11:14:34 (+0100), Kevin Wolf a écrit : > > > Am 28.02.2013 um 10:41 hat Stefan Hajnoczi geschrieben: > > > > On Wed, Feb 27, 2013 at 04:00:28PM +0100, Benoît Canet wrote: > > > > > > > - if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > > > > > > + if (!s->has_dedup && > > > > > > > + (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { > > > > > > > + fprintf(stderr, "ERROR OFLAG_COPIED: offset=%" > > > > > > > + PRIx64 " refcount=%d\n", l2_entry, refcount); > > > > > > > + res->corruptions++; > > > > > > > + } > > > > > > > > > > > > Why is this warning suppressed when dedup is enabled? The meaning of > > > > > > QCOW_OFLAG_COPIED is that refcount == 1. If this invariant is violated > > > > > > then something is wrong. > > > > > > > > > > When deduplication is done refcount will be bigger than one and > > > > > QCOW_OFLAG_COPIED will be cleared. > > > > > > > > > > Then if enough logical clustere pointing to the same physical cluster are > > > > > rewritten with something else the refcount will goes down back to one. > > > > > > > > > > But this time QCOW_OFLAG_COPIED can be set back so this equality won't be true. > > > > > > > > When the refcount decreases to 1 again we need to set QCOW_OFLAG_COPIED > > > > again. qcow2-snapshot.c:qcow2_snapshot_delete() does this with: > > > > > > > > /* must update the copied flag on the current cluster offsets */ > > > > ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0); > > > > > > > > Is dedup not restoring QCOW_OFLAG_COPIED? > > > > > > This is a very expensive operation. I don't think that you can do it for > > > each deduplicated cluster that is overwritten. Not doing it comes with > > > the cost of doing more COW than is actually needed. And we need to > > > mention in the spec that QCOW_OFLAG_COPIED can be missing on clusters > > > with deduplication enabled. > > > > Also when two logical clusters point to the same physical cluster and one of the > > logical cluster get overwritten the deduplication code has no way to know the > > index of the last logical cluster entry. > > Well, strictly speaking you can. The qcow2_update_snapshot_refcount() > call that Stefan mention does exactly that. It's just insanely expensive > because it has to look at the refcounts for all clusters. Okay, I agree that qcow2_update_snapshot_refcount() is too expensive. Please add a comment explaining that QCOW_OFLAG_COPIED is not guaranteed when dedup is enabled since it would be too expensive to do this everything sharing breaks (refcount is decremented to 1). Stefan
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index d7d9339..ee5de6b 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -999,7 +999,14 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res, PRIx64 ": %s\n", l2_entry, strerror(-refcount)); goto fail; } - if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { + if (!s->has_dedup && + (refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { + fprintf(stderr, "ERROR OFLAG_COPIED: offset=%" + PRIx64 " refcount=%d\n", l2_entry, refcount); + res->corruptions++; + } + if (s->has_dedup && refcount > 1 && + ((l2_entry & QCOW_OFLAG_COPIED) != 0)) { fprintf(stderr, "ERROR OFLAG_COPIED: offset=%" PRIx64 " refcount=%d\n", l2_entry, refcount); res->corruptions++;
Signed-off-by: Benoit Canet <benoit@irqsave.net> --- block/qcow2-refcount.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)