diff mbox

[v2] qemu-img fails to delete last snapshot

Message ID 1399926438-32292-1-git-send-email-ncmike@ncultra.org
State New
Headers show

Commit Message

Mike D. Day May 12, 2014, 8:27 p.m. UTC
When deleting the last snapshot, copying the resulting snapshot table
currently fails, causing the delete operation to also fail. Fix the
failure by skipping the copy and just writing the snapshot header and
freeing the extra clusters.

There are two specific problems in the current code. First is a lack of
parenthesis in the calculation of the memmove size parameter:

s->nb_snapshots - snapshot_index - 1

When s->nb_snapshots is 0, snapshot_index is 1.

0 - 1 - 1 = 0xfffffffe

it should be:

0 - (1 - 1) = 0x00

The second problem is shifting the snapshot table to the left. After
removing the last snapshot there are no existing snapshots to be
shifted. All that needs to be done is to write the header and
unallocate the blocks.

Signed-off-by: Mike Day <ncmike@ncultra.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
v2: improved the git log entry
    added Eric Blake as a reviewer

 block/qcow2-snapshot.c | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

Comments

Kevin Wolf May 14, 2014, 3:15 p.m. UTC | #1
Am 12.05.2014 um 22:27 hat Mike Day geschrieben:
> When deleting the last snapshot, copying the resulting snapshot table
> currently fails, causing the delete operation to also fail. Fix the
> failure by skipping the copy and just writing the snapshot header and
> freeing the extra clusters.

Do you have an easy reproducer? Because I can't see the bug.

> There are two specific problems in the current code. First is a lack of
> parenthesis in the calculation of the memmove size parameter:
> 
> s->nb_snapshots - snapshot_index - 1
> 
> When s->nb_snapshots is 0, snapshot_index is 1.
> 
> 0 - 1 - 1 = 0xfffffffe
> 
> it should be:
> 
> 0 - (1 - 1) = 0x00

Not really. With s->nb_snapshots == 0, there is no snapshot to delete to
start with. Therefore find_snapshot_by_id_and_name() returns -1 and we
return immediately.

> The second problem is shifting the snapshot table to the left. After
> removing the last snapshot there are no existing snapshots to be
> shifted. All that needs to be done is to write the header and
> unallocate the blocks.

When removing the last snapshot, we have:

    nb_snapshots = 1
    snapshot_index = 0

    memmove(..., (1 - 0 - 1) * sizeof(sn));

So we're not moving anything, which is what you correctly said needs to
happen.

Kevin
Max Reitz May 14, 2014, 11:54 p.m. UTC | #2
On 12.05.2014 22:27, Mike Day wrote:
> When deleting the last snapshot, copying the resulting snapshot table
> currently fails, causing the delete operation to also fail. Fix the
> failure by skipping the copy and just writing the snapshot header and
> freeing the extra clusters.
>
> There are two specific problems in the current code. First is a lack of
> parenthesis in the calculation of the memmove size parameter:
>
> s->nb_snapshots - snapshot_index - 1
>
> When s->nb_snapshots is 0, snapshot_index is 1.

Before this patch is applied, s->nb_snapshots is only increased after 
the memmove(). Therefore, it can never be 0 – snapshot_index on the 
other hand needs to be 0, as find_snapshot_by_id_and_name() forces it to 
be less than s->nb_snapshots (to elaborate on Kevin's review).

> 0 - 1 - 1 = 0xfffffffe
>
> it should be:
>
> 0 - (1 - 1) = 0x00
>
> The second problem is shifting the snapshot table to the left. After
> removing the last snapshot there are no existing snapshots to be
> shifted. All that needs to be done is to write the header and
> unallocate the blocks.
>
> Signed-off-by: Mike Day <ncmike@ncultra.org>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
> v2: improved the git log entry
>      added Eric Blake as a reviewer

I do agree that this code is rather ugly and I had problems with it on 
more than one occasion (which should be speaking for itself, considering 
that I have not worked that long on qemu). On the other hand it is 
always a nice test case whether one broke zero-size allocations, though 
(while I'm not sure whether these should be allowed in the first place, 
though…).

Considering that this code indeed does perform a zero-size allocation 
reproducably, I'm rather surprised that we actually do not have a test 
case yet for snapshot deletion, though (as far as I can see).

So, all in all, I am kind of in favor of making the deletion of the last 
snapshot a special case as this would probably greatly improve 
readability; but on the other hand, it actually is a good test as it is 
right now.

Max
Mike D. Day May 15, 2014, 1:07 p.m. UTC | #3
On Wed, May 14, 2014 at 11:15 AM, Kevin Wolf <kwolf@redhat.com> wrote:
>> freeing the extra clusters.
>
> Do you have an easy reproducer? Because I can't see the bug.

Thanks for the review! I was having a hard time reproducing this until
I did a bisect. This bug was fixed by 65f33bc which was merged at or
after  the time I submitted the patch:

 qcow2: Fix alloc_clusters_noref() overflow detection

I can reproduce the bug by checking out the immediate ancestor
43cbeffb1, creating a single snapshot in a qcow2 image, and then
attempting to delete that snapshot.

The error I get is:
qemu-img: Could not delete snapshot 'snapone': (Failed to remove
snapshot from snapshot list: File too large)

This is the error that is fixed by 65f33bc

Mike
Kevin Wolf May 15, 2014, 1:26 p.m. UTC | #4
Am 15.05.2014 um 15:07 hat Mike Day geschrieben:
> On Wed, May 14, 2014 at 11:15 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> >> freeing the extra clusters.
> >
> > Do you have an easy reproducer? Because I can't see the bug.
> 
> Thanks for the review! I was having a hard time reproducing this until
> I did a bisect. This bug was fixed by 65f33bc which was merged at or
> after  the time I submitted the patch:
> 
>  qcow2: Fix alloc_clusters_noref() overflow detection
> 
> I can reproduce the bug by checking out the immediate ancestor
> 43cbeffb1, creating a single snapshot in a qcow2 image, and then
> attempting to delete that snapshot.
> 
> The error I get is:
> qemu-img: Could not delete snapshot 'snapone': (Failed to remove
> snapshot from snapshot list: File too large)
> 
> This is the error that is fixed by 65f33bc

Yes, that makes sense. Thanks Mike. I think we can disregard your patch
then and consider the problem fixed.

Kevin
diff mbox

Patch

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 0aa9def..c8b842c 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -165,9 +165,11 @@  static int qcow2_write_snapshots(BlockDriverState *bs)
 
     assert(offset <= INT_MAX);
     snapshots_size = offset;
-
     /* Allocate space for the new snapshot list */
-    snapshots_offset = qcow2_alloc_clusters(bs, snapshots_size);
+    snapshots_offset = 0;
+    if (snapshots_size) {
+        snapshots_offset = qcow2_alloc_clusters(bs, snapshots_size);
+    }
     offset = snapshots_offset;
     if (offset < 0) {
         ret = offset;
@@ -180,12 +182,13 @@  static int qcow2_write_snapshots(BlockDriverState *bs)
 
     /* The snapshot list position has not yet been updated, so these clusters
      * must indeed be completely free */
-    ret = qcow2_pre_write_overlap_check(bs, 0, offset, snapshots_size);
-    if (ret < 0) {
-        goto fail;
+    if (snapshots_size) {
+        ret = qcow2_pre_write_overlap_check(bs, 0, offset, snapshots_size);
+        if (ret < 0) {
+            goto fail;
+        }
     }
 
-
     /* Write all snapshots to the new list */
     for(i = 0; i < s->nb_snapshots; i++) {
         sn = s->snapshots + i;
@@ -590,12 +593,14 @@  int qcow2_snapshot_delete(BlockDriverState *bs,
         return -ENOENT;
     }
     sn = s->snapshots[snapshot_index];
-
     /* Remove it from the snapshot list */
-    memmove(s->snapshots + snapshot_index,
-            s->snapshots + snapshot_index + 1,
-            (s->nb_snapshots - snapshot_index - 1) * sizeof(sn));
     s->nb_snapshots--;
+    if (s->nb_snapshots) {
+        memmove(s->snapshots + snapshot_index,
+                s->snapshots + snapshot_index + 1,
+                (s->nb_snapshots - (snapshot_index - 1)) * sizeof(sn));
+    }
+
     ret = qcow2_write_snapshots(bs);
     if (ret < 0) {
         error_setg_errno(errp, -ret,