diff mbox series

[Bug,1846427] Re: 4.1.0: qcow2 corruption on savevm/quit/loadvm cycle

Message ID 157178444824.19048.3845514982635724694.malone@gac.canonical.com
State New
Headers show
Series [Bug,1846427] Re: 4.1.0: qcow2 corruption on savevm/quit/loadvm cycle | expand

Commit Message

Michael Weiser Oct. 22, 2019, 10:47 p.m. UTC
> I tried to reproduce the problem locally, on the same commit, with the
> steps you described, but I wasn't lucky. I tried keeping the image on my
> home directory (XFS), on tmpfs, and finally on a newly created ext4
> filesystem on a spare LVM volume, but the image just wouldn't break even
> after letting the loop run for a quite a while.

That's certainly an important data point. Is it possible that we're
talking about some kind of miscompilation here, maybe because gcc-9.2.0
is just that tiny bit too spanking current?

> So as the next step I would like to test my theory that the problem
> isn't bdrv_co_block_status() returning a different value after the
> commit, but that qcow2_detect_metadata_preallocation() even runs. I
> think the easiest way to do this would be modifying handle_alloc_space()
> so that it performs the checks, but skips its optimisation regardless of
> the is_zero_cow() return value:

>         if (!is_zero_cow(bs, m) || true) {
>             continue;
>         }

I made the change and the problem went away.

Then, extrapolating the jest of your methodology :), I went ahead and
disabled only bdrv_co_pwrite_zeroes() by placing a continue in front of
it but let qcow2_pre_write_overlap_check() execute and the problem
reappeared. I certainly did not expect that to happen because the
function name ends in _check(), suggesting read-only access. And it's
not even touched by the commit.

This had me so rattled that I revalidated that the problem does indeed
not occur with the commit before. And it does not. I left it running for
about half an hour without problems.

After some more tests I finally figured out that even with -g and no -O
gcc is smart enough to optimize out (!is_zero_cow() || true) and that
corruption only happens if is_zero_cow() is actually called. Corruption
also does not occur if I make is_zero_cow() or is_unallocated() return 0
always.

So my first guess was that is_unallocated() sometimes returns false
positives, making is_zero_cow() report false positives which is not
caught by qcow2_pre_write_overlap_check() and causes
bdrv_co_pwrite_zeroes() to zero out actual data. That seemed a bit
convoluted to me.

But then I realized that corruption still occurs if the rest of
handle_alloc_space() is disabled like so:


So it's much more likely that is_zero_cow() has a side-effect that
somehow causes corruption later on even without handle_alloc_space()
ever calling bdrv_co_pwrite_zeroes(). That would also explain why
qcow2_pre_write_overlap_check() does not catch those false positives
overwriting metadata because there simply are none.

Putting a breakpoint on handle_alloc_space() and single stepping into
is_zero_cow() I do indeed end up in bdrv_co_block_status():

gdb) bt
#0  0x0000555555d610fd in bdrv_co_block_status (bs=0x5555567c69e0, want_zero=false, offset=5242880, bytes=12288, pnum=0x7ffedffd7b28, map=0x0, file=0x0)
    at block/io.c:2048
#1  0x0000555555d6167e in bdrv_co_block_status_above
    (bs=0x5555567c69e0, base=0x0, want_zero=false, offset=5242880, bytes=12288, pnum=0x7ffedffd7b28, map=0x0, file=0x0) at block/io.c:2190
#2  0x0000555555d61764 in bdrv_block_status_above_co_entry (opaque=0x7ffedffd7a10) at block/io.c:2220
#3  0x0000555555d6188f in bdrv_common_block_status_above
    (bs=0x5555567c69e0, base=0x0, want_zero=false, offset=5242880, bytes=12288, pnum=0x7ffedffd7b28, map=0x0, file=0x0) at block/io.c:2255
#4  0x0000555555d61afa in bdrv_is_allocated (bs=0x5555567c69e0, offset=5242880, bytes=12288, pnum=0x7ffedffd7b28) at block/io.c:2285
#5  0x0000555555d61b8c in bdrv_is_allocated_above (top=0x5555567c69e0, base=0x0, offset=5242880, bytes=12288, pnum=0x7ffedffd7b80) at block/io.c:2323
#6  0x0000555555d12d48 in is_unallocated (bs=0x5555567c69e0, offset=5242880, bytes=12288) at block/qcow2.c:2151
#7  0x0000555555d12dbc in is_zero_cow (bs=0x5555567c69e0, m=0x5555569d35b0) at block/qcow2.c:2162
#8  0x0000555555d12e9c in handle_alloc_space (bs=0x5555567c69e0, l2meta=0x5555569d35b0) at block/qcow2.c:2188
#9  0x0000555555d13321 in qcow2_co_pwritev (bs=0x5555567c69e0, offset=5255168, bytes=4096, qiov=0x7fffe82ec310, flags=0) at block/qcow2.c:2302
#10 0x0000555555d5e6d5 in bdrv_driver_pwritev (bs=0x5555567c69e0, offset=5255168, bytes=4096, qiov=0x7fffe82ec310, flags=0) at block/io.c:1043
#11 0x0000555555d6014b in bdrv_aligned_pwritev (child=0x55555675cf80, req=0x7ffedffd7e50, offset=5255168, bytes=4096, align=1, qiov=0x7fffe82ec310, flags=0)
    at block/io.c:1670
#12 0x0000555555d60d77 in bdrv_co_pwritev (child=0x55555675cf80, offset=5255168, bytes=4096, qiov=0x7fffe82ec310, flags=0) at block/io.c:1897
#13 0x0000555555d47cc7 in blk_co_pwritev (blk=0x5555567c6730, offset=5255168, bytes=4096, qiov=0x7fffe82ec310, flags=0) at block/block-backend.c:1183
#14 0x0000555555d484aa in blk_aio_write_entry (opaque=0x7fffe823f920) at block/block-backend.c:1382
#15 0x0000555555e3ff91 in coroutine_trampoline (i0=-399759776, i1=32767) at util/coroutine-ucontext.c:116
#16 0x00007ffff5fc61a0 in  () at /lib64/libc.so.6
#17 0x00007ffff17c5920 in  ()
#18 0x0000000000000000 in  ()

At that point it had gotten too late to even attempt to wrap my brain
around the whole BDRV_BLOCK_RECURSE logic. But I think the above gives a
strong(er|ish) connection between the change and the corruption and how
handle_alloc_space() ties into it. Let me know what else I could check
to help track this down.
diff mbox series

Patch

--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2185,9 +2185,8 @@  static int handle_alloc_space(BlockDriverState *bs, QCowL2Meta *l2meta)
             continue;
         }
 
-        if (!is_zero_cow(bs, m)) {
-            continue;
-        }
+        is_zero_cow(bs, m);
+        continue;
 
         /*
          * instead of writing zero COW buffers,