diff mbox

qcow2: Fix refcount block allocation during qcow2_allocate_cluster_at()

Message ID 1334930161-21972-1-git-send-email-kwolf@redhat.com
State New
Headers show

Commit Message

Kevin Wolf April 20, 2012, 1:56 p.m. UTC
Refcount block allocation and refcount table growth rely on
s->free_cluster_index pointing to somewhere after the current
allocation. Change qcow2_allocate_cluster_at() to fulfill this
assumption.

Without this change it could happen that a newly allocated refcount
block and the allocated data block point to the same area in the image
file, causing data corruption in the long run.

This fixes a bug that became first visible after commit 250196f1.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-refcount.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

Comments

Marcelo Tosatti April 22, 2012, 11:18 p.m. UTC | #1
On Fri, Apr 20, 2012 at 03:56:01PM +0200, Kevin Wolf wrote:
> Refcount block allocation and refcount table growth rely on
> s->free_cluster_index pointing to somewhere after the current
> allocation. Change qcow2_allocate_cluster_at() to fulfill this
> assumption.
> 
> Without this change it could happen that a newly allocated refcount
> block and the allocated data block point to the same area in the image
> file, causing data corruption in the long run.
> 
> This fixes a bug that became first visible after commit 250196f1.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Kevin,

This patch fixes explicit filesystem errors (qemu-img check also OK), 
but autotest is still failing, see attached screenshot. It is not
reproducible without

f081987ad20a8c8dc391deded55161ea8d38be5f
Marcelo Tosatti April 22, 2012, 11:35 p.m. UTC | #2
On Sun, Apr 22, 2012 at 08:18:49PM -0300, Marcelo Tosatti wrote:
> On Fri, Apr 20, 2012 at 03:56:01PM +0200, Kevin Wolf wrote:
> > Refcount block allocation and refcount table growth rely on
> > s->free_cluster_index pointing to somewhere after the current
> > allocation. Change qcow2_allocate_cluster_at() to fulfill this
> > assumption.
> > 
> > Without this change it could happen that a newly allocated refcount
> > block and the allocated data block point to the same area in the image
> > file, causing data corruption in the long run.
> > 
> > This fixes a bug that became first visible after commit 250196f1.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> 
> Kevin,
> 
> This patch fixes explicit filesystem errors (qemu-img check also OK), 
> but autotest is still failing, see attached screenshot. It is not
> reproducible without
> 
> f081987ad20a8c8dc391deded55161ea8d38be5f

Sorry, i meant

commit 250196f19c6e7df12965d74a5073e10aba06c802
Author: Kevin Wolf <kwolf@redhat.com>
Date:   Fri Mar 2 14:10:54 2012 +0100

    qcow2: Reduce number of I/O requests
Kevin Wolf April 23, 2012, 12:33 p.m. UTC | #3
Am 23.04.2012 01:35, schrieb Marcelo Tosatti:
> On Sun, Apr 22, 2012 at 08:18:49PM -0300, Marcelo Tosatti wrote:
>> On Fri, Apr 20, 2012 at 03:56:01PM +0200, Kevin Wolf wrote:
>>> Refcount block allocation and refcount table growth rely on
>>> s->free_cluster_index pointing to somewhere after the current
>>> allocation. Change qcow2_allocate_cluster_at() to fulfill this
>>> assumption.
>>>
>>> Without this change it could happen that a newly allocated refcount
>>> block and the allocated data block point to the same area in the image
>>> file, causing data corruption in the long run.
>>>
>>> This fixes a bug that became first visible after commit 250196f1.
>>>
>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>
>> Kevin,
>>
>> This patch fixes explicit filesystem errors (qemu-img check also OK), 
>> but autotest is still failing, see attached screenshot. It is not
>> reproducible without
>>
>> f081987ad20a8c8dc391deded55161ea8d38be5f
> 
> Sorry, i meant
> 
> commit 250196f19c6e7df12965d74a5073e10aba06c802
> Author: Kevin Wolf <kwolf@redhat.com>
> Date:   Fri Mar 2 14:10:54 2012 +0100
> 
>     qcow2: Reduce number of I/O requests

The screenshot doesn't really give a lot of information, but let's
assume that _something_ must have been corrupted... Can you try finding
the corrupted file (e.g. using rpm -V) and see in which way it differs
from the real one?

Kevin
Marcelo Tosatti April 23, 2012, 6:30 p.m. UTC | #4
On Mon, Apr 23, 2012 at 02:33:49PM +0200, Kevin Wolf wrote:
> Am 23.04.2012 01:35, schrieb Marcelo Tosatti:
> > On Sun, Apr 22, 2012 at 08:18:49PM -0300, Marcelo Tosatti wrote:
> >> On Fri, Apr 20, 2012 at 03:56:01PM +0200, Kevin Wolf wrote:
> >>> Refcount block allocation and refcount table growth rely on
> >>> s->free_cluster_index pointing to somewhere after the current
> >>> allocation. Change qcow2_allocate_cluster_at() to fulfill this
> >>> assumption.
> >>>
> >>> Without this change it could happen that a newly allocated refcount
> >>> block and the allocated data block point to the same area in the image
> >>> file, causing data corruption in the long run.
> >>>
> >>> This fixes a bug that became first visible after commit 250196f1.
> >>>
> >>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> >>
> >> Kevin,
> >>
> >> This patch fixes explicit filesystem errors (qemu-img check also OK), 
> >> but autotest is still failing, see attached screenshot. It is not
> >> reproducible without
> >>
> >> f081987ad20a8c8dc391deded55161ea8d38be5f
> > 
> > Sorry, i meant
> > 
> > commit 250196f19c6e7df12965d74a5073e10aba06c802
> > Author: Kevin Wolf <kwolf@redhat.com>
> > Date:   Fri Mar 2 14:10:54 2012 +0100
> > 
> >     qcow2: Reduce number of I/O requests
> 
> The screenshot doesn't really give a lot of information, but let's
> assume that _something_ must have been corrupted... Can you try finding
> the corrupted file (e.g. using rpm -V) and see in which way it differs
> from the real one?
> 
> Kevin

Unfortunately no, /boot/vmlinuz is gone on the last install
(screenshot).

It should be easy to reproduce, install Fedora.8.64, or i
can upload an image later tonight if that is helpful.
Kevin Wolf April 24, 2012, 12:21 p.m. UTC | #5
Am 23.04.2012 20:30, schrieb Marcelo Tosatti:
> On Mon, Apr 23, 2012 at 02:33:49PM +0200, Kevin Wolf wrote:
>> Am 23.04.2012 01:35, schrieb Marcelo Tosatti:
>>> On Sun, Apr 22, 2012 at 08:18:49PM -0300, Marcelo Tosatti wrote:
>>>> On Fri, Apr 20, 2012 at 03:56:01PM +0200, Kevin Wolf wrote:
>>>>> Refcount block allocation and refcount table growth rely on
>>>>> s->free_cluster_index pointing to somewhere after the current
>>>>> allocation. Change qcow2_allocate_cluster_at() to fulfill this
>>>>> assumption.
>>>>>
>>>>> Without this change it could happen that a newly allocated refcount
>>>>> block and the allocated data block point to the same area in the image
>>>>> file, causing data corruption in the long run.
>>>>>
>>>>> This fixes a bug that became first visible after commit 250196f1.
>>>>>
>>>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>>>
>>>> Kevin,
>>>>
>>>> This patch fixes explicit filesystem errors (qemu-img check also OK), 
>>>> but autotest is still failing, see attached screenshot. It is not
>>>> reproducible without
>>>>
>>>> f081987ad20a8c8dc391deded55161ea8d38be5f
>>>
>>> Sorry, i meant
>>>
>>> commit 250196f19c6e7df12965d74a5073e10aba06c802
>>> Author: Kevin Wolf <kwolf@redhat.com>
>>> Date:   Fri Mar 2 14:10:54 2012 +0100
>>>
>>>     qcow2: Reduce number of I/O requests
>>
>> The screenshot doesn't really give a lot of information, but let's
>> assume that _something_ must have been corrupted... Can you try finding
>> the corrupted file (e.g. using rpm -V) and see in which way it differs
>> from the real one?
>>
>> Kevin
> 
> Unfortunately no, /boot/vmlinuz is gone on the last install
> (screenshot).
> 
> It should be easy to reproduce, install Fedora.8.64, or i
> can upload an image later tonight if that is helpful.

Nope, can't reproduce. Fedora 8 installs and boots just fine (8 GB disk,
default qcow2 cluster size, just clicked through the installer with
default options).

Kevin
diff mbox

Patch

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 565bd54..6c38337 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -587,6 +587,7 @@  int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
 {
     BDRVQcowState *s = bs->opaque;
     uint64_t cluster_index;
+    uint64_t old_free_cluster_index;
     int i, refcount, ret;
 
     /* Check how many clusters there are free */
@@ -602,11 +603,16 @@  int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
     }
 
     /* And then allocate them */
+    old_free_cluster_index = s->free_cluster_index;
+    s->free_cluster_index = cluster_index + i;
+
     ret = update_refcount(bs, offset, i << s->cluster_bits, 1);
     if (ret < 0) {
         return ret;
     }
 
+    s->free_cluster_index = old_free_cluster_index;
+
     return i;
 }