Patchwork block.c, block/vmdk.c: Fixed major bug in VMDK WRITE and READ handling - FIXES DATA CORRUPTION

login
register
mail settings
Submitter Paolo Bonzini
Date Nov. 9, 2012, 8:26 a.m.
Message ID <509CBE3A.4040103@redhat.com>
Download mbox | patch
Permalink /patch/197991/
State New
Headers show

Comments

Paolo Bonzini - Nov. 9, 2012, 8:26 a.m.
Il 08/11/2012 20:05, Gerhard Wiesinger ha scritto:
> Fixed a MAJOR BUG in VMDK files on file boundaries on reads
> and ALSO ON WRITES WHICH MIGHT CORRUPT THE IMAGE AND DATA!!!!!!
> 
> Triggered for example with the following VMDK file (partly listed):
> # Extent description
> RW 4193792 FLAT "XP-W1-f001.vmdk" 0
> RW 2097664 FLAT "XP-W1-f002.vmdk" 0
> RW 4193792 FLAT "XP-W1-f003.vmdk" 0
> RW 512 FLAT "XP-W1-f004.vmdk" 0
> RW 4193792 FLAT "XP-W1-f005.vmdk" 0
> RW 2097664 FLAT "XP-W1-f006.vmdk" 0
> RW 4193792 FLAT "XP-W1-f007.vmdk" 0
> RW 512 FLAT "XP-W1-f008.vmdk" 0
> 
> Patch includes:
> 1.) Patch fixes wrong calculation on extent boundaries. Especially it
> fixes the relativeness of the sector number to the current extent.

Please just fix _this_ part.  Everything else is not necessary for example
for distributions to fix this.  It's an important bug, so we actually want
to make that as simple as this.

> 2.) Added debug code to block.c and to block/vmdk.c to verify correctness

Same here.  Also, please use the tracing infrastructure---a lot of the debug
messages you're adding, though not all, are in fact already available (not
saying the others aren't useful!)

> 3.) Also optimized code which avoids multiplication and uses shifts.

The compiler can do this for you.

Most importantly, making it more complex for reviewers to find only the
"interesting" part.

Please check that the attached patch still works.

Paolo

---------------- 8< ------------------
From: Gerhard Wiesinger <lists@wiesinger.com>
Subject: [PATCH] vmdk: fix extent mapping

This patch fixes wrong calculation on extent boundaries, and computing the sector
number relative to the current extent. This happens on reads and also on write,
which might corrupt the image and data.

Triggered for example with the following VMDK file (partly listed):
# Extent description
RW 4193792 FLAT "XP-W1-f001.vmdk" 0
RW 2097664 FLAT "XP-W1-f002.vmdk" 0
RW 4193792 FLAT "XP-W1-f003.vmdk" 0
RW 512 FLAT "XP-W1-f004.vmdk" 0
RW 4193792 FLAT "XP-W1-f005.vmdk" 0
RW 2097664 FLAT "XP-W1-f006.vmdk" 0
RW 4193792 FLAT "XP-W1-f007.vmdk" 0
RW 512 FLAT "XP-W1-f008.vmdk" 0

Signed-off-by: Gerhard Wiesinger <lists@wiesinger.com>
---
 block/vmdk.c    | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)
Gerhard Wiesinger - Nov. 10, 2012, 8:30 a.m.
On 09.11.2012 09:26, Paolo Bonzini wrote:
> Il 08/11/2012 20:05, Gerhard Wiesinger ha scritto:
>> Fixed a MAJOR BUG in VMDK files on file boundaries on reads
>> and ALSO ON WRITES WHICH MIGHT CORRUPT THE IMAGE AND DATA!!!!!!
>>
>> Triggered for example with the following VMDK file (partly listed):
>> # Extent description
>> RW 4193792 FLAT "XP-W1-f001.vmdk" 0
>> RW 2097664 FLAT "XP-W1-f002.vmdk" 0
>> RW 4193792 FLAT "XP-W1-f003.vmdk" 0
>> RW 512 FLAT "XP-W1-f004.vmdk" 0
>> RW 4193792 FLAT "XP-W1-f005.vmdk" 0
>> RW 2097664 FLAT "XP-W1-f006.vmdk" 0
>> RW 4193792 FLAT "XP-W1-f007.vmdk" 0
>> RW 512 FLAT "XP-W1-f008.vmdk" 0
>>
>> Patch includes:
>> 1.) Patch fixes wrong calculation on extent boundaries. Especially it
>> fixes the relativeness of the sector number to the current extent.
> Please just fix _this_ part.  Everything else is not necessary for example
> for distributions to fix this.  It's an important bug, so we actually want
> to make that as simple as this.

Sent.

>
>> 2.) Added debug code to block.c and to block/vmdk.c to verify correctness
> Same here.  Also, please use the tracing infrastructure---a lot of the debug
> messages you're adding, though not all, are in fact already available (not
> saying the others aren't useful!)

Any chance that the patch with debug code only (after some cleaning) 
would be accepted (other modules do debug logging, too)?
I  don't like to do useless work.
Tracing infrastructure is quite limited to function calls only (as far 
as I saw).


>
>> 3.) Also optimized code which avoids multiplication and uses shifts.
> The compiler can do this for you.
>
> Most importantly, making it more complex for reviewers to find only the
> "interesting" part.
>
> Please check that the attached patch still works.
>

Made some tests and gcc is really clever in optimizing. Even other 
multipliers (513, 514, ...) than 512 will be optimized to shifts and 
adds until a limit of 512+8 (I think it was that) was reached. :-)

Bugfix only resent.

Ciao,
Gerhard
Paolo Bonzini - Nov. 10, 2012, 8:55 a.m.
Il 10/11/2012 09:30, Gerhard Wiesinger ha scritto:
>>> 2.) Added debug code to block.c and to block/vmdk.c to verify
>>> correctness
>> Same here.  Also, please use the tracing infrastructure---a lot of the
>> debug
>> messages you're adding, though not all, are in fact already available
>> (not
>> saying the others aren't useful!)
> 
> Any chance that the patch with debug code only (after some cleaning)
> would be accepted (other modules do debug logging, too)?
> I  don't like to do useless work.
> Tracing infrastructure is quite limited to function calls only (as far
> as I saw).

No, tracing infrastructure uses function calls for tracing (messages go
into trace-events) but you can apply it to everything you want.  Use the
stderr backend to debug it.

Debug patches using traces are certainly welcome.

Paolo
Gerhard Wiesinger - Nov. 10, 2012, 9:59 a.m.
On 10.11.2012 09:55, Paolo Bonzini wrote:
> Il 10/11/2012 09:30, Gerhard Wiesinger ha scritto:
>>>> 2.) Added debug code to block.c and to block/vmdk.c to verify
>>>> correctness
>>> Same here.  Also, please use the tracing infrastructure---a lot of the
>>> debug
>>> messages you're adding, though not all, are in fact already available
>>> (not
>>> saying the others aren't useful!)
>> Any chance that the patch with debug code only (after some cleaning)
>> would be accepted (other modules do debug logging, too)?
>> I  don't like to do useless work.
>> Tracing infrastructure is quite limited to function calls only (as far
>> as I saw).
> No, tracing infrastructure uses function calls for tracing (messages go
> into trace-events) but you can apply it to everything you want.  Use the
> stderr backend to debug it.

Tracing is a good thing for normal behavior but the major limitation is 
that a function call must be involved. But for deep debugging one needs 
a lot of more messages than function calls are available.

Of course every DPRINTF line could be made in a function call but IHMO 
this introduces unnecessary overhead in performance.

So how to proceed further, some options:
1.) Add additional function calls for each DPRINTF statement?
2.) Add just plain DPRINTF statements?
3.) Or a mixture of both: on function call boundaries use Tracing, in 
function debug info use DPRINTF?
4.) Refactor code that always function calls are involved?

Example:
static void traceing_func(int mul)
{
     // Do nothing here
}

// Just some dummy useless function doing illustration
static int addandmultiply(int arg1, int arg2)
{
     int mul = 0;
     int sum = arg1 + arg2;
     DPRINTF("....", arg1, arg2); // this one can be handled by tracing 
infrastructure
     DPRINTF("....", sum); // this one can't be done with tracing
     if (sum == 0)
     {
             DPRINTF("sum=0"); // this one can't be done with tracing
         return;
     }
     mul = arg1 * arg2;
     DPRINTF(...., mul);  // this one can't be done with tracing (except 
with a introduced tracing function)
     traceing_func(mul); // Introduce tracing function, performance penalty
     return sum + mul;
}

So how to proceed further?

Ciao,
Gerhard
Kevin Wolf - Nov. 13, 2012, 12:27 p.m.
Am 10.11.2012 10:59, schrieb Gerhard Wiesinger:
> On 10.11.2012 09:55, Paolo Bonzini wrote:
>> Il 10/11/2012 09:30, Gerhard Wiesinger ha scritto:
>>>>> 2.) Added debug code to block.c and to block/vmdk.c to verify
>>>>> correctness
>>>> Same here.  Also, please use the tracing infrastructure---a lot of the
>>>> debug
>>>> messages you're adding, though not all, are in fact already available
>>>> (not
>>>> saying the others aren't useful!)
>>> Any chance that the patch with debug code only (after some cleaning)
>>> would be accepted (other modules do debug logging, too)?
>>> I  don't like to do useless work.
>>> Tracing infrastructure is quite limited to function calls only (as far
>>> as I saw).
>> No, tracing infrastructure uses function calls for tracing (messages go
>> into trace-events) but you can apply it to everything you want.  Use the
>> stderr backend to debug it.
> 
> Tracing is a good thing for normal behavior but the major limitation is 
> that a function call must be involved. But for deep debugging one needs 
> a lot of more messages than function calls are available.
> 
> Of course every DPRINTF line could be made in a function call but IHMO 
> this introduces unnecessary overhead in performance.
> 
> So how to proceed further, some options:
> 1.) Add additional function calls for each DPRINTF statement?
> 2.) Add just plain DPRINTF statements?
> 3.) Or a mixture of both: on function call boundaries use Tracing, in 
> function debug info use DPRINTF?
> 4.) Refactor code that always function calls are involved?
> 
> Example:
> static void traceing_func(int mul)
> {
>      // Do nothing here
> }
> 
> // Just some dummy useless function doing illustration
> static int addandmultiply(int arg1, int arg2)
> {
>      int mul = 0;
>      int sum = arg1 + arg2;
>      DPRINTF("....", arg1, arg2); // this one can be handled by tracing 
> infrastructure
>      DPRINTF("....", sum); // this one can't be done with tracing

What's the problem? It would usually turn into something like

  trace_addandmultiply_sum(sum);

where trace_addandmultiply_sum() is a generated static inline function
in trace.h, which is empty and has zero overhead with disabled tracing.

Kevin
Stefan Hajnoczi - Nov. 13, 2012, 1:36 p.m.
On Tue, Nov 13, 2012 at 01:27:57PM +0100, Kevin Wolf wrote:
> Am 10.11.2012 10:59, schrieb Gerhard Wiesinger:
> > On 10.11.2012 09:55, Paolo Bonzini wrote:
> >> Il 10/11/2012 09:30, Gerhard Wiesinger ha scritto:
> >>>>> 2.) Added debug code to block.c and to block/vmdk.c to verify
> >>>>> correctness
> >>>> Same here.  Also, please use the tracing infrastructure---a lot of the
> >>>> debug
> >>>> messages you're adding, though not all, are in fact already available
> >>>> (not
> >>>> saying the others aren't useful!)
> >>> Any chance that the patch with debug code only (after some cleaning)
> >>> would be accepted (other modules do debug logging, too)?
> >>> I  don't like to do useless work.
> >>> Tracing infrastructure is quite limited to function calls only (as far
> >>> as I saw).
> >> No, tracing infrastructure uses function calls for tracing (messages go
> >> into trace-events) but you can apply it to everything you want.  Use the
> >> stderr backend to debug it.
> > 
> > Tracing is a good thing for normal behavior but the major limitation is 
> > that a function call must be involved. But for deep debugging one needs 
> > a lot of more messages than function calls are available.
> > 
> > Of course every DPRINTF line could be made in a function call but IHMO 
> > this introduces unnecessary overhead in performance.
> > 
> > So how to proceed further, some options:

QEMU tracing has a solution for this, use the *_ENABLED macro to
determine at compile-time whether tracing code should be compiled in or
not:

if (TRACE_MY_EXPENSIVE_THING_ENABLED) {
    const char *pi = calculate_pi_to_10000000000_decimal_places();
    trace_my_expensive_thing(pi);
}

If the trace event is marked with "disable" in trace-events, then
TRACE_MY_EXPENSIVE_THING_ENABLED will be 0 and the compiler will drop
the dead code.  This way you can put expensive debug-only tracing into
QEMU without affecting production builds - you do lose the advantage of
being able to toggle "disabled" trace events at runtime, but it still
beats DPRINTF().

Stefan

Patch

diff --git a/block/vmdk.c b/block/vmdk.c
index 1a80e5a..51398c0 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1092,6 +1092,7 @@  static int vmdk_read(BlockDriverState *bs, int64_t sector_num,
     BDRVVmdkState *s = bs->opaque;
     int ret;
     uint64_t n, index_in_cluster;
+    uint64_t extent_begin_sector, extent_relative_sector_num;
     VmdkExtent *extent = NULL;
     uint64_t cluster_offset;
 
@@ -1103,7 +1104,9 @@  static int vmdk_read(BlockDriverState *bs, int64_t sector_num,
         ret = get_cluster_offset(
                             bs, extent, NULL,
                             sector_num << 9, 0, &cluster_offset);
-        index_in_cluster = sector_num % extent->cluster_sectors;
+        extent_begin_sector = extent->end_sector - extent->sectors;
+        extent_relative_sector_num = sector_num - extent_begin_sector;
+        index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
         n = extent->cluster_sectors - index_in_cluster;
         if (n > nb_sectors) {
             n = nb_sectors;
@@ -1154,6 +1157,7 @@  static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
     VmdkExtent *extent = NULL;
     int n, ret;
     int64_t index_in_cluster;
+    uint64_t extent_begin_sector, extent_relative_sector_num;
     uint64_t cluster_offset;
     VmdkMetaData m_data;
 
@@ -1196,7 +1200,9 @@  static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
         if (ret) {
             return -EINVAL;
         }
-        index_in_cluster = sector_num % extent->cluster_sectors;
+        extent_begin_sector = extent->end_sector - extent->sectors;
+        extent_relative_sector_num = sector_num - extent_begin_sector;
+        index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
         n = extent->cluster_sectors - index_in_cluster;
         if (n > nb_sectors) {
             n = nb_sectors;