diff mbox

[RFC] block: separate raw images from the file protocol

Message ID 20100407203024.GA30897@lst.de
State New
Headers show

Commit Message

Christoph Hellwig April 7, 2010, 8:30 p.m. UTC
We're running into various problems because the "raw" file access, which
is used internally by the various image formats is entangled with the
"raw" image format, which maps the VM view 1:1 to a file system.

This patch renames the raw file backends to the file protocol which
is treated like other protocols (e.g. nbd and http) and adds a new
"raw" image format which is just a wrapper around calls to the underlying
protocol.

The patch is surprisingly simple, besides changing the probing logical
in block.c to only look for image formats when using bdrv_open and
renaming of the old raw protocols to file there's almost nothing in there.

One thing that looks suspicious in the patch is moving the actual
posix file creation from raw-posix into the new raw image.  This is
a layering violation, but exactly the same as done by all other image
formats implementing the create operations, and not easily fixable
without a major API change in this area.

The only issues still open are in the handling of the host devices.
Firstly in current qemu we can specifiy the host* format names
on various command line acceping images, but the new code can't
do that without adding some translation.  Second the layering breaks
the no_zero_init flag in the BlockDriver used by qemu-img.  I'm not
happy how this is done per-driver instead of per-state so I'll
prepare a separate patch to clean this up.

There's some more cleanup opportunity after this patch, e.g. using
separate lists and registration functions for image formats vs
protocols and maybe even host drivers, but this can be done at a
later stage.

Also there's a check for protocol in bdrv_open for the BDRV_O_SNAPSHOT
case that I don't quite understand, but which I fear won't work as
expected - possibly even before this patch.

Note that this patch requires various recent block patches from Kevin
and me, which should all be in his block queue.

Signed-off-by: Christoph Hellwig <hch@lst.de>

Comments

Kevin Wolf April 8, 2010, 9:50 a.m. UTC | #1
Am 07.04.2010 22:30, schrieb Christoph Hellwig:
> We're running into various problems because the "raw" file access, which
> is used internally by the various image formats is entangled with the
> "raw" image format, which maps the VM view 1:1 to a file system.
> 
> This patch renames the raw file backends to the file protocol which
> is treated like other protocols (e.g. nbd and http) and adds a new
> "raw" image format which is just a wrapper around calls to the underlying
> protocol.

As you know and as I mentioned in previous discussions this approach is
exactly what I think we need in the block layer.

You provided a nice long patch description that covers almost
everything, so I think I can put the greatest part of my comments there.

> The patch is surprisingly simple, besides changing the probing logical
> in block.c to only look for image formats when using bdrv_open and
> renaming of the old raw protocols to file there's almost nothing in there.
> 
> One thing that looks suspicious in the patch is moving the actual
> posix file creation from raw-posix into the new raw image.  This is
> a layering violation, but exactly the same as done by all other image
> formats implementing the create operations, and not easily fixable
> without a major API change in this area.

This is not only a layering violation, but also buggy in this case.
raw-win32.c has a different implementation of raw_create which wouldn't
be called any more.

The two solutions that I see are making raw_create a wrapper that calls
the create function of the protocol, or do make the step and use bdrv_*
in the create functions of the drivers. I think the former is what could
be done to keep this patch simple, but the latter is what we should aim
for longer term.

> The only issues still open are in the handling of the host devices.
> Firstly in current qemu we can specifiy the host* format names
> on various command line acceping images, but the new code can't
> do that without adding some translation.  Second the layering breaks
> the no_zero_init flag in the BlockDriver used by qemu-img.  I'm not
> happy how this is done per-driver instead of per-state so I'll
> prepare a separate patch to clean this up.

Hm, I don't like that very much, but there's probably no sane way around
it. It's clearly a property of the protocol and not of a single device,
but protocols might be stacked and just checking the first one doesn't
give the right result.

Anyway, before merging this patch we obviously need to fix this kind of
things (is it caught by qemu-iotests, by the way?). I'm not sure if we
should add a compatibility translation of host_device => raw or if we
should just remove support for that completely. It would be helpful to
know if this is actually used.

> There's some more cleanup opportunity after this patch, e.g. using
> separate lists and registration functions for image formats vs
> protocols and maybe even host drivers, but this can be done at a
> later stage.
> 
> Also there's a check for protocol in bdrv_open for the BDRV_O_SNAPSHOT
> case that I don't quite understand, but which I fear won't work as
> expected - possibly even before this patch.

You mean that is_protocol thing? It comes into play when you do strange
things like qemu -hda fat:/tmp/testdir -snapshot and I think it actually
does work.

Hm, apropos vvfat... Should vvfat actually be implemented as raw backed
by vvfat now instead of using vvfat directly? We could then forbid
protocols to be used directly.

> Note that this patch requires various recent block patches from Kevin
> and me, which should all be in his block queue.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Index: qemu/Makefile.objs
> ===================================================================
> --- qemu.orig/Makefile.objs	2010-04-07 13:56:27.429254145 +0200
> +++ qemu/Makefile.objs	2010-04-07 22:01:24.974284455 +0200
> @@ -12,7 +12,7 @@ block-obj-y += nbd.o block.o aio.o aes.o
>  block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
>  block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>  
> -block-nested-y += cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
> +block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
>  block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
>  block-nested-y += parallels.o nbd.o
>  block-nested-$(CONFIG_WIN32) += raw-win32.o

This hunk only applies with fuzz on my queue (caused by blkdebug.o). Can
you make sure to rebase the final version on the queue?

> @@ -1026,12 +985,12 @@ static int hdev_create(const char *filen
>  
>  static BlockDriver bdrv_host_device = {
>      .format_name        = "host_device",
> +    .protocol_name        = "host_device",
>      .instance_size      = sizeof(BDRVRawState),
>      .bdrv_probe_device  = hdev_probe_device,
>      .bdrv_open          = hdev_open,
>      .bdrv_close         = raw_close,
>      .bdrv_create        = hdev_create,
> -    .create_options     = raw_create_options,
>      .no_zero_init       = 1,
>      .bdrv_flush         = raw_flush,

A driver that has a bdrv_create needs to also have create_options.
Either retain both or remove both. qemu-img create -f host_device
segfaults with this change.

Kevin
Stefan Weil May 4, 2010, 8:58 p.m. UTC | #2
Am 08.04.2010 11:50, schrieb Kevin Wolf:
> Am 07.04.2010 22:30, schrieb Christoph Hellwig:
>    
>> We're running into various problems because the "raw" file access, which
>> is used internally by the various image formats is entangled with the
>> "raw" image format, which maps the VM view 1:1 to a file system.
>>
>> This patch renames the raw file backends to the file protocol which
>> is treated like other protocols (e.g. nbd and http) and adds a new
>> "raw" image format which is just a wrapper around calls to the underlying
>> protocol.
>>      
> As you know and as I mentioned in previous discussions this approach is
> exactly what I think we need in the block layer.
>
> You provided a nice long patch description that covers almost
> everything, so I think I can put the greatest part of my comments there.
>
>    
>> The patch is surprisingly simple, besides changing the probing logical
>> in block.c to only look for image formats when using bdrv_open and
>> renaming of the old raw protocols to file there's almost nothing in there.
>>
>> One thing that looks suspicious in the patch is moving the actual
>> posix file creation from raw-posix into the new raw image.  This is
>> a layering violation, but exactly the same as done by all other image
>> formats implementing the create operations, and not easily fixable
>> without a major API change in this area.
>>      
> This is not only a layering violation, but also buggy in this case.
> raw-win32.c has a different implementation of raw_create which wouldn't
> be called any more.
>
> The two solutions that I see are making raw_create a wrapper that calls
> the create function of the protocol, or do make the step and use bdrv_*
> in the create functions of the drivers. I think the former is what could
> be done to keep this patch simple, but the latter is what we should aim
> for longer term.
>
>    
>> The only issues still open are in the handling of the host devices.
>> Firstly in current qemu we can specifiy the host* format names
>> on various command line acceping images, but the new code can't
>> do that without adding some translation.  Second the layering breaks
>> the no_zero_init flag in the BlockDriver used by qemu-img.  I'm not
>> happy how this is done per-driver instead of per-state so I'll
>> prepare a separate patch to clean this up.
>>      
> Hm, I don't like that very much, but there's probably no sane way around
> it. It's clearly a property of the protocol and not of a single device,
> but protocols might be stacked and just checking the first one doesn't
> give the right result.
>
> Anyway, before merging this patch we obviously need to fix this kind of
> things (is it caught by qemu-iotests, by the way?). I'm not sure if we
> should add a compatibility translation of host_device =>  raw or if we
> should just remove support for that completely. It would be helpful to
> know if this is actually used.
>
>    
>> There's some more cleanup opportunity after this patch, e.g. using
>> separate lists and registration functions for image formats vs
>> protocols and maybe even host drivers, but this can be done at a
>> later stage.
>>
>> Also there's a check for protocol in bdrv_open for the BDRV_O_SNAPSHOT
>> case that I don't quite understand, but which I fear won't work as
>> expected - possibly even before this patch.
>>      
> You mean that is_protocol thing? It comes into play when you do strange
> things like qemu -hda fat:/tmp/testdir -snapshot and I think it actually
> does work.
>
> Hm, apropos vvfat... Should vvfat actually be implemented as raw backed
> by vvfat now instead of using vvfat directly? We could then forbid
> protocols to be used directly.
>
>    
>> Note that this patch requires various recent block patches from Kevin
>> and me, which should all be in his block queue.
>>
>> Signed-off-by: Christoph Hellwig<hch@lst.de>
>>
>> Index: qemu/Makefile.objs
>> ===================================================================
>> --- qemu.orig/Makefile.objs	2010-04-07 13:56:27.429254145 +0200
>> +++ qemu/Makefile.objs	2010-04-07 22:01:24.974284455 +0200
>> @@ -12,7 +12,7 @@ block-obj-y += nbd.o block.o aio.o aes.o
>>   block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
>>   block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>>
>> -block-nested-y += cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
>> +block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
>>   block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
>>   block-nested-y += parallels.o nbd.o
>>   block-nested-$(CONFIG_WIN32) += raw-win32.o
>>      
> This hunk only applies with fuzz on my queue (caused by blkdebug.o). Can
> you make sure to rebase the final version on the queue?
>
>    
>> @@ -1026,12 +985,12 @@ static int hdev_create(const char *filen
>>
>>   static BlockDriver bdrv_host_device = {
>>       .format_name        = "host_device",
>> +    .protocol_name        = "host_device",
>>       .instance_size      = sizeof(BDRVRawState),
>>       .bdrv_probe_device  = hdev_probe_device,
>>       .bdrv_open          = hdev_open,
>>       .bdrv_close         = raw_close,
>>       .bdrv_create        = hdev_create,
>> -    .create_options     = raw_create_options,
>>       .no_zero_init       = 1,
>>       .bdrv_flush         = raw_flush,
>>      
> A driver that has a bdrv_create needs to also have create_options.
> Either retain both or remove both. qemu-img create -f host_device
> segfaults with this change.
>
> Kevin
>    


This patch (commit 84a12e6648444f517055138a7d7f25a22d7e1029)
breaks QEMU for Win32:

QEMU can no longer access \\.\PhysicalDrive0 - a feature I use quite often.

Found by git bisect, tested like this: qemu \\.\PhysicalDrive0

Stefan
diff mbox

Patch

Index: qemu/Makefile.objs
===================================================================
--- qemu.orig/Makefile.objs	2010-04-07 13:56:27.429254145 +0200
+++ qemu/Makefile.objs	2010-04-07 22:01:24.974284455 +0200
@@ -12,7 +12,7 @@  block-obj-y += nbd.o block.o aio.o aes.o
 block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
-block-nested-y += cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
+block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
 block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
 block-nested-y += parallels.o nbd.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
Index: qemu/block.c
===================================================================
--- qemu.orig/block.c	2010-04-07 13:56:27.435254284 +0200
+++ qemu/block.c	2010-04-07 22:21:25.803256099 +0200
@@ -248,6 +248,28 @@  int is_windows_drive(const char *filenam
 }
 #endif
 
+/*
+ * Detect host devices. By convention, /dev/cdrom[N] is always
+ * recognized as a host CDROM.
+ */
+static BlockDriver *find_hdev_driver(const char *filename)
+{
+    int score_max = 0, score;
+    BlockDriver *drv = NULL, *d;
+
+    for (d = first_drv; d; d = d->next) {
+        if (d->bdrv_probe_device) {
+            score = d->bdrv_probe_device(filename);
+            if (score > score_max) {
+                score_max = score;
+                drv = d;
+            }
+        }
+    }
+
+    return drv;
+}
+
 static BlockDriver *find_protocol(const char *filename)
 {
     BlockDriver *drv1;
@@ -258,11 +280,16 @@  static BlockDriver *find_protocol(const
 #ifdef _WIN32
     if (is_windows_drive(filename) ||
         is_windows_drive_prefix(filename))
-        return bdrv_find_format("raw");
+        return bdrv_find_format("file");
 #endif
     p = strchr(filename, ':');
-    if (!p)
-        return bdrv_find_format("raw");
+    if (!p) {
+        drv1 = find_hdev_driver(filename);
+        if (!drv1) {
+            drv1 = bdrv_find_format("file");
+        }
+        return drv1;
+    }
     len = p - filename;
     if (len > sizeof(protocol) - 1)
         len = sizeof(protocol) - 1;
@@ -276,28 +303,6 @@  static BlockDriver *find_protocol(const
     return NULL;
 }
 
-/*
- * Detect host devices. By convention, /dev/cdrom[N] is always
- * recognized as a host CDROM.
- */
-static BlockDriver *find_hdev_driver(const char *filename)
-{
-    int score_max = 0, score;
-    BlockDriver *drv = NULL, *d;
-
-    for (d = first_drv; d; d = d->next) {
-        if (d->bdrv_probe_device) {
-            score = d->bdrv_probe_device(filename);
-            if (score > score_max) {
-                score_max = score;
-                drv = d;
-            }
-        }
-    }
-
-    return drv;
-}
-
 static BlockDriver *find_image_format(const char *filename)
 {
     int ret, score, score_max;
@@ -320,6 +325,7 @@  static BlockDriver *find_image_format(co
     }
 
     score_max = 0;
+    drv = NULL;
     for(drv1 = first_drv; drv1 != NULL; drv1 = drv1->next) {
         if (drv1->bdrv_probe) {
             score = drv1->bdrv_probe(buf, ret, filename);
@@ -424,10 +430,7 @@  int bdrv_open(BlockDriverState *bs, cons
     pstrcpy(bs->filename, sizeof(bs->filename), filename);
 
     if (!drv) {
-        drv = find_hdev_driver(filename);
-        if (!drv) {
-            drv = find_image_format(filename);
-        }
+        drv = find_image_format(filename);
     }
 
     if (!drv) {
Index: qemu/block/raw-posix.c
===================================================================
--- qemu.orig/block/raw-posix.c	2010-04-07 13:56:27.446254494 +0200
+++ qemu/block/raw-posix.c	2010-04-07 22:21:01.012006275 +0200
@@ -723,60 +723,21 @@  static int64_t raw_getlength(BlockDriver
 }
 #endif
 
-static int raw_create(const char *filename, QEMUOptionParameter *options)
-{
-    int fd;
-    int result = 0;
-    int64_t total_size = 0;
-
-    /* Read out options */
-    while (options && options->name) {
-        if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
-            total_size = options->value.n / 512;
-        }
-        options++;
-    }
-
-    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
-              0644);
-    if (fd < 0) {
-        result = -errno;
-    } else {
-        if (ftruncate(fd, total_size * 512) != 0) {
-            result = -errno;
-        }
-        if (close(fd) != 0) {
-            result = -errno;
-        }
-    }
-    return result;
-}
-
 static void raw_flush(BlockDriverState *bs)
 {
     BDRVRawState *s = bs->opaque;
     qemu_fdatasync(s->fd);
 }
 
-
-static QEMUOptionParameter raw_create_options[] = {
-    {
-        .name = BLOCK_OPT_SIZE,
-        .type = OPT_SIZE,
-        .help = "Virtual disk size"
-    },
-    { NULL }
-};
-
-static BlockDriver bdrv_raw = {
-    .format_name = "raw",
+static BlockDriver bdrv_file = {
+    .format_name = "file",
+    .protocol_name = "file",
     .instance_size = sizeof(BDRVRawState),
     .bdrv_probe = NULL, /* no probe for protocols */
     .bdrv_open = raw_open,
     .bdrv_read = raw_read,
     .bdrv_write = raw_write,
     .bdrv_close = raw_close,
-    .bdrv_create = raw_create,
     .bdrv_flush = raw_flush,
 
     .bdrv_aio_readv = raw_aio_readv,
@@ -785,8 +746,6 @@  static BlockDriver bdrv_raw = {
 
     .bdrv_truncate = raw_truncate,
     .bdrv_getlength = raw_getlength,
-
-    .create_options = raw_create_options,
 };
 
 /***********************************************/
@@ -1026,12 +985,12 @@  static int hdev_create(const char *filen
 
 static BlockDriver bdrv_host_device = {
     .format_name        = "host_device",
+    .protocol_name        = "host_device",
     .instance_size      = sizeof(BDRVRawState),
     .bdrv_probe_device  = hdev_probe_device,
     .bdrv_open          = hdev_open,
     .bdrv_close         = raw_close,
     .bdrv_create        = hdev_create,
-    .create_options     = raw_create_options,
     .no_zero_init       = 1,
     .bdrv_flush         = raw_flush,
 
@@ -1140,12 +1099,12 @@  static int floppy_eject(BlockDriverState
 
 static BlockDriver bdrv_host_floppy = {
     .format_name        = "host_floppy",
+    .protocol_name      = "host_floppy",
     .instance_size      = sizeof(BDRVRawState),
     .bdrv_probe_device	= floppy_probe_device,
     .bdrv_open          = floppy_open,
     .bdrv_close         = raw_close,
     .bdrv_create        = hdev_create,
-    .create_options     = raw_create_options,
     .no_zero_init       = 1,
     .bdrv_flush         = raw_flush,
 
@@ -1239,12 +1198,12 @@  static int cdrom_set_locked(BlockDriverS
 
 static BlockDriver bdrv_host_cdrom = {
     .format_name        = "host_cdrom",
+    .protocol_name      = "host_cdrom",
     .instance_size      = sizeof(BDRVRawState),
     .bdrv_probe_device	= cdrom_probe_device,
     .bdrv_open          = cdrom_open,
     .bdrv_close         = raw_close,
     .bdrv_create        = hdev_create,
-    .create_options     = raw_create_options,
     .no_zero_init       = 1,
     .bdrv_flush         = raw_flush,
 
@@ -1361,12 +1320,12 @@  static int cdrom_set_locked(BlockDriverS
 
 static BlockDriver bdrv_host_cdrom = {
     .format_name        = "host_cdrom",
+    .protocol_name      = "host_cdrom",
     .instance_size      = sizeof(BDRVRawState),
     .bdrv_probe_device	= cdrom_probe_device,
     .bdrv_open          = cdrom_open,
     .bdrv_close         = raw_close,
     .bdrv_create        = hdev_create,
-    .create_options     = raw_create_options,
     .no_zero_init       = 1,
     .bdrv_flush         = raw_flush,
 
@@ -1385,13 +1344,13 @@  static BlockDriver bdrv_host_cdrom = {
 };
 #endif /* __FreeBSD__ */
 
-static void bdrv_raw_init(void)
+static void bdrv_file_init(void)
 {
     /*
      * Register all the drivers.  Note that order is important, the driver
      * registered last will get probed first.
      */
-    bdrv_register(&bdrv_raw);
+    bdrv_register(&bdrv_file);
     bdrv_register(&bdrv_host_device);
 #ifdef __linux__
     bdrv_register(&bdrv_host_floppy);
@@ -1402,4 +1361,4 @@  static void bdrv_raw_init(void)
 #endif
 }
 
-block_init(bdrv_raw_init);
+block_init(bdrv_file_init);
Index: qemu/block/raw-win32.c
===================================================================
--- qemu.orig/block/raw-win32.c	2010-04-07 13:56:27.458254284 +0200
+++ qemu/block/raw-win32.c	2010-04-07 22:20:15.295005645 +0200
@@ -238,8 +238,9 @@  static QEMUOptionParameter raw_create_op
     { NULL }
 };
 
-static BlockDriver bdrv_raw = {
-    .format_name	= "raw",
+static BlockDriver bdrv_file = {
+    .format_name	= "file",
+    .protocol_name	= "file",
     .instance_size	= sizeof(BDRVRawState),
     .bdrv_open		= raw_open,
     .bdrv_close		= raw_close,
@@ -395,6 +396,7 @@  static int raw_set_locked(BlockDriverSta
 
 static BlockDriver bdrv_host_device = {
     .format_name	= "host_device",
+    .protocol_name	= "host_device",
     .instance_size	= sizeof(BDRVRawState),
     .bdrv_probe_device	= hdev_probe_device,
     .bdrv_open		= hdev_open,
@@ -406,10 +408,10 @@  static BlockDriver bdrv_host_device = {
     .bdrv_getlength	= raw_getlength,
 };
 
-static void bdrv_raw_init(void)
+static void bdrv_file_init(void)
 {
-    bdrv_register(&bdrv_raw);
+    bdrv_register(&bdrv_file);
     bdrv_register(&bdrv_host_device);
 }
 
-block_init(bdrv_raw_init);
+block_init(bdrv_file_init);
Index: qemu/block/raw.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ qemu/block/raw.c	2010-04-07 22:07:49.054255891 +0200
@@ -0,0 +1,195 @@ 
+
+#include "qemu-common.h"
+#include "block_int.h"
+#include "module.h"
+
+typedef struct RAWState {
+    BlockDriverState *hd;
+} RAWState;
+
+static int raw_open(BlockDriverState *bs, const char *filename, int flags)
+{
+    RAWState *s = bs->opaque;
+    int ret;
+
+    ret = bdrv_file_open(&s->hd, filename, flags);
+    if (!ret) {
+        bs->sg = s->hd->sg;
+    }
+
+    return ret;
+}
+
+static int raw_read(BlockDriverState *bs, int64_t sector_num,
+                    uint8_t *buf, int nb_sectors)
+{
+    RAWState *s = bs->opaque;
+    return bdrv_read(s->hd, sector_num, buf, nb_sectors);
+}
+
+static int raw_write(BlockDriverState *bs, int64_t sector_num,
+                     const uint8_t *buf, int nb_sectors)
+{
+    RAWState *s = bs->opaque;
+    return bdrv_write(s->hd, sector_num, buf, nb_sectors);
+}
+
+static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs,
+    int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+    BlockDriverCompletionFunc *cb, void *opaque)
+{
+    RAWState *s = bs->opaque;
+
+    return bdrv_aio_readv(s->hd, sector_num, qiov, nb_sectors, cb, opaque);
+}
+
+static BlockDriverAIOCB *raw_aio_writev(BlockDriverState *bs,
+    int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+    BlockDriverCompletionFunc *cb, void *opaque)
+{
+    RAWState *s = bs->opaque;
+
+    return bdrv_aio_writev(s->hd, sector_num, qiov, nb_sectors, cb, opaque);
+}
+
+static void raw_close(BlockDriverState *bs)
+{
+    RAWState *s = bs->opaque;
+    bdrv_delete(s->hd);
+}
+
+static void raw_flush(BlockDriverState *bs)
+{
+    RAWState *s = bs->opaque;
+    bdrv_flush(s->hd);
+}
+
+static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs,
+    BlockDriverCompletionFunc *cb, void *opaque)
+{
+    RAWState *s = bs->opaque;
+    return bdrv_aio_flush(s->hd, cb, opaque);
+}
+
+static int64_t raw_getlength(BlockDriverState *bs)
+{
+    RAWState *s = bs->opaque;
+    return bdrv_getlength(s->hd);
+}
+
+static int raw_truncate(BlockDriverState *bs, int64_t offset)
+{
+    RAWState *s = bs->opaque;
+    return bdrv_truncate(s->hd, offset);
+}
+
+static int raw_probe(const uint8_t *buf, int buf_size, const char *filename)
+{
+   return 1; /* everything can be opened as raw image */
+}
+
+static int raw_is_inserted(BlockDriverState *bs)
+{
+    RAWState *s = bs->opaque;
+    return bdrv_is_inserted(s->hd);
+}
+
+static int raw_eject(BlockDriverState *bs, int eject_flag)
+{
+    RAWState *s = bs->opaque;
+    return bdrv_eject(s->hd, eject_flag);
+}
+
+static int raw_set_locked(BlockDriverState *bs, int locked)
+{
+    RAWState *s = bs->opaque;
+    bdrv_set_locked(s->hd, locked);
+    return 0;
+}
+
+static int raw_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
+{
+   RAWState *s = bs->opaque;
+   return bdrv_ioctl(s->hd, req, buf);
+}
+
+static BlockDriverAIOCB *raw_aio_ioctl(BlockDriverState *bs,
+        unsigned long int req, void *buf,
+        BlockDriverCompletionFunc *cb, void *opaque)
+{
+   RAWState *s = bs->opaque;
+   return bdrv_aio_ioctl(s->hd, req, buf, cb, opaque);
+}
+
+static int raw_create(const char *filename, QEMUOptionParameter *options)
+{
+    int fd;
+    int result = 0;
+    int64_t total_size = 0;
+
+    /* Read out options */
+    while (options && options->name) {
+        if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
+            total_size = options->value.n / 512;
+        }
+        options++;
+    }
+
+    fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
+              0644);
+    if (fd < 0) {
+        result = -errno;
+    } else {
+        if (ftruncate(fd, total_size * 512) != 0) {
+            result = -errno;
+        }
+        if (close(fd) != 0) {
+            result = -errno;
+        }
+    }
+    return result;
+}
+
+static QEMUOptionParameter raw_create_options[] = {
+    {
+        .name = BLOCK_OPT_SIZE,
+        .type = OPT_SIZE,
+        .help = "Virtual disk size"
+    },
+    { NULL }
+};
+
+static BlockDriver bdrv_raw = {
+    .format_name        = "raw",
+
+    .instance_size      = sizeof(RAWState),
+
+    .bdrv_open          = raw_open,
+    .bdrv_close         = raw_close,
+    .bdrv_read          = raw_read,
+    .bdrv_write         = raw_write,
+    .bdrv_flush         = raw_flush,
+    .bdrv_probe         = raw_probe,
+    .bdrv_getlength     = raw_getlength,
+    .bdrv_truncate      = raw_truncate,
+
+    .bdrv_aio_readv     = raw_aio_readv,
+    .bdrv_aio_writev    = raw_aio_writev,
+    .bdrv_aio_flush     = raw_aio_flush,
+
+    .bdrv_is_inserted   = raw_is_inserted,
+    .bdrv_eject         = raw_eject,
+    .bdrv_set_locked    = raw_set_locked,
+    .bdrv_ioctl         = raw_ioctl,
+    .bdrv_aio_ioctl     = raw_aio_ioctl,
+
+    .bdrv_create        = raw_create,
+    .create_options     = raw_create_options,
+};
+
+static void bdrv_raw_init(void)
+{
+    bdrv_register(&bdrv_raw);
+}
+
+block_init(bdrv_raw_init);