From patchwork Wed Apr 7 20:30:24 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 49641 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 3C2CDB7D0C for ; Thu, 8 Apr 2010 06:31:27 +1000 (EST) Received: from localhost ([127.0.0.1]:57143 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Nzbu9-0000B6-L4 for incoming@patchwork.ozlabs.org; Wed, 07 Apr 2010 16:31:17 -0400 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NzbtY-00008V-F2 for qemu-devel@nongnu.org; Wed, 07 Apr 2010 16:30:40 -0400 Received: from [140.186.70.92] (port=47276 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NzbtW-00006R-Kl for qemu-devel@nongnu.org; Wed, 07 Apr 2010 16:30:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1NzbtT-00088q-5r for qemu-devel@nongnu.org; Wed, 07 Apr 2010 16:30:38 -0400 Received: from verein.lst.de ([213.95.11.210]:46165) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1NzbtS-00086k-Hm for qemu-devel@nongnu.org; Wed, 07 Apr 2010 16:30:35 -0400 Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id o37KUOWY030915 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Wed, 7 Apr 2010 22:30:24 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-7.2) id o37KUOSB030914 for qemu-devel@nongnu.org; Wed, 7 Apr 2010 22:30:24 +0200 Date: Wed, 7 Apr 2010 22:30:24 +0200 From: Christoph Hellwig To: qemu-devel@nongnu.org Message-ID: <20100407203024.GA30897@lst.de> Mime-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) Subject: [Qemu-devel] [PATCH, RFC] block: separate raw images from the file protocol X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We're running into various problems because the "raw" file access, which is used internally by the various image formats is entangled with the "raw" image format, which maps the VM view 1:1 to a file system. This patch renames the raw file backends to the file protocol which is treated like other protocols (e.g. nbd and http) and adds a new "raw" image format which is just a wrapper around calls to the underlying protocol. The patch is surprisingly simple, besides changing the probing logical in block.c to only look for image formats when using bdrv_open and renaming of the old raw protocols to file there's almost nothing in there. One thing that looks suspicious in the patch is moving the actual posix file creation from raw-posix into the new raw image. This is a layering violation, but exactly the same as done by all other image formats implementing the create operations, and not easily fixable without a major API change in this area. The only issues still open are in the handling of the host devices. Firstly in current qemu we can specifiy the host* format names on various command line acceping images, but the new code can't do that without adding some translation. Second the layering breaks the no_zero_init flag in the BlockDriver used by qemu-img. I'm not happy how this is done per-driver instead of per-state so I'll prepare a separate patch to clean this up. There's some more cleanup opportunity after this patch, e.g. using separate lists and registration functions for image formats vs protocols and maybe even host drivers, but this can be done at a later stage. Also there's a check for protocol in bdrv_open for the BDRV_O_SNAPSHOT case that I don't quite understand, but which I fear won't work as expected - possibly even before this patch. Note that this patch requires various recent block patches from Kevin and me, which should all be in his block queue. Signed-off-by: Christoph Hellwig Index: qemu/Makefile.objs =================================================================== --- qemu.orig/Makefile.objs 2010-04-07 13:56:27.429254145 +0200 +++ qemu/Makefile.objs 2010-04-07 22:01:24.974284455 +0200 @@ -12,7 +12,7 @@ block-obj-y += nbd.o block.o aio.o aes.o block-obj-$(CONFIG_POSIX) += posix-aio-compat.o block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o -block-nested-y += cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o +block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o block-nested-y += parallels.o nbd.o block-nested-$(CONFIG_WIN32) += raw-win32.o Index: qemu/block.c =================================================================== --- qemu.orig/block.c 2010-04-07 13:56:27.435254284 +0200 +++ qemu/block.c 2010-04-07 22:21:25.803256099 +0200 @@ -248,6 +248,28 @@ int is_windows_drive(const char *filenam } #endif +/* + * Detect host devices. By convention, /dev/cdrom[N] is always + * recognized as a host CDROM. + */ +static BlockDriver *find_hdev_driver(const char *filename) +{ + int score_max = 0, score; + BlockDriver *drv = NULL, *d; + + for (d = first_drv; d; d = d->next) { + if (d->bdrv_probe_device) { + score = d->bdrv_probe_device(filename); + if (score > score_max) { + score_max = score; + drv = d; + } + } + } + + return drv; +} + static BlockDriver *find_protocol(const char *filename) { BlockDriver *drv1; @@ -258,11 +280,16 @@ static BlockDriver *find_protocol(const #ifdef _WIN32 if (is_windows_drive(filename) || is_windows_drive_prefix(filename)) - return bdrv_find_format("raw"); + return bdrv_find_format("file"); #endif p = strchr(filename, ':'); - if (!p) - return bdrv_find_format("raw"); + if (!p) { + drv1 = find_hdev_driver(filename); + if (!drv1) { + drv1 = bdrv_find_format("file"); + } + return drv1; + } len = p - filename; if (len > sizeof(protocol) - 1) len = sizeof(protocol) - 1; @@ -276,28 +303,6 @@ static BlockDriver *find_protocol(const return NULL; } -/* - * Detect host devices. By convention, /dev/cdrom[N] is always - * recognized as a host CDROM. - */ -static BlockDriver *find_hdev_driver(const char *filename) -{ - int score_max = 0, score; - BlockDriver *drv = NULL, *d; - - for (d = first_drv; d; d = d->next) { - if (d->bdrv_probe_device) { - score = d->bdrv_probe_device(filename); - if (score > score_max) { - score_max = score; - drv = d; - } - } - } - - return drv; -} - static BlockDriver *find_image_format(const char *filename) { int ret, score, score_max; @@ -320,6 +325,7 @@ static BlockDriver *find_image_format(co } score_max = 0; + drv = NULL; for(drv1 = first_drv; drv1 != NULL; drv1 = drv1->next) { if (drv1->bdrv_probe) { score = drv1->bdrv_probe(buf, ret, filename); @@ -424,10 +430,7 @@ int bdrv_open(BlockDriverState *bs, cons pstrcpy(bs->filename, sizeof(bs->filename), filename); if (!drv) { - drv = find_hdev_driver(filename); - if (!drv) { - drv = find_image_format(filename); - } + drv = find_image_format(filename); } if (!drv) { Index: qemu/block/raw-posix.c =================================================================== --- qemu.orig/block/raw-posix.c 2010-04-07 13:56:27.446254494 +0200 +++ qemu/block/raw-posix.c 2010-04-07 22:21:01.012006275 +0200 @@ -723,60 +723,21 @@ static int64_t raw_getlength(BlockDriver } #endif -static int raw_create(const char *filename, QEMUOptionParameter *options) -{ - int fd; - int result = 0; - int64_t total_size = 0; - - /* Read out options */ - while (options && options->name) { - if (!strcmp(options->name, BLOCK_OPT_SIZE)) { - total_size = options->value.n / 512; - } - options++; - } - - fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, - 0644); - if (fd < 0) { - result = -errno; - } else { - if (ftruncate(fd, total_size * 512) != 0) { - result = -errno; - } - if (close(fd) != 0) { - result = -errno; - } - } - return result; -} - static void raw_flush(BlockDriverState *bs) { BDRVRawState *s = bs->opaque; qemu_fdatasync(s->fd); } - -static QEMUOptionParameter raw_create_options[] = { - { - .name = BLOCK_OPT_SIZE, - .type = OPT_SIZE, - .help = "Virtual disk size" - }, - { NULL } -}; - -static BlockDriver bdrv_raw = { - .format_name = "raw", +static BlockDriver bdrv_file = { + .format_name = "file", + .protocol_name = "file", .instance_size = sizeof(BDRVRawState), .bdrv_probe = NULL, /* no probe for protocols */ .bdrv_open = raw_open, .bdrv_read = raw_read, .bdrv_write = raw_write, .bdrv_close = raw_close, - .bdrv_create = raw_create, .bdrv_flush = raw_flush, .bdrv_aio_readv = raw_aio_readv, @@ -785,8 +746,6 @@ static BlockDriver bdrv_raw = { .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, - - .create_options = raw_create_options, }; /***********************************************/ @@ -1026,12 +985,12 @@ static int hdev_create(const char *filen static BlockDriver bdrv_host_device = { .format_name = "host_device", + .protocol_name = "host_device", .instance_size = sizeof(BDRVRawState), .bdrv_probe_device = hdev_probe_device, .bdrv_open = hdev_open, .bdrv_close = raw_close, .bdrv_create = hdev_create, - .create_options = raw_create_options, .no_zero_init = 1, .bdrv_flush = raw_flush, @@ -1140,12 +1099,12 @@ static int floppy_eject(BlockDriverState static BlockDriver bdrv_host_floppy = { .format_name = "host_floppy", + .protocol_name = "host_floppy", .instance_size = sizeof(BDRVRawState), .bdrv_probe_device = floppy_probe_device, .bdrv_open = floppy_open, .bdrv_close = raw_close, .bdrv_create = hdev_create, - .create_options = raw_create_options, .no_zero_init = 1, .bdrv_flush = raw_flush, @@ -1239,12 +1198,12 @@ static int cdrom_set_locked(BlockDriverS static BlockDriver bdrv_host_cdrom = { .format_name = "host_cdrom", + .protocol_name = "host_cdrom", .instance_size = sizeof(BDRVRawState), .bdrv_probe_device = cdrom_probe_device, .bdrv_open = cdrom_open, .bdrv_close = raw_close, .bdrv_create = hdev_create, - .create_options = raw_create_options, .no_zero_init = 1, .bdrv_flush = raw_flush, @@ -1361,12 +1320,12 @@ static int cdrom_set_locked(BlockDriverS static BlockDriver bdrv_host_cdrom = { .format_name = "host_cdrom", + .protocol_name = "host_cdrom", .instance_size = sizeof(BDRVRawState), .bdrv_probe_device = cdrom_probe_device, .bdrv_open = cdrom_open, .bdrv_close = raw_close, .bdrv_create = hdev_create, - .create_options = raw_create_options, .no_zero_init = 1, .bdrv_flush = raw_flush, @@ -1385,13 +1344,13 @@ static BlockDriver bdrv_host_cdrom = { }; #endif /* __FreeBSD__ */ -static void bdrv_raw_init(void) +static void bdrv_file_init(void) { /* * Register all the drivers. Note that order is important, the driver * registered last will get probed first. */ - bdrv_register(&bdrv_raw); + bdrv_register(&bdrv_file); bdrv_register(&bdrv_host_device); #ifdef __linux__ bdrv_register(&bdrv_host_floppy); @@ -1402,4 +1361,4 @@ static void bdrv_raw_init(void) #endif } -block_init(bdrv_raw_init); +block_init(bdrv_file_init); Index: qemu/block/raw-win32.c =================================================================== --- qemu.orig/block/raw-win32.c 2010-04-07 13:56:27.458254284 +0200 +++ qemu/block/raw-win32.c 2010-04-07 22:20:15.295005645 +0200 @@ -238,8 +238,9 @@ static QEMUOptionParameter raw_create_op { NULL } }; -static BlockDriver bdrv_raw = { - .format_name = "raw", +static BlockDriver bdrv_file = { + .format_name = "file", + .protocol_name = "file", .instance_size = sizeof(BDRVRawState), .bdrv_open = raw_open, .bdrv_close = raw_close, @@ -395,6 +396,7 @@ static int raw_set_locked(BlockDriverSta static BlockDriver bdrv_host_device = { .format_name = "host_device", + .protocol_name = "host_device", .instance_size = sizeof(BDRVRawState), .bdrv_probe_device = hdev_probe_device, .bdrv_open = hdev_open, @@ -406,10 +408,10 @@ static BlockDriver bdrv_host_device = { .bdrv_getlength = raw_getlength, }; -static void bdrv_raw_init(void) +static void bdrv_file_init(void) { - bdrv_register(&bdrv_raw); + bdrv_register(&bdrv_file); bdrv_register(&bdrv_host_device); } -block_init(bdrv_raw_init); +block_init(bdrv_file_init); Index: qemu/block/raw.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ qemu/block/raw.c 2010-04-07 22:07:49.054255891 +0200 @@ -0,0 +1,195 @@ + +#include "qemu-common.h" +#include "block_int.h" +#include "module.h" + +typedef struct RAWState { + BlockDriverState *hd; +} RAWState; + +static int raw_open(BlockDriverState *bs, const char *filename, int flags) +{ + RAWState *s = bs->opaque; + int ret; + + ret = bdrv_file_open(&s->hd, filename, flags); + if (!ret) { + bs->sg = s->hd->sg; + } + + return ret; +} + +static int raw_read(BlockDriverState *bs, int64_t sector_num, + uint8_t *buf, int nb_sectors) +{ + RAWState *s = bs->opaque; + return bdrv_read(s->hd, sector_num, buf, nb_sectors); +} + +static int raw_write(BlockDriverState *bs, int64_t sector_num, + const uint8_t *buf, int nb_sectors) +{ + RAWState *s = bs->opaque; + return bdrv_write(s->hd, sector_num, buf, nb_sectors); +} + +static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs, + int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, + BlockDriverCompletionFunc *cb, void *opaque) +{ + RAWState *s = bs->opaque; + + return bdrv_aio_readv(s->hd, sector_num, qiov, nb_sectors, cb, opaque); +} + +static BlockDriverAIOCB *raw_aio_writev(BlockDriverState *bs, + int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, + BlockDriverCompletionFunc *cb, void *opaque) +{ + RAWState *s = bs->opaque; + + return bdrv_aio_writev(s->hd, sector_num, qiov, nb_sectors, cb, opaque); +} + +static void raw_close(BlockDriverState *bs) +{ + RAWState *s = bs->opaque; + bdrv_delete(s->hd); +} + +static void raw_flush(BlockDriverState *bs) +{ + RAWState *s = bs->opaque; + bdrv_flush(s->hd); +} + +static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs, + BlockDriverCompletionFunc *cb, void *opaque) +{ + RAWState *s = bs->opaque; + return bdrv_aio_flush(s->hd, cb, opaque); +} + +static int64_t raw_getlength(BlockDriverState *bs) +{ + RAWState *s = bs->opaque; + return bdrv_getlength(s->hd); +} + +static int raw_truncate(BlockDriverState *bs, int64_t offset) +{ + RAWState *s = bs->opaque; + return bdrv_truncate(s->hd, offset); +} + +static int raw_probe(const uint8_t *buf, int buf_size, const char *filename) +{ + return 1; /* everything can be opened as raw image */ +} + +static int raw_is_inserted(BlockDriverState *bs) +{ + RAWState *s = bs->opaque; + return bdrv_is_inserted(s->hd); +} + +static int raw_eject(BlockDriverState *bs, int eject_flag) +{ + RAWState *s = bs->opaque; + return bdrv_eject(s->hd, eject_flag); +} + +static int raw_set_locked(BlockDriverState *bs, int locked) +{ + RAWState *s = bs->opaque; + bdrv_set_locked(s->hd, locked); + return 0; +} + +static int raw_ioctl(BlockDriverState *bs, unsigned long int req, void *buf) +{ + RAWState *s = bs->opaque; + return bdrv_ioctl(s->hd, req, buf); +} + +static BlockDriverAIOCB *raw_aio_ioctl(BlockDriverState *bs, + unsigned long int req, void *buf, + BlockDriverCompletionFunc *cb, void *opaque) +{ + RAWState *s = bs->opaque; + return bdrv_aio_ioctl(s->hd, req, buf, cb, opaque); +} + +static int raw_create(const char *filename, QEMUOptionParameter *options) +{ + int fd; + int result = 0; + int64_t total_size = 0; + + /* Read out options */ + while (options && options->name) { + if (!strcmp(options->name, BLOCK_OPT_SIZE)) { + total_size = options->value.n / 512; + } + options++; + } + + fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, + 0644); + if (fd < 0) { + result = -errno; + } else { + if (ftruncate(fd, total_size * 512) != 0) { + result = -errno; + } + if (close(fd) != 0) { + result = -errno; + } + } + return result; +} + +static QEMUOptionParameter raw_create_options[] = { + { + .name = BLOCK_OPT_SIZE, + .type = OPT_SIZE, + .help = "Virtual disk size" + }, + { NULL } +}; + +static BlockDriver bdrv_raw = { + .format_name = "raw", + + .instance_size = sizeof(RAWState), + + .bdrv_open = raw_open, + .bdrv_close = raw_close, + .bdrv_read = raw_read, + .bdrv_write = raw_write, + .bdrv_flush = raw_flush, + .bdrv_probe = raw_probe, + .bdrv_getlength = raw_getlength, + .bdrv_truncate = raw_truncate, + + .bdrv_aio_readv = raw_aio_readv, + .bdrv_aio_writev = raw_aio_writev, + .bdrv_aio_flush = raw_aio_flush, + + .bdrv_is_inserted = raw_is_inserted, + .bdrv_eject = raw_eject, + .bdrv_set_locked = raw_set_locked, + .bdrv_ioctl = raw_ioctl, + .bdrv_aio_ioctl = raw_aio_ioctl, + + .bdrv_create = raw_create, + .create_options = raw_create_options, +}; + +static void bdrv_raw_init(void) +{ + bdrv_register(&bdrv_raw); +} + +block_init(bdrv_raw_init);