From patchwork Thu Oct 8 13:02:08 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 35437 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 09D7EB7B6F for ; Fri, 9 Oct 2009 00:05:59 +1100 (EST) Received: from localhost ([127.0.0.1]:51642 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Mvsgt-0006bP-F7 for incoming@patchwork.ozlabs.org; Thu, 08 Oct 2009 09:05:55 -0400 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MvseS-0005ds-PA for qemu-devel@nongnu.org; Thu, 08 Oct 2009 09:03:25 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MvseN-0005cW-AN for qemu-devel@nongnu.org; Thu, 08 Oct 2009 09:03:23 -0400 Received: from [199.232.76.173] (port=35089 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MvseL-0005cA-52 for qemu-devel@nongnu.org; Thu, 08 Oct 2009 09:03:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38868) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MvseK-00083z-F6 for qemu-devel@nongnu.org; Thu, 08 Oct 2009 09:03:16 -0400 Received: from int-mx03.intmail.prod.int.phx2.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id n98D3FOC023603 for ; Thu, 8 Oct 2009 09:03:15 -0400 Received: from localhost.localdomain (dhcp-5-175.str.redhat.com [10.32.5.175]) by int-mx03.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id n98D3EV9012188; Thu, 8 Oct 2009 09:03:14 -0400 From: Kevin Wolf To: qemu-devel@nongnu.org Date: Thu, 8 Oct 2009 15:02:08 +0200 Message-Id: <1255006928-7600-1-git-send-email-kwolf@redhat.com> X-Scanned-By: MIMEDefang 2.67 on 10.5.11.16 X-detected-operating-system: by monty-python.gnu.org: Genre and OS details not recognized. Cc: Kevin Wolf Subject: [Qemu-devel] [PATCH] qcow2: Bring synchronous read/write back to life X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org When the synchronous read and write functions were dropped, they were replaced by generic emulation functions. Unfortunately, these emulation functions don't provide the same semantics as the original functions did. The original bdrv_read would mean that we read some data synchronously and that we won't be interrupted during this read. The latter assumption is no longer true with the emulation function which needs to use qemu_aio_poll and therefore allows the callback of any other concurrent AIO request to be run during the read. Which in turn means that (meta)data read earlier could have changed and be invalid now. qcow2 is not prepared to work in this way and it's just scary how many places there are where other requests could run. I'm not sure yet where exactly it breaks, but you'll see breakage with virtio on qcow2 with a backing file. Providing synchronous functions again fixes the problem for me. Signed-off-by: Kevin Wolf --- block/qcow2-cluster.c | 6 ++-- block/qcow2.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++- block/qcow2.h | 3 ++ 3 files changed, 55 insertions(+), 5 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index e444e53..a7de820 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -306,8 +306,8 @@ void qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num, } -static int qcow_read(BlockDriverState *bs, int64_t sector_num, - uint8_t *buf, int nb_sectors) +int qcow2_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, + int nb_sectors) { BDRVQcowState *s = bs->opaque; int ret, index_in_cluster, n, n1; @@ -358,7 +358,7 @@ static int copy_sectors(BlockDriverState *bs, uint64_t start_sect, n = n_end - n_start; if (n <= 0) return 0; - ret = qcow_read(bs, start_sect + n_start, s->cluster_data, n); + ret = qcow2_read(bs, start_sect + n_start, s->cluster_data, n); if (ret < 0) return ret; if (s->crypt_method) { diff --git a/block/qcow2.c b/block/qcow2.c index a9e7682..52584ed 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -934,6 +934,51 @@ static int qcow_make_empty(BlockDriverState *bs) return 0; } +static int qcow2_write(BlockDriverState *bs, int64_t sector_num, + const uint8_t *buf, int nb_sectors) +{ + BDRVQcowState *s = bs->opaque; + int ret, index_in_cluster, n; + uint64_t cluster_offset; + int n_end; + QCowL2Meta l2meta; + + while (nb_sectors > 0) { + memset(&l2meta, 0, sizeof(l2meta)); + + index_in_cluster = sector_num & (s->cluster_sectors - 1); + n_end = index_in_cluster + nb_sectors; + if (s->crypt_method && + n_end > QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors) + n_end = QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors; + cluster_offset = qcow2_alloc_cluster_offset(bs, sector_num << 9, + index_in_cluster, + n_end, &n, &l2meta); + if (!cluster_offset) + return -1; + if (s->crypt_method) { + qcow2_encrypt_sectors(s, sector_num, s->cluster_data, buf, n, 1, + &s->aes_encrypt_key); + ret = bdrv_pwrite(s->hd, cluster_offset + index_in_cluster * 512, + s->cluster_data, n * 512); + } else { + ret = bdrv_pwrite(s->hd, cluster_offset + index_in_cluster * 512, buf, n * 512); + } + if (ret != n * 512 || qcow2_alloc_cluster_link_l2(bs, cluster_offset, &l2meta) < 0) { + qcow2_free_any_clusters(bs, cluster_offset, l2meta.nb_clusters); + return -1; + } + nb_sectors -= n; + sector_num += n; + buf += n * 512; + if (l2meta.nb_clusters != 0) { + QLIST_REMOVE(&l2meta, next_in_flight); + } + } + s->cluster_cache_offset = -1; /* disable compressed cache */ + return 0; +} + /* XXX: put compressed sectors first, then all the cluster aligned tables to avoid losing bytes in alignment */ static int qcow_write_compressed(BlockDriverState *bs, int64_t sector_num, @@ -1121,8 +1166,10 @@ static BlockDriver bdrv_qcow2 = { .bdrv_set_key = qcow_set_key, .bdrv_make_empty = qcow_make_empty, - .bdrv_aio_readv = qcow_aio_readv, - .bdrv_aio_writev = qcow_aio_writev, + .bdrv_read = qcow2_read, + .bdrv_write = qcow2_write, + .bdrv_aio_readv = qcow_aio_readv, + .bdrv_aio_writev = qcow_aio_writev, .bdrv_write_compressed = qcow_write_compressed, .bdrv_snapshot_create = qcow2_snapshot_create, diff --git a/block/qcow2.h b/block/qcow2.h index 26ab5d9..d3f690a 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -202,6 +202,9 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, uint64_t cluster_offset, QCowL2Meta *m); +int qcow2_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, + int nb_sectors); + /* qcow2-snapshot.c functions */ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info); int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id);