From patchwork Sat Jan 22 09:29:27 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 79988 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 4D29DB7105 for ; Sat, 22 Jan 2011 20:50:50 +1100 (EST) Received: from localhost ([127.0.0.1]:39724 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pga7K-0002sv-VR for incoming@patchwork.ozlabs.org; Sat, 22 Jan 2011 04:50:47 -0500 Received: from [140.186.70.92] (port=56972 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PgZnF-00036h-8s for qemu-devel@nongnu.org; Sat, 22 Jan 2011 04:30:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PgZnC-0001nO-20 for qemu-devel@nongnu.org; Sat, 22 Jan 2011 04:30:01 -0500 Received: from mtagate5.uk.ibm.com ([194.196.100.165]:57050) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PgZnB-0001n8-Ja for qemu-devel@nongnu.org; Sat, 22 Jan 2011 04:29:57 -0500 Received: from d06nrmr1806.portsmouth.uk.ibm.com (d06nrmr1806.portsmouth.uk.ibm.com [9.149.39.193]) by mtagate5.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p0M9TuLl031661 for ; Sat, 22 Jan 2011 09:29:56 GMT Received: from d06av07.portsmouth.uk.ibm.com (d06av07.portsmouth.uk.ibm.com [9.149.37.248]) by d06nrmr1806.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p0M9TxrY1441858 for ; Sat, 22 Jan 2011 09:29:59 GMT Received: from d06av07.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av07.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p0M9Tupc028268 for ; Sat, 22 Jan 2011 02:29:56 -0700 Received: from stefanha-thinkpad.ibm.com (gbv82324-009146172195.uk.ibm.com [9.146.172.195]) by d06av07.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p0M9TlNg028082; Sat, 22 Jan 2011 02:29:55 -0700 From: Stefan Hajnoczi To: Date: Sat, 22 Jan 2011 09:29:27 +0000 Message-Id: <1295688567-25496-13-git-send-email-stefanha@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.2.3 In-Reply-To: <1295688567-25496-1-git-send-email-stefanha@linux.vnet.ibm.com> References: <1295688567-25496-1-git-send-email-stefanha@linux.vnet.ibm.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) Cc: Kevin Wolf , Anthony Liguori , Stefan Hajnoczi Subject: [Qemu-devel] [RFC][PATCH 12/12] qcow2: Serialize all requests X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org QCOW2 with coroutines is not safe because synchronous code paths are no longer guaranteed to execute without interference from pending requests. A blocking call like bdrv_pread() causes the coroutine to yield and another request can be processed during that time, causing to race conditions or interference between pending requests. The simple solution is to serialize all requests. This is bad for performance and a fine-grained solution needs to be implemented in future patches. Using this patch, QCOW2 with coroutines can reliably install a RHEL6 VM with a virtio-blk disk. Signed-off-by: Stefan Hajnoczi --- block/qcow2.c | 21 ++++++++++++++++++++- block/qcow2.h | 5 +++-- 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 4b33ef3..0cea0e8 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -231,6 +231,7 @@ static int qcow2_open(BlockDriverState *bs, int flags) } QLIST_INIT(&s->cluster_allocs); + QTAILQ_INIT(&s->request_list); /* read qcow2 extensions */ if (header.backing_file_offset) { @@ -365,6 +366,7 @@ typedef struct QCowAIOCB { QEMUBH *bh; QCowL2Meta l2meta; QLIST_ENTRY(QCowAIOCB) next_depend; + QTAILQ_ENTRY(QCowAIOCB) next_request; Coroutine *coroutine; int ret; /* return code for user callback */ } QCowAIOCB; @@ -385,11 +387,20 @@ static AIOPool qcow2_aio_pool = { static void qcow2_aio_bh(void *opaque) { QCowAIOCB *acb = opaque; + BDRVQcowState *s = acb->common.bs->opaque; + qemu_bh_delete(acb->bh); acb->bh = NULL; acb->common.cb(acb->common.opaque, acb->ret); qemu_iovec_destroy(&acb->hd_qiov); + QTAILQ_REMOVE(&s->request_list, acb, next_request); qemu_aio_release(acb); + + /* Start next request */ + if (!QTAILQ_EMPTY(&s->request_list)) { + acb = QTAILQ_FIRST(&s->request_list); + qemu_coroutine_enter(acb->coroutine, acb); + } } static int qcow2_schedule_bh(QEMUBHFunc *cb, QCowAIOCB *acb) @@ -548,8 +559,10 @@ static BlockDriverAIOCB *qcow2_aio_setup(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, BlockDriverCompletionFunc *cb, void *opaque, int is_write) { + BDRVQcowState *s = bs->opaque; QCowAIOCB *acb; Coroutine *coroutine; + int start_request; acb = qemu_aio_get(&qcow2_aio_pool, bs, cb, opaque); if (!acb) @@ -569,7 +582,13 @@ static BlockDriverAIOCB *qcow2_aio_setup(BlockDriverState *bs, coroutine = qemu_coroutine_create(is_write ? qcow2_co_write : qcow2_co_read); acb->coroutine = coroutine; - qemu_coroutine_enter(coroutine, acb); + + /* Kick off the request if no others are currently executing */ + start_request = QTAILQ_EMPTY(&s->request_list); + QTAILQ_INSERT_TAIL(&s->request_list, acb, next_request); + if (start_request) { + qemu_coroutine_enter(coroutine, acb); + } return &acb->common; } diff --git a/block/qcow2.h b/block/qcow2.h index 5217bea..159f86b 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -78,6 +78,8 @@ typedef struct QCowSnapshot { uint64_t vm_clock_nsec; } QCowSnapshot; +struct QCowAIOCB; + typedef struct BDRVQcowState { int cluster_bits; int cluster_size; @@ -98,6 +100,7 @@ typedef struct BDRVQcowState { uint8_t *cluster_data; uint64_t cluster_cache_offset; QLIST_HEAD(QCowClusterAlloc, QCowL2Meta) cluster_allocs; + QTAILQ_HEAD(, QCowAIOCB) request_list; uint64_t *refcount_table; uint64_t refcount_table_offset; @@ -128,8 +131,6 @@ typedef struct QCowCreateState { int64_t refcount_block_offset; } QCowCreateState; -struct QCowAIOCB; - /* XXX This could be private for qcow2-cluster.c */ typedef struct QCowL2Meta {