From patchwork Tue Jun 14 18:18:24 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 100404 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [140.186.70.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 457F5B6F68 for ; Wed, 15 Jun 2011 04:22:34 +1000 (EST) Received: from localhost ([::1]:49612 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QWYFy-0005xn-Jl for incoming@patchwork.ozlabs.org; Tue, 14 Jun 2011 14:22:30 -0400 Received: from eggs.gnu.org ([140.186.70.92]:38270) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QWYCR-0005wt-Vk for qemu-devel@nongnu.org; Tue, 14 Jun 2011 14:18:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QWYCQ-00068n-18 for qemu-devel@nongnu.org; Tue, 14 Jun 2011 14:18:51 -0400 Received: from mtagate7.uk.ibm.com ([194.196.100.167]:59902) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QWYCP-00068A-Fg for qemu-devel@nongnu.org; Tue, 14 Jun 2011 14:18:49 -0400 Received: from d06nrmr1507.portsmouth.uk.ibm.com (d06nrmr1507.portsmouth.uk.ibm.com [9.149.38.233]) by mtagate7.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p5EIIkBT028286 for ; Tue, 14 Jun 2011 18:18:46 GMT Received: from d06av12.portsmouth.uk.ibm.com (d06av12.portsmouth.uk.ibm.com [9.149.37.247]) by d06nrmr1507.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p5EIIiFP2384046 for ; Tue, 14 Jun 2011 19:18:46 +0100 Received: from d06av12.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av12.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p5EIIiWc015166 for ; Tue, 14 Jun 2011 12:18:44 -0600 Received: from stefanha-thinkpad.ibm.com (sig-9-145-202-176.de.ibm.com [9.145.202.176]) by d06av12.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p5EIIdDn014627; Tue, 14 Jun 2011 12:18:44 -0600 From: Stefan Hajnoczi To: Date: Tue, 14 Jun 2011 19:18:24 +0100 Message-Id: <1308075511-4745-7-git-send-email-stefanha@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.5.3 In-Reply-To: <1308075511-4745-1-git-send-email-stefanha@linux.vnet.ibm.com> References: <1308075511-4745-1-git-send-email-stefanha@linux.vnet.ibm.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Received-From: 194.196.100.167 Cc: Kevin Wolf , Anthony Liguori , Stefan Hajnoczi , Adam Litke Subject: [Qemu-devel] [PATCH 06/13] qed: add support for copy-on-read X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Anthony Liguori This patch implements copy-on-read in QED. Once a read request reaches the copy-on-read state it adds itself to the allocating write queue in order to avoid race conditions with write requests. If an allocating write request manages to sneak in before the copy-on-read request, then the copy-on-read will notice that the cluster has been allocated when qed_find_cluster() is re-run. This works because only one allocating request is active at any time and when the next request is activated it will re-run qed_find_cluster(). [Originally by Anthony. Stefan added allocating write queuing and factored out the QED_CF_COPY_ON_READ header flag.] Signed-off-by: Anthony Liguori Signed-off-by: Stefan Hajnoczi --- block/qed.c | 35 +++++++++++++++++++++++++++++++++-- block/qed.h | 3 ++- trace-events | 1 + 3 files changed, 36 insertions(+), 3 deletions(-) diff --git a/block/qed.c b/block/qed.c index 4f535aa..6ca57f2 100644 --- a/block/qed.c +++ b/block/qed.c @@ -1189,6 +1189,25 @@ static void qed_aio_write_data(void *opaque, int ret, } /** + * Copy on read callback + * + * Write data from backing file to QED that's been read if CoR is enabled. + */ +static void qed_copy_on_read_cb(void *opaque, int ret) +{ + QEDAIOCB *acb = opaque; + + trace_qed_copy_on_read_cb(acb, ret); + + if (ret < 0) { + qed_aio_complete(acb, ret); + return; + } + + qed_aio_write_alloc(acb); +} + +/** * Read data cluster * * @opaque: Read request @@ -1216,6 +1235,7 @@ static void qed_aio_read_data(void *opaque, int ret, goto err; } + acb->find_cluster_ret = ret; qemu_iovec_copy(&acb->cur_qiov, acb->qiov, acb->qiov_offset, len); /* Handle zero cluster and backing file reads */ @@ -1224,8 +1244,17 @@ static void qed_aio_read_data(void *opaque, int ret, qed_aio_next_io(acb, 0); return; } else if (ret != QED_CLUSTER_FOUND) { + BlockDriverCompletionFunc *cb = qed_aio_next_io; + + if (bs->backing_hd && (acb->flags & QED_AIOCB_COPY_ON_READ)) { + if (!qed_start_allocating_write(acb)) { + qemu_iovec_reset(&acb->cur_qiov); + return; /* wait for current allocating write to complete */ + } + cb = qed_copy_on_read_cb; + } qed_read_backing_file(s, acb->cur_pos, &acb->cur_qiov, - qed_aio_next_io, acb); + cb, acb); return; } @@ -1309,7 +1338,9 @@ static BlockDriverAIOCB *bdrv_qed_aio_readv(BlockDriverState *bs, BlockDriverCompletionFunc *cb, void *opaque) { - return qed_aio_setup(bs, sector_num, qiov, nb_sectors, cb, opaque, 0); + int flags = bs->copy_on_read ? QED_AIOCB_COPY_ON_READ : 0; + + return qed_aio_setup(bs, sector_num, qiov, nb_sectors, cb, opaque, flags); } static BlockDriverAIOCB *bdrv_qed_aio_writev(BlockDriverState *bs, diff --git a/block/qed.h b/block/qed.h index dbc00be..16f4bd9 100644 --- a/block/qed.h +++ b/block/qed.h @@ -124,7 +124,8 @@ typedef struct QEDRequest { } QEDRequest; enum { - QED_AIOCB_WRITE = 0x0001, /* read or write? */ + QED_AIOCB_WRITE = 0x0001, /* read or write? */ + QED_AIOCB_COPY_ON_READ = 0x0002, }; typedef struct QEDAIOCB { diff --git a/trace-events b/trace-events index 6e1a19f..10faa07 100644 --- a/trace-events +++ b/trace-events @@ -236,6 +236,7 @@ disable qed_aio_complete(void *s, void *acb, int ret) "s %p acb %p ret %d" disable qed_aio_setup(void *s, void *acb, int64_t sector_num, int nb_sectors, void *opaque, int flags) "s %p acb %p sector_num %"PRId64" nb_sectors %d opaque %p flags %#x" disable qed_aio_next_io(void *s, void *acb, int ret, uint64_t cur_pos) "s %p acb %p ret %d cur_pos %"PRIu64"" disable qed_aio_read_data(void *s, void *acb, int ret, uint64_t offset, size_t len) "s %p acb %p ret %d offset %"PRIu64" len %zu" +disable qed_copy_on_read_cb(void *acb, int ret) "acb %p ret %d" disable qed_aio_write_data(void *s, void *acb, int ret, uint64_t offset, size_t len) "s %p acb %p ret %d offset %"PRIu64" len %zu" disable qed_aio_write_prefill(void *s, void *acb, uint64_t start, size_t len, uint64_t offset) "s %p acb %p start %"PRIu64" len %zu offset %"PRIu64"" disable qed_aio_write_postfill(void *s, void *acb, uint64_t start, size_t len, uint64_t offset) "s %p acb %p start %"PRIu64" len %zu offset %"PRIu64""