From patchwork Wed Dec 11 21:08:24 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 300422 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 70EC22C00AE for ; Thu, 12 Dec 2013 08:15:21 +1100 (EST) Received: from localhost ([::1]:60297 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vqr7m-0006hN-Qp for incoming@patchwork.ozlabs.org; Wed, 11 Dec 2013 16:15:18 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44060) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vqr24-0007sE-Sy for qemu-devel@nongnu.org; Wed, 11 Dec 2013 16:09:30 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Vqr1y-0000n2-QB for qemu-devel@nongnu.org; Wed, 11 Dec 2013 16:09:24 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46032) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vqr1y-0000my-Hi for qemu-devel@nongnu.org; Wed, 11 Dec 2013 16:09:18 -0500 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rBBL9Ekd027659 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 11 Dec 2013 16:09:14 -0500 Received: from dhcp-200-207.str.redhat.com (ovpn-116-104.ams2.redhat.com [10.36.116.104]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id rBBL8d92028102; Wed, 11 Dec 2013 16:09:12 -0500 From: Kevin Wolf To: qemu-devel@nongnu.org Date: Wed, 11 Dec 2013 22:08:24 +0100 Message-Id: <1386796109-15264-18-git-send-email-kwolf@redhat.com> In-Reply-To: <1386796109-15264-1-git-send-email-kwolf@redhat.com> References: <1386796109-15264-1-git-send-email-kwolf@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: kwolf@redhat.com, pbonzini@redhat.com, pl@kamp.de, stefanha@redhat.com Subject: [Qemu-devel] [PATCH 17/22] block: Generalise and optimise COR serialisation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Change the API so that specific requests can be marked serialising. Only these requests are checked for overlaps then. This means that during a Copy on Read operation, not all requests overlapping other requests are serialised any more, but only those that actually overlap with the specific COR request. Also remove COR from function and variable names because this functionality can be useful in other contexts. Signed-off-by: Kevin Wolf --- block.c | 40 +++++++++++++++++++++++++--------------- include/block/block_int.h | 5 +++-- 2 files changed, 28 insertions(+), 17 deletions(-) diff --git a/block.c b/block.c index 6dddb7c..40be327 100644 --- a/block.c +++ b/block.c @@ -2028,6 +2028,10 @@ int bdrv_commit_all(void) */ static void tracked_request_end(BdrvTrackedRequest *req) { + if (req->serialising) { + req->bs->serialising_in_flight--; + } + QLIST_REMOVE(req, list); qemu_co_queue_restart_all(&req->wait_queue); } @@ -2046,6 +2050,7 @@ static void tracked_request_begin(BdrvTrackedRequest *req, .bytes = bytes, .is_write = is_write, .co = qemu_coroutine_self(), + .serialising = false, }; qemu_co_queue_init(&req->wait_queue); @@ -2053,6 +2058,14 @@ static void tracked_request_begin(BdrvTrackedRequest *req, QLIST_INSERT_HEAD(&bs->tracked_requests, req, list); } +static void mark_request_serialising(BdrvTrackedRequest *req) +{ + if (!req->serialising) { + req->bs->serialising_in_flight++; + req->serialising = true; + } +} + /** * Round a region to cluster boundaries */ @@ -2105,26 +2118,31 @@ static bool tracked_request_overlaps(BdrvTrackedRequest *req, return true; } -static void coroutine_fn wait_for_overlapping_requests(BlockDriverState *bs, - BdrvTrackedRequest *self, int64_t offset, unsigned int bytes) +static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self) { + BlockDriverState *bs = self->bs; BdrvTrackedRequest *req; int64_t cluster_offset; unsigned int cluster_bytes; bool retry; + if (!bs->serialising_in_flight) { + return; + } + /* If we touch the same cluster it counts as an overlap. This guarantees * that allocating writes will be serialized and not race with each other * for the same cluster. For example, in copy-on-read it ensures that the * CoR read and write operations are atomic and guest writes cannot * interleave between them. */ - round_bytes_to_clusters(bs, offset, bytes, &cluster_offset, &cluster_bytes); + round_bytes_to_clusters(bs, self->offset, self->bytes, + &cluster_offset, &cluster_bytes); do { retry = false; QLIST_FOREACH(req, &bs->tracked_requests, list) { - if (req == self) { + if (req == self || (!req->serialising && !self->serialising)) { continue; } if (tracked_request_overlaps(req, cluster_offset, cluster_bytes)) { @@ -2738,12 +2756,10 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs, /* Handle Copy on Read and associated serialisation */ if (flags & BDRV_REQ_COPY_ON_READ) { - bs->copy_on_read_in_flight++; + mark_request_serialising(req); } - if (bs->copy_on_read_in_flight) { - wait_for_overlapping_requests(bs, req, offset, bytes); - } + wait_serialising_requests(req); if (flags & BDRV_REQ_COPY_ON_READ) { int pnum; @@ -2791,10 +2807,6 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs, } out: - if (flags & BDRV_REQ_COPY_ON_READ) { - bs->copy_on_read_in_flight--; - } - return ret; } @@ -2992,9 +3004,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs, assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0); assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0); - if (bs->copy_on_read_in_flight) { - wait_for_overlapping_requests(bs, req, offset, bytes); - } + wait_serialising_requests(req); ret = notifier_with_return_list_notify(&bs->before_write_notifiers, req); diff --git a/include/block/block_int.h b/include/block/block_int.h index a11e5c9..b00402b 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -60,6 +60,7 @@ typedef struct BdrvTrackedRequest { int64_t offset; unsigned int bytes; bool is_write; + bool serialising; QLIST_ENTRY(BdrvTrackedRequest) list; Coroutine *co; /* owner, used for deadlock detection */ CoQueue wait_queue; /* coroutines blocked on this request */ @@ -296,8 +297,8 @@ struct BlockDriverState { /* Callback before write request is processed */ NotifierWithReturnList before_write_notifiers; - /* number of in-flight copy-on-read requests */ - unsigned int copy_on_read_in_flight; + /* number of in-flight serialising requests */ + unsigned int serialising_in_flight; /* I/O throttling */ ThrottleState throttle_state;