From patchwork Fri Jan 24 17:22:05 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 314000 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id D8E0E2C009C for ; Sat, 25 Jan 2014 05:01:46 +1100 (EST) Received: from localhost ([::1]:48322 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W6l4a-0006Go-Pp for incoming@patchwork.ozlabs.org; Fri, 24 Jan 2014 13:01:44 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53760) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W6kW1-0004uS-56 for qemu-devel@nongnu.org; Fri, 24 Jan 2014 12:26:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W6kVt-0000Pt-Rd for qemu-devel@nongnu.org; Fri, 24 Jan 2014 12:26:01 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54584) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W6kVt-0000Pc-IX for qemu-devel@nongnu.org; Fri, 24 Jan 2014 12:25:53 -0500 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s0OHPqG8009991 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 24 Jan 2014 12:25:52 -0500 Received: from dhcp-200-207.str.redhat.com (ovpn-116-95.ams2.redhat.com [10.36.116.95]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id s0OHMRRM012974; Fri, 24 Jan 2014 12:25:51 -0500 From: Kevin Wolf To: anthony@codemonkey.ws Date: Fri, 24 Jan 2014 18:22:05 +0100 Message-Id: <1390584136-24703-83-git-send-email-kwolf@redhat.com> In-Reply-To: <1390584136-24703-1-git-send-email-kwolf@redhat.com> References: <1390584136-24703-1-git-send-email-kwolf@redhat.com> X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: kwolf@redhat.com, qemu-devel@nongnu.org Subject: [Qemu-devel] [PULL 82/93] block: Make overlap range for serialisation dynamic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Copy on Read wants to serialise with all requests touching the same cluster, so wait_serialising_requests() rounded to cluster boundaries. Other users like alignment RMW will have different requirements, though (requests touching the same sector), so make it dynamic. Signed-off-by: Kevin Wolf Reviewed-by: Max Reitz Reviewed-by: Benoit Canet --- block.c | 53 ++++++++++++++++++++++++----------------------- include/block/block_int.h | 4 ++++ 2 files changed, 31 insertions(+), 26 deletions(-) diff --git a/block.c b/block.c index 493c75f..784ff07 100644 --- a/block.c +++ b/block.c @@ -2231,6 +2231,8 @@ static void tracked_request_begin(BdrvTrackedRequest *req, .is_write = is_write, .co = qemu_coroutine_self(), .serialising = false, + .overlap_offset = offset, + .overlap_bytes = bytes, }; qemu_co_queue_init(&req->wait_queue); @@ -2238,12 +2240,19 @@ static void tracked_request_begin(BdrvTrackedRequest *req, QLIST_INSERT_HEAD(&bs->tracked_requests, req, list); } -static void mark_request_serialising(BdrvTrackedRequest *req) +static void mark_request_serialising(BdrvTrackedRequest *req, size_t align) { + int64_t overlap_offset = req->offset & ~(align - 1); + int overlap_bytes = ROUND_UP(req->offset + req->bytes, align) + - overlap_offset; + if (!req->serialising) { req->bs->serialising_in_flight++; req->serialising = true; } + + req->overlap_offset = MIN(req->overlap_offset, overlap_offset); + req->overlap_bytes = MAX(req->overlap_bytes, overlap_bytes); } /** @@ -2267,20 +2276,16 @@ void bdrv_round_to_clusters(BlockDriverState *bs, } } -static void round_bytes_to_clusters(BlockDriverState *bs, - int64_t offset, unsigned int bytes, - int64_t *cluster_offset, - unsigned int *cluster_bytes) +static int bdrv_get_cluster_size(BlockDriverState *bs) { BlockDriverInfo bdi; + int ret; - if (bdrv_get_info(bs, &bdi) < 0 || bdi.cluster_size == 0) { - *cluster_offset = offset; - *cluster_bytes = bytes; + ret = bdrv_get_info(bs, &bdi); + if (ret < 0 || bdi.cluster_size == 0) { + return bs->request_alignment; } else { - *cluster_offset = QEMU_ALIGN_DOWN(offset, bdi.cluster_size); - *cluster_bytes = QEMU_ALIGN_UP(offset - *cluster_offset + bytes, - bdi.cluster_size); + return bdi.cluster_size; } } @@ -2288,11 +2293,11 @@ static bool tracked_request_overlaps(BdrvTrackedRequest *req, int64_t offset, unsigned int bytes) { /* aaaa bbbb */ - if (offset >= req->offset + req->bytes) { + if (offset >= req->overlap_offset + req->overlap_bytes) { return false; } /* bbbb aaaa */ - if (req->offset >= offset + bytes) { + if (req->overlap_offset >= offset + bytes) { return false; } return true; @@ -2302,30 +2307,21 @@ static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self) { BlockDriverState *bs = self->bs; BdrvTrackedRequest *req; - int64_t cluster_offset; - unsigned int cluster_bytes; bool retry; if (!bs->serialising_in_flight) { return; } - /* If we touch the same cluster it counts as an overlap. This guarantees - * that allocating writes will be serialized and not race with each other - * for the same cluster. For example, in copy-on-read it ensures that the - * CoR read and write operations are atomic and guest writes cannot - * interleave between them. - */ - round_bytes_to_clusters(bs, self->offset, self->bytes, - &cluster_offset, &cluster_bytes); - do { retry = false; QLIST_FOREACH(req, &bs->tracked_requests, list) { if (req == self || (!req->serialising && !self->serialising)) { continue; } - if (tracked_request_overlaps(req, cluster_offset, cluster_bytes)) { + if (tracked_request_overlaps(req, self->overlap_offset, + self->overlap_bytes)) + { /* Hitting this means there was a reentrant request, for * example, a block driver issuing nested requests. This must * never happen since it means deadlock. @@ -2941,7 +2937,12 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs, /* Handle Copy on Read and associated serialisation */ if (flags & BDRV_REQ_COPY_ON_READ) { - mark_request_serialising(req); + /* If we touch the same cluster it counts as an overlap. This + * guarantees that allocating writes will be serialized and not race + * with each other for the same cluster. For example, in copy-on-read + * it ensures that the CoR read and write operations are atomic and + * guest writes cannot interleave between them. */ + mark_request_serialising(req, bdrv_get_cluster_size(bs)); } wait_serialising_requests(req); diff --git a/include/block/block_int.h b/include/block/block_int.h index c1153cb..0ee955c 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -60,7 +60,11 @@ typedef struct BdrvTrackedRequest { int64_t offset; unsigned int bytes; bool is_write; + bool serialising; + int64_t overlap_offset; + unsigned int overlap_bytes; + QLIST_ENTRY(BdrvTrackedRequest) list; Coroutine *co; /* owner, used for deadlock detection */ CoQueue wait_queue; /* coroutines blocked on this request */