From patchwork Wed Dec 11 21:08:25 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 300462 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id D7C282C019B for ; Thu, 12 Dec 2013 09:13:29 +1100 (EST) Received: from localhost ([::1]:60255 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vqr5t-0004Dw-Cc for incoming@patchwork.ozlabs.org; Wed, 11 Dec 2013 16:13:21 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44076) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vqr25-0007tu-WD for qemu-devel@nongnu.org; Wed, 11 Dec 2013 16:09:32 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Vqr1z-0000nH-Ic for qemu-devel@nongnu.org; Wed, 11 Dec 2013 16:09:25 -0500 Received: from mx1.redhat.com ([209.132.183.28]:33505) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vqr1z-0000n8-BE for qemu-devel@nongnu.org; Wed, 11 Dec 2013 16:09:19 -0500 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rBBL9GBf029526 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 11 Dec 2013 16:09:16 -0500 Received: from dhcp-200-207.str.redhat.com (ovpn-116-104.ams2.redhat.com [10.36.116.104]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id rBBL8d93028102; Wed, 11 Dec 2013 16:09:14 -0500 From: Kevin Wolf To: qemu-devel@nongnu.org Date: Wed, 11 Dec 2013 22:08:25 +0100 Message-Id: <1386796109-15264-19-git-send-email-kwolf@redhat.com> In-Reply-To: <1386796109-15264-1-git-send-email-kwolf@redhat.com> References: <1386796109-15264-1-git-send-email-kwolf@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: kwolf@redhat.com, pbonzini@redhat.com, pl@kamp.de, stefanha@redhat.com Subject: [Qemu-devel] [PATCH 18/22] block: Make overlap range for serialisation dynamic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Copy on Read wants to serialise with all requests touching the same cluster, so wait_serialising_requests() rounded to cluster boundaries. Other users like alignment RMW will have different requirements, though (requests touching the same sector), so make it dynamic. Signed-off-by: Kevin Wolf --- block.c | 51 +++++++++++++++++++++++------------------------ include/block/block_int.h | 4 ++++ 2 files changed, 29 insertions(+), 26 deletions(-) diff --git a/block.c b/block.c index 40be327..6630b41 100644 --- a/block.c +++ b/block.c @@ -2058,12 +2058,19 @@ static void tracked_request_begin(BdrvTrackedRequest *req, QLIST_INSERT_HEAD(&bs->tracked_requests, req, list); } -static void mark_request_serialising(BdrvTrackedRequest *req) +static void mark_request_serialising(BdrvTrackedRequest *req, size_t align) { + int64_t overlap_offset = req->offset & ~(align - 1); + int overlap_bytes = ROUND_UP(req->offset + req->bytes, align) + - req->overlap_offset; + if (!req->serialising) { req->bs->serialising_in_flight++; req->serialising = true; } + + req->overlap_offset = MIN(req->overlap_offset, overlap_offset); + req->overlap_bytes = MAX(req->overlap_bytes, overlap_bytes); } /** @@ -2087,20 +2094,16 @@ void bdrv_round_to_clusters(BlockDriverState *bs, } } -static void round_bytes_to_clusters(BlockDriverState *bs, - int64_t offset, unsigned int bytes, - int64_t *cluster_offset, - unsigned int *cluster_bytes) +static int bdrv_get_cluster_size(BlockDriverState *bs) { BlockDriverInfo bdi; + int ret; - if (bdrv_get_info(bs, &bdi) < 0 || bdi.cluster_size == 0) { - *cluster_offset = offset; - *cluster_bytes = bytes; + ret = bdrv_get_info(bs, &bdi); + if (ret < 0 || bdi.cluster_size == 0) { + return bs->request_alignment; } else { - *cluster_offset = QEMU_ALIGN_DOWN(offset, bdi.cluster_size); - *cluster_bytes = QEMU_ALIGN_UP(offset - *cluster_offset + bytes, - bdi.cluster_size); + return bdi.cluster_size; } } @@ -2108,11 +2111,11 @@ static bool tracked_request_overlaps(BdrvTrackedRequest *req, int64_t offset, int bytes) { /* aaaa bbbb */ - if (offset >= req->offset + req->bytes) { + if (offset >= req->overlap_offset + req->overlap_bytes) { return false; } /* bbbb aaaa */ - if (req->offset >= offset + bytes) { + if (req->overlap_offset >= offset + bytes) { return false; } return true; @@ -2122,30 +2125,21 @@ static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self) { BlockDriverState *bs = self->bs; BdrvTrackedRequest *req; - int64_t cluster_offset; - unsigned int cluster_bytes; bool retry; if (!bs->serialising_in_flight) { return; } - /* If we touch the same cluster it counts as an overlap. This guarantees - * that allocating writes will be serialized and not race with each other - * for the same cluster. For example, in copy-on-read it ensures that the - * CoR read and write operations are atomic and guest writes cannot - * interleave between them. - */ - round_bytes_to_clusters(bs, self->offset, self->bytes, - &cluster_offset, &cluster_bytes); - do { retry = false; QLIST_FOREACH(req, &bs->tracked_requests, list) { if (req == self || (!req->serialising && !self->serialising)) { continue; } - if (tracked_request_overlaps(req, cluster_offset, cluster_bytes)) { + if (tracked_request_overlaps(req, self->overlap_offset, + self->overlap_bytes)) + { /* Hitting this means there was a reentrant request, for * example, a block driver issuing nested requests. This must * never happen since it means deadlock. @@ -2756,7 +2750,12 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs, /* Handle Copy on Read and associated serialisation */ if (flags & BDRV_REQ_COPY_ON_READ) { - mark_request_serialising(req); + /* If we touch the same cluster it counts as an overlap. This + * guarantees that allocating writes will be serialized and not race + * with each other for the same cluster. For example, in copy-on-read + * it ensures that the CoR read and write operations are atomic and + * guest writes cannot interleave between them. */ + mark_request_serialising(req, bdrv_get_cluster_size(bs)); } wait_serialising_requests(req); diff --git a/include/block/block_int.h b/include/block/block_int.h index b00402b..1aee02b 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -60,7 +60,11 @@ typedef struct BdrvTrackedRequest { int64_t offset; unsigned int bytes; bool is_write; + bool serialising; + int64_t overlap_offset; + unsigned int overlap_bytes; + QLIST_ENTRY(BdrvTrackedRequest) list; Coroutine *co; /* owner, used for deadlock detection */ CoQueue wait_queue; /* coroutines blocked on this request */