From patchwork Thu Oct 12 18:58:57 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 825034 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yCgCy6W0wz9sNw for ; Fri, 13 Oct 2017 06:00:46 +1100 (AEDT) Received: from localhost ([::1]:46826 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e2iir-0003NI-1t for incoming@patchwork.ozlabs.org; Thu, 12 Oct 2017 15:00:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42647) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e2ii8-00034t-4A for qemu-devel@nongnu.org; Thu, 12 Oct 2017 15:00:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e2ii6-00061G-Rv for qemu-devel@nongnu.org; Thu, 12 Oct 2017 15:00:00 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37344) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1e2ii1-0005vK-Pf; Thu, 12 Oct 2017 14:59:54 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B8D752D0FB7; Thu, 12 Oct 2017 18:59:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com B8D752D0FB7 Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=eblake@redhat.com Received: from red.redhat.com (ovpn-122-223.rdu2.redhat.com [10.10.122.223]) by smtp.corp.redhat.com (Postfix) with ESMTP id 246C66063B; Thu, 12 Oct 2017 18:59:42 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Thu, 12 Oct 2017 13:58:57 -0500 Message-Id: <20171012185916.22776-2-eblake@redhat.com> In-Reply-To: <20171012185916.22776-1-eblake@redhat.com> References: <20171012185916.22776-1-eblake@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Thu, 12 Oct 2017 18:59:52 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v4 01/20] block: Add .bdrv_co_block_status() callback X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, famz@redhat.com, qemu-block@nongnu.org, Max Reitz , Stefan Hajnoczi , jsnow@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" We are gradually moving away from sector-based interfaces, towards byte-based. Now that the block layer exposes byte-based allocation, it's time to tackle the drivers. Add a new callback that operates on as small as byte boundaries. Subsequent patches will then update individual drivers, then finally remove .bdrv_co_get_block_status(). The old code now uses a goto in order to minimize churn at that later removal. Update documentation in this patch so that the later removal can be a straight delete. The new code also passes through the 'want_zero' hint, which will allow subsequent patches to further optimize callers that only care about how much of the image is allocated (mapping is false), rather than full details about runs of zeroes and which offsets the allocation actually maps to (mapping is true). Note that most drivers give sector-aligned answers, except at end-of-file, even when request_alignment is smaller than a sector. However, bdrv_getlength() is sector-aligned (even though it gives a byte answer), often by exceeding the actual file size. If we were to give back strict results, at least file-posix.c would report a transition from DATA to HOLE at the end of a file even in the middle of a sector, which can throw off callers; so we intentionally lie and state that any partial sector at the end of a file has the same status for the entire sector. Maybe at some future day we can report actual file size instead of rounding up, but not for this series. Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- v4: rebase to master v3: no change v2: improve alignment handling, ensure all iotests still pass --- include/block/block.h | 9 ++++----- include/block/block_int.h | 12 +++++++++--- block/io.c | 30 +++++++++++++++++++++++++----- 3 files changed, 38 insertions(+), 13 deletions(-) diff --git a/include/block/block.h b/include/block/block.h index fbc21daf62..c5d6b2c933 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -136,11 +136,10 @@ typedef struct HDGeometry { * that the block layer recompute the answer from the returned * BDS; must be accompanied by just BDRV_BLOCK_OFFSET_VALID. * - * If BDRV_BLOCK_OFFSET_VALID is set, bits 9-62 (BDRV_BLOCK_OFFSET_MASK) of - * the return value (old interface) or the entire map parameter (new - * interface) represent the offset in the returned BDS that is allocated for - * the corresponding raw data. However, whether that offset actually - * contains data also depends on BDRV_BLOCK_DATA, as follows: + * If BDRV_BLOCK_OFFSET_VALID is set, the map parameter represents the + * host offset within the returned BDS that is allocated for the + * corresponding raw guest data. However, whether that offset + * actually contains data also depends on BDRV_BLOCK_DATA, as follows: * * DATA ZERO OFFSET_VALID * t t t sectors read as zero, returned file is zero at offset diff --git a/include/block/block_int.h b/include/block/block_int.h index 4b9b23a08d..4153cd646d 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -206,13 +206,19 @@ struct BlockDriver { * bdrv_is_allocated[_above]. The driver should answer only * according to the current layer, and should not set * BDRV_BLOCK_ALLOCATED, but may set BDRV_BLOCK_RAW. See block.h - * for the meaning of _DATA, _ZERO, and _OFFSET_VALID. The block - * layer guarantees input aligned to request_alignment, as well as - * non-NULL pnum and file. + * for the meaning of _DATA, _ZERO, and _OFFSET_VALID. As a hint, + * the flag want_zero is true if the caller cares more about + * precise mappings (favor _OFFSET_VALID/_ZERO) or false for + * overall allocation (favor larger *pnum). The block layer + * guarantees input aligned to request_alignment, as well as + * non-NULL pnum, map, and file. */ int64_t coroutine_fn (*bdrv_co_get_block_status)(BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum, BlockDriverState **file); + int coroutine_fn (*bdrv_co_block_status)(BlockDriverState *bd, + bool want_zero, int64_t offset, int64_t bytes, int64_t *pnum, + int64_t *map, BlockDriverState **file); /* * Invalidate any cached meta-data. diff --git a/block/io.c b/block/io.c index e4caa4acf1..ef9ea44667 100644 --- a/block/io.c +++ b/block/io.c @@ -1843,7 +1843,7 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs, bytes = n; } - if (!bs->drv->bdrv_co_get_block_status) { + if (!bs->drv->bdrv_co_get_block_status && !bs->drv->bdrv_co_block_status) { *pnum = bytes; ret = BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED; if (offset + bytes == total_size) { @@ -1860,13 +1860,14 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs, bdrv_inc_in_flight(bs); /* Round out to request_alignment boundaries */ - /* TODO: until we have a byte-based driver callback, we also have to - * round out to sectors, even if that is bigger than request_alignment */ - align = MAX(bs->bl.request_alignment, BDRV_SECTOR_SIZE); + align = bs->bl.request_alignment; + if (bs->drv->bdrv_co_get_block_status && align < BDRV_SECTOR_SIZE) { + align = BDRV_SECTOR_SIZE; + } aligned_offset = QEMU_ALIGN_DOWN(offset, align); aligned_bytes = ROUND_UP(offset + bytes, align) - aligned_offset; - { + if (bs->drv->bdrv_co_get_block_status) { int count; /* sectors */ int64_t longret; @@ -1891,8 +1892,27 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs, } ret = longret & ~BDRV_BLOCK_OFFSET_MASK; *pnum = count * BDRV_SECTOR_SIZE; + goto refine; } + ret = bs->drv->bdrv_co_block_status(bs, want_zero, aligned_offset, + aligned_bytes, pnum, &local_map, + &local_file); + if (ret < 0) { + *pnum = 0; + goto out; + } + + /* + * total_size is always sector-aligned, by sometimes exceeding actual + * file size. Expand pnum if it lands mid-sector due to end-of-file. + */ + if (QEMU_ALIGN_UP(*pnum + aligned_offset, + BDRV_SECTOR_SIZE) == total_size) { + *pnum = total_size - aligned_offset; + } + +refine: /* * The driver's result must be a multiple of request_alignment. * Clamp pnum and adjust map to original request.