From patchwork Mon Apr 1 14:09:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 1072875 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44Xvvt2Qvvz9sPv for ; Tue, 2 Apr 2019 01:33:14 +1100 (AEDT) Received: from localhost ([127.0.0.1]:39350 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hAxzs-0007hA-8k for incoming@patchwork.ozlabs.org; Mon, 01 Apr 2019 10:33:12 -0400 Received: from eggs.gnu.org ([209.51.188.92]:45978) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hAxcp-00027E-RX for qemu-devel@nongnu.org; Mon, 01 Apr 2019 10:09:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hAxco-0004vW-BY for qemu-devel@nongnu.org; Mon, 01 Apr 2019 10:09:23 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48538) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hAxck-0004pb-59; Mon, 01 Apr 2019 10:09:18 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6EBC630BA356; Mon, 1 Apr 2019 14:09:17 +0000 (UTC) Received: from blue.redhat.com (ovpn-116-75.phx2.redhat.com [10.3.116.75]) by smtp.corp.redhat.com (Postfix) with ESMTP id CCE6327BC7; Mon, 1 Apr 2019 14:09:16 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Mon, 1 Apr 2019 09:09:00 -0500 Message-Id: <20190401140903.19186-12-eblake@redhat.com> In-Reply-To: <20190401140903.19186-1-eblake@redhat.com> References: <20190401140903.19186-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Mon, 01 Apr 2019 14:09:17 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL 11/14] nbd/client: Support qemu-img convert from unaligned size X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , "Richard W . M . Jones" , "open list:Network Block Dev..." , Max Reitz Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" If an NBD server advertises a size that is not a multiple of a sector, the block layer rounds up that size, even though we set info.size to the exact byte value sent by the server. The block layer then proceeds to let us read or query block status on the hole that it added past EOF, which the NBD server is unlikely to be happy with. Fortunately, qemu as a server never advertizes an unaligned size, so we generally don't run into this problem; but the nbdkit server makes it easy to test: $ printf %1000d 1 > f1 $ ~/nbdkit/nbdkit -fv file f1 & pid=$! $ qemu-img convert -f raw nbd://localhost:10809 f2 $ kill $pid $ qemu-img compare f1 f2 Pre-patch, the server attempts a 1024-byte read, which nbdkit rightfully rejects as going beyond its advertised 1000 byte size; the conversion fails and the output files differ (not even the first sector is copied, because qemu-img does not follow ddrescue's habit of trying smaller reads to get as much information as possible in spite of errors). Post-patch, the client's attempts to read (and query block status, for new enough nbdkit) are properly truncated to the server's length, with sane handling of the hole the block layer forced on us. Although f2 ends up as a larger file (1024 bytes instead of 1000), qemu-img compare shows the two images to have identical contents for display to the guest. I didn't add iotests coverage since I didn't want to add a dependency on nbdkit in iotests. I also did NOT patch write, trim, or write zeroes - these commands continue to fail (usually with ENOSPC, but whatever the server chose), because we really can't write to the end of the file, and because 'qemu-img convert' is the most common case where we care about being tolerant (which is read-only). Perhaps we could truncate the request if the client is writing zeros to the tail, but that seems like more work, especially if the block layer is fixed in 4.1 to track byte-accurate sizing (in which case this patch would be reverted as unnecessary). Signed-off-by: Eric Blake Message-Id: <20190329042750.14704-5-eblake@redhat.com> Tested-by: Richard W.M. Jones --- block/nbd-client.c | 39 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index 3edb508f668..409c2171bc3 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -848,6 +848,25 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, if (!bytes) { return 0; } + /* + * Work around the fact that the block layer doesn't do + * byte-accurate sizing yet - if the read exceeds the server's + * advertised size because the block layer rounded size up, then + * truncate the request to the server and tail-pad with zero. + */ + if (offset >= client->info.size) { + assert(bytes < BDRV_SECTOR_SIZE); + qemu_iovec_memset(qiov, 0, 0, bytes); + return 0; + } + if (offset + bytes > client->info.size) { + uint64_t slop = offset + bytes - client->info.size; + + assert(slop < BDRV_SECTOR_SIZE); + qemu_iovec_memset(qiov, bytes - slop, 0, slop); + request.len -= slop; + } + ret = nbd_co_send_request(bs, &request, NULL); if (ret < 0) { return ret; @@ -966,7 +985,8 @@ int coroutine_fn nbd_client_co_block_status(BlockDriverState *bs, .from = offset, .len = MIN(MIN_NON_ZERO(QEMU_ALIGN_DOWN(INT_MAX, bs->bl.request_alignment), - client->info.max_block), bytes), + client->info.max_block), + MIN(bytes, client->info.size - offset)), .flags = NBD_CMD_FLAG_REQ_ONE, }; @@ -977,6 +997,23 @@ int coroutine_fn nbd_client_co_block_status(BlockDriverState *bs, return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID; } + /* + * Work around the fact that the block layer doesn't do + * byte-accurate sizing yet - if the status request exceeds the + * server's advertised size because the block layer rounded size + * up, we truncated the request to the server (above), or are + * called on just the hole. + */ + if (offset >= client->info.size) { + *pnum = bytes; + assert(bytes < BDRV_SECTOR_SIZE); + /* Intentionally don't report offset_valid for the hole */ + return BDRV_BLOCK_ZERO; + } + + if (client->info.min_block) { + assert(QEMU_IS_ALIGNED(request.len, client->info.min_block)); + } ret = nbd_co_send_request(bs, &request, NULL); if (ret < 0) { return ret;