From patchwork Wed Apr 3 03:05:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 1075279 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44YrfN5frvz9sS3 for ; Wed, 3 Apr 2019 14:09:48 +1100 (AEDT) Received: from localhost ([127.0.0.1]:41087 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWHa-0003JM-MA for incoming@patchwork.ozlabs.org; Tue, 02 Apr 2019 23:09:46 -0400 Received: from eggs.gnu.org ([209.51.188.92]:41682) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWDZ-0000FJ-Cb for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hBWDY-0004yS-Cu for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49768) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hBWDW-0004pM-2v; Tue, 02 Apr 2019 23:05:34 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 52C49308FC23; Wed, 3 Apr 2019 03:05:33 +0000 (UTC) Received: from blue.redhat.com (ovpn-116-168.phx2.redhat.com [10.3.116.168]) by smtp.corp.redhat.com (Postfix) with ESMTP id CA3D419C57; Wed, 3 Apr 2019 03:05:32 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Tue, 2 Apr 2019 22:05:20 -0500 Message-Id: <20190403030526.12258-2-eblake@redhat.com> In-Reply-To: <20190403030526.12258-1-eblake@redhat.com> References: <20190403030526.12258-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Wed, 03 Apr 2019 03:05:33 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 1/7] nbd/server: Fix blockstatus trace X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: vsementsov@virtuozzo.com, jsnow@redhat.com, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Don't increment remaining_bytes until we know that we will actually be including the current block status extent in the reply; otherwise, the value traced will include a bytes value that is oversized by the length of the next block status extent which did not get sent because it instead ended the loop. Fixes: fb7afc79 Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- nbd/server.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 218a2aa5e65..1b8c8619896 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1880,17 +1880,12 @@ static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset, flags = (ret & BDRV_BLOCK_ALLOCATED ? 0 : NBD_STATE_HOLE) | (ret & BDRV_BLOCK_ZERO ? NBD_STATE_ZERO : 0); - offset += num; - remaining_bytes -= num; if (first_extent) { extent->flags = flags; extent->length = num; first_extent = false; - continue; - } - - if (flags == extent->flags) { + } else if (flags == extent->flags) { /* extend current extent */ extent->length += num; } else { @@ -1903,6 +1898,8 @@ static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset, extent->flags = flags; extent->length = num; } + offset += num; + remaining_bytes -= num; } extents_end = extent + 1; From patchwork Wed Apr 3 03:05:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 1075271 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44YrZt03Wgz9sS3 for ; Wed, 3 Apr 2019 14:06:42 +1100 (AEDT) Received: from localhost ([127.0.0.1]:40188 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWEZ-0000Y0-0S for incoming@patchwork.ozlabs.org; Tue, 02 Apr 2019 23:06:39 -0400 Received: from eggs.gnu.org ([209.51.188.92]:41716) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWDb-0000H5-55 for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hBWDZ-000562-Np for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53288) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hBWDW-0004rN-MQ; Tue, 02 Apr 2019 23:05:34 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 026C73091747; Wed, 3 Apr 2019 03:05:34 +0000 (UTC) Received: from blue.redhat.com (ovpn-116-168.phx2.redhat.com [10.3.116.168]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7E27519C68; Wed, 3 Apr 2019 03:05:33 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Tue, 2 Apr 2019 22:05:21 -0500 Message-Id: <20190403030526.12258-3-eblake@redhat.com> In-Reply-To: <20190403030526.12258-1-eblake@redhat.com> References: <20190403030526.12258-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Wed, 03 Apr 2019 03:05:34 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 2/7] nbd/server: Trace server noncompliance on unaligned requests X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: vsementsov@virtuozzo.com, jsnow@redhat.com, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" We've recently added traces for clients to flag server non-compliance; let's do the same for servers to flag client non-compliance. According to the spec, if the client requests NBD_INFO_BLOCK_SIZE, it is promising to send all requests aligned to those boundaries. Of course, if the client does not request NBD_INFO_BLOCK_SIZE, then it made no promises so we shouldn't flag anything; and because we are willing to handle clients that made no promises (the spec allows us to use NBD_REP_ERR_BLOCK_SIZE_REQD if we had been unwilling), we already have to handle unaligned requests (which the block layer already does on our behalf). So even though the spec allows us to return EINVAL for clients that promised to behave, it's easier to always answer unaligned requests. Still, flagging non-compliance can be useful in debugging a client that is trying to be maximally portable. Qemu as client used to have one spot where it sent non-compliant requests: if the server sends an unaligned reply to NBD_CMD_BLOCK_STATUS, and the client was iterating over the entire disk, the next request would start at that unaligned point; this was fixed in commit a39286dd when the client was taught to work around server non-compliance; but is equally fixed if the server is patched to not send unaligned replies in the first place (the next few patches will address that). Fortunately, I did not find any more spots where qemu as client was non-compliant. I was able to test the patch by using the following hack to convince qemu-io to run various unaligned commands, coupled with serving 512-byte alignment by intentionally omitting '-f raw' on the server while viewing server traces. | diff --git i/nbd/client.c w/nbd/client.c | index 427980bdd22..1858b2aac35 100644 | --- i/nbd/client.c | +++ w/nbd/client.c | @@ -449,6 +449,7 @@ static int nbd_opt_info_or_go(QIOChannel *ioc, uint32_t opt, | nbd_send_opt_abort(ioc); | return -1; | } | + info->min_block = 1;//hack | if (!is_power_of_2(info->min_block)) { | error_setg(errp, "server minimum block size %" PRIu32 | " is not a power of two", info->min_block); Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- nbd/server.c | 12 ++++++++++++ nbd/trace-events | 1 + 2 files changed, 13 insertions(+) diff --git a/nbd/server.c b/nbd/server.c index 1b8c8619896..3040ceb5606 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -124,6 +124,8 @@ struct NBDClient { int nb_requests; bool closing; + uint32_t check_align; /* If non-zero, check for aligned client requests */ + bool structured_reply; NBDExportMetaContexts export_meta; @@ -660,6 +662,8 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags, if (client->opt == NBD_OPT_GO) { client->exp = exp; + client->check_align = blocksize ? + blk_get_request_alignment(exp->blk) : 0; QTAILQ_INSERT_TAIL(&client->exp->clients, client, next); nbd_export_get(client->exp); nbd_check_meta_export(client); @@ -2126,6 +2130,14 @@ static int nbd_co_receive_request(NBDRequestData *req, NBDRequest *request, return (request->type == NBD_CMD_WRITE || request->type == NBD_CMD_WRITE_ZEROES) ? -ENOSPC : -EINVAL; } + if (client->check_align && !QEMU_IS_ALIGNED(request->from | request->len, + client->check_align)) { + /* + * The block layer gracefully handles unaligned requests, but + * it's still worth tracing client non-compliance + */ + trace_nbd_co_receive_align_compliance(nbd_cmd_lookup(request->type)); + } valid_flags = NBD_CMD_FLAG_FUA; if (request->type == NBD_CMD_READ && client->structured_reply) { valid_flags |= NBD_CMD_FLAG_DF; diff --git a/nbd/trace-events b/nbd/trace-events index a6cca8fdf83..87a2b896c35 100644 --- a/nbd/trace-events +++ b/nbd/trace-events @@ -71,4 +71,5 @@ nbd_co_send_extents(uint64_t handle, unsigned int extents, uint32_t id, uint64_t nbd_co_send_structured_error(uint64_t handle, int err, const char *errname, const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d (%s), msg = '%s'" nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16 " (%s)" nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len) "Payload received: handle = %" PRIu64 ", len = %" PRIu32 +nbd_co_receive_align_compliance(const char *op) "client sent non-compliant unaligned %s request" nbd_trip(void) "Reading request" From patchwork Wed Apr 3 03:05:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 1075291 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44Yrhx0y1dz9sS3 for ; Wed, 3 Apr 2019 14:11:59 +1100 (AEDT) Received: from localhost ([127.0.0.1]:41792 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWJf-0005TR-UY for incoming@patchwork.ozlabs.org; Tue, 02 Apr 2019 23:11:55 -0400 Received: from eggs.gnu.org ([209.51.188.92]:41718) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWDb-0000H6-5g for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hBWDa-000588-4j for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47130) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hBWDX-0004tX-Ch; Tue, 02 Apr 2019 23:05:35 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A740385A07; Wed, 3 Apr 2019 03:05:34 +0000 (UTC) Received: from blue.redhat.com (ovpn-116-168.phx2.redhat.com [10.3.116.168]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2BF4419C68; Wed, 3 Apr 2019 03:05:34 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Tue, 2 Apr 2019 22:05:22 -0500 Message-Id: <20190403030526.12258-4-eblake@redhat.com> In-Reply-To: <20190403030526.12258-1-eblake@redhat.com> References: <20190403030526.12258-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 03 Apr 2019 03:05:34 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 3/7] nbd/server: Don't fail NBD_OPT_INFO for byte-aligned sources X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: vsementsov@virtuozzo.com, jsnow@redhat.com, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In commit 0c1d50bd, I added a couple of TODO comments about whether we consult bl.request_alignment when responding to NBD_OPT_INFO. At the time, qemu as server was hard-coding an advertised alignment of 512 to clients that promised to obey constraints, and there was no function for getting at a device's preferred alignment. But in hindsight, advertising 512 when the block device prefers 1 caused other compliance problems, and commit b0245d64 changed one of the two TODO comments to advertise a more accurate alignment. Time to fix the other TODO. Doesn't really impact qemu as client (our normal client doesn't use NBD_OPT_INFO, and qemu-nbd --list promises to obey block sizes), but it might prove useful to other clients. Fixes: b0245d64 Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- nbd/server.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 3040ceb5606..faa3c7879fd 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -642,11 +642,14 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags, return rc; } - /* If the client is just asking for NBD_OPT_INFO, but forgot to - * request block sizes, return an error. - * TODO: consult blk_bs(blk)->request_align, and only error if it - * is not 1? */ - if (client->opt == NBD_OPT_INFO && !blocksize) { + /* + * If the client is just asking for NBD_OPT_INFO, but forgot to + * request block sizes in a situation that would impact + * performance, then return an error. But for NBD_OPT_GO, we + * tolerate all clients, regardless of alignments. + */ + if (client->opt == NBD_OPT_INFO && !blocksize && + blk_get_request_alignment(exp->blk) > 1) { return nbd_negotiate_send_rep_err(client, NBD_REP_ERR_BLOCK_SIZE_REQD, errp, From patchwork Wed Apr 3 03:05:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 1075281 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44YrfV2mjpz9sS3 for ; Wed, 3 Apr 2019 14:09:54 +1100 (AEDT) Received: from localhost ([127.0.0.1]:41109 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWHg-0003N7-8Q for incoming@patchwork.ozlabs.org; Tue, 02 Apr 2019 23:09:52 -0400 Received: from eggs.gnu.org ([209.51.188.92]:41835) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWDk-0000eG-Ks for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hBWDi-0005Vz-H2 for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37774) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hBWDb-0005A9-Az; Tue, 02 Apr 2019 23:05:39 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8683985541; Wed, 3 Apr 2019 03:05:38 +0000 (UTC) Received: from blue.redhat.com (ovpn-116-168.phx2.redhat.com [10.3.116.168]) by smtp.corp.redhat.com (Postfix) with ESMTP id ADEAA19C68; Wed, 3 Apr 2019 03:05:35 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Tue, 2 Apr 2019 22:05:24 -0500 Message-Id: <20190403030526.12258-6-eblake@redhat.com> In-Reply-To: <20190403030526.12258-1-eblake@redhat.com> References: <20190403030526.12258-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 03 Apr 2019 03:05:38 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 5/7] block: Fix BDRV_BLOCK_RAW status to honor alignment X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Kevin Wolf , vsementsov@virtuozzo.com, qemu-block@nongnu.org, Max Reitz , Stefan Hajnoczi , jsnow@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Previous patches mentioned how the blkdebug filter driver demonstrates a bug present in our NBD server; the bug is also present with the raw format driver when probing occurs. Basically, if we specify a particular alignment > 1, but defer the actual block status to the underlying file, and the underlying file has a smaller alignment, then the use of BDRV_BLOCK_RAW to defer to the underlying file can end up with status split at an alignment unacceptable to our layer. Many callers don't care, but NBD does - it is a violation of the NBD protocol to send status or read results split on an unaligned boundary (we've taught our client to be tolerant of such violations because of qemu 3.1; but we still need to fix our server for the sake of other stricter clients). This patch lays the groundwork - it adjusts bdrv_block_status to ensure that any time one layer defers to another via BDRV_BLOCK_RAW, the deferral is either truncated down to an aligned boundary, or multiple sub-aligned requests are coalesced into a single representative answer (using an implicit hole beyond EOF as needed). Iotest 241 exposes the effect (when format probing occurred, we don't want block status to subdivide the initial sector, and thus not any other sector either). Similarly, iotest 221 is a good candidate to amend to specifically track the effects; a change to a hole at EOF is not visible if an upper layer does not want alignment smaller than 512. However, this patch alone is not a complete fix - it is still possible to have an active layer with large alignment constraints backed by another layer with smaller constraints; so the next patch will complete the task. Signed-off-by: Eric Blake --- block/io.c | 108 +++++++++++++++++++++++++++++++++++-- tests/qemu-iotests/221 | 10 ++++ tests/qemu-iotests/221.out | 6 +++ tests/qemu-iotests/241.out | 3 +- 4 files changed, 122 insertions(+), 5 deletions(-) diff --git a/block/io.c b/block/io.c index dfc153b8d82..936877d3745 100644 --- a/block/io.c +++ b/block/io.c @@ -2021,6 +2021,97 @@ int coroutine_fn bdrv_co_block_status_from_backing(BlockDriverState *bs, return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID; } +static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs, + bool want_zero, + int64_t offset, int64_t bytes, + int64_t *pnum, int64_t *map, + BlockDriverState **file); + +/* + * Returns an aligned allocation status of the specified sectors. + * Wrapper around bdrv_co_block_status() which requires the initial + * offset and count to be aligned, and guarantees the result will also + * be aligned (even if it requires piecing together multiple + * sub-aligned queries into an appropriate coalesced answer). + */ +static int coroutine_fn bdrv_co_block_status_aligned(BlockDriverState *bs, + uint32_t align, + bool want_zero, + int64_t offset, + int64_t bytes, + int64_t *pnum, + int64_t *map, + BlockDriverState **file) +{ + int ret; + + assert(is_power_of_2(align) && QEMU_IS_ALIGNED(offset | bytes, align)); + ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file); + if (ret < 0) { + return ret; + } + if (!*pnum) { + assert(!bytes || ret & BDRV_BLOCK_EOF); + return ret; + } + if (*pnum >= align) { + if (!QEMU_IS_ALIGNED(*pnum, align)) { + ret &= ~BDRV_BLOCK_EOF; + *pnum = QEMU_ALIGN_DOWN(*pnum, align); + } + return ret; + } + /* Merge in status for the rest of align */ + while (*pnum < align) { + int ret2; + int64_t pnum2; + int64_t map2; + BlockDriverState *file2; + + ret2 = bdrv_co_block_status(bs, want_zero, offset + *pnum, + align - *pnum, &pnum2, + map ? &map2 : NULL, file ? &file2 : NULL); + if (ret2 < 0) { + return ret2; + } + if (ret2 & BDRV_BLOCK_EOF) { + ret |= BDRV_BLOCK_EOF; + if (!pnum2) { + pnum2 = align - *pnum; + ret2 |= BDRV_BLOCK_ZERO; + } + } else { + assert(pnum2); + } + if (ret2 & BDRV_BLOCK_DATA) { + ret |= BDRV_BLOCK_DATA; + } + if (!(ret2 & BDRV_BLOCK_ZERO)) { + ret &= ~BDRV_BLOCK_ZERO; + } + if ((ret & BDRV_BLOCK_OFFSET_VALID) && + (!(ret2 & BDRV_BLOCK_OFFSET_VALID) || + (map && *map + *pnum != map2) || (file && *file != file2))) { + ret &= ~BDRV_BLOCK_OFFSET_VALID; + if (map) { + *map = 0; + } + if (file) { + *file = NULL; + } + } + if (ret2 & BDRV_BLOCK_ALLOCATED) { + ret |= BDRV_BLOCK_ALLOCATED; + } + if (ret2 & BDRV_BLOCK_RAW) { + assert(ret & BDRV_BLOCK_RAW); + assert(ret & BDRV_BLOCK_OFFSET_VALID); + } + *pnum += pnum2; + } + return ret; +} + /* * Returns the allocation status of the specified sectors. * Drivers not implementing the functionality are assumed to not support @@ -2121,6 +2212,19 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs, */ assert(*pnum && QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset); + + if (ret & BDRV_BLOCK_RAW) { + assert(ret & BDRV_BLOCK_OFFSET_VALID && local_file); + ret = bdrv_co_block_status_aligned(local_file, align, want_zero, + local_map, *pnum, pnum, &local_map, + &local_file); + if (ret < 0) { + goto out; + } + assert(!(ret & BDRV_BLOCK_RAW)); + ret |= BDRV_BLOCK_RAW; + } + *pnum -= offset - aligned_offset; if (*pnum > bytes) { *pnum = bytes; @@ -2130,9 +2234,7 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs, } if (ret & BDRV_BLOCK_RAW) { - assert(ret & BDRV_BLOCK_OFFSET_VALID && local_file); - ret = bdrv_co_block_status(local_file, want_zero, local_map, - *pnum, pnum, &local_map, &local_file); + ret &= ~BDRV_BLOCK_RAW; goto out; } diff --git a/tests/qemu-iotests/221 b/tests/qemu-iotests/221 index 808cd9a289c..3f60fd0fdc8 100755 --- a/tests/qemu-iotests/221 +++ b/tests/qemu-iotests/221 @@ -41,17 +41,27 @@ echo echo "=== Check mapping of unaligned raw image ===" echo +# Note that when we enable format probing by omitting -f, the raw +# layer forces 512-byte alignment and the bytes past EOF take on the +# same status as the rest of the sector; otherwise, we can see the +# implicit hole visible past EOF thanks to the block layer rounding +# sizes up. + _make_test_img 43009 # qemu-img create rounds size up $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map +$QEMU_IMG map -f $IMGFMT --output=json "$TEST_IMG" | _filter_qemu_img_map truncate --size=43009 "$TEST_IMG" # so we resize it and check again $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map +$QEMU_IMG map -f $IMGFMT --output=json "$TEST_IMG" | _filter_qemu_img_map $QEMU_IO -c 'w 43008 1' "$TEST_IMG" | _filter_qemu_io # writing also rounds up $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map +$QEMU_IMG map -f $IMGFMT --output=json "$TEST_IMG" | _filter_qemu_img_map truncate --size=43009 "$TEST_IMG" # so we resize it and check again $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map +$QEMU_IMG map -f $IMGFMT --output=json "$TEST_IMG" | _filter_qemu_img_map # success, all done echo '*** done' diff --git a/tests/qemu-iotests/221.out b/tests/qemu-iotests/221.out index a9c0190aadc..723e94037fe 100644 --- a/tests/qemu-iotests/221.out +++ b/tests/qemu-iotests/221.out @@ -5,12 +5,18 @@ QA output created by 221 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=43009 [{ "start": 0, "length": 43520, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] [{ "start": 0, "length": 43520, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] +[{ "start": 0, "length": 43520, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] +[{ "start": 0, "length": 43520, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] wrote 1/1 bytes at offset 43008 1 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) [{ "start": 0, "length": 40960, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, +{ "start": 40960, "length": 2560, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] +[{ "start": 0, "length": 40960, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, { "start": 40960, "length": 2049, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, { "start": 43009, "length": 511, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] [{ "start": 0, "length": 40960, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, +{ "start": 40960, "length": 2560, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] +[{ "start": 0, "length": 40960, "depth": 0, "zero": true, "data": false, "offset": OFFSET}, { "start": 40960, "length": 2049, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, { "start": 43009, "length": 511, "depth": 0, "zero": true, "data": false, "offset": OFFSET}] *** done diff --git a/tests/qemu-iotests/241.out b/tests/qemu-iotests/241.out index ef7de1205d2..fd856bb0c8d 100644 --- a/tests/qemu-iotests/241.out +++ b/tests/qemu-iotests/241.out @@ -22,8 +22,7 @@ WARNING: Image format was not specified for '/home/eblake/qemu/tests/qemu-iotest size: 1024 min block: 1 -[{ "start": 0, "length": 1000, "depth": 0, "zero": false, "data": true, "offset": OFFSET}, -{ "start": 1000, "length": 24, "depth": 0, "zero": true, "data": true, "offset": OFFSET}] +[{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] 1 KiB (0x400) bytes allocated at offset 0 bytes (0x0) === Encrypted qcow2 file backed by unaligned raw image === From patchwork Wed Apr 3 03:05:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 1075278 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44YrfL1rNYz9sS3 for ; Wed, 3 Apr 2019 14:09:46 +1100 (AEDT) Received: from localhost ([127.0.0.1]:41062 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWHY-0003E1-56 for incoming@patchwork.ozlabs.org; Tue, 02 Apr 2019 23:09:44 -0400 Received: from eggs.gnu.org ([209.51.188.92]:41834) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWDk-0000eF-Kt for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hBWDi-0005Vv-HK for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49792) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hBWDc-0005Ct-AX; Tue, 02 Apr 2019 23:05:42 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 85FBE308FC23; Wed, 3 Apr 2019 03:05:39 +0000 (UTC) Received: from blue.redhat.com (ovpn-116-168.phx2.redhat.com [10.3.116.168]) by smtp.corp.redhat.com (Postfix) with ESMTP id B4D1219C68; Wed, 3 Apr 2019 03:05:38 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Tue, 2 Apr 2019 22:05:25 -0500 Message-Id: <20190403030526.12258-7-eblake@redhat.com> In-Reply-To: <20190403030526.12258-1-eblake@redhat.com> References: <20190403030526.12258-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Wed, 03 Apr 2019 03:05:39 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 6/7] nbd/server: Avoid unaligned read/block_status from backing X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Kevin Wolf , vsementsov@virtuozzo.com, qemu-block@nongnu.org, Max Reitz , Stefan Hajnoczi , jsnow@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The NBD server code used bdrv_block_status_above() to determine where to fragment structured read and block status replies. However, the protocol can only advertise the active layer's minimum block size; if the active layer is backed by another file with smaller alignment, then we can end up leaking unaligned results back through to the client, in violation of the spec. Fix this by exposing a new bdrv_block_status_aligned() function around the recently-added internal bdrv_co_block_status_aligned, to guarantee that all block status answers from backing layers are rounded up to the alignment of the current layer. Note that the underlying function still requires aligned boundaries, but the public function copes with unaligned inputs. iotest 241 does not change in output, but running it manually with traces shows the improved behavior; furthermore, reverting 737d3f5244 but leaving this patch in place lets the test pass (whereas before the test would fail because the client had to work around the server's non-compliance). Note that while this fixes NBD_CMD_READ and NBD_CMD_BLOCK_STATUS for base:allocation (because those rely on bdrv_block_status*), we still have a theoretical compliance problem with NBD_CMD_BLOCK_STATUS for qemu:dirty-bitmap:NN when visiting a bitmap created at a smaller granularity than what we advertised. On the other hand, dirty bitmap granularity cannot go below 512, and without the use of blkdebug (which in turn needs patches before it can properly find dirty bitmaps or track status from qcow2 backing chains), I can't provoke an NBD server to advertise an alignment larger than 512. Signed-off-by: Eric Blake --- include/block/block.h | 2 ++ block/io.c | 47 ++++++++++++++++++++++++++++++++++++++----- nbd/server.c | 14 ++++++------- 3 files changed, 50 insertions(+), 13 deletions(-) diff --git a/include/block/block.h b/include/block/block.h index c7a26199aa0..3f38fa57f2d 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -444,6 +444,8 @@ int bdrv_block_status(BlockDriverState *bs, int64_t offset, int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base, int64_t offset, int64_t bytes, int64_t *pnum, int64_t *map, BlockDriverState **file); +int bdrv_block_status_aligned(BlockDriverState *bs, int64_t offset, + int64_t bytes, int64_t *pnum); int bdrv_is_allocated(BlockDriverState *bs, int64_t offset, int64_t bytes, int64_t *pnum); int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base, diff --git a/block/io.c b/block/io.c index 936877d3745..5e6c0d06018 100644 --- a/block/io.c +++ b/block/io.c @@ -1981,6 +1981,7 @@ int bdrv_flush_all(void) typedef struct BdrvCoBlockStatusData { BlockDriverState *bs; BlockDriverState *base; + uint32_t align; bool want_zero; int64_t offset; int64_t bytes; @@ -2298,6 +2299,7 @@ early_out: static int coroutine_fn bdrv_co_block_status_above(BlockDriverState *bs, BlockDriverState *base, + uint32_t align, bool want_zero, int64_t offset, int64_t bytes, @@ -2311,8 +2313,8 @@ static int coroutine_fn bdrv_co_block_status_above(BlockDriverState *bs, assert(bs != base); for (p = bs; p != base; p = backing_bs(p)) { - ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map, - file); + ret = bdrv_co_block_status_aligned(p, align, want_zero, offset, bytes, + pnum, map, file); if (ret < 0) { break; } @@ -2342,7 +2344,7 @@ static void coroutine_fn bdrv_block_status_above_co_entry(void *opaque) BdrvCoBlockStatusData *data = opaque; data->ret = bdrv_co_block_status_above(data->bs, data->base, - data->want_zero, + data->align, data->want_zero, data->offset, data->bytes, data->pnum, data->map, data->file); data->done = true; @@ -2356,6 +2358,7 @@ static void coroutine_fn bdrv_block_status_above_co_entry(void *opaque) */ static int bdrv_common_block_status_above(BlockDriverState *bs, BlockDriverState *base, + uint32_t align, bool want_zero, int64_t offset, int64_t bytes, int64_t *pnum, int64_t *map, @@ -2365,6 +2368,7 @@ static int bdrv_common_block_status_above(BlockDriverState *bs, BdrvCoBlockStatusData data = { .bs = bs, .base = base, + .align = align, .want_zero = want_zero, .offset = offset, .bytes = bytes, @@ -2389,7 +2393,7 @@ int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base, int64_t offset, int64_t bytes, int64_t *pnum, int64_t *map, BlockDriverState **file) { - return bdrv_common_block_status_above(bs, base, true, offset, bytes, + return bdrv_common_block_status_above(bs, base, 1, true, offset, bytes, pnum, map, file); } @@ -2400,13 +2404,46 @@ int bdrv_block_status(BlockDriverState *bs, int64_t offset, int64_t bytes, offset, bytes, pnum, map, file); } +/* + * Similar to bdrv_block_status_above(bs, NULL, ...), but ensures that + * the answer matches the minimum alignment of bs (smaller alignments + * in layers above will not leak through to the active layer). It is + * assumed that callers do not care about the resulting mapping of + * offsets to an underlying BDS. + */ +int bdrv_block_status_aligned(BlockDriverState *bs, int64_t offset, + int64_t bytes, int64_t *pnum) +{ + /* Widen the request to aligned boundaries */ + int64_t aligned_offset, aligned_bytes; + uint32_t align = bs->bl.request_alignment; + int ret; + + assert(pnum); + aligned_offset = QEMU_ALIGN_DOWN(offset, align); + aligned_bytes = ROUND_UP(offset + bytes, align) - aligned_offset; + ret = bdrv_common_block_status_above(bs, NULL, align, true, aligned_offset, + aligned_bytes, pnum, NULL, NULL); + if (ret < 0) { + *pnum = 0; + return ret; + } + assert(*pnum && QEMU_IS_ALIGNED(*pnum, align) && + align > offset - aligned_offset); + *pnum -= offset - aligned_offset; + if (*pnum > bytes) { + *pnum = bytes; + } + return ret; +} + int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset, int64_t bytes, int64_t *pnum) { int ret; int64_t dummy; - ret = bdrv_common_block_status_above(bs, backing_bs(bs), false, offset, + ret = bdrv_common_block_status_above(bs, backing_bs(bs), 1, false, offset, bytes, pnum ? pnum : &dummy, NULL, NULL); if (ret < 0) { diff --git a/nbd/server.c b/nbd/server.c index faa3c7879fd..576deb931c8 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1792,7 +1792,7 @@ static int coroutine_fn nbd_co_send_structured_error(NBDClient *client, } /* Do a sparse read and send the structured reply to the client. - * Returns -errno if sending fails. bdrv_block_status_above() failure is + * Returns -errno if sending fails. bdrv_block_status_aligned() failure is * reported to the client, at which point this function succeeds. */ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client, @@ -1808,10 +1808,9 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client, while (progress < size) { int64_t pnum; - int status = bdrv_block_status_above(blk_bs(exp->blk), NULL, - offset + progress, - size - progress, &pnum, NULL, - NULL); + int status = bdrv_block_status_aligned(blk_bs(exp->blk), + offset + progress, + size - progress, &pnum); bool final; if (status < 0) { @@ -1864,7 +1863,7 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client, * length encoded (which may be smaller than the original), and update * @nb_extents to the number of extents used. * - * Returns zero on success and -errno on bdrv_block_status_above failure. + * Returns zero on success and -errno on bdrv_block_status_aligned failure. */ static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset, uint64_t *bytes, NBDExtent *extents, @@ -1878,8 +1877,7 @@ static int blockstatus_to_extents(BlockDriverState *bs, uint64_t offset, while (remaining_bytes) { uint32_t flags; int64_t num; - int ret = bdrv_block_status_above(bs, NULL, offset, remaining_bytes, - &num, NULL, NULL); + int ret = bdrv_block_status_aligned(bs, offset, remaining_bytes, &num); if (ret < 0) { return ret; From patchwork Wed Apr 3 03:05:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 1075274 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44Yrb92mQ1z9sRW for ; Wed, 3 Apr 2019 14:07:01 +1100 (AEDT) Received: from localhost ([127.0.0.1]:40285 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWEt-000127-7X for incoming@patchwork.ozlabs.org; Tue, 02 Apr 2019 23:06:59 -0400 Received: from eggs.gnu.org ([209.51.188.92]:41833) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBWDk-0000eE-Ko for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hBWDi-0005Vu-Gy for qemu-devel@nongnu.org; Tue, 02 Apr 2019 23:05:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39018) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hBWDe-0005F7-7B; Tue, 02 Apr 2019 23:05:42 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 345413082B40; Wed, 3 Apr 2019 03:05:40 +0000 (UTC) Received: from blue.redhat.com (ovpn-116-168.phx2.redhat.com [10.3.116.168]) by smtp.corp.redhat.com (Postfix) with ESMTP id AF3FB19C68; Wed, 3 Apr 2019 03:05:39 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Tue, 2 Apr 2019 22:05:26 -0500 Message-Id: <20190403030526.12258-8-eblake@redhat.com> In-Reply-To: <20190403030526.12258-1-eblake@redhat.com> References: <20190403030526.12258-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Wed, 03 Apr 2019 03:05:40 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH for-4.1 7/7] nbd/server: Avoid unaligned dirty-bitmap status X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: vsementsov@virtuozzo.com, jsnow@redhat.com, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The NBD spec requires that responses to NBD_CMD_BLOCK_STATUS be aligned to the server's advertised min_block alignment, if the client agreed to abide by alignments. In general, since dirty bitmap granularity cannot go below 512, and it is hard to provoke qcow2 to go above an alignment of 512, this is not an issue. And until the block layer can see through filters, the moment you use blkdebug, you can no longer export dirty bitmaps over NBD. But once we fix the traversal through filters, then blkdebug can create a scenario where the server is provoked into supplying a non-compliant reply. Thus, it is time to fix the dirty bitmap code to round requests to aligned boundaries. This hack patch lets blkdebug be used; although note that Max has proposed a more complete solution https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg03534.html | diff --git a/nbd/server.c b/nbd/server.c | index 576deb931c8..1d370b5b2c7 100644 | --- a/nbd/server.c | +++ b/nbd/server.c | @@ -1507,6 +1507,9 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset, | BdrvDirtyBitmap *bm = NULL; | | while (true) { | + while (bs->drv && bs->drv->is_filter && bs->file) { | + bs = bs->file->bs; | + } | bm = bdrv_find_dirty_bitmap(bs, bitmap); | if (bm != NULL || bs->backing == NULL) { | break; Then the following sequence creates a dirty bitmap with granularity 512, exposed through a blkdebug interface with minimum block size 4k; correct behavior is to see a map showing the first 4k as "data":false (corresponding to the dirty bit in the 2nd of 8 sectors, rounded up). Without this patch, the code instead showed the entire image as "data":true (due to the client working around the server non-compliance by coalescing regions, but assuming that data:true takes precedence, which is the opposite of the behavior we want during x-dirty-bitmap). $ printf %01000d 0 > img $ qemu-img create -f qcow2 -F raw -b img img.wrap 64k $ qemu-system-x86_64 -nodefaults -nographic -qmp stdio {'execute':'qmp_capabilities'} {'execute':'blockdev-add','arguments':{'node-name':'n','driver':'qcow2','file':{'driver':'file','filename':'img.wrap'}}} {'execute':'nbd-server-start','arguments':{'addr':{'type':'inet','data':{'host':'localhost','port':'10809'}}}} {'execute':'block-dirty-bitmap-add','arguments':{'node':'n','name':'b','persistent':true,'granularity':512}} {'execute':'nbd-server-add','arguments':{'device':'n','bitmap':'b','writable':true}} ^z $ bg $ qemu-io -f raw -c 'w 513 1' nbd://localhost:10809 $ fg {'execute':'nbd-server-remove','arguments':{'name':'n'}} {'execute':'block-dirty-bitmap-disable','arguments':{'node':'n','name':'b'}} {'execute':'blockdev-add','arguments':{'driver':'blkdebug','align':4096,'image':'n','node-name':'wrap'}} {'execute':'nbd-server-add','arguments':{'device':'wrap','bitmap':'b'}} ^z $ bg $ qemu-img map --output=json --image-opts driver=nbd,x-dirty-bitmap=qemu:dirty-bitmap:b,server.type=inet,server.host=localhost,server.port=10809,export=wrap $ fg {'execute':'quit'} Signed-off-by: Eric Blake --- Not for 4.0; once Max's patches land to see through filters, then this commit message will need to be rewritten, and the steps outlined above instead turned into an addition for iotest 241. --- nbd/server.c | 40 ++++++++++++++++++++++++++++++++++------ 1 file changed, 34 insertions(+), 6 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 576deb931c8..82da0bae9ba 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1982,7 +1982,8 @@ static int nbd_co_send_block_status(NBDClient *client, uint64_t handle, * byte length encoded (which may be smaller or larger than the * original), and return the number of extents used. */ -static unsigned int bitmap_to_extents(BdrvDirtyBitmap *bitmap, uint64_t offset, +static unsigned int bitmap_to_extents(uint32_t align, + BdrvDirtyBitmap *bitmap, uint64_t offset, uint64_t *length, NBDExtent *extents, unsigned int nb_extents, bool dont_fragment) @@ -2008,13 +2009,30 @@ static unsigned int bitmap_to_extents(BdrvDirtyBitmap *bitmap, uint64_t offset, bdrv_set_dirty_iter(it, begin); end = bdrv_dirty_iter_next(it); } - if (end == -1 || end - begin > UINT32_MAX) { + if (end == -1 || end - begin > UINT32_MAX - align) { /* Cap to an aligned value < 4G beyond begin. */ end = MIN(bdrv_dirty_bitmap_size(bitmap), begin + UINT32_MAX + 1 - - bdrv_dirty_bitmap_granularity(bitmap)); + MAX(align, bdrv_dirty_bitmap_granularity(bitmap))); next_dirty = dirty; } + /* + * Round unaligned bits: any transition mid-alignment makes + * that entire aligned region appear dirty + */ + if (!QEMU_IS_ALIGNED(end, align)) { + if (dirty) { + end = QEMU_ALIGN_UP(end, align); + } else if (end - begin < align) { + end = begin + align; + dirty = true; + } else { + end = QEMU_ALIGN_DOWN(end, align); + } + if (end < bdrv_dirty_bitmap_size(bitmap)) { + next_dirty = bdrv_get_dirty_locked(NULL, bitmap, end); + } + } if (dont_fragment && end > overall_end) { end = overall_end; } @@ -2042,11 +2060,21 @@ static int nbd_co_send_bitmap(NBDClient *client, uint64_t handle, { int ret; unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BITMAP_EXTENTS; - NBDExtent *extents = g_new(NBDExtent, nb_extents); + NBDExtent *extents; uint64_t final_length = length; - nb_extents = bitmap_to_extents(bitmap, offset, &final_length, extents, - nb_extents, dont_fragment); + /* Easiest to just refuse to answer an unaligned query */ + if (client->check_align && + !QEMU_IS_ALIGNED(offset | length, client->check_align)) { + return nbd_co_send_structured_error(client, handle, -EINVAL, + "unaligned dirty bitmap request", + errp); + } + + extents = g_new(NBDExtent, nb_extents); + nb_extents = bitmap_to_extents(client->check_align ?: 1, bitmap, offset, + &final_length, extents, nb_extents, + dont_fragment); ret = nbd_co_send_extents(client, handle, extents, nb_extents, final_length, last, context_id, errp); From patchwork Thu Apr 4 14:52:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 1077395 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44ZmCR5DK6z9sNw for ; Fri, 5 Apr 2019 01:53:07 +1100 (AEDT) Received: from localhost ([127.0.0.1]:55962 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hC3jl-0006OC-H7 for incoming@patchwork.ozlabs.org; Thu, 04 Apr 2019 10:53:05 -0400 Received: from eggs.gnu.org ([209.51.188.92]:49139) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hC3jL-0006KN-Ql for qemu-devel@nongnu.org; Thu, 04 Apr 2019 10:52:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hC3jJ-00073m-SQ for qemu-devel@nongnu.org; Thu, 04 Apr 2019 10:52:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44538) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hC3jF-0006qJ-As; Thu, 04 Apr 2019 10:52:34 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C51E1CAA67; Thu, 4 Apr 2019 14:52:30 +0000 (UTC) Received: from blue.redhat.com (ovpn-116-67.phx2.redhat.com [10.3.116.67]) by smtp.corp.redhat.com (Postfix) with ESMTP id BFACE5DD97; Thu, 4 Apr 2019 14:52:27 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Thu, 4 Apr 2019 09:52:26 -0500 Message-Id: <20190404145226.32649-1-eblake@redhat.com> In-Reply-To: <20190403030526.12258-1-eblake@redhat.com> References: <20190403030526.12258-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 04 Apr 2019 14:52:30 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH for-4.0? 8/7] nbd/client: Fix error message for server with unusable sizing X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: vsementsov@virtuozzo.com, jsnow@redhat.com, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Add a missing space to the error message used when giving up on a server that insists on an alignment which renders the last few bytes of the export unreadable. Fixes: 3add3ab78 Signed-off-by: Eric Blake Reviewed-by: Kevin Wolf --- We've already concluded that 5-7 are too risky for 4.0 (so qemu 4.0 NBD servers will have non-compliance issues in alignment corner cases, although in fewer places than where 3.1 had similar bugs; and the 4.0 client is able to work around those non-compliance cases). But I'd still like a review on 1-3 and this patch 8, as trivial improvements worth having in 4.0. I'm borderline on whether having patch 4 in 4.0 adds any value. nbd/client.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/nbd/client.c b/nbd/client.c index 427980bdd22..4de30630c73 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -428,7 +428,7 @@ static int nbd_opt_info_or_go(QIOChannel *ioc, uint32_t opt, } if (info->min_block && !QEMU_IS_ALIGNED(info->size, info->min_block)) { - error_setg(errp, "export size %" PRIu64 "is not multiple of " + error_setg(errp, "export size %" PRIu64 " is not multiple of " "minimum block size %" PRIu32, info->size, info->min_block); nbd_send_opt_abort(ioc);