From patchwork Thu Oct 27 15:52:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam Li X-Patchwork-Id: 1695500 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Lhwofxes; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Myr0C3gKNz1ygr for ; Fri, 28 Oct 2022 02:56:59 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oo5BR-00083H-I3; Thu, 27 Oct 2022 11:52:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oo5BQ-0007nK-4u; Thu, 27 Oct 2022 11:52:40 -0400 Received: from mail-pj1-x1029.google.com ([2607:f8b0:4864:20::1029]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oo5BN-0002Iz-Se; Thu, 27 Oct 2022 11:52:39 -0400 Received: by mail-pj1-x1029.google.com with SMTP id ez6so1962224pjb.1; Thu, 27 Oct 2022 08:52:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kijn3u/8+VZyPJ8m0dg566lEKL9c/ISZTbMARLxwhF4=; b=Lhwofxespsfb896TvxwDDFp30gkzFZPY6E5EOgkUwL4kPbw86A6b6wmAqsmtfnwui5 NTFUZM8Evei0TWg5wx4G/VSn1IW7t2NcuaBHLI0WC1H0753/DisWaa3pPY9eHAgeaRsA uplS6luI0Jx+I0P7Tl0GAPRKRPIuJf08Ax2H9sSW9pmblkTe5SteaVQMj06ZT2i2tUu5 Dd98+CrFkvX6Afv+OpBVv3U9NgVhwmBzlP+KzYxt8FgYOLSuDZvXfAG3YgGk42a5nlZa 1iCVx1Zv0CKSIcOvvlxKqegag9CVR4we5ZVMhs+wyfwONVJYQJPLjhmcZhkPxdtajKCN WX7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kijn3u/8+VZyPJ8m0dg566lEKL9c/ISZTbMARLxwhF4=; b=vkf7uhchD4ChAZomS+lJZ4X0+9L9rhL6rGNKvuC9ieevOkGSKpyXSgAR6OoezdtUYm FGbNgQrv58eFnCymL6tX4ZNelPDX6oRHAVf8j119oE5hBNF9uzRBuT7lVNRk40jellIu EiH4jJhBktu9JXPRVhfnKWwOE5v9P+KRYxNNUMlBxqRYHKl4HN/f152hGrBcN7EtpagD nDyjizpgeYcaR+SF/QdXrm7oR5BJh39LwkT/89K36N2dLIlO+oxB9JHxgeM+GC3zxohE TW2lpW+Ul2pzwTwyW2dJ9kw4HZfIgzZtcLqWEWVXxqwMiSKeRlVrGwhTwgZ8Hw1O/Fi4 XRiA== X-Gm-Message-State: ACrzQf2yaoqSv0KcF/tYm3B0HMmCXfGBbY9HS63rOTVjqsONMD+XxAgB zmsmiCWs/zlVlU57R5stKE3jXOIq7eQ3AAmq X-Google-Smtp-Source: AMsMyM4XsUNFJHXWuL8i7WKBPRXyxstIC+W4h0f1Ren0RuLRprPSVzaD+YYGTyKlh0VZ7CuNv0gqVQ== X-Received: by 2002:a17:90b:38c:b0:212:fe14:4ba0 with SMTP id ga12-20020a17090b038c00b00212fe144ba0mr11035627pjb.138.1666885955339; Thu, 27 Oct 2022 08:52:35 -0700 (PDT) Received: from roots.. ([112.44.202.248]) by smtp.gmail.com with ESMTPSA id f21-20020a623815000000b0056c058ab000sm1327744pfa.155.2022.10.27.08.52.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Oct 2022 08:52:35 -0700 (PDT) From: Sam Li To: qemu-devel@nongnu.org Cc: damien.lemoal@opensource.wdc.com, Stefano Garzarella , Stefan Hajnoczi , dmitry.fomichev@wdc.com, qemu-block@nongnu.org, Julia Suvorova , hare@suse.de, Kevin Wolf , Hanna Reitz , Fam Zheng , Aarushi Mehta , Sam Li Subject: [PATCH v5 1/4] file-posix: add tracking of the zone write pointers Date: Thu, 27 Oct 2022 23:52:12 +0800 Message-Id: <20221027155215.21374-2-faithilikerun@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221027155215.21374-1-faithilikerun@gmail.com> References: <20221027155215.21374-1-faithilikerun@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1029; envelope-from=faithilikerun@gmail.com; helo=mail-pj1-x1029.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Qemu-devel" Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Since Linux doesn't have a user API to issue zone append operations to zoned devices from user space, the file-posix driver is modified to add zone append emulation using regular writes. To do this, the file-posix driver tracks the wp location of all zones of the device. It uses an array of uint64_t. The most significant bit of each wp location indicates if the zone type is conventional zones. The zones wp can be changed due to the following operations issued: - zone reset: change the wp to the start offset of that zone - zone finish: change to the end location of that zone - write to a zone - zone append Signed-off-by: Sam Li --- block/file-posix.c | 153 ++++++++++++++++++++++++++++++- include/block/block-common.h | 14 +++ include/block/block_int-common.h | 3 + 3 files changed, 166 insertions(+), 4 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index fe52e91da4..fbab23f450 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1323,6 +1323,77 @@ static int hdev_get_max_segments(int fd, struct stat *st) #endif } +#if defined(CONFIG_BLKZONED) +static int get_zones_wp(int fd, BlockZoneWps *wps, int64_t offset, + unsigned int nrz) { + struct blk_zone *blkz; + size_t rep_size; + uint64_t sector = offset >> BDRV_SECTOR_BITS; + int ret, n = 0, i = 0; + rep_size = sizeof(struct blk_zone_report) + nrz * sizeof(struct blk_zone); + g_autofree struct blk_zone_report *rep = NULL; + + rep = g_malloc(rep_size); + blkz = (struct blk_zone *)(rep + 1); + while (n < nrz) { + memset(rep, 0, rep_size); + rep->sector = sector; + rep->nr_zones = nrz - n; + + do { + ret = ioctl(fd, BLKREPORTZONE, rep); + } while (ret != 0 && errno == EINTR); + if (ret != 0) { + error_report("%d: ioctl BLKREPORTZONE at %" PRId64 " failed %d", + fd, offset, errno); + return -errno; + } + + if (!rep->nr_zones) { + break; + } + + for (i = 0; i < rep->nr_zones; i++, n++) { + /* + * The wp tracking cares only about sequential writes required and + * sequential write preferred zones so that the wp can advance to + * the right location. + * Use the most significant bit of the wp location to indicate the + * zone type: 0 for SWR/SWP zones and 1 for conventional zones. + */ + if (blkz[i].type == BLK_ZONE_TYPE_CONVENTIONAL) { + wps->wp[i] = 1ULL << 63; + } else { + switch(blkz[i].cond) { + case BLK_ZONE_COND_FULL: + case BLK_ZONE_COND_READONLY: + /* Zone not writable */ + wps->wp[i] = (blkz[i].start + blkz[i].len) << BDRV_SECTOR_BITS; + break; + case BLK_ZONE_COND_OFFLINE: + /* Zone not writable nor readable */ + wps->wp[i] = (blkz[i].start) << BDRV_SECTOR_BITS; + break; + default: + wps->wp[i] = blkz[i].wp << BDRV_SECTOR_BITS; + break; + } + } + } + sector = blkz[i - 1].start + blkz[i - 1].len; + } + + return 0; +} + +static void update_zones_wp(int fd, BlockZoneWps *wps, int64_t offset, + unsigned int nrz) { + if (get_zones_wp(fd, wps, offset, nrz) < 0) { + error_report("update zone wp failed"); + } +} +#endif + static void raw_refresh_limits(BlockDriverState *bs, Error **errp) { BDRVRawState *s = bs->opaque; @@ -1412,6 +1483,15 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp) if (ret >= 0) { bs->bl.max_active_zones = ret; } + + bs->bl.wps = g_malloc(sizeof(BlockZoneWps) + sizeof(int64_t) * ret); + ret = get_zones_wp(s->fd, bs->bl.wps, 0, bs->bl.nr_zones); + if (ret < 0) { + error_setg_errno(errp, -ret, "report wps failed"); + g_free(bs->bl.wps); + return; + } + qemu_co_mutex_init(&bs->bl.wps->colock); return; } out: @@ -2339,9 +2419,15 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset, { BDRVRawState *s = bs->opaque; RawPosixAIOData acb; + int ret; if (fd_open(bs) < 0) return -EIO; +#if defined(CONFIG_BLKZONED) + if (bs->bl.wps) { + qemu_co_mutex_lock(&bs->bl.wps->colock); + } +#endif /* * When using O_DIRECT, the request must be aligned to be able to use @@ -2355,14 +2441,16 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset, } else if (s->use_linux_io_uring) { LuringState *aio = aio_get_linux_io_uring(bdrv_get_aio_context(bs)); assert(qiov->size == bytes); - return luring_co_submit(bs, aio, s->fd, offset, qiov, type); + ret = luring_co_submit(bs, aio, s->fd, offset, qiov, type); + goto out; #endif #ifdef CONFIG_LINUX_AIO } else if (s->use_linux_aio) { LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs)); assert(qiov->size == bytes); - return laio_co_submit(bs, aio, s->fd, offset, qiov, type, + ret = laio_co_submit(bs, aio, s->fd, offset, qiov, type, s->aio_max_batch); + goto out; #endif } @@ -2379,7 +2467,32 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset, }; assert(qiov->size == bytes); - return raw_thread_pool_submit(bs, handle_aiocb_rw, &acb); + ret = raw_thread_pool_submit(bs, handle_aiocb_rw, &acb); + +out: +#if defined(CONFIG_BLKZONED) + BlockZoneWps *wps = bs->bl.wps; + if (ret == 0) { + if (type & QEMU_AIO_WRITE && wps && bs->bl.zone_size) { + int index = offset / bs->bl.zone_size; + if (!BDRV_ZT_IS_CONV(wps->wp[index])) { + /* Advance the wp if needed */ + if (offset + bytes > wps->wp[index]) { + wps->wp[index] = offset + bytes; + } + } + } + } else { + if (type & QEMU_AIO_WRITE) { + update_zones_wp(s->fd, bs->bl.wps, 0, 1); + } + } + + if (wps) { + qemu_co_mutex_unlock(&wps->colock); + } +#endif + return ret; } static int coroutine_fn raw_co_preadv(BlockDriverState *bs, int64_t offset, @@ -2488,6 +2601,11 @@ static void raw_close(BlockDriverState *bs) BDRVRawState *s = bs->opaque; if (s->fd >= 0) { +#if defined(CONFIG_BLKZONED) + if (bs->bl.wps) { + g_free(bs->bl.wps); + } +#endif qemu_close(s->fd); s->fd = -1; } @@ -3288,6 +3406,7 @@ static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op, const char *op_name; unsigned long zo; int ret; + BlockZoneWps *wps = bs->bl.wps; int64_t capacity = bs->total_sectors << BDRV_SECTOR_BITS; zone_size = bs->bl.zone_size; @@ -3305,6 +3424,14 @@ static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op, return -EINVAL; } + qemu_co_mutex_lock(&wps->colock); + uint32_t index = offset / bs->bl.zone_size; + if (BDRV_ZT_IS_CONV(wps->wp[index]) && len != capacity) { + error_report("zone mgmt operations are not allowed for conventional zones"); + ret = -EIO; + goto out; + } + switch (op) { case BLK_ZO_OPEN: op_name = "BLKOPENZONE"; @@ -3324,7 +3451,8 @@ static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op, break; default: error_report("Unsupported zone op: 0x%x", op); - return -ENOTSUP; + ret = -ENOTSUP; + goto out; } acb = (RawPosixAIOData) { @@ -3342,10 +3470,27 @@ static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op, len >> BDRV_SECTOR_BITS); ret = raw_thread_pool_submit(bs, handle_aiocb_zone_mgmt, &acb); if (ret != 0) { + update_zones_wp(s->fd, wps, offset, index); ret = -errno; error_report("ioctl %s failed %d", op_name, ret); + goto out; } + if (zo == BLKRESETZONE && len == capacity) { + for (int i = 0; i < bs->bl.nr_zones; ++i) { + if (!BDRV_ZT_IS_CONV(wps->wp[index])) { + wps->wp[i] = i * bs->bl.zone_size; + } + } + } else if (zo == BLKRESETZONE) { + wps->wp[index] = offset; + } else if (zo == BLKFINISHZONE) { + /* The zoned device allows the last zone smaller that the zone size. */ + wps->wp[index] = offset + len; + } + +out: + qemu_co_mutex_unlock(&wps->colock); return ret; } #endif diff --git a/include/block/block-common.h b/include/block/block-common.h index 4025df380e..1abd0d5b65 100644 --- a/include/block/block-common.h +++ b/include/block/block-common.h @@ -92,6 +92,14 @@ typedef struct BlockZoneDescriptor { BlockZoneState state; } BlockZoneDescriptor; +/* + * Track write pointers of a zone in bytes. + */ +typedef struct BlockZoneWps { + CoMutex colock; + uint64_t wp[]; +} BlockZoneWps; + typedef struct BlockDriverInfo { /* in bytes, 0 if irrelevant */ int cluster_size; @@ -205,6 +213,12 @@ typedef enum { #define BDRV_SECTOR_BITS 9 #define BDRV_SECTOR_SIZE (1ULL << BDRV_SECTOR_BITS) +/* + * Get the first most significant bit of wp. If it is zero, then + * the zone type is SWR. + */ +#define BDRV_ZT_IS_CONV(wp) (wp & (1ULL << 63)) + #define BDRV_REQUEST_MAX_SECTORS MIN_CONST(SIZE_MAX >> BDRV_SECTOR_BITS, \ INT_MAX >> BDRV_SECTOR_BITS) #define BDRV_REQUEST_MAX_BYTES (BDRV_REQUEST_MAX_SECTORS << BDRV_SECTOR_BITS) diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h index 2c057a9980..4effff3aa1 100644 --- a/include/block/block_int-common.h +++ b/include/block/block_int-common.h @@ -854,6 +854,9 @@ typedef struct BlockLimits { /* maximum number of active zones */ int64_t max_active_zones; + + /* array of write pointers' location of each zone in the zoned device. */ + BlockZoneWps *wps; } BlockLimits; typedef struct BdrvOpBlocker BdrvOpBlocker; From patchwork Thu Oct 27 15:52:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam Li X-Patchwork-Id: 1695494 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=YrzKbVIv; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Myqw76t1sz20S2 for ; Fri, 28 Oct 2022 02:53:27 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oo5Bb-0000rY-59; Thu, 27 Oct 2022 11:52:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oo5BY-0000ci-H9; Thu, 27 Oct 2022 11:52:48 -0400 Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oo5BU-0002Jg-Jp; Thu, 27 Oct 2022 11:52:46 -0400 Received: by mail-pg1-x52d.google.com with SMTP id s196so1875289pgs.3; Thu, 27 Oct 2022 08:52:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DqMfDLVQb6KW3GUhA5mdqOBzr6JRgJt4lHM65uuaYlk=; b=YrzKbVIvGWaalx0+ihHkW+7VqSXElLiYOirzU748QD4QcVCRJQrncJwFjmfJb1OWU1 KVfai3H3e1ipkfXTPPAhFpfOZ/jmzTtoExn7ZDrQPa8BXLlPY7IZ8HX7dW/5Z8lyRKb7 h+hRUTDQqwryQQwMVYTMWLbUlXiLns5WggAuecUT7LfTSFHoKBhAsbt9jC6uEKptf2Yy E48S12r+vmdZwwF4vhlLS9zEAboENs3cLG653E4TF7Il44l3ysS/otpf3y9dockt6cQQ 1ZAR4d/iF0cYSmtU1JHJxf1G9V63KosMCGuKCq5dadQrUdd1a3g+3PFQ2XNCjOCjmIH2 AIYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DqMfDLVQb6KW3GUhA5mdqOBzr6JRgJt4lHM65uuaYlk=; b=Vf7Kkso//DqOEbTwPPTvnkOCi6Y+SDQcewa0YuV/iptRF9uZ2NiW0ns7RCnUMcHPug +aa0F0GdhWa3rGkdWvtNTKibYZGSxipb2Yke2YEP5WvMu26buPi87YvEjw81978Z/lIc SOVZF1gWkMwplzYwS9cYTaP1AYGkQy1DiB1qIDEODkUuMbefzRbluNxbzI32qMGC/VCm 1gioQPm2nlDsBw/rR+xBIGHJ6zfypJknX0lsiSvosk2LoN5VdhIeJM7Z5LJjQ7EtulfU Oj1inloGEx+QL/UE1r8+D1Fl26DUWvgYrTnlCqJFa5LCpYvpzhZs4N2QKBt2VIDWh2IR eaBg== X-Gm-Message-State: ACrzQf29d4mqn5g7+ShWTgDKuXE24S6R99GD2XPARmt3xXpGRDtllj2F 0E1GXpa+xRn2KE+Dbpw8nJBCIOQ3GJNvxUaq X-Google-Smtp-Source: AMsMyM59qM9S9NoyeT0tybl7uM7iOa4tWyDPtmz2eApmfbOz1y+EvSNuBrBCfxpKZe1R+IWBpqtA9A== X-Received: by 2002:a63:2bd4:0:b0:451:5df1:4b15 with SMTP id r203-20020a632bd4000000b004515df14b15mr43488411pgr.518.1666885962027; Thu, 27 Oct 2022 08:52:42 -0700 (PDT) Received: from roots.. ([112.44.202.248]) by smtp.gmail.com with ESMTPSA id f21-20020a623815000000b0056c058ab000sm1327744pfa.155.2022.10.27.08.52.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Oct 2022 08:52:41 -0700 (PDT) From: Sam Li To: qemu-devel@nongnu.org Cc: damien.lemoal@opensource.wdc.com, Stefano Garzarella , Stefan Hajnoczi , dmitry.fomichev@wdc.com, qemu-block@nongnu.org, Julia Suvorova , hare@suse.de, Kevin Wolf , Hanna Reitz , Fam Zheng , Aarushi Mehta , Sam Li Subject: [PATCH v5 2/4] block: introduce zone append write for zoned devices Date: Thu, 27 Oct 2022 23:52:13 +0800 Message-Id: <20221027155215.21374-3-faithilikerun@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221027155215.21374-1-faithilikerun@gmail.com> References: <20221027155215.21374-1-faithilikerun@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52d; envelope-from=faithilikerun@gmail.com; helo=mail-pg1-x52d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Qemu-devel" Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org A zone append command is a write operation that specifies the first logical block of a zone as the write position. When writing to a zoned block device using zone append, the byte offset of writes is pointing to the write pointer of that zone. Upon completion the device will respond with the position the data has been written in the zone. Signed-off-by: Sam Li --- block/block-backend.c | 65 +++++++++++++++++++++++++++++++ block/file-posix.c | 60 +++++++++++++++++++++++++--- block/io.c | 21 ++++++++++ block/io_uring.c | 4 ++ block/linux-aio.c | 3 ++ block/raw-format.c | 8 ++++ include/block/block-io.h | 3 ++ include/block/block_int-common.h | 5 +++ include/block/raw-aio.h | 4 +- include/sysemu/block-backend-io.h | 9 +++++ 10 files changed, 176 insertions(+), 6 deletions(-) diff --git a/block/block-backend.c b/block/block-backend.c index 731f23e816..26cc8f9722 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1439,6 +1439,9 @@ typedef struct BlkRwCo { struct { unsigned long op; } zone_mgmt; + struct { + int64_t *offset; + } zone_append; }; } BlkRwCo; @@ -1871,6 +1874,47 @@ BlockAIOCB *blk_aio_zone_mgmt(BlockBackend *blk, BlockZoneOp op, return &acb->common; } +static void coroutine_fn blk_aio_zone_append_entry(void *opaque) +{ + BlkAioEmAIOCB *acb = opaque; + BlkRwCo *rwco = &acb->rwco; + + rwco->ret = blk_co_zone_append(rwco->blk, rwco->zone_append.offset, + rwco->iobuf, rwco->flags); + blk_aio_complete(acb); +} + +BlockAIOCB *blk_aio_zone_append(BlockBackend *blk, int64_t *offset, + QEMUIOVector *qiov, BdrvRequestFlags flags, + BlockCompletionFunc *cb, void *opaque) { + BlkAioEmAIOCB *acb; + Coroutine *co; + IO_CODE(); + + blk_inc_in_flight(blk); + acb = blk_aio_get(&blk_aio_em_aiocb_info, blk, cb, opaque); + acb->rwco = (BlkRwCo) { + .blk = blk, + .ret = NOT_DONE, + .flags = flags, + .iobuf = qiov, + .zone_append = { + .offset = offset, + }, + }; + acb->has_returned = false; + + co = qemu_coroutine_create(blk_aio_zone_append_entry, acb); + bdrv_coroutine_enter(blk_bs(blk), co); + acb->has_returned = true; + if (acb->rwco.ret != NOT_DONE) { + replay_bh_schedule_oneshot_event(blk_get_aio_context(blk), + blk_aio_complete_bh, acb); + } + + return &acb->common; +} + /* * Send a zone_report command. * offset is a byte offset from the start of the device. No alignment @@ -1922,6 +1966,27 @@ int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, BlockZoneOp op, return ret; } +/* + * Send a zone_append command. + */ +int coroutine_fn blk_co_zone_append(BlockBackend *blk, int64_t *offset, + QEMUIOVector *qiov, BdrvRequestFlags flags) +{ + int ret; + IO_CODE(); + + blk_inc_in_flight(blk); + blk_wait_while_drained(blk); + if (!blk_is_available(blk)) { + blk_dec_in_flight(blk); + return -ENOMEDIUM; + } + + ret = bdrv_co_zone_append(blk_bs(blk), offset, qiov, flags); + blk_dec_in_flight(blk); + return ret; +} + void blk_drain(BlockBackend *blk) { BlockDriverState *bs = blk_bs(blk); diff --git a/block/file-posix.c b/block/file-posix.c index fbab23f450..9c1afb7749 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -159,6 +159,7 @@ typedef struct BDRVRawState { bool has_write_zeroes:1; bool use_linux_aio:1; bool use_linux_io_uring:1; + int64_t *offset; /* offset of zone append operation */ int page_cache_inconsistent; /* errno from fdatasync failure */ bool has_fallocate; bool needs_alignment; @@ -1484,6 +1485,11 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp) bs->bl.max_active_zones = ret; } + ret = get_sysfs_long_val(&st, "physical_block_size"); + if (ret >= 0) { + bs->bl.write_granularity = ret; + } + bs->bl.wps = g_malloc(sizeof(BlockZoneWps) + sizeof(int64_t) * ret); ret = get_zones_wp(s->fd, bs->bl.wps, 0, bs->bl.nr_zones); if (ret < 0) { @@ -1664,7 +1670,7 @@ static ssize_t handle_aiocb_rw_vector(RawPosixAIOData *aiocb) ssize_t len; do { - if (aiocb->aio_type & QEMU_AIO_WRITE) + if (aiocb->aio_type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) len = qemu_pwritev(aiocb->aio_fildes, aiocb->io.iov, aiocb->io.niov, @@ -1694,7 +1700,7 @@ static ssize_t handle_aiocb_rw_linear(RawPosixAIOData *aiocb, char *buf) ssize_t len; while (offset < aiocb->aio_nbytes) { - if (aiocb->aio_type & QEMU_AIO_WRITE) { + if (aiocb->aio_type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) { len = pwrite(aiocb->aio_fildes, (const char *)buf + offset, aiocb->aio_nbytes - offset, @@ -1787,7 +1793,7 @@ static int handle_aiocb_rw(void *opaque) } nbytes = handle_aiocb_rw_linear(aiocb, buf); - if (!(aiocb->aio_type & QEMU_AIO_WRITE)) { + if (!(aiocb->aio_type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND))) { char *p = buf; size_t count = aiocb->aio_nbytes, copy; int i; @@ -2426,6 +2432,10 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset, #if defined(CONFIG_BLKZONED) if (bs->bl.wps) { qemu_co_mutex_lock(&bs->bl.wps->colock); + if (type & QEMU_AIO_ZONE_APPEND && bs->bl.zone_size) { + int index = offset / bs->bl.zone_size; + offset = bs->bl.wps->wp[index]; + } } #endif @@ -2473,9 +2483,13 @@ out: #if defined(CONFIG_BLKZONED) BlockZoneWps *wps = bs->bl.wps; if (ret == 0) { - if (type & QEMU_AIO_WRITE && wps && bs->bl.zone_size) { + if ((type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) + && wps && bs->bl.zone_size) { int index = offset / bs->bl.zone_size; if (!BDRV_ZT_IS_CONV(wps->wp[index])) { + if (type & QEMU_AIO_ZONE_APPEND) { + *s->offset = wps->wp[index]; + } /* Advance the wp if needed */ if (offset + bytes > wps->wp[index]) { wps->wp[index] = offset + bytes; @@ -2483,7 +2497,7 @@ out: } } } else { - if (type & QEMU_AIO_WRITE) { + if (type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) { update_zones_wp(s->fd, bs->bl.wps, 0, 1); } } @@ -3495,6 +3509,41 @@ out: } #endif +#if defined(CONFIG_BLKZONED) +static int coroutine_fn raw_co_zone_append(BlockDriverState *bs, + int64_t *offset, + QEMUIOVector *qiov, + BdrvRequestFlags flags) { + assert(flags == 0); + int64_t zone_size_mask = bs->bl.zone_size - 1; + int64_t iov_len = 0; + int64_t len = 0; + BDRVRawState *s = bs->opaque; + s->offset = offset; + + + if (*offset & zone_size_mask) { + error_report("sector offset %" PRId64 " is not aligned to zone size " + "%" PRId32 "", *offset / 512, bs->bl.zone_size / 512); + return -EINVAL; + } + + int64_t wg = bs->bl.write_granularity; + int64_t wg_mask = wg - 1; + for (int i = 0; i < qiov->niov; i++) { + iov_len = qiov->iov[i].iov_len; + if (iov_len & wg_mask) { + error_report("len of IOVector[%d] %" PRId64 " is not aligned to " + "block size %" PRId64 "", i, iov_len, wg); + return -EINVAL; + } + len += iov_len; + } + + return raw_co_prw(bs, *offset, len, qiov, QEMU_AIO_ZONE_APPEND); +} +#endif + static coroutine_fn int raw_do_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes, bool blkdev) @@ -4270,6 +4319,7 @@ static BlockDriver bdrv_zoned_host_device = { /* zone management operations */ .bdrv_co_zone_report = raw_co_zone_report, .bdrv_co_zone_mgmt = raw_co_zone_mgmt, + .bdrv_co_zone_append = raw_co_zone_append, }; #endif diff --git a/block/io.c b/block/io.c index 88f707ea4d..03e1109056 100644 --- a/block/io.c +++ b/block/io.c @@ -3230,6 +3230,27 @@ out: return co.ret; } +int coroutine_fn bdrv_co_zone_append(BlockDriverState *bs, int64_t *offset, + QEMUIOVector *qiov, + BdrvRequestFlags flags) +{ + BlockDriver *drv = bs->drv; + CoroutineIOCompletion co = { + .coroutine = qemu_coroutine_self(), + }; + IO_CODE(); + + bdrv_inc_in_flight(bs); + if (!drv || !drv->bdrv_co_zone_append) { + co.ret = -ENOTSUP; + goto out; + } + co.ret = drv->bdrv_co_zone_append(bs, offset, qiov, flags); +out: + bdrv_dec_in_flight(bs); + return co.ret; +} + void *qemu_blockalign(BlockDriverState *bs, size_t size) { IO_CODE(); diff --git a/block/io_uring.c b/block/io_uring.c index 973e15d876..f7488c241a 100644 --- a/block/io_uring.c +++ b/block/io_uring.c @@ -345,6 +345,10 @@ static int luring_do_submit(int fd, LuringAIOCB *luringcb, LuringState *s, io_uring_prep_writev(sqes, fd, luringcb->qiov->iov, luringcb->qiov->niov, offset); break; + case QEMU_AIO_ZONE_APPEND: + io_uring_prep_writev(sqes, fd, luringcb->qiov->iov, + luringcb->qiov->niov, offset); + break; case QEMU_AIO_READ: io_uring_prep_readv(sqes, fd, luringcb->qiov->iov, luringcb->qiov->niov, offset); diff --git a/block/linux-aio.c b/block/linux-aio.c index d2cfb7f523..1959834156 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -389,6 +389,9 @@ static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset, case QEMU_AIO_WRITE: io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, offset); break; + case QEMU_AIO_ZONE_APPEND: + io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, offset); + break; case QEMU_AIO_READ: io_prep_preadv(iocbs, fd, qiov->iov, qiov->niov, offset); break; diff --git a/block/raw-format.c b/block/raw-format.c index 18dc52a150..33bff8516e 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -325,6 +325,13 @@ static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op, return bdrv_co_zone_mgmt(bs->file->bs, op, offset, len); } +static int coroutine_fn raw_co_zone_append(BlockDriverState *bs, + int64_t *offset, + QEMUIOVector *qiov, + BdrvRequestFlags flags) { + return bdrv_co_zone_append(bs->file->bs, offset, qiov, flags); +} + static int64_t raw_getlength(BlockDriverState *bs) { int64_t len; @@ -629,6 +636,7 @@ BlockDriver bdrv_raw = { .bdrv_co_pdiscard = &raw_co_pdiscard, .bdrv_co_zone_report = &raw_co_zone_report, .bdrv_co_zone_mgmt = &raw_co_zone_mgmt, + .bdrv_co_zone_append = &raw_co_zone_append, .bdrv_co_block_status = &raw_co_block_status, .bdrv_co_copy_range_from = &raw_co_copy_range_from, .bdrv_co_copy_range_to = &raw_co_copy_range_to, diff --git a/include/block/block-io.h b/include/block/block-io.h index f0cdf67d33..6a54453578 100644 --- a/include/block/block-io.h +++ b/include/block/block-io.h @@ -94,6 +94,9 @@ int coroutine_fn bdrv_co_zone_report(BlockDriverState *bs, int64_t offset, BlockZoneDescriptor *zones); int coroutine_fn bdrv_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op, int64_t offset, int64_t len); +int coroutine_fn bdrv_co_zone_append(BlockDriverState *bs, int64_t *offset, + QEMUIOVector *qiov, + BdrvRequestFlags flags); int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes); bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs); diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h index 4effff3aa1..43ae78171e 100644 --- a/include/block/block_int-common.h +++ b/include/block/block_int-common.h @@ -701,6 +701,9 @@ struct BlockDriver { BlockZoneDescriptor *zones); int coroutine_fn (*bdrv_co_zone_mgmt)(BlockDriverState *bs, BlockZoneOp op, int64_t offset, int64_t len); + int coroutine_fn (*bdrv_co_zone_append)(BlockDriverState *bs, + int64_t *offset, QEMUIOVector *qiov, + BdrvRequestFlags flags); /* removable device specific */ bool (*bdrv_is_inserted)(BlockDriverState *bs); @@ -857,6 +860,8 @@ typedef struct BlockLimits { /* array of write pointers' location of each zone in the zoned device. */ BlockZoneWps *wps; + + int64_t write_granularity; } BlockLimits; typedef struct BdrvOpBlocker BdrvOpBlocker; diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h index 877b2240b3..53033a5ca7 100644 --- a/include/block/raw-aio.h +++ b/include/block/raw-aio.h @@ -31,6 +31,7 @@ #define QEMU_AIO_TRUNCATE 0x0080 #define QEMU_AIO_ZONE_REPORT 0x0100 #define QEMU_AIO_ZONE_MGMT 0x0200 +#define QEMU_AIO_ZONE_APPEND 0x0400 #define QEMU_AIO_TYPE_MASK \ (QEMU_AIO_READ | \ QEMU_AIO_WRITE | \ @@ -41,7 +42,8 @@ QEMU_AIO_COPY_RANGE | \ QEMU_AIO_TRUNCATE | \ QEMU_AIO_ZONE_REPORT | \ - QEMU_AIO_ZONE_MGMT) + QEMU_AIO_ZONE_MGMT | \ + QEMU_AIO_ZONE_APPEND) /* AIO flags */ #define QEMU_AIO_MISALIGNED 0x1000 diff --git a/include/sysemu/block-backend-io.h b/include/sysemu/block-backend-io.h index 1b5fc7db6b..ff9f777f52 100644 --- a/include/sysemu/block-backend-io.h +++ b/include/sysemu/block-backend-io.h @@ -52,6 +52,9 @@ BlockAIOCB *blk_aio_zone_report(BlockBackend *blk, int64_t offset, BlockAIOCB *blk_aio_zone_mgmt(BlockBackend *blk, BlockZoneOp op, int64_t offset, int64_t len, BlockCompletionFunc *cb, void *opaque); +BlockAIOCB *blk_aio_zone_append(BlockBackend *blk, int64_t *offset, + QEMUIOVector *qiov, BdrvRequestFlags flags, + BlockCompletionFunc *cb, void *opaque); BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes, BlockCompletionFunc *cb, void *opaque); void blk_aio_cancel_async(BlockAIOCB *acb); @@ -173,6 +176,12 @@ int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, BlockZoneOp op, int64_t offset, int64_t len); int generated_co_wrapper blk_zone_mgmt(BlockBackend *blk, BlockZoneOp op, int64_t offset, int64_t len); +int coroutine_fn blk_co_zone_append(BlockBackend *blk, int64_t *offset, + QEMUIOVector *qiov, + BdrvRequestFlags flags); +int generated_co_wrapper blk_zone_append(BlockBackend *blk, int64_t *offset, + QEMUIOVector *qiov, + BdrvRequestFlags flags); int generated_co_wrapper blk_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes); From patchwork Thu Oct 27 15:52:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam Li X-Patchwork-Id: 1695496 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=b6hZAED4; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Myqwf0Jfqz20S2 for ; Fri, 28 Oct 2022 02:53:54 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oo5Bk-0001rK-P9; Thu, 27 Oct 2022 11:53:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oo5Bd-0001Ap-B2; Thu, 27 Oct 2022 11:52:53 -0400 Received: from mail-pj1-x1030.google.com ([2607:f8b0:4864:20::1030]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oo5Bb-0002KX-Kj; Thu, 27 Oct 2022 11:52:53 -0400 Received: by mail-pj1-x1030.google.com with SMTP id v4-20020a17090a088400b00212cb0ed97eso1829306pjc.5; Thu, 27 Oct 2022 08:52:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ntInf4eetSWILts6vn1/3KGiFebuP+yikSpSQrr01a8=; b=b6hZAED4S0WNE8nnAzBkjsVSfjr0k+3iCWr4d7iPNrlZo1dMjccNAkRU1K29Ftfu5W s0H1kgeGDOCJFyj4eKEoIp0/uncRlsiljbRZni2ROjrvuwsdXczcCO8ugBCBslSgXVUZ BlKMLtBnhDOvQYE2w4uVvq7axm+sSyeSDCCtE7UQ8MCp1DOelJi33UVGajNa06Y4Ns9v vNNBglLVJweDvDBLucIZubR7Xd7YgcrtTp4N1Rd5c3roDM0RRumOUJVTBQRjC0wVZq3b Db2DsHT5Di01gNz8RlPd6xbUTrAonSGfOifjqjeHIbwRIxfkRIMIb6hsrVTPIMk+ZLi/ cHHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ntInf4eetSWILts6vn1/3KGiFebuP+yikSpSQrr01a8=; b=BMEZgyQmh1DwF8T9+VPXdepeCv6RxcE6HbD5vzm4b+Nu/EeanG54069uthiZtjyCye +PxKqSmSku/aibsIuJlZY0rgHFCVJkLqomVNHxn0JEJXxXjL68mRtQ8NCYd4LLzt5+m2 JZOjFG1RXz4At3eNlyo2T+iAz0NBvdDqbfRuIqIo/bhYxy1mJByILpwzbh8x7IA+ZCP6 1xTlrp/TrwAqiYRW2frGgOfoTiwwxQN5hDs3AbpEeOVVQirQhn825+ZeMwRZ3z9Mdbw7 AyVCfwI27LcIoN8G8B4W5jXFSuwAvLrYY4MzwXQnFTiGjHZokpZzrbipHXLWKkax5LgH APDQ== X-Gm-Message-State: ACrzQf1d+2NrJhqsj0znXhA90Y5VUx/abO9F1bzp+kUBWTsz6WtdXojD jSV/mMKrOkglXwEvlszvDuL0Qp6ZrlHfGyO3 X-Google-Smtp-Source: AMsMyM57GrBP/ObgAnGIdQFzWa4zWvzph1xgS8uQc99u7KkhGtXKYTU8N6fRDDM6aACcdg0/S2Pj8g== X-Received: by 2002:a17:90b:305:b0:213:8a6:8bb4 with SMTP id ay5-20020a17090b030500b0021308a68bb4mr11106724pjb.33.1666885969259; Thu, 27 Oct 2022 08:52:49 -0700 (PDT) Received: from roots.. ([112.44.202.248]) by smtp.gmail.com with ESMTPSA id f21-20020a623815000000b0056c058ab000sm1327744pfa.155.2022.10.27.08.52.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Oct 2022 08:52:48 -0700 (PDT) From: Sam Li To: qemu-devel@nongnu.org Cc: damien.lemoal@opensource.wdc.com, Stefano Garzarella , Stefan Hajnoczi , dmitry.fomichev@wdc.com, qemu-block@nongnu.org, Julia Suvorova , hare@suse.de, Kevin Wolf , Hanna Reitz , Fam Zheng , Aarushi Mehta , Sam Li Subject: [PATCH v5 3/4] qemu-iotests: test zone append operation Date: Thu, 27 Oct 2022 23:52:14 +0800 Message-Id: <20221027155215.21374-4-faithilikerun@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221027155215.21374-1-faithilikerun@gmail.com> References: <20221027155215.21374-1-faithilikerun@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1030; envelope-from=faithilikerun@gmail.com; helo=mail-pj1-x1030.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Qemu-devel" Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This tests is mainly a helper to indicate append writes in block layer behaves as expected. Signed-off-by: Sam Li --- qemu-io-cmds.c | 63 ++++++++++++++++++++++++++++++ tests/qemu-iotests/tests/zoned.out | 7 ++++ tests/qemu-iotests/tests/zoned.sh | 9 +++++ 3 files changed, 79 insertions(+) diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c index 3a3bad77c3..abf433f0ad 100644 --- a/qemu-io-cmds.c +++ b/qemu-io-cmds.c @@ -1856,6 +1856,68 @@ static const cmdinfo_t zone_reset_cmd = { .oneline = "reset a zone write pointer in zone block device", }; +static int do_aio_zone_append(BlockBackend *blk, QEMUIOVector *qiov, + int64_t *offset, int flags, int *total) +{ + int async_ret = NOT_DONE; + + blk_aio_zone_append(blk, offset, qiov, flags, aio_rw_done, &async_ret); + while (async_ret == NOT_DONE) { + main_loop_wait(false); + } + + *total = qiov->size; + return async_ret < 0 ? async_ret : 1; +} + +static int zone_append_f(BlockBackend *blk, int argc, char **argv) +{ + int ret; + int flags = 0; + int total = 0; + int64_t offset; + char *buf; + int nr_iov; + int pattern = 0xcd; + QEMUIOVector qiov; + + if (optind > argc - 2) { + return -EINVAL; + } + optind++; + offset = cvtnum(argv[optind]); + if (offset < 0) { + print_cvtnum_err(offset, argv[optind]); + return offset; + } + optind++; + nr_iov = argc - optind; + buf = create_iovec(blk, &qiov, &argv[optind], nr_iov, pattern); + if (buf == NULL) { + return -EINVAL; + } + ret = do_aio_zone_append(blk, &qiov, &offset, flags, &total); + if (ret < 0) { + printf("zone append failed: %s\n", strerror(-ret)); + goto out; + } + + out: + qemu_iovec_destroy(&qiov); + qemu_io_free(buf); + return ret; +} + +static const cmdinfo_t zone_append_cmd = { + .name = "zone_append", + .altname = "zap", + .cfunc = zone_append_f, + .argmin = 3, + .argmax = 3, + .args = "offset len [len..]", + .oneline = "append write a number of bytes at a specified offset", +}; + static int truncate_f(BlockBackend *blk, int argc, char **argv); static const cmdinfo_t truncate_cmd = { .name = "truncate", @@ -2653,6 +2715,7 @@ static void __attribute((constructor)) init_qemuio_commands(void) qemuio_add_command(&zone_close_cmd); qemuio_add_command(&zone_finish_cmd); qemuio_add_command(&zone_reset_cmd); + qemuio_add_command(&zone_append_cmd); qemuio_add_command(&truncate_cmd); qemuio_add_command(&length_cmd); qemuio_add_command(&info_cmd); diff --git a/tests/qemu-iotests/tests/zoned.out b/tests/qemu-iotests/tests/zoned.out index 0c8f96deb9..b3b139b4ec 100644 --- a/tests/qemu-iotests/tests/zoned.out +++ b/tests/qemu-iotests/tests/zoned.out @@ -50,4 +50,11 @@ start: 0x80000, len 0x80000, cap 0x80000, wptr 0x100000, zcond:14, [type: 2] (5) resetting the second zone After resetting a zone: start: 0x80000, len 0x80000, cap 0x80000, wptr 0x80000, zcond:1, [type: 2] + + +(6) append write +After appending the first zone: +start: 0x0, len 0x80000, cap 0x80000, wptr 0x18, zcond:2, [type: 2] +After appending the second zone: +start: 0x80000, len 0x80000, cap 0x80000, wptr 0x80018, zcond:2, [type: 2] *** done diff --git a/tests/qemu-iotests/tests/zoned.sh b/tests/qemu-iotests/tests/zoned.sh index fced0194c5..888711eef2 100755 --- a/tests/qemu-iotests/tests/zoned.sh +++ b/tests/qemu-iotests/tests/zoned.sh @@ -79,6 +79,15 @@ echo "(5) resetting the second zone" sudo $QEMU_IO $IMG -c "zrs 268435456 268435456" echo "After resetting a zone:" sudo $QEMU_IO $IMG -c "zrp 268435456 1" +echo +echo +echo "(6) append write" # physical block size of the device is 4096 +sudo $QEMU_IO $IMG -c "zap 0 0x1000 0x2000" +echo "After appending the first zone:" +sudo $QEMU_IO $IMG -c "zrp 0 1" +sudo $QEMU_IO $IMG -c "zap 268435456 0x1000 0x2000" +echo "After appending the second zone:" +sudo $QEMU_IO $IMG -c "zrp 268435456 1" # success, all done echo "*** done" From patchwork Thu Oct 27 15:52:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam Li X-Patchwork-Id: 1695498 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=d/2iI4mC; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Myqyj3WKZz1ygr for ; Fri, 28 Oct 2022 02:55:41 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oo5Bm-0002Mb-Ti; Thu, 27 Oct 2022 11:53:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oo5Bl-0001zU-7x; Thu, 27 Oct 2022 11:53:01 -0400 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oo5Bj-0002MH-JV; Thu, 27 Oct 2022 11:53:00 -0400 Received: by mail-pl1-x636.google.com with SMTP id u6so1917960plq.12; Thu, 27 Oct 2022 08:52:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9Fss7SWkcWPOiRxCjosHdHx2iUNVuQMQCnMg6vKHl34=; b=d/2iI4mCwLDdGuYrs7/Svd2VMiHXVD81dDuUa7IvPf1UD3EeD/whXSvUBe8NZ1xv7h ovqGm/8u5nLXdo+22BiQqMOIQBd7FHvVpK0ZNREkOEoyp7hi+gNpj43AYp2j+Ss0BPwY lae5P+ukZgukxq7jieaUHfa4bQ5KYFuMcvAETLmjPORtfKm3Ox58UPQgOuvz5ZtQHc83 aQ7JrnKsv+afVxrEkZYuM0zp2bMO8wsn+ZZMyFUI8+nF6j1GRbtfE61UgJ6c3kdNMKNh 2DEeGAavcmEU1FBO3pWo08nhrOwTh1Y946wbQV+Bmvypdqwk/PZT5IH0oMs1qmyqKxGm diOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9Fss7SWkcWPOiRxCjosHdHx2iUNVuQMQCnMg6vKHl34=; b=R0mBP9EnBuhFakwN9pwAHY3ohLWO8YSC6VJ951tYNlDEvxvn418QbZAj/jdnZfnuhg iOTFkCCF8bT6aaU1NbFhiFcw9xIwF2Y5eyH3BwoAtl5es1hWEJRA6EEarQeGW8i9p7OS Q9tqVtd0Bd4oqyJh1Le0hXCr5niHgwihmLhH1o/FNEFNy/cSEmPJrlbNJTpOlgbY3zvg nj9SD1Ukk1+JGSigF9iPLB/frznXUTkkMx3S2wlrqk2S2zkDQhRDDqDsuvrVnvjeFOGF yV+3L+AzNPZdQHBtP0C2Jq0Bij4ttUF8WcqkAfN4gIS6Vwfaq9nDXZmIKE9VfXjaLHdv VpyQ== X-Gm-Message-State: ACrzQf1Z/9e7mv+SH1Sfwquu5EswjtMRg10yfxNK6kXGt4CSy3VrzGJ1 LhdzaLiPl1nRQy19rq7pLkD6Ghq0Pq6J3JDp X-Google-Smtp-Source: AMsMyM4S3CSHyec+Xj8O3gatHF9yXungcQsZspkeOkDARiqE70pI/pjRm+LoSWqFq1JMWWrNJWueJg== X-Received: by 2002:a17:90b:4a0d:b0:213:587b:206a with SMTP id kk13-20020a17090b4a0d00b00213587b206amr9784731pjb.195.1666885976091; Thu, 27 Oct 2022 08:52:56 -0700 (PDT) Received: from roots.. ([112.44.202.248]) by smtp.gmail.com with ESMTPSA id f21-20020a623815000000b0056c058ab000sm1327744pfa.155.2022.10.27.08.52.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Oct 2022 08:52:55 -0700 (PDT) From: Sam Li To: qemu-devel@nongnu.org Cc: damien.lemoal@opensource.wdc.com, Stefano Garzarella , Stefan Hajnoczi , dmitry.fomichev@wdc.com, qemu-block@nongnu.org, Julia Suvorova , hare@suse.de, Kevin Wolf , Hanna Reitz , Fam Zheng , Aarushi Mehta , Sam Li Subject: [PATCH v5 4/4] block: add some trace events for zone append Date: Thu, 27 Oct 2022 23:52:15 +0800 Message-Id: <20221027155215.21374-5-faithilikerun@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221027155215.21374-1-faithilikerun@gmail.com> References: <20221027155215.21374-1-faithilikerun@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=faithilikerun@gmail.com; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Qemu-devel" Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Signed-off-by: Sam Li --- block/file-posix.c | 3 +++ block/trace-events | 2 ++ 2 files changed, 5 insertions(+) diff --git a/block/file-posix.c b/block/file-posix.c index 9c1afb7749..b23cfb02e3 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2489,6 +2489,8 @@ out: if (!BDRV_ZT_IS_CONV(wps->wp[index])) { if (type & QEMU_AIO_ZONE_APPEND) { *s->offset = wps->wp[index]; + trace_zbd_zone_append_complete(bs, *s->offset + >> BDRV_SECTOR_BITS); } /* Advance the wp if needed */ if (offset + bytes > wps->wp[index]) { @@ -3540,6 +3542,7 @@ static int coroutine_fn raw_co_zone_append(BlockDriverState *bs, len += iov_len; } + trace_zbd_zone_append(bs, *offset >> BDRV_SECTOR_BITS); return raw_co_prw(bs, *offset, len, qiov, QEMU_AIO_ZONE_APPEND); } #endif diff --git a/block/trace-events b/block/trace-events index 3f4e1d088a..32665158d6 100644 --- a/block/trace-events +++ b/block/trace-events @@ -211,6 +211,8 @@ file_hdev_is_sg(int type, int version) "SG device found: type=%d, version=%d" file_flush_fdatasync_failed(int err) "errno %d" zbd_zone_report(void *bs, unsigned int nr_zones, int64_t sector) "bs %p report %d zones starting at sector offset 0x%" PRIx64 "" zbd_zone_mgmt(void *bs, const char *op_name, int64_t sector, int64_t len) "bs %p %s starts at sector offset 0x%" PRIx64 " over a range of 0x%" PRIx64 " sectors" +zbd_zone_append(void *bs, int64_t sector) "bs %p append at sector offset 0x%" PRIx64 "" +zbd_zone_append_complete(void *bs, int64_t sector) "bs %p returns append sector 0x%" PRIx64 "" # ssh.c sftp_error(const char *op, const char *ssh_err, int ssh_err_code, int sftp_err_code) "%s failed: %s (libssh error code: %d, sftp error code: %d)"