{"id":2225797,"url":"http://patchwork.ozlabs.org/api/1.1/patches/2225797/?format=json","web_url":"http://patchwork.ozlabs.org/project/qemu-devel/patch/20260421155628.3600671-3-den@openvz.org/","project":{"id":14,"url":"http://patchwork.ozlabs.org/api/1.1/projects/14/?format=json","name":"QEMU Development","link_name":"qemu-devel","list_id":"qemu-devel.nongnu.org","list_email":"qemu-devel@nongnu.org","web_url":"","scm_url":"","webscm_url":""},"msgid":"<20260421155628.3600671-3-den@openvz.org>","date":"2026-04-21T15:56:28","name":"[2/2] iotests: regression test for discard/write-zeroes vs in-flight write race","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"da478c68bc1640b21735ef4dacacfca1e6aecce4","submitter":{"id":71296,"url":"http://patchwork.ozlabs.org/api/1.1/people/71296/?format=json","name":"Denis V. Lunev\" via qemu development","email":"qemu-devel@nongnu.org"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/qemu-devel/patch/20260421155628.3600671-3-den@openvz.org/mbox/","series":[{"id":500841,"url":"http://patchwork.ozlabs.org/api/1.1/series/500841/?format=json","web_url":"http://patchwork.ozlabs.org/project/qemu-devel/list/?series=500841","date":"2026-04-21T15:56:27","name":"block/io: fix reproducible silent data corruption in write-vs-discard race","version":1,"mbox":"http://patchwork.ozlabs.org/series/500841/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/2225797/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/2225797/checks/","tags":{},"headers":{"Return-Path":"<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":"patchwork-incoming@legolas.ozlabs.org","Authentication-Results":["legolas.ozlabs.org;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n secure) header.d=virtuozzo.com header.i=@virtuozzo.com header.a=rsa-sha256\n header.s=relay header.b=x0O+XMH3;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org\n (client-ip=209.51.188.17; helo=lists1p.gnu.org;\n envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org;\n receiver=patchwork.ozlabs.org)"],"Received":["from lists1p.gnu.org (lists1p.gnu.org [209.51.188.17])\n\t(using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g0Rns0kNvz1yCv\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 22 Apr 2026 01:57:35 +1000 (AEST)","from localhost ([::1] helo=lists1p.gnu.org)\n\tby lists1p.gnu.org with esmtp (Exim 4.90_1)\n\t(envelope-from <qemu-devel-bounces@nongnu.org>)\n\tid 1wFDSo-0002ft-HC; Tue, 21 Apr 2026 11:56:38 -0400","from eggs.gnu.org ([2001:470:142:3::10])\n by lists1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)\n (Exim 4.90_1) (envelope-from <den@openvz.org>)\n id 1wFDSl-0002ei-GP; Tue, 21 Apr 2026 11:56:35 -0400","from relay.virtuozzo.com ([130.117.225.111])\n by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)\n (Exim 4.90_1) (envelope-from <den@openvz.org>)\n id 1wFDSj-0003JA-Fl; Tue, 21 Apr 2026 11:56:34 -0400","from ch-demo-asa.virtuozzo.com ([130.117.225.8] helo=iris.sw.ru)\n by relay.virtuozzo.com with esmtp (Exim 4.96)\n (envelope-from <den@openvz.org>) id 1wFDQ2-001k3k-35;\n Tue, 21 Apr 2026 17:56:20 +0200"],"DKIM-Signature":"v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;\n d=virtuozzo.com; s=relay; h=MIME-Version:Message-ID:Date:Subject:From:\n Content-Type; bh=5oZ5ryl95dTV+hIbE37CXIJ9NkwjnJnZE/utqtG/ovM=; b=x0O+XMH38Aiy\n Y0ODmDeTiif5+9CCJfkkbSqDHaHQVPszRK4ydyzJEgplNtg2U0w4jWklEcWCJv8L5lP9iW/slclhz\n X0T0rlRRSFxXo35cI9Elwaz2pUZbKPFNZxIQVMtJuiBC4QpzBuw2Oka3u1+Q98zveCqa+ixLpIw0a\n Voewzd+eBwL0UHOxAHbVl7ahBjMgYMsK6sgCqRdKF2RpRoH/xW69bIg7xTzDw5UORI74CGrmsnLEK\n GHhiAioX9CFEhEdsyGZEN+LRt8bDmd0ZOyCdKjnppYwQ8Xbmg5FwG+vlXJ7uTvnMrDq5+jvzqNBFF\n Sf+WuSvdHXweXDi6/Zs7eQ==;","To":"qemu-devel@nongnu.org,\n\tqemu-block@nongnu.org,\n\tqemu-stable@nongnu.org","Cc":"den@openvz.org, Stefan Hajnoczi <stefanha@redhat.com>,\n Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>","Subject":"[PATCH 2/2] iotests: regression test for discard/write-zeroes vs\n in-flight write race","Date":"Tue, 21 Apr 2026 17:56:28 +0200","Message-ID":"<20260421155628.3600671-3-den@openvz.org>","X-Mailer":"git-send-email 2.51.0","In-Reply-To":"<20260421155628.3600671-1-den@openvz.org>","References":"<20260421155628.3600671-1-den@openvz.org>","MIME-Version":"1.0","Content-Transfer-Encoding":"8bit","Received-SPF":"softfail client-ip=130.117.225.111;\n envelope-from=den@openvz.org;\n helo=relay.virtuozzo.com","X-Spam_score_int":"-34","X-Spam_score":"-3.5","X-Spam_bar":"---","X-Spam_report":"(-3.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,\n DKIM_VALID=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001,\n SPF_SOFTFAIL=0.665 autolearn=ham autolearn_force=no","X-Spam_action":"no action","X-BeenThere":"qemu-devel@nongnu.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"qemu development <qemu-devel.nongnu.org>","List-Unsubscribe":"<https://lists.nongnu.org/mailman/options/qemu-devel>,\n <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>","List-Archive":"<https://lists.nongnu.org/archive/html/qemu-devel>","List-Post":"<mailto:qemu-devel@nongnu.org>","List-Help":"<mailto:qemu-devel-request@nongnu.org?subject=help>","List-Subscribe":"<https://lists.nongnu.org/mailman/listinfo/qemu-devel>,\n <mailto:qemu-devel-request@nongnu.org?subject=subscribe>","Reply-to":"\"Denis V. Lunev\" <den@openvz.org>","From":"\"Denis V. Lunev\" via qemu development <qemu-devel@nongnu.org>","Errors-To":"qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org","Sender":"qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org"},"content":"Add tests/qemu-iotests/tests/discard-write-serialisation, a deterministic\nregression test for the race fixed in the previous commit.\n\nDrive a single qemu-io process with a fixed-seed interleaved sequence of\nasync aio_write and aio_write -z -u commands at random cluster-aligned\noffsets in a small contention region, then run qemu-img check and assert\nzero corruptions.  On an unpatched tree the same workload reproduces the\nrefcount-aliasing fingerprint deterministically; on the fixed tree the\nimage comes back clean.\n\nThe test is scoped to qcow2 because qcow2 is the format whose qemu-img\ncheck validates refcount/reference consistency and therefore actually\ndetects the fingerprint.  The underlying race is in the generic block\nlayer and not format-specific, but a test that asserts \"qemu-img check\nreturns zero corruptions\" only has signal on formats that run such a\ncheck.\n\nSigned-off-by: Denis V. Lunev <den@openvz.org>\nCc: Stefan Hajnoczi <stefanha@redhat.com>\nCc: Kevin Wolf <kwolf@redhat.com>\nCc: Hanna Reitz <hreitz@redhat.com>\n---\n .../tests/discard-write-serialisation         | 97 +++++++++++++++++++\n .../tests/discard-write-serialisation.out     |  1 +\n 2 files changed, 98 insertions(+)\n create mode 100755 tests/qemu-iotests/tests/discard-write-serialisation\n create mode 100644 tests/qemu-iotests/tests/discard-write-serialisation.out","diff":"diff --git a/tests/qemu-iotests/tests/discard-write-serialisation b/tests/qemu-iotests/tests/discard-write-serialisation\nnew file mode 100755\nindex 0000000000..45a3f7f043\n--- /dev/null\n+++ b/tests/qemu-iotests/tests/discard-write-serialisation\n@@ -0,0 +1,97 @@\n+#!/usr/bin/env python3\n+# group: rw quick auto\n+#\n+# Regression test for the block-layer race fixed in\n+# block/io: serialise discard and write-zeroes against in-flight writes.\n+#\n+# A format driver's write path may drop its internal mutex around the\n+# data I/O of an allocating write (qcow2 does so between\n+# qcow2_alloc_host_offset and qcow2_alloc_cluster_link_l2).  A\n+# concurrent discard or MAY_UNMAP write-zeroes on the same guest range,\n+# running in that window, can clear the L2 entry and drop the cluster's\n+# refcount to zero; the writer's subsequent link then binds the L2\n+# entry to a freed cluster.  qemu-img check reports this as refcount=0\n+# with a live OFLAG_COPIED reference, or refcount < reference when the\n+# allocator re-hands the cluster out.\n+#\n+# The bug is in the generic block layer, not format-specific; qcow2 is\n+# the detection vehicle because its refcount validation in qemu-img\n+# check catches the fingerprint.  The test drives a single qemu-io\n+# process with interleaved async aio_write and aio_write -z -u commands\n+# at random cluster-aligned offsets in a small contention region, then\n+# runs qemu-img check and asserts zero corruptions.  On an unpatched\n+# tree the same workload reproduces the fingerprint deterministically\n+# (seed is fixed).\n+#\n+# SPDX-License-Identifier: GPL-2.0-or-later\n+\n+import random\n+import subprocess\n+\n+import iotests\n+from iotests import qemu_img_create, qemu_img_check, qemu_io_wrap_args\n+\n+\n+iotests.script_initialize(supported_fmts=['qcow2'],\n+                          supported_platforms=['linux'])\n+\n+IMG_SIZE = 256 * 1024 * 1024          # 256 MiB\n+REGION = 64 * 1024 * 1024             # contention region: 64 MiB\n+CLUSTER = 1024 * 1024                 # 1 MiB\n+SUBCLUSTER = 32 * 1024                # 32 KiB\n+OPS = 5000\n+SEED = 7\n+\n+def build_commands() -> bytes:\n+    rng = random.Random(SEED)\n+    max_cluster = REGION // CLUSTER - 1\n+    lines = []\n+    for _ in range(OPS):\n+        cl = rng.randint(0, max_cluster)\n+        off = cl * CLUSTER\n+        if rng.random() < 0.5:\n+            # Small sub-cluster write at an unaligned position inside\n+            # the cluster -- exercises the handle_copied path and the\n+            # s->lock drop around the data I/O.\n+            sub = rng.randrange(0, CLUSTER, SUBCLUSTER)\n+            lines.append(f'aio_write -q {off + sub} 32k')\n+        else:\n+            # MAY_UNMAP write-zeroes aligned to the cluster -- frees\n+            # clusters at the format driver level and is the concurrent\n+            # cluster-free source that races with the in-flight writes.\n+            lines.append(f'aio_write -q -z -u {off} 1M')\n+    lines.append('aio_flush')\n+    return ('\\n'.join(lines) + '\\n').encode()\n+\n+\n+def main() -> None:\n+    with iotests.FilePath('disk.img') as img:\n+        qemu_img_create('-f', 'qcow2',\n+                        '-o', 'cluster_size=1M,extended_l2=on,'\n+                              'lazy_refcounts=on,refcount_bits=16',\n+                        img, str(IMG_SIZE))\n+\n+        # Run qemu-io with async AIO.  --cache=none and --aio=native ensure\n+        # the writer coroutine actually yields around its data I/O (which\n+        # is what opens the race window).  Swallow stdout/stderr: the\n+        # result we care about is the on-disk state, checked below.\n+        args = qemu_io_wrap_args(['-f', 'qcow2', '-n',\n+                                  '--cache=none', '--aio=native', img])\n+        subprocess.run(args, input=build_commands(),\n+                       stdout=subprocess.DEVNULL,\n+                       stderr=subprocess.DEVNULL,\n+                       check=True)\n+\n+        result = qemu_img_check(img)\n+        corruptions = result.get('corruptions', 0)\n+        check_errors = result.get('check-errors', 0)\n+        if corruptions or check_errors:\n+            iotests.log(f'FAIL: qemu-img check reports '\n+                        f'corruptions={corruptions} '\n+                        f'check-errors={check_errors}')\n+        else:\n+            iotests.log('OK')\n+\n+\n+if __name__ == '__main__':\n+    main()\ndiff --git a/tests/qemu-iotests/tests/discard-write-serialisation.out b/tests/qemu-iotests/tests/discard-write-serialisation.out\nnew file mode 100644\nindex 0000000000..d86bac9de5\n--- /dev/null\n+++ b/tests/qemu-iotests/tests/discard-write-serialisation.out\n@@ -0,0 +1 @@\n+OK\n","prefixes":["2/2"]}