Show a cover letter.

GET /api/1.2/covers/2225799/?format=api
HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept

{
    "id": 2225799,
    "url": "http://patchwork.ozlabs.org/api/1.2/covers/2225799/?format=api",
    "web_url": "http://patchwork.ozlabs.org/project/qemu-devel/cover/20260421155628.3600671-1-den@openvz.org/",
    "project": {
        "id": 14,
        "url": "http://patchwork.ozlabs.org/api/1.2/projects/14/?format=api",
        "name": "QEMU Development",
        "link_name": "qemu-devel",
        "list_id": "qemu-devel.nongnu.org",
        "list_email": "qemu-devel@nongnu.org",
        "web_url": "",
        "scm_url": "",
        "webscm_url": "",
        "list_archive_url": "",
        "list_archive_url_format": "",
        "commit_url_format": ""
    },
    "msgid": "<20260421155628.3600671-1-den@openvz.org>",
    "list_archive_url": null,
    "date": "2026-04-21T15:56:26",
    "name": "[0/2] block/io: fix reproducible silent data corruption in write-vs-discard race",
    "submitter": {
        "id": 71296,
        "url": "http://patchwork.ozlabs.org/api/1.2/people/71296/?format=api",
        "name": "Denis V. Lunev\" via qemu development",
        "email": "qemu-devel@nongnu.org"
    },
    "mbox": "http://patchwork.ozlabs.org/project/qemu-devel/cover/20260421155628.3600671-1-den@openvz.org/mbox/",
    "series": [
        {
            "id": 500841,
            "url": "http://patchwork.ozlabs.org/api/1.2/series/500841/?format=api",
            "web_url": "http://patchwork.ozlabs.org/project/qemu-devel/list/?series=500841",
            "date": "2026-04-21T15:56:27",
            "name": "block/io: fix reproducible silent data corruption in write-vs-discard race",
            "version": 1,
            "mbox": "http://patchwork.ozlabs.org/series/500841/mbox/"
        }
    ],
    "comments": "http://patchwork.ozlabs.org/api/covers/2225799/comments/",
    "headers": {
        "Return-Path": "<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>",
        "X-Original-To": "incoming@patchwork.ozlabs.org",
        "Delivered-To": "patchwork-incoming@legolas.ozlabs.org",
        "Authentication-Results": [
            "legolas.ozlabs.org;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n secure) header.d=virtuozzo.com header.i=@virtuozzo.com header.a=rsa-sha256\n header.s=relay header.b=oEKj58XM;\n\tdkim-atps=neutral",
            "legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org\n (client-ip=209.51.188.17; helo=lists1p.gnu.org;\n envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org;\n receiver=patchwork.ozlabs.org)"
        ],
        "Received": [
            "from lists1p.gnu.org (lists1p.gnu.org [209.51.188.17])\n\t(using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g0RpN00dxz1yCv\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 22 Apr 2026 01:58:03 +1000 (AEST)",
            "from localhost ([::1] helo=lists1p.gnu.org)\n\tby lists1p.gnu.org with esmtp (Exim 4.90_1)\n\t(envelope-from <qemu-devel-bounces@nongnu.org>)\n\tid 1wFDSp-0002fw-EU; Tue, 21 Apr 2026 11:56:39 -0400",
            "from eggs.gnu.org ([2001:470:142:3::10])\n by lists1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)\n (Exim 4.90_1) (envelope-from <den@openvz.org>)\n id 1wFDSl-0002em-Kj; Tue, 21 Apr 2026 11:56:35 -0400",
            "from relay.virtuozzo.com ([130.117.225.111])\n by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)\n (Exim 4.90_1) (envelope-from <den@openvz.org>)\n id 1wFDSj-0003J9-FG; Tue, 21 Apr 2026 11:56:35 -0400",
            "from ch-demo-asa.virtuozzo.com ([130.117.225.8] helo=iris.sw.ru)\n by relay.virtuozzo.com with esmtp (Exim 4.96)\n (envelope-from <den@openvz.org>) id 1wFDQ2-001k3k-0t;\n Tue, 21 Apr 2026 17:56:19 +0200"
        ],
        "DKIM-Signature": "v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;\n d=virtuozzo.com; s=relay; h=MIME-Version:Message-ID:Date:Subject:From:\n Content-Type; bh=FgmINqfieKlh+6fCJ81osPdRRu4ImsoTkTh6aOy7+6c=; b=oEKj58XMIc5M\n 4hJ7ACojDmm55g12Tl07OZSqkKFe+0XVMntr8pHW1ccoxCJF57G69YdE0jfF4sH77xXmqsKNWe2eJ\n z6YGFUnp/ZfnrbHv6bo/zCre2vv8jYHHHFiRAAgZX3ZpPAQpXIS5yt39bXfEW+BvaV/b6Oknv6OEM\n pkHtZUIxMt1fT+BmtL7joEHdw+Ro6JO3v//tJEvcryM+rkzzDbIyPe6y2R2un1hKpYdpopTkRKQgQ\n BHDaijx0pXtdrh2DRaGIhy87ZMy00U6Pj6Kg0otJPmd/bmaTVgX+KvZ/En7V7kBXf2mJ8Mop40H6U\n jSrE1ndGaJWufZvbltRP3Q==;",
        "To": "qemu-devel@nongnu.org,\n\tqemu-block@nongnu.org,\n\tqemu-stable@nongnu.org",
        "Cc": "den@openvz.org, Stefan Hajnoczi <stefanha@redhat.com>,\n Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>",
        "Subject": "[PATCH 0/2] block/io: fix reproducible silent data corruption in\n write-vs-discard race",
        "Date": "Tue, 21 Apr 2026 17:56:26 +0200",
        "Message-ID": "<20260421155628.3600671-1-den@openvz.org>",
        "X-Mailer": "git-send-email 2.51.0",
        "MIME-Version": "1.0",
        "Content-Transfer-Encoding": "8bit",
        "Received-SPF": "softfail client-ip=130.117.225.111;\n envelope-from=den@openvz.org;\n helo=relay.virtuozzo.com",
        "X-Spam_score_int": "-34",
        "X-Spam_score": "-3.5",
        "X-Spam_bar": "---",
        "X-Spam_report": "(-3.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,\n DKIM_VALID=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001,\n SPF_SOFTFAIL=0.665 autolearn=ham autolearn_force=no",
        "X-Spam_action": "no action",
        "X-BeenThere": "qemu-devel@nongnu.org",
        "X-Mailman-Version": "2.1.29",
        "Precedence": "list",
        "List-Id": "qemu development <qemu-devel.nongnu.org>",
        "List-Unsubscribe": "<https://lists.nongnu.org/mailman/options/qemu-devel>,\n <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>",
        "List-Archive": "<https://lists.nongnu.org/archive/html/qemu-devel>",
        "List-Post": "<mailto:qemu-devel@nongnu.org>",
        "List-Help": "<mailto:qemu-devel-request@nongnu.org?subject=help>",
        "List-Subscribe": "<https://lists.nongnu.org/mailman/listinfo/qemu-devel>,\n <mailto:qemu-devel-request@nongnu.org?subject=subscribe>",
        "Reply-to": "\"Denis V. Lunev\" <den@openvz.org>",
        "From": "\"Denis V. Lunev\" via qemu development <qemu-devel@nongnu.org>",
        "Errors-To": "qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org",
        "Sender": "qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org"
    },
    "content": "Cc: Stefan Hajnoczi <stefanha@redhat.com>\nCc: Kevin Wolf <kwolf@redhat.com>\nCc: Hanna Reitz <hreitz@redhat.com>\n\nThis series fixes a qemu block-layer race between an in-flight\nformat-driver write and a concurrent discard or MAY_UNMAP\nwrite-zeroes on the same guest range. The race has been latent\nin upstream since v1.0 (commit 68d100e905 \"qcow2: Use coroutines\",\n2011-06-30) and has been producing silent qcow2 metadata\ncorruption in production.\n\nMechanism\n---------\n\nqcow2's write path drops s->lock around the data I/O of an\nallocating write.  If a discard / pwrite_zeroes(MAY_UNMAP) on the\nsame guest offset lands in that window, it clears the L2 entry\nand decrements the cluster's refcount to zero; the writer then\nreacquires the lock and unconditionally writes L2[G] =\nalloc_offset | OFLAG_COPIED onto the now-freed cluster.  The next\nallocation re-hands the cluster out and we end up with two L2\nentries aliasing one host cluster.  Patch 1/2 carries the\nper-frame diagram of the interleaving.\n\nOn-disk signature: qemu-img check reports refcount=0 with a live\nOFLAG_COPIED reference, or refcount < reference.  Runtime\nsignature: \"qcow2_free_clusters failed: Invalid argument\" on\nstderr with no guest-visible error.\n\nProduction consequences\n-----------------------\n\n  * Silent data drift.  Once two guest offsets share one host\n    cluster, writes through either alias overwrite bytes the\n    other alias owns.  The guest reads back bytes it never\n    wrote, with no I/O error hit anywhere in the stack.\n\n  * Guest-filesystem corruption.  ext4 discovers the resulting\n    inconsistency and remounts read-only.  Because the backing\n    qemu returned success for every request, nothing in the\n    guest's own block layer logs anything; kernels have been\n    observed stopping all FS writes silently for hours until\n    userspace tries to write.\n\n  * Latent poisoning with multi-day incubation.  A block-job\n    (commit, stream, active mirror, legacy block-migrate +\n    commit on a destination) running concurrently with guest\n    discard traffic plants aliased clusters that may not\n    produce a symptom until a later guest discard walks one\n    of them.  Cases in the wild have surfaced 8 to 17 days\n    after the originating block-job window.\n\n  * Recovery requires both fsck inside the guest AND\n    qemu-img check -r all on the host -- the former repairs\n    the ext4 level, the latter repairs the qcow2 refcount/L2\n    aliasing; fsck alone leaves the image to re-corrupt the\n    moment writes to an aliased cluster resume.\n\nWhy it surfaces only under block-jobs\n-------------------------------------\n\nGuest-only I/O rarely opens the race window: the guest's own\nblock layer serialises DISCARD and WRITE to the same LBA range,\nso at any moment a cluster is either \"being written\" or \"being\ndiscarded\" from the guest, not both.  The race requires a second\nI/O producer on the same BDS that does not observe guest-side\nordering -- i.e. a block job.  Every migration / commit / backup\n/ mirror cycle is an exposure window; steady-state VMs are\nessentially immune until the next image-management operation\nruns.\n\nFix\n---\n\nPatch 1/2 marks both pdiscard and all pwrite_zeroes (with or\nwithout MAY_UNMAP) as BDRV_REQ_SERIALISING in the generic block\nlayer.  Their tracked_request then waits for overlapping\nin-flight writes -- including non-serialising ones -- to finish\ntheir format-driver commit before any L2/refcount mutation\nhappens.\n\nThe gate lives in block/io.c rather than in qcow2 so that:\n\n  * every format driver that drops an internal mutex during\n    the data I/O of an allocating write is covered, not just\n    qcow2;\n\n  * the NBD WRITE_ZEROES path (blk_co_pwrite_zeroes ->\n    blk_co_pwritev -> bdrv_co_pwritev_part ->\n    bdrv_aligned_pwritev, which bypasses the\n    bdrv_co_pwrite_zeroes wrapper entirely) is still caught\n    -- the gate is placed where BDRV_REQ_ZERO_WRITE is\n    observed on the way down to the driver.\n\nPerf impact is limited to the overlap window. The serialising\nrequest only waits when a conflict actually exists, which is\nexactly the corruption surface.  Steady-state non-overlapping\ntraffic pays nothing.\n\nTest\n----\n\nPatch 2/2 adds a deterministic iotest\n(tests/qemu-iotests/tests/discard-write-serialisation) that\ndrives a single qemu-io process with a fixed-seed 5000-command\nsequence of interleaved aio_write and aio_write -z -u at random\ncluster-aligned offsets in a small contention region, then runs\nqemu-img check and asserts zero corruptions. Results across 8\nruns each:\n\n  fixed tree:    8/8 clean\n  unfixed tree:  8/8 detect (2-4 corruptions per run)\n\n100% detection on the unfixed tree, zero false positives on the\nfixed tree, under 30 seconds per run.  The test is scoped to\nqcow2 because qcow2 is the format whose qemu-img check validates\nthe fingerprint; the underlying race is format-agnostic.\n\nDenis V. Lunev (2):\n  block/io: serialise discard and write-zeroes against in-flight writes\n  iotests: regression test for discard/write-zeroes vs in-flight write\n    race\n\n block/io.c                                    | 25 ++++-\n .../tests/discard-write-serialisation         | 97 +++++++++++++++++++\n .../tests/discard-write-serialisation.out     |  1 +\n 3 files changed, 122 insertions(+), 1 deletion(-)\n create mode 100755 tests/qemu-iotests/tests/discard-write-serialisation\n create mode 100644 tests/qemu-iotests/tests/discard-write-serialisation.out"
}