Show a cover letter.

GET /api/covers/2227840/?format=api
HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept

{
    "id": 2227840,
    "url": "http://patchwork.ozlabs.org/api/covers/2227840/?format=api",
    "web_url": "http://patchwork.ozlabs.org/project/qemu-devel/cover/20260424103917.248668-1-den@openvz.org/",
    "project": {
        "id": 14,
        "url": "http://patchwork.ozlabs.org/api/projects/14/?format=api",
        "name": "QEMU Development",
        "link_name": "qemu-devel",
        "list_id": "qemu-devel.nongnu.org",
        "list_email": "qemu-devel@nongnu.org",
        "web_url": "",
        "scm_url": "",
        "webscm_url": "",
        "list_archive_url": "",
        "list_archive_url_format": "",
        "commit_url_format": ""
    },
    "msgid": "<20260424103917.248668-1-den@openvz.org>",
    "list_archive_url": null,
    "date": "2026-04-24T10:39:15",
    "name": "[0/2] block: fix two missed-wakeup hangs on shutdown path",
    "submitter": {
        "id": 71296,
        "url": "http://patchwork.ozlabs.org/api/people/71296/?format=api",
        "name": "Denis V. Lunev\" via qemu development",
        "email": "qemu-devel@nongnu.org"
    },
    "mbox": "http://patchwork.ozlabs.org/project/qemu-devel/cover/20260424103917.248668-1-den@openvz.org/mbox/",
    "series": [
        {
            "id": 501343,
            "url": "http://patchwork.ozlabs.org/api/series/501343/?format=api",
            "web_url": "http://patchwork.ozlabs.org/project/qemu-devel/list/?series=501343",
            "date": "2026-04-24T10:39:16",
            "name": "block: fix two missed-wakeup hangs on shutdown path",
            "version": 1,
            "mbox": "http://patchwork.ozlabs.org/series/501343/mbox/"
        }
    ],
    "comments": "http://patchwork.ozlabs.org/api/covers/2227840/comments/",
    "headers": {
        "Return-Path": "<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>",
        "X-Original-To": "incoming@patchwork.ozlabs.org",
        "Delivered-To": "patchwork-incoming@legolas.ozlabs.org",
        "Authentication-Results": [
            "legolas.ozlabs.org;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n secure) header.d=virtuozzo.com header.i=@virtuozzo.com header.a=rsa-sha256\n header.s=relay header.b=wbwwoxqL;\n\tdkim-atps=neutral",
            "legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org\n (client-ip=209.51.188.17; helo=lists1p.gnu.org;\n envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org;\n receiver=patchwork.ozlabs.org)"
        ],
        "Received": [
            "from lists1p.gnu.org (lists1p.gnu.org [209.51.188.17])\n\t(using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g28c6650cz1yDD\n\tfor <incoming@patchwork.ozlabs.org>; Fri, 24 Apr 2026 20:40:05 +1000 (AEST)",
            "from localhost ([::1] helo=lists1p.gnu.org)\n\tby lists1p.gnu.org with esmtp (Exim 4.90_1)\n\t(envelope-from <qemu-devel-bounces@nongnu.org>)\n\tid 1wGDwU-0002b6-9p; Fri, 24 Apr 2026 06:39:26 -0400",
            "from eggs.gnu.org ([2001:470:142:3::10])\n by lists1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)\n (Exim 4.90_1) (envelope-from <den@openvz.org>)\n id 1wGDwT-0002aZ-0o; Fri, 24 Apr 2026 06:39:25 -0400",
            "from relay.virtuozzo.com ([130.117.225.111])\n by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)\n (Exim 4.90_1) (envelope-from <den@openvz.org>)\n id 1wGDwR-000200-7h; Fri, 24 Apr 2026 06:39:24 -0400",
            "from ch-demo-asa.virtuozzo.com ([130.117.225.8] helo=iris.sw.ru)\n by relay.virtuozzo.com with esmtp (Exim 4.96)\n (envelope-from <den@openvz.org>) id 1wGDtf-00F3Ps-05;\n Fri, 24 Apr 2026 12:39:14 +0200"
        ],
        "DKIM-Signature": "v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;\n d=virtuozzo.com; s=relay; h=Content-Type:MIME-Version:Message-ID:Date:Subject\n :From; bh=EcnsQFbnncWE7Prrulk8hdwft53Btw+MTadW5OXdZ6Q=; b=wbwwoxqLyd8iuh9vy7o\n jOxUsAJMjWigQXUHNgzLzjAvRSrVXUeaaVjUw2HMvw2wQLHaiRVuNWoyNCLLh+1KG7yIIefqDQEK5\n s2sCAIQHeKJlsoerIbQ5TrAYQVtKJubLEqZQFHd0Y68q0S9JaH2GQg3jiYCDN/EvncOhHxXlIIm5Z\n EPwdt7VAwnbBBqRrV2epDGnc+zUVz/Zk1uUYUT+Y6s89XefhfWirADzGE1YLOKlobiGl/qBz8fqQL\n goO8R/hC9JS/Gy4t2c4QCoeFqrptoLCuaf9IfP2+ExpBdrJf5d7XZrslaSNqnNqt1Lc+49iQBoMiM\n pcfS8gZ6OrFeqSw==;",
        "To": "qemu-devel@nongnu.org",
        "Cc": "qemu-block@nongnu.org, qemu-stable@nongnu.org,\n \"Denis V. Lunev\" <den@openvz.org>, Kevin Wolf <kwolf@redhat.com>,\n Hanna Reitz <hreitz@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>,\n Fiona Ebner <f.ebner@proxmox.com>",
        "Subject": "[PATCH 0/2] block: fix two missed-wakeup hangs on shutdown path",
        "Date": "Fri, 24 Apr 2026 12:39:15 +0200",
        "Message-ID": "<20260424103917.248668-1-den@openvz.org>",
        "X-Mailer": "git-send-email 2.51.0",
        "MIME-Version": "1.0",
        "Content-Type": "text/plain; charset=UTF-8",
        "Content-Transfer-Encoding": "8bit",
        "Received-SPF": "softfail client-ip=130.117.225.111;\n envelope-from=den@openvz.org;\n helo=relay.virtuozzo.com",
        "X-Spam_score_int": "-34",
        "X-Spam_score": "-3.5",
        "X-Spam_bar": "---",
        "X-Spam_report": "(-3.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,\n DKIM_VALID=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001,\n SPF_SOFTFAIL=0.665 autolearn=ham autolearn_force=no",
        "X-Spam_action": "no action",
        "X-BeenThere": "qemu-devel@nongnu.org",
        "X-Mailman-Version": "2.1.29",
        "Precedence": "list",
        "List-Id": "qemu development <qemu-devel.nongnu.org>",
        "List-Unsubscribe": "<https://lists.nongnu.org/mailman/options/qemu-devel>,\n <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>",
        "List-Archive": "<https://lists.nongnu.org/archive/html/qemu-devel>",
        "List-Post": "<mailto:qemu-devel@nongnu.org>",
        "List-Help": "<mailto:qemu-devel-request@nongnu.org?subject=help>",
        "List-Subscribe": "<https://lists.nongnu.org/mailman/listinfo/qemu-devel>,\n <mailto:qemu-devel-request@nongnu.org?subject=subscribe>",
        "Reply-to": "\"Denis V. Lunev\" <den@openvz.org>",
        "From": "\"Denis V. Lunev\" via qemu development <qemu-devel@nongnu.org>",
        "Errors-To": "qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org",
        "Sender": "qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org"
    },
    "content": "Problem\n-------\n\nThe qemu shutdown / blockdev-close path can deadlock permanently on\nupstream master.  The main thread enters ppoll(timeout=-1) holding\nBQL, no other thread has a wake source that points back at it, and\nqemu has to be SIGKILLed.  The hang has no timeout -- it is a hard\ndeadlock, not a slow operation; behind BQL, RCU, VCPUs and every\niothread path that needs BQL stall with it.\n\nTwo independent missed-wakeup races in the block layer contribute.\nBoth share the same shape: a waiter arms on one side, the waker\nreads stale state on its fast path and silently skips the kick, and\nnothing else on the AioContext will fire to recover.  They are\ndifferent bugs in different subsystems and each patch stands on its\nown; they are posted together because they surface through the same\ntest and the same symptom and are easiest to diagnose side by side.\n\nDepending on which race fires, the main thread backtrace at the\nmoment of hang is one of:\n\n  ppoll -> aio_poll -> bdrv_graph_wrlock -> blk_remove_bs\n      (patch 1 -- block/graph-lock)\n\n  ppoll -> aio_poll -> cache_clean_timer_del_and_wait -> qcow2_close\n      (patch 2 -- block/qcow2 cache_clean_timer)\n\nRace diagrams and the exact stale-state read are in each patch's\ncommit message.\n\nReproducer\n----------\n\nEnvironment used for the numbers below: 4-vCPU VM guest,\nkernel 6.12.x, upstream master at bb230769b4.  On modern bare-metal\nthe window is narrow enough that the hangs rarely reproduce without\na VM -- a VM guest under full CPU saturation is what makes the\ntiming reliable.  Downstream trees that still use plain\nbdrv_graph_wrlock() in blk_remove_bs() hit the graph-lock race on\nthe first iteration without any stress at all.\n\n    # reproducer\n    stress-ng --cpu \"$(nproc)\" --timeout 0 &\n    for r in $(seq 20); do\n        timeout 120 ./build/tests/qemu-iotests/check -qcow2 iothreads-create\n    done\n    kill %1\n\nWith `stress-ng --cpu $(nproc)` both races surface.  With\n`stress-ng --cpu $(($(nproc) - 1))` or without a stressor neither\nreproduces reliably across 20 iterations.\n\nWhen a race fires, the Python QMP client times out on vm.run_job()\nafter 5 s, the qemu process keeps running but never makes forward\nprogress, and the outer `timeout 120` eventually kills it.  attach\ngdb before the timeout kills qemu to capture the stack and\ndistinguish which of the two races fired.\n\nResults\n-------\n\nSame guest, 20 iterations of the loop above:\n\n  upstream master:            10/20 FAIL (first fail at iter #2)\n  master + both patches:      20/20 PASS\n\nSigned-off-by: Denis V. Lunev <den@openvz.org>\nCc: Kevin Wolf <kwolf@redhat.com>\nCc: Hanna Reitz <hreitz@redhat.com>\nCc: Stefan Hajnoczi <stefanha@redhat.com>\nCc: Fiona Ebner <f.ebner@proxmox.com>\nCc: Hanna Czenczek <hreitz@redhat.com>\n\nDenis V. Lunev (2):\n  block/graph-lock: fix missed wakeup in bdrv_graph_co_rdunlock()\n  block/qcow2: fix hangup in cache_clean_timer cancellation\n\n block/graph-lock.c | 12 +++++-------\n block/qcow2.c      | 28 +++++++++++++++++-----------\n 2 files changed, 22 insertions(+), 18 deletions(-)\n\n--\n2.51.0"
}