From patchwork Wed Dec 6 14:45:41 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 845209 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3ysM4X2MH1z9sNd for ; Thu, 7 Dec 2017 01:51:08 +1100 (AEDT) Received: from localhost ([::1]:56048 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eMb2P-0003CV-Cc for incoming@patchwork.ozlabs.org; Wed, 06 Dec 2017 09:51:05 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51726) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eMaxh-00071N-Ha for qemu-devel@nongnu.org; Wed, 06 Dec 2017 09:46:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eMaxb-00038G-UW for qemu-devel@nongnu.org; Wed, 06 Dec 2017 09:46:13 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54670) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eMaxR-0002tQ-Uo; Wed, 06 Dec 2017 09:45:58 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DF3CB49007; Wed, 6 Dec 2017 14:45:56 +0000 (UTC) Received: from localhost (ovpn-117-13.ams2.redhat.com [10.36.117.13]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8A0E37A1E8; Wed, 6 Dec 2017 14:45:51 +0000 (UTC) From: Stefan Hajnoczi To: Date: Wed, 6 Dec 2017 14:45:41 +0000 Message-Id: <20171206144550.22295-1-stefanha@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Wed, 06 Dec 2017 14:45:56 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v2 0/9] blockdev: fix QMP 'transaction' with IOThreads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , John Snow , Stefan Hajnoczi , qemu-block@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" v2: * Use StrOrNull for x-blockdev-set-iothread iothread argument [eblake] (This is for QEMU 2.12 because this bug is not -rc4 critical) Previously AioContext was held across QMP 'transaction' in an attempt to achieve bdrv_drained_begin/end() semantics. Nowadays we have bdrv_drained_begin/end() and the AioContext lock just protects state. Therefore there is no reason to hold AioContext across .prepare/.commit/.abort/.clean() anymore. Besides being cleanup-worthy, holding AioContext also exposes a bug: BDRV_POLL_WHILE() doesn't support recursive AioContext locking and will hang if depth > 1. This is easy to trigger by submitting a transaction with 2 actions that touch two drives assigned to an IOThread. The IOThread's AioContext will be locked twice and BDRV_POLL_WHILE() will hang. BDRV_POLL_WHILE() is best fixed by eliminating the AioContext lock entirely in favor of fine-grained locking. I discussed some fixes for BDRV_POLL_WHILE() with Paolo but we came to the conclusion that it will just add complexity when we really want to stop using AioContext locking. Summary: * Patch 1 fixes missing AioContext lock protection * Patches 2-6 clean up excessive AioContext locked regions in QMP 'transaction' to solve the hang * Patches 7-9 add a qemu-iotests test case and the necessary infrastructure Stefan Hajnoczi (9): blockdev: hold AioContext for bdrv_unref() in external_snapshot_clean() block: don't keep AioContext acquired after external_snapshot_prepare() block: don't keep AioContext acquired after drive_backup_prepare() block: don't keep AioContext acquired after blockdev_backup_prepare() block: don't keep AioContext acquired after internal_snapshot_prepare() block: drop unused BlockDirtyBitmapState->aio_context field iothread: add iothread_by_id() API blockdev: add x-blockdev-set-iothread testing command qemu-iotests: add 202 external snapshots IOThread test qapi/block-core.json | 36 +++++++ include/sysemu/iothread.h | 1 + blockdev.c | 258 +++++++++++++++++++++++++++++++++------------ iothread.c | 7 ++ tests/qemu-iotests/202 | 95 +++++++++++++++++ tests/qemu-iotests/202.out | 11 ++ tests/qemu-iotests/group | 1 + 7 files changed, 339 insertions(+), 70 deletions(-) create mode 100755 tests/qemu-iotests/202 create mode 100644 tests/qemu-iotests/202.out Reviewed-by: Kevin Wolf Reviewed-by: Paolo Bonzini