From patchwork Mon Sep 9 10:13:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Sementsov-Ogievskiy X-Patchwork-Id: 1159658 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=virtuozzo.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=virtuozzo.com header.i=@virtuozzo.com header.b="KbYTIE2g"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46RkY61y0rz9sNf for ; Mon, 9 Sep 2019 20:14:32 +1000 (AEST) Received: from localhost ([::1]:54200 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7Ggn-0002h6-UR for incoming@patchwork.ozlabs.org; Mon, 09 Sep 2019 06:14:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50080) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7GgP-0002gE-5e for qemu-devel@nongnu.org; Mon, 09 Sep 2019 06:14:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7GgM-0004Su-VA for qemu-devel@nongnu.org; Mon, 09 Sep 2019 06:14:04 -0400 Received: from mail-eopbgr130118.outbound.protection.outlook.com ([40.107.13.118]:21569 helo=EUR01-HE1-obe.outbound.protection.outlook.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7GgM-0004PQ-8G; Mon, 09 Sep 2019 06:14:02 -0400 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hk7nh+9RXL9D6CHW6K/tt6tUrZC7nFWthh5hg1yQW8Jl5Kmpz/b+xiyAQkVXTT1JfPFpV3kb5gxq5SpmJuogcFj9Q8KBGocwiBnugMimSAhJ67NO/MS1s4UI2JFQuTDzTWv2zcH31cSWEIQoZZJigG79AsAVh0XMf1LLVTpCVL0vlb8S/doam/TBKOFk0Z9GXJHI1kLt4cRLPDB2pNI7kdPVKF37VIAfpEqBYK9QIiAdBPfxb+YNkjzMtFqTzgUHXP6IuQ8dc53SJ7pRHp32Zbs7GEy8XMRDXTMt6kSI617w7/kX0izkGpOzmRB3Z7xNp3IxKVSAwCGdERxOtleOsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=V5LZGuhTrqymgTnHknxTkwGIhVshpRHq2hhbDJBfLYY=; b=fkraupib1W467scd9BAP75EJWy/PJxBqAm01/PphNpM13zr+pqURnpsobUkrVNN3C4dwkUv+qGG4vgEld4umgVSJsQKCCGLmrxkZ8y//COMJ0AXExMb657AKwJQcS0mA63iSj/5X/N3MVI1ETxNcTHU9NHFeLe4zuATb6DN8CUiQssUwKNmf3X5P8WyexeSCNsja3lR/yj2skPxtZnnghlZ48XgpapEYnr3X58tL9ZKhBuRcjnFySEL2ZYxyli5CN9jNdAxqczO4UguOkW9VtvN0c2mBj9jN/60Fi0DyJncsz9SoGkkoCJZDFHEPluHn3NlahUNexF//gXMZnVvAgg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=virtuozzo.com; dmarc=pass action=none header.from=virtuozzo.com; dkim=pass header.d=virtuozzo.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=virtuozzo.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=V5LZGuhTrqymgTnHknxTkwGIhVshpRHq2hhbDJBfLYY=; b=KbYTIE2gubFwIw2/DnsJnli+Ma/0Esc+e2hxviNRI37RV3eIltpTjqn4IyUkVSl5xpdxmnqxW1Wo2yFyKioP3C3AWRPw0lqa/2Z6XCVyXURgGA7hQlGM7KKjjSFiVsrnrxaUGDcry45gFqb8e9AxZNU4Lif2imwEb5q0tqSTQMQ= Received: from DB8PR08MB5498.eurprd08.prod.outlook.com (52.133.242.216) by DB8PR08MB5130.eurprd08.prod.outlook.com (10.255.18.89) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2241.18; Mon, 9 Sep 2019 10:13:57 +0000 Received: from DB8PR08MB5498.eurprd08.prod.outlook.com ([fe80::b5c0:6b97:438d:77ed]) by DB8PR08MB5498.eurprd08.prod.outlook.com ([fe80::b5c0:6b97:438d:77ed%2]) with mapi id 15.20.2241.018; Mon, 9 Sep 2019 10:13:57 +0000 From: Vladimir Sementsov-Ogievskiy To: qemu block Thread-Topic: qcow2 lock Thread-Index: AQHVZvdKzivEreajZ0eODxMruLGs3w== Date: Mon, 9 Sep 2019 10:13:57 +0000 Message-ID: <135df452-397a-30bb-7518-2184fa5971aa@virtuozzo.com> Accept-Language: ru-RU, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: HE1PR05CA0271.eurprd05.prod.outlook.com (2603:10a6:3:fc::23) To DB8PR08MB5498.eurprd08.prod.outlook.com (2603:10a6:10:11c::24) authentication-results: spf=none (sender IP is ) smtp.mailfrom=vsementsov@virtuozzo.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [185.231.240.5] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: b095f1f8-5242-40dd-7dd1-08d7350e6d16 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600166)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:DB8PR08MB5130; x-ms-traffictypediagnostic: DB8PR08MB5130: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:416; x-forefront-prvs: 01559F388D x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(346002)(39840400004)(366004)(376002)(136003)(396003)(199004)(189003)(71190400001)(186003)(81166006)(14454004)(8936002)(386003)(6506007)(478600001)(8676002)(36756003)(53936002)(6512007)(6436002)(26005)(7736002)(305945005)(14444005)(256004)(86362001)(19627235002)(6486002)(7116003)(66066001)(81156014)(31696002)(6116002)(3846002)(4326008)(66946007)(2906002)(66476007)(66556008)(64756008)(25786009)(66446008)(486006)(476003)(2616005)(71200400001)(99286004)(6916009)(102836004)(31686004)(5660300002)(316002)(52116002)(54906003); DIR:OUT; SFP:1102; SCL:1; SRVR:DB8PR08MB5130; H:DB8PR08MB5498.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: virtuozzo.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: KPF3OWxDJp9SPEm3dH1NyiVxPZ3IT3T/o4Xt5MdYXm652kHezsSzhi5V/0qu522D3mU4xIQBx8fK4vUwZDc144ZFbz2MrLhsyWkUH+uNaysEUnW5Zlns9v+LXONt9UanBiA+R86xjyE3t4wRb3sZ0muqosjgN+rXCial8qo2d2ebeyezGeWF7FsXEk9D5T+RDZweimG+Z6yGQtQnUwSmtdjfMApa7MhOq6C6EOnyMw23hrJkNWDmdYjKg9nZS6RNHM5/cO1PXirFhL5PBv4oVAlA+X5Cs9ifzjVW3yucqnyoKHtAi4kxePmTClk1sJNAvxP7JpQ3d4SpK4STDCINFo2Cjl0pRoDu+aqYLqIDIZ0KKz0xribGpDljF9pF7q00AGNmM2j6oM2uLvb4xJ0IR8zYJ2/ix6V5i8YNjOIrLQU= x-ms-exchange-transport-forked: True Content-ID: MIME-Version: 1.0 X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-Network-Message-Id: b095f1f8-5242-40dd-7dd1-08d7350e6d16 X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Sep 2019 10:13:57.4561 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 0bc7f26d-0264-416e-a6fc-8352af79c58f X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: ENvRPHKykFVbRRSZDljM8SflbVAOkl7HWEq/OfRJ32YeAwAn+Fpyfa4Gm2v0rCACyVF9IIUEVeLlU6mNGLZNJkDrmnZWm9lhCj21WC8rrT8= X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB5130 X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 [fuzzy] X-Received-From: 40.107.13.118 Subject: [Qemu-devel] qcow2 lock X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , qemu-devel , Max Reitz Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Hi! I have a (may be stupid) question: what is BDRVQcow2State.lock for and when should it be locked? I faced SIGSEGV here: #0 qcow2_process_discards (bs=bs@entry=0x564b93bc8000, ret=ret@entry=0) at block/qcow2-refcount.c:737 #1 0x0000564b90e9f15f in qcow2_cluster_discard (bs=bs@entry=0x564b93bc8000, offset=0, offset@entry=3062890496, bytes=bytes@entry=134217728, type=type@entry=QCOW2_DISCARD_REQUEST, full_discard=full_discard@entry=false) at block/qcow2-cluster.c:1853 #2 0x0000564b90e8f720 in qcow2_co_pdiscard (bs=0x564b93bc8000, offset=3062890496, bytes=134217728) at block/qcow2.c:3749 #3 0x0000564b90ec565d in bdrv_co_pdiscard (bs=0x564b93bc8000, offset=3062890496, bytes=134217728) at block/io.c:2939 #4 0x0000564b90eb5c04 in blk_aio_pdiscard_entry (opaque=0x564b94f968c0) at block/block-backend.c:1527 #5 0x0000564b90f681ea in coroutine_trampoline (i0=, i1=) at util/coroutine-ucontext.c:116 SIGSEGV is on QTAILQ_REMOVE in qcow2_process_discards: (gdb) list 732 { 733 BDRVQcow2State *s = bs->opaque; 734 Qcow2DiscardRegion *d, *next; 735 736 QTAILQ_FOREACH_SAFE(d, &s->discards, next, next) { 737 QTAILQ_REMOVE(&s->discards, d, next); 738 739 /* Discard is optional, ignore the return value */ 740 if (ret >= 0) { 741 bdrv_pdiscard(bs->file->bs, d->offset, d->bytes); (you see bs->file->bs, yes it's old code based on 2.12, but still, I need some help on the following) and problem is that d is already deleted: (gdb) p d->next $50 = {tqe_next = 0x564b94b0b140, tqe_prev = 0x0} Such problems may occur when there is an interleaving of such removing loops with other usage of the queue. And this is possible, as we call bdrv_pdiscard inside the loop which may yield. go to frame #5, and print co->caller stack: #0 0x0000564b90f68180 in qemu_coroutine_switch () #1 0x0000564b90f66c84 in qemu_aio_coroutine_enter () #2 0x0000564b90f50764 in aio_co_enter () #3 0x0000564b90f50ea9 in thread_pool_completion_bh () #4 0x0000564b90f500d1 in aio_bh_poll () #5 0x0000564b90f5360b in aio_poll () #6 0x0000564b90ec59cd in bdrv_pdiscard () #7 0x0000564b90e96a36 in qcow2_process_discards () #8 0x0000564b90e97785 in update_refcount () #9 0x0000564b90e96bdd in qcow2_free_clusters () #10 0x0000564b90ea29c7 in update_ext_header_and_dir () #11 0x0000564b90ea3a14 in qcow2_remove_persistent_dirty_bitmap () #12 0x0000564b90ce7bce in qmp_block_dirty_bitmap_remove () #13 0x0000564b90cf5390 in qmp_marshal_block_dirty_bitmap_remove () #14 0x0000564b90f46080 in qmp_dispatch () #15 0x0000564b90bedc74 in monitor_qmp_dispatch_one () #16 0x0000564b90bee04a in monitor_qmp_bh_dispatcher () #17 0x0000564b90f500d1 in aio_bh_poll () #18 0x0000564b90f53430 in aio_dispatch () #19 0x0000564b90f4ffae in aio_ctx_dispatch () #20 0x00007f0a8e3e9049 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #21 0x0000564b90f52727 in main_loop_wait () #22 0x0000564b90ba0c07 in main () And this (at least partly) confirms my guess. So, my actual question is, what should be fixed here: 1. yielding in qcow2_process_discards, like this: And in this case, I'm afraid that locking is missed in some other bitmap related qcow2 codes :( or 3. Just backport from upstream John's fix 0a6c86d024c52b, which acquires aio context around bdrv_remove_persistent_dirty_bitmap call? Is it enough, or we still need locking inside qcow2? --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -732,9 +732,13 @@ void qcow2_process_discards(BlockDriverState *bs, int ret) { BDRVQcow2State *s = bs->opaque; Qcow2DiscardRegion *d, *next; + QTAILQ_HEAD (, Qcow2DiscardRegion) discards; - QTAILQ_FOREACH_SAFE(d, &s->discards, next, next) { - QTAILQ_REMOVE(&s->discards, d, next); + discards = s->discards; + QTAILQ_INIT(&s->discards); + + QTAILQ_FOREACH_SAFE(d, &discards, next, next) { + QTAILQ_REMOVE(&discards, d, next); /* Discard is optional, ignore the return value */ if (ret >= 0) { or 2. calling qcow2_remove_persistent_dirty_bitmap without taking lock, like this: --- a/block/qcow2-bitmap.c +++ b/block/qcow2-bitmap.c @@ -1359,8 +1359,8 @@ void qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs, { int ret; BDRVQcow2State *s = bs->opaque; - Qcow2Bitmap *bm; - Qcow2BitmapList *bm_list; + Qcow2Bitmap *bm = NULL; + Qcow2BitmapList *bm_list = NULL; if (s->nb_bitmaps == 0) { /* Absence of the bitmap is not an error: see explanation above @@ -1368,15 +1368,17 @@ void qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs, return; } + qemu_co_mutex_lock(&s->lock); + bm_list = bitmap_list_load(bs, s->bitmap_directory_offset, s->bitmap_directory_size, errp); if (bm_list == NULL) { - return; + goto out; } bm = find_bitmap_by_name(bm_list, name); if (bm == NULL) { - goto fail; + goto out; } QSIMPLEQ_REMOVE(bm_list, bm, Qcow2Bitmap, entry); @@ -1384,12 +1386,14 @@ void qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs, ret = update_ext_header_and_dir(bs, bm_list); if (ret < 0) { error_setg_errno(errp, -ret, "Failed to update bitmap extension"); - goto fail; + goto out; } free_bitmap_clusters(bs, &bm->table); -fail: +out: + qemu_co_mutex_unlock(&s->lock); + bitmap_free(bm); bitmap_list_free(bm_list); }