From patchwork Mon Mar 23 22:40:13 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 453659 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2001:1868:205::9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id B17CE140119 for ; Tue, 24 Mar 2015 09:42:11 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1YaB1S-0005X3-LJ; Mon, 23 Mar 2015 22:40:38 +0000 Received: from mx1.redhat.com ([209.132.183.28]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1YaB1Q-0005UA-22 for linux-mtd@lists.infradead.org; Mon, 23 Mar 2015 22:40:37 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id 6291CC1F04; Mon, 23 Mar 2015 22:40:14 +0000 (UTC) Received: from localhost (dhcp-25-149.bos.redhat.com [10.18.25.149]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t2NMeDOv017698 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Mon, 23 Mar 2015 18:40:13 -0400 Date: Mon, 23 Mar 2015 18:40:13 -0400 From: Mike Snitzer To: Christoph Hellwig Subject: Re: [PATCH 11/12] fs: don't reassign dirty inodes to default_backing_dev_info Message-ID: <20150323224012.GA29505@redhat.com> References: <1421228561-16857-1-git-send-email-hch@lst.de> <1421228561-16857-12-git-send-email-hch@lst.de> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150323_154036_161794_E9D96EB2 X-CRM114-Status: GOOD ( 26.84 ) X-Spam-Score: -5.0 (-----) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (-5.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at http://www.dnswl.org/, high trust [209.132.183.28 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay domain -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [209.132.183.28 listed in wl.mailspike.net] -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders Cc: Jens Axboe , linux-nfs@vger.kernel.org, Jeff Moyer , David Howells , linux-mm@kvack.org, device-mapper development , linux-mtd@lists.infradead.org, linux-fsdevel , Tejun Heo , ceph-devel@vger.kernel.org X-BeenThere: linux-mtd@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-mtd" Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org On Sat, Mar 21 2015 at 11:11am -0400, Mike Snitzer wrote: > On Wed, Jan 14, 2015 at 4:42 AM, Christoph Hellwig wrote: > > If we have dirty inodes we need to call the filesystem for it, even if the > > device has been removed and the filesystem will error out early. The > > current code does that by reassining all dirty inodes to the default > > backing_dev_info when a bdi is unlinked, but that's pretty pointless given > > that the bdi must always outlive the super block. > > > > Instead of stopping writeback at unregister time and moving inodes to the > > default bdi just keep the current bdi alive until it is destroyed. The > > containing objects of the bdi ensure this doesn't happen until all > > writeback has finished by erroring out. > > > > Signed-off-by: Christoph Hellwig > > Reviewed-by: Tejun Heo > > --- > > mm/backing-dev.c | 91 +++++++++++++++----------------------------------------- > > 1 file changed, 24 insertions(+), 67 deletions(-) > > Hey Christoph, > > Just a heads up: your commit c4db59d31e39ea067c32163ac961e9c80198fd37 > is suspected as the first bad commit in a bisect performed to track > down the cause of DM crashes reported in this BZ: > https://bugzilla.redhat.com/show_bug.cgi?id=1202449 > > I've yet to look closely at _why_ this commit but figured I'd share > since this appears to be a 4.0-rcX regression. FYI, here is the DM fix I've staged for 4.0-rc6. I'll continue testing the various DM targets before requesting Linus to pull. From 63a4f065ece613b6d575b538234375b0e9c23bbc Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Mon, 23 Mar 2015 17:01:43 -0400 Subject: [PATCH] dm: fix add_disk() NULL pointer due to race with free_dev() Commit c4db59d31e39 ("fs: don't reassign dirty inodes to default_backing_dev_info") exposed DM to a latent race in free_dev() vs add_disk() in relation to management of the device's minor number. Fix this by refactoring free_dev() to match cleanup order of the alloc_dev() error path. Move cleanup of the gendisk, queue, and bdev to _before_ the cleanup of the idr managed minor number. Also, purely due to cleanup that fell out during the free_dev() audit: - adjust dm_blk_close() to access the gendisk's private_data under the _minor_lock spinlock. - move __dm_destroy()'s dm_get_live_table() call out from under the _minor_lock spinlock. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1202449 Reported-by: Zdenek Kabelac Reported-by: Jeff Moyer Signed-off-by: Mike Snitzer --- drivers/md/dm.c | 26 ++++++++++++++++---------- 1 files changed, 16 insertions(+), 10 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 9b641b3..8001fe9 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -433,7 +433,6 @@ static int dm_blk_open(struct block_device *bdev, fmode_t mode) dm_get(md); atomic_inc(&md->open_count); - out: spin_unlock(&_minor_lock); @@ -442,16 +441,20 @@ out: static void dm_blk_close(struct gendisk *disk, fmode_t mode) { - struct mapped_device *md = disk->private_data; + struct mapped_device *md; spin_lock(&_minor_lock); + md = disk->private_data; + if (WARN_ON(!md)) + goto out; + if (atomic_dec_and_test(&md->open_count) && (test_bit(DMF_DEFERRED_REMOVE, &md->flags))) queue_work(deferred_remove_workqueue, &deferred_remove_work); dm_put(md); - +out: spin_unlock(&_minor_lock); } @@ -2241,7 +2244,6 @@ static void free_dev(struct mapped_device *md) int minor = MINOR(disk_devt(md->disk)); unlock_fs(md); - bdput(md->bdev); destroy_workqueue(md->wq); if (md->kworker_task) @@ -2252,19 +2254,22 @@ static void free_dev(struct mapped_device *md) mempool_destroy(md->rq_pool); if (md->bs) bioset_free(md->bs); - blk_integrity_unregister(md->disk); - del_gendisk(md->disk); + cleanup_srcu_struct(&md->io_barrier); free_table_devices(&md->table_devices); - free_minor(minor); + dm_stats_cleanup(&md->stats); spin_lock(&_minor_lock); md->disk->private_data = NULL; spin_unlock(&_minor_lock); - + if (blk_get_integrity(md->disk)) + blk_integrity_unregister(md->disk); + del_gendisk(md->disk); put_disk(md->disk); blk_cleanup_queue(md->queue); - dm_stats_cleanup(&md->stats); + bdput(md->bdev); + free_minor(minor); + module_put(THIS_MODULE); kfree(md); } @@ -2642,8 +2647,9 @@ static void __dm_destroy(struct mapped_device *md, bool wait) might_sleep(); - spin_lock(&_minor_lock); map = dm_get_live_table(md, &srcu_idx); + + spin_lock(&_minor_lock); idr_replace(&_minor_idr, MINOR_ALLOCED, MINOR(disk_devt(dm_disk(md)))); set_bit(DMF_FREEING, &md->flags); spin_unlock(&_minor_lock);