diff mbox series

[2/2] ubi: block: Fix deadlock on remove

Message ID 20230523-ubiblock-remove-v1-2-240bed75849b@axis.com
State New
Delegated to: Richard Weinberger
Headers show
Series ubi: block: fix use-after-free and deadlock | expand

Commit Message

Vincent Whitchurch May 23, 2023, 1:12 p.m. UTC
Lockdep warns about possible circular locking when the following
commands are run:

 ubiblock --create /dev/ubi0_0
 head -c1 /dev/ubiblock0_0 > /dev/null
 ubiblock --remove /dev/ubi0_0

 ======================================================
 WARNING: possible circular locking dependency detected

 ubiblock/364 is trying to acquire lock:
 (&disk->open_mutex){+.+.}-{3:3}, at: del_gendisk (block/genhd.c:616)

 but task is already holding lock:
 (&dev->dev_mutex){+.+.}-{3:3}, at: ubiblock_remove (drivers/mtd/ubi/block.c:476)

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&dev->dev_mutex){+.+.}-{3:3}:
   ubiblock_open (drivers/mtd/ubi/block.c:236)
   blkdev_get_whole (block/bdev.c:607)
   blkdev_get_by_dev (block/bdev.c:756)
   blkdev_open (block/fops.c:493)
   ...
   do_sys_openat2 (fs/open.c:1356)

 -> #0 (&disk->open_mutex){+.+.}-{3:3}:
   del_gendisk (block/genhd.c:616)
   ubiblock_remove (drivers/mtd/ubi/block.c:456 drivers/mtd/ubi/block.c:483)
   vol_cdev_ioctl
   ...

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&dev->dev_mutex);
                                lock(&disk->open_mutex);
                                lock(&dev->dev_mutex);
   lock(&disk->open_mutex);

 *** DEADLOCK ***

 Call Trace:
 del_gendisk (block/genhd.c:616)
 ubiblock_remove (drivers/mtd/ubi/block.c:456 drivers/mtd/ubi/block.c:483)
 vol_cdev_ioctl
 ...

The actual deadlock is also easily reproducible by running the above
commands in parallel in a loop.

Fix this by marking the device as going away and releasing the dev mutex
before del_gendisk().  This is similar to other drivers such as
drivers/block/zram/zram_drv.c.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
---
 drivers/mtd/ubi/block.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig May 24, 2023, 6:04 a.m. UTC | #1
If you imlement ->free_disk, the list_del and kfree can move into
that, and we don't really care if a new opener raced with the delete.
Vincent Whitchurch May 24, 2023, 1:36 p.m. UTC | #2
On Tue, 2023-05-23 at 23:04 -0700, Christoph Hellwig wrote:
> If you imlement ->free_disk, the list_del and kfree can move into
> that, and we don't really care if a new opener raced with the delete.

Moving the kfree() to ->free_disk() works, but the list_del() still
needs to be in ubiblock_remove() since otherwise ubiblock_remove() could
attempt to remove the same device twice.

I assumed the current code really wanted to prevent new openers racing
with delete, but if that is not needed, yes, we don't need to add a
->removing flag if we move the kfree() to ->free_disk().  I'll re-spin
this based on your suggestions.  Thanks.
Christoph Hellwig May 25, 2023, 9:50 a.m. UTC | #3
On Wed, May 24, 2023 at 01:36:39PM +0000, Vincent Whitchurch wrote:
> On Tue, 2023-05-23 at 23:04 -0700, Christoph Hellwig wrote:
> > If you imlement ->free_disk, the list_del and kfree can move into
> > that, and we don't really care if a new opener raced with the delete.
> 
> Moving the kfree() to ->free_disk() works, but the list_del() still
> needs to be in ubiblock_remove() since otherwise ubiblock_remove() could
> attempt to remove the same device twice.

Or we'd still need your removed flag..

> I assumed the current code really wanted to prevent new openers racing
> with delete, but if that is not needed, yes, we don't need to add a
> ->removing flag if we move the kfree() to ->free_disk().  I'll re-spin
> this based on your suggestions.  Thanks.

I think in the past we always had to protect against removals of live
devices because handling of hot removes sucked so bad, both in drivers
and in the block layer itself.  With some newer infrastructure including
the ->free_disk method this can now be handled sanely.
diff mbox series

Patch

diff --git a/drivers/mtd/ubi/block.c b/drivers/mtd/ubi/block.c
index 70caec4606cd..fcfea7cfdb6b 100644
--- a/drivers/mtd/ubi/block.c
+++ b/drivers/mtd/ubi/block.c
@@ -83,6 +83,8 @@  struct ubiblock {
 	struct mutex dev_mutex;
 	struct list_head list;
 	struct blk_mq_tag_set tag_set;
+
+	bool removing;
 };
 
 /* Linked list of all ubiblock instances */
@@ -233,6 +235,11 @@  static int ubiblock_open(struct block_device *bdev, fmode_t mode)
 	int ret;
 
 	mutex_lock(&dev->dev_mutex);
+	if (dev->removing) {
+		ret = -ENODEV;
+		goto out_unlock;
+	}
+
 	if (dev->refcnt > 0) {
 		/*
 		 * The volume is already open, just increase the reference
@@ -480,8 +487,15 @@  int ubiblock_remove(struct ubi_volume_info *vi)
 
 	/* Remove from device list */
 	list_del(&dev->list);
-	ubiblock_cleanup(dev);
+
+	/*
+	 * Prevent further opens.  del_gendisk() will ensure that there are no
+	 * parallel openers.
+	 */
+	dev->removing = true;
 	mutex_unlock(&dev->dev_mutex);
+
+	ubiblock_cleanup(dev);
 	mutex_unlock(&devices_mutex);
 
 	kfree(dev);