Message ID | 20170426174609.29433-2-ben.dooks@codethink.co.uk |
---|---|
State | Superseded |
Headers | show |
On 26/04/17 19:18, Ben Hutchings wrote: > On Wed, 2017-04-26 at 18:46 +0100, Ben Dooks wrote: >> The mtd_erase() call can hit code that will trigger warnings >> from __might_sleep(), such as the do_erase_oneblock() function >> on the cfi_cmdset_0002.c file. >> >> This is due to some of the erase functions doing the work in the >> thread they are called in, which means that the erase_write() >> should only go into TASK_INTERRUPTIBLE once the mtd_erase call >> has returned. > [...] >> diff --git a/drivers/mtd/mtdblock.c b/drivers/mtd/mtdblock.c >> index bb4c14f83c75..4b1cd464f919 100644 >> --- a/drivers/mtd/mtdblock.c >> +++ b/drivers/mtd/mtdblock.c >> @@ -68,6 +68,7 @@ static int erase_write (struct mtd_info *mtd, unsigned long pos, >> DECLARE_WAITQUEUE(wait, current); >> wait_queue_head_t wait_q; >> size_t retlen; >> + long timeout = 1; >> int ret; >> >> /* >> @@ -81,12 +82,10 @@ static int erase_write (struct mtd_info *mtd, unsigned long pos, >> erase.len = len; >> erase.priv = (u_long)&wait_q; >> >> - set_current_state(TASK_INTERRUPTIBLE); >> add_wait_queue(&wait_q, &wait); >> >> ret = mtd_erase(mtd, &erase); >> if (ret) { >> - set_current_state(TASK_RUNNING); >> remove_wait_queue(&wait_q, &wait); >> printk (KERN_WARNING "mtdblock: erase of region [0x%lx, 0x%x] " >> "on \"%s\" failed\n", >> @@ -94,8 +93,18 @@ static int erase_write (struct mtd_info *mtd, unsigned long pos, >> return ret; >> } >> >> - schedule(); /* Wait for erase to finish. */ >> + if (erase->state != MTD_ERASE_DONE && >> + erase->state != MTD_ERASE_FAILED) >> + timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, >> + MAX_SCHEDULE_TIMEOUT); > > If mtd_erase() returns 0 then the wait queue either has been woken or > will be woken. Since we're already on the wait queue, it's safe to wait > unconditionally. > > I think that making the wait conditional results in a race condition > that could result in returning too early. > > Also there seems to be another existing problem here: if this is > interrupted and we return early then the driver can use-after-free the > wait queue and erase structure. mtdchar waits uninterruptibly for > exactly this reason. > > We really ought to have an always-synchronous wrapper for mtd_erase(), > because this seems to be hard to get right... > > Ben. Ok, so add something like mtd_erase_sync() which uses mtd_erase and then waits for the completion and then use it for both mtdblock and mtdchar?
On Thu, 2017-04-27 at 09:27 +0100, Ben Dooks wrote: > On 26/04/17 19:18, Ben Hutchings wrote: > > On Wed, 2017-04-26 at 18:46 +0100, Ben Dooks wrote: > >> The mtd_erase() call can hit code that will trigger warnings > >> from __might_sleep(), such as the do_erase_oneblock() function > >> on the cfi_cmdset_0002.c file. > >> > >> This is due to some of the erase functions doing the work in the > >> thread they are called in, which means that the erase_write() > >> should only go into TASK_INTERRUPTIBLE once the mtd_erase call > >> has returned. > > [...] > >> diff --git a/drivers/mtd/mtdblock.c b/drivers/mtd/mtdblock.c > >> index bb4c14f83c75..4b1cd464f919 100644 > >> --- a/drivers/mtd/mtdblock.c > >> +++ b/drivers/mtd/mtdblock.c > >> @@ -68,6 +68,7 @@ static int erase_write (struct mtd_info *mtd, unsigned long pos, > >> DECLARE_WAITQUEUE(wait, current); > >> wait_queue_head_t wait_q; > >> size_t retlen; > >> + long timeout = 1; > >> int ret; > >> > >> /* > >> @@ -81,12 +82,10 @@ static int erase_write (struct mtd_info *mtd, unsigned long pos, > >> erase.len = len; > >> erase.priv = (u_long)&wait_q; > >> > >> - set_current_state(TASK_INTERRUPTIBLE); > >> add_wait_queue(&wait_q, &wait); > >> > >> ret = mtd_erase(mtd, &erase); > >> if (ret) { > >> - set_current_state(TASK_RUNNING); > >> remove_wait_queue(&wait_q, &wait); > >> printk (KERN_WARNING "mtdblock: erase of region [0x%lx, 0x%x] " > >> "on \"%s\" failed\n", > >> @@ -94,8 +93,18 @@ static int erase_write (struct mtd_info *mtd, unsigned long pos, > >> return ret; > >> } > >> > >> - schedule(); /* Wait for erase to finish. */ > >> + if (erase->state != MTD_ERASE_DONE && > >> + erase->state != MTD_ERASE_FAILED) > >> + timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, > >> + MAX_SCHEDULE_TIMEOUT); > > > > If mtd_erase() returns 0 then the wait queue either has been woken or > > will be woken. Since we're already on the wait queue, it's safe to wait > > unconditionally. > > > > I think that making the wait conditional results in a race condition > > that could result in returning too early. > > > > Also there seems to be another existing problem here: if this is > > interrupted and we return early then the driver can use-after-free the > > wait queue and erase structure. mtdchar waits uninterruptibly for > > exactly this reason. > > > > We really ought to have an always-synchronous wrapper for mtd_erase(), > > because this seems to be hard to get right... > > > > Ben. > > Ok, so add something like mtd_erase_sync() which uses mtd_erase and then > waits for the completion and then use it for both mtdblock and mtdchar? ...and any other callers that want synchronous behaviour, especially if they're doing it wrong now. Ben.
diff --git a/drivers/mtd/mtdblock.c b/drivers/mtd/mtdblock.c index bb4c14f83c75..4b1cd464f919 100644 --- a/drivers/mtd/mtdblock.c +++ b/drivers/mtd/mtdblock.c @@ -68,6 +68,7 @@ static int erase_write (struct mtd_info *mtd, unsigned long pos, DECLARE_WAITQUEUE(wait, current); wait_queue_head_t wait_q; size_t retlen; + long timeout = 1; int ret; /* @@ -81,12 +82,10 @@ static int erase_write (struct mtd_info *mtd, unsigned long pos, erase.len = len; erase.priv = (u_long)&wait_q; - set_current_state(TASK_INTERRUPTIBLE); add_wait_queue(&wait_q, &wait); ret = mtd_erase(mtd, &erase); if (ret) { - set_current_state(TASK_RUNNING); remove_wait_queue(&wait_q, &wait); printk (KERN_WARNING "mtdblock: erase of region [0x%lx, 0x%x] " "on \"%s\" failed\n", @@ -94,8 +93,18 @@ static int erase_write (struct mtd_info *mtd, unsigned long pos, return ret; } - schedule(); /* Wait for erase to finish. */ + if (erase->state != MTD_ERASE_DONE && + erase->state != MTD_ERASE_FAILED) + timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT); + remove_wait_queue(&wait_q, &wait); + if (timeout == 0) { + printk (KERN_WARNING "mtdblock: erase of region [0x%lx, 0x%x] " + "on \"%s\" failed\n", + pos, len, mtd->name); + return -ETIMEDOUT; + } /* * Next, write the data to flash.
The mtd_erase() call can hit code that will trigger warnings from __might_sleep(), such as the do_erase_oneblock() function on the cfi_cmdset_0002.c file. This is due to some of the erase functions doing the work in the thread they are called in, which means that the erase_write() should only go into TASK_INTERRUPTIBLE once the mtd_erase call has returned. Note, this still does not check if the erase->state is not set to an error. This should stop the following warning being produced: ] WARNING: CPU: 0 PID: 23 at kernel/sched/core.c:7588 __might_sleep+0x84/0x9c() ] do not call blocking ops when !TASK_RUNNING; state=1 set at [<8029c724>] erase_write+0x0/0x178 ] Modules linked in: ] CPU: 0 PID: 23 Comm: kworker/0:1 Tainted: G W 4.4.62-dev-ic-9Feb2017-47164-g95b1e7c09ff3 #855 ] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) ] Workqueue: mtdblock9 mtd_blktrans_work ] [<8001eddc>] (unwind_backtrace) from [<8001b2fc>] (show_stack+0x10/0x14) ] [<8001b2fc>] (show_stack) from [<80213948>] (dump_stack+0x80/0x9c) ] [<80213948>] (dump_stack) from [<800287d8>] (warn_slowpath_common+0x80/0xac) ] [<800287d8>] (warn_slowpath_common) from [<80028830>] (warn_slowpath_fmt+0x2c/0x3c) ] [<80028830>] (warn_slowpath_fmt) from [<800448ec>] (__might_sleep+0x84/0x9c) ] [<800448ec>] (__might_sleep) from [<8047a714>] (mutex_lock+0x18/0x3c) ] [<8047a714>] (mutex_lock) from [<802a22dc>] (do_erase_oneblock+0x64/0x3b8) ] [<802a22dc>] (do_erase_oneblock) from [<8029d628>] (cfi_varsize_frob+0x144/0x1dc) ] [<8029d628>] (cfi_varsize_frob) from [<8029e930>] (cfi_amdstd_erase_varsize+0x28/0x50) ] [<8029e930>] (cfi_amdstd_erase_varsize) from [<80299370>] (part_erase+0x2c/0x74) ] [<80299370>] (part_erase) from [<8029c7e0>] (erase_write+0xbc/0x178) ] [<8029c7e0>] (erase_write) from [<8029c8c4>] (write_cached_data+0x28/0x3c) ] [<8029c8c4>] (write_cached_data) from [<8029cc70>] (mtdblock_writesect+0x178/0x1bc) ] [<8029cc70>] (mtdblock_writesect) from [<8029bf68>] (mtd_blktrans_work+0x2ec/0x360) ] [<8029bf68>] (mtd_blktrans_work) from [<8003b314>] (process_one_work+0x19c/0x310) ] [<8003b314>] (process_one_work) from [<8003c174>] (worker_thread+0x2e8/0x3d4) ] [<8003c174>] (worker_thread) from [<800403d0>] (kthread+0xe4/0xf8) ] [<800403d0>] (kthread) from [<80017478>] (ret_from_fork+0x14/0x3c) ] ---[ end trace ca8c94fad04126d2 ]--- [ben.hutchings@codethink.co.uk: suggest use of wait_woken()] Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> --- drivers/mtd/mtdblock.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)