Message ID | 20211008143825.3717116-1-sean@geanix.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [RFC,1/2] mtd: core: protect access to mtd devices while in suspend | expand |
Hi Sean, Can you please submit that as a separate thread, ideally with an incremented version number, a changelog and a reference to all your previous attempts. On Fri, 8 Oct 2021 16:38:24 +0200 Sean Nyekjaer <sean@geanix.com> wrote: > This will prevent reading/writing/erasing to a suspended mtd device. > It will force mtd_write()/mtd_read()/mtd_erase() to wait for > mtd_resume() to unlock access to mtd devices. I think this has to be done for all the hooks except ->_reboot(), ->_get_device() and ->_put_device(). > > Exec_op[0] speed things up, so we see this race when rawnand devices going Mention the commit directly: Commit ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op") speed things up, so we see this race when rawnand devices going ... > into suspend. But it's actually "mtd: rawnand: Simplify the locking" that But it's actually commit 013e6292aaf5 ("mtd: rawnand: Simplify the locking") that ... > allows it to return errors rather than locking, before that commit it would > have waited for the rawnand device to resume. > > Tested on a iMX6ULL. > > [0]: > ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op") > > Fixes: 013e6292aaf5 ("mtd: rawnand: Simplify the locking") > Signed-off-by: Sean Nyekjaer <sean@geanix.com> You flagged yourself as the author even though you didn't really write that code. I guess I'm fine with that, but I'd appreciate a Suggested-by: Boris Brezillon <boris.brezillon@collabora.com> here, at least. > --- > > Hope I got it all :) > > drivers/mtd/mtdcore.c | 57 ++++++++++++++++++++++++++++++++++++++++- > include/linux/mtd/mtd.h | 36 ++++++++++++++++++-------- > 2 files changed, 81 insertions(+), 12 deletions(-) > > diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c > index c8fd7f758938..3c93202e6cbb 100644 > --- a/drivers/mtd/mtdcore.c > +++ b/drivers/mtd/mtdcore.c > @@ -36,6 +36,44 @@ > > struct backing_dev_info *mtd_bdi; > > +static void mtd_start_access(struct mtd_info *mtd) > +{ > + struct mtd_info *master = mtd_get_master(mtd); > + > + /* > + * Don't take the suspend_lock on devices that don't > + * implement the suspend hook. Otherwise, lockdep will > + * complain about nested locks when trying to suspend MTD > + * partitions or MTD devices created by gluebi which are > + * backed by real devices. > + */ > + if (!master->_suspend) > + return; > + You need to remove the ->_suspend()/->_resume() implementation in mtd_concat.c, otherwise you'll hit the case described in the comment. BTW, did you test this series with lockdep enabled to make sure we don't introduce a deadlock? > + /* > + * Wait until the device is resumed. Should we have a > + * non-blocking mode here? > + */ > + while (1) { > + down_read(&master->master.suspend_lock); > + if (!master->master.suspended) > + return; > + > + up_read(&master->master.suspend_lock); > + wait_event(master->master.resume_wq, master->master.suspended == 0); > + } > +} > + > +static void mtd_end_access(struct mtd_info *mtd) > +{ > + struct mtd_info *master = mtd_get_master(mtd); > + > + if (!master->_suspend) > + return; > + > + up_read(&master->master.suspend_lock); > +} > +
Hi Sean, sean@geanix.com wrote on Fri, 8 Oct 2021 16:38:24 +0200: > This will prevent reading/writing/erasing to a suspended mtd device. > It will force mtd_write()/mtd_read()/mtd_erase() to wait for > mtd_resume() to unlock access to mtd devices. Maybe you can use the present tense, as in: Prevent reading/... s/mtd/MTD/ Force mtd_write/... I would suggest something like: Prevent accessing the devices while in a suspended state. Also prevent suspending a device which is still currently in use. > Exec_op[0] speed things up, so we see this race when rawnand devices going I believe you can quote the commit inline, like below (please use the 12-digit hash below as well). I am not sure ->exec_op() is to be blamed here, maybe this change revealed the issue but I doubt it is because of its efficiency. The problem was just laying silently IMHO. > into suspend. But it's actually "mtd: rawnand: Simplify the locking" that > allows it to return errors rather than locking, before that commit it would > have waited for the rawnand device to resume. I don't think so, I believe it was broken in the same way but was just not returning errors. > > Tested on a iMX6ULL. > > [0]: > ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op") > Suggested-by: Boris... would be nice. > Fixes: 013e6292aaf5 ("mtd: rawnand: Simplify the locking") > Signed-off-by: Sean Nyekjaer <sean@geanix.com> > --- > > Hope I got it all :) > > drivers/mtd/mtdcore.c | 57 ++++++++++++++++++++++++++++++++++++++++- > include/linux/mtd/mtd.h | 36 ++++++++++++++++++-------- > 2 files changed, 81 insertions(+), 12 deletions(-) > > diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c > index c8fd7f758938..3c93202e6cbb 100644 > --- a/drivers/mtd/mtdcore.c > +++ b/drivers/mtd/mtdcore.c > @@ -36,6 +36,44 @@ > > struct backing_dev_info *mtd_bdi; > > +static void mtd_start_access(struct mtd_info *mtd) > +{ > + struct mtd_info *master = mtd_get_master(mtd); > + > + /* > + * Don't take the suspend_lock on devices that don't > + * implement the suspend hook. Otherwise, lockdep will > + * complain about nested locks when trying to suspend MTD > + * partitions or MTD devices created by gluebi which are > + * backed by real devices. > + */ > + if (!master->_suspend) > + return; > + > + /* > + * Wait until the device is resumed. Should we have a > + * non-blocking mode here? > + */ > + while (1) { > + down_read(&master->master.suspend_lock); > + if (!master->master.suspended) > + return; > + > + up_read(&master->master.suspend_lock); > + wait_event(master->master.resume_wq, master->master.suspended == 0); "var == 0" translates well to "!var" > + } > +} > + > +static void mtd_end_access(struct mtd_info *mtd) > +{ > + struct mtd_info *master = mtd_get_master(mtd); > + > + if (!master->_suspend) > + return; > + > + up_read(&master->master.suspend_lock); > +} > + > #ifdef CONFIG_PM_SLEEP > > static int mtd_cls_suspend(struct device *dev) > @@ -1000,6 +1038,9 @@ int mtd_device_parse_register(struct mtd_info *mtd, const char * const *types, > > ret = mtd_otp_nvmem_add(mtd); > > + init_waitqueue_head(&mtd->master.resume_wq); > + init_rwsem(&mtd->master.suspend_lock); what about setting this in mtd_set_dev_defaults()? > + > out: > if (ret && device_is_registered(&mtd->dev)) > del_mtd_device(mtd); > @@ -1241,6 +1282,8 @@ int mtd_erase(struct mtd_info *mtd, struct erase_info *instr) > struct erase_info adjinstr; > int ret; > > + mtd_start_access(mtd); I believe we should cover all the ioctls, even if they are not accessing the device. I don't think it's a problem to stop interacting when the devices are suspended? > + > instr->fail_addr = MTD_FAIL_ADDR_UNKNOWN; > adjinstr = *instr; > > @@ -1278,6 +1321,8 @@ int mtd_erase(struct mtd_info *mtd, struct erase_info *instr) > } > } > > + mtd_end_access(mtd); > + > return ret; > } > EXPORT_SYMBOL_GPL(mtd_erase); > @@ -1558,6 +1603,8 @@ int mtd_read_oob(struct mtd_info *mtd, loff_t from, struct mtd_oob_ops *ops) > struct mtd_ecc_stats old_stats = master->ecc_stats; > int ret_code; > > + mtd_start_access(mtd); > + > ops->retlen = ops->oobretlen = 0; > > ret_code = mtd_check_oob_ops(mtd, from, ops); > @@ -1577,6 +1624,8 @@ int mtd_read_oob(struct mtd_info *mtd, loff_t from, struct mtd_oob_ops *ops) > > mtd_update_ecc_stats(mtd, master, &old_stats); > > + mtd_end_access(mtd); > + > /* > * In cases where ops->datbuf != NULL, mtd->_read_oob() has semantics > * similar to mtd->_read(), returning a non-negative integer > @@ -1597,6 +1646,8 @@ int mtd_write_oob(struct mtd_info *mtd, loff_t to, > struct mtd_info *master = mtd_get_master(mtd); > int ret; > > + mtd_start_access(mtd); > + > ops->retlen = ops->oobretlen = 0; > > if (!(mtd->flags & MTD_WRITEABLE)) > @@ -1615,7 +1666,11 @@ int mtd_write_oob(struct mtd_info *mtd, loff_t to, > if (mtd->flags & MTD_SLC_ON_MLC_EMULATION) > return mtd_io_emulated_slc(mtd, to, false, ops); > > - return mtd_write_oob_std(mtd, to, ops); > + ret = mtd_write_oob_std(mtd, to, ops); > + > + mtd_end_access(mtd); > + > + return ret; > } > EXPORT_SYMBOL_GPL(mtd_write_oob); > > diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h > index 88227044fc86..cfab07b02dc9 100644 > --- a/include/linux/mtd/mtd.h > +++ b/include/linux/mtd/mtd.h > @@ -231,6 +231,8 @@ struct mtd_master { > struct mutex partitions_lock; > struct mutex chrdev_lock; > unsigned int suspended : 1; > + wait_queue_head_t resume_wq; > + struct rw_semaphore suspend_lock; > }; > > struct mtd_info { > @@ -546,30 +548,42 @@ int mtd_block_markbad(struct mtd_info *mtd, loff_t ofs); > static inline int mtd_suspend(struct mtd_info *mtd) > { > struct mtd_info *master = mtd_get_master(mtd); > - int ret; > + int ret = 0; > > - if (master->master.suspended) > - return 0; > > - ret = master->_suspend ? master->_suspend(master) : 0; > - if (ret) > + if (!master->_suspend) > return ret; > > - master->master.suspended = 1; > - return 0; > + down_write(&master->master.suspend_lock); > + if (!master->master.suspended) { > + ret = master->_suspend(master); > + if (!ret) > + master->master.suspended = 1; > + } > + up_write(&master->master.suspend_lock); > + > + return ret; > } > > static inline void mtd_resume(struct mtd_info *mtd) > { > struct mtd_info *master = mtd_get_master(mtd); > > - if (!master->master.suspended) > + if (!master->_suspend) > return; > > - if (master->_resume) > - master->_resume(master); > > - master->master.suspended = 0; > + down_write(&master->master.suspend_lock); > + if (master->master.suspended) { > + if (master->_resume) > + master->_resume(master); > + > + master->master.suspended = 0; > + > + /* The MTD dev has been resumed, wake up all waiters. */ > + wake_up_all(&master->master.resume_wq); > + } > + up_write(&master->master.suspend_lock); > } > > static inline uint32_t mtd_div_by_eb(uint64_t sz, struct mtd_info *mtd) The other patch lgtm. Thanks, Miquèl
On Fri, 8 Oct 2021 17:35:26 +0200 Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > into suspend. But it's actually "mtd: rawnand: Simplify the locking" that > > allows it to return errors rather than locking, before that commit it would > > have waited for the rawnand device to resume. > > I don't think so, I believe it was broken in the same way but was just > not returning errors. Actually I was wrong, 013e6292aaf5 ("mtd: rawnand: Simplify the locking") removed the blocking wait (returning -EBUSY when the device is suspended instead of putting the thread on a waitqueue). At that time, I assumed all threads would be paused when the device is suspended, which appeared to be incorrect. So I guess the Fixes tag should remain, and we might want to consider backporting a less invasive patch to stable releases (one touching only the raw NAND layer).
On Fri, Oct 08, 2021 at 05:30:43PM +0200, Boris Brezillon wrote: > Hi Sean, > > Can you please submit that as a separate thread, ideally with an > incremented version number, a changelog and a reference to all your > previous attempts. Yes, I'll do that with the next one. > > On Fri, 8 Oct 2021 16:38:24 +0200 > Sean Nyekjaer <sean@geanix.com> wrote: > > > This will prevent reading/writing/erasing to a suspended mtd device. > > It will force mtd_write()/mtd_read()/mtd_erase() to wait for > > mtd_resume() to unlock access to mtd devices. > > I think this has to be done for all the hooks except ->_reboot(), > ->_get_device() and ->_put_device(). > > > > > Exec_op[0] speed things up, so we see this race when rawnand devices going > > Mention the commit directly: > > Commit ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op") speed > things up, so we see this race when rawnand devices going ... > > > into suspend. But it's actually "mtd: rawnand: Simplify the locking" that > > But it's actually commit 013e6292aaf5 ("mtd: rawnand: Simplify the > locking") that ... > > > allows it to return errors rather than locking, before that commit it would > > have waited for the rawnand device to resume. > > > > Tested on a iMX6ULL. > > > > [0]: > > ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op") > > > > Fixes: 013e6292aaf5 ("mtd: rawnand: Simplify the locking") > > Signed-off-by: Sean Nyekjaer <sean@geanix.com> > > You flagged yourself as the author even though you didn't really write > that code. I guess I'm fine with that, but I'd appreciate a > > Suggested-by: Boris Brezillon <boris.brezillon@collabora.com> > > here, at least. > Of course, of course I forgot it... Still an RFC after all :) > > --- > > > > Hope I got it all :) > > > > drivers/mtd/mtdcore.c | 57 ++++++++++++++++++++++++++++++++++++++++- > > include/linux/mtd/mtd.h | 36 ++++++++++++++++++-------- > > 2 files changed, 81 insertions(+), 12 deletions(-) > > > > diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c > > index c8fd7f758938..3c93202e6cbb 100644 > > --- a/drivers/mtd/mtdcore.c > > +++ b/drivers/mtd/mtdcore.c > > @@ -36,6 +36,44 @@ > > > > struct backing_dev_info *mtd_bdi; > > > > +static void mtd_start_access(struct mtd_info *mtd) > > +{ > > + struct mtd_info *master = mtd_get_master(mtd); > > + > > + /* > > + * Don't take the suspend_lock on devices that don't > > + * implement the suspend hook. Otherwise, lockdep will > > + * complain about nested locks when trying to suspend MTD > > + * partitions or MTD devices created by gluebi which are > > + * backed by real devices. > > + */ > > + if (!master->_suspend) > > + return; > > + > > You need to remove the ->_suspend()/->_resume() implementation in > mtd_concat.c, otherwise you'll hit the case described in the comment. Do you mean to remove concat_suspend() and concat_resume() together with the references to them? > > BTW, did you test this series with lockdep enabled to make sure we > don't introduce a deadlock? > Good you mentioned it... I thought the kernel had LOCKDEP enabled, but I guess it at some point got removed. It reveals that mtd_read_oob() -> mtd_start_access() is using the suspend_lock rw_semaphore before it's initialized... But it's not complaining when going suspend and resuming, will continue to test with LOCKDEP enabled. /Sean > > + /* > > + * Wait until the device is resumed. Should we have a > > + * non-blocking mode here? > > + */ > > + while (1) { > > + down_read(&master->master.suspend_lock); > > + if (!master->master.suspended) > > + return; > > + > > + up_read(&master->master.suspend_lock); > > + wait_event(master->master.resume_wq, master->master.suspended == 0); > > + } > > +} > > + > > +static void mtd_end_access(struct mtd_info *mtd) > > +{ > > + struct mtd_info *master = mtd_get_master(mtd); > > + > > + if (!master->_suspend) > > + return; > > + > > + up_read(&master->master.suspend_lock); > > +} > > +
On Fri, Oct 08, 2021 at 06:08:11PM +0200, Boris Brezillon wrote: > On Fri, 8 Oct 2021 17:35:26 +0200 > Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > into suspend. But it's actually "mtd: rawnand: Simplify the locking" that > > > allows it to return errors rather than locking, before that commit it would > > > have waited for the rawnand device to resume. > > > > I don't think so, I believe it was broken in the same way but was just > > not returning errors. > > Actually I was wrong, 013e6292aaf5 ("mtd: rawnand: Simplify the > locking") removed the blocking wait (returning -EBUSY when the device > is suspended instead of putting the thread on a waitqueue). At that > time, I assumed all threads would be paused when the device is > suspended, which appeared to be incorrect. So I guess the Fixes tag > should remain, and we might want to consider backporting a less > invasive patch to stable releases (one touching only the raw NAND > layer). Thanks Miquel add Reviewed-By you on the second patch. I'll remove the mentioning of commit ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op") in this commit msg. Is it possible to backport another(less invasive) patch to stable releases? I thought only upstream commits could be backported. /Sean
diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c index c8fd7f758938..3c93202e6cbb 100644 --- a/drivers/mtd/mtdcore.c +++ b/drivers/mtd/mtdcore.c @@ -36,6 +36,44 @@ struct backing_dev_info *mtd_bdi; +static void mtd_start_access(struct mtd_info *mtd) +{ + struct mtd_info *master = mtd_get_master(mtd); + + /* + * Don't take the suspend_lock on devices that don't + * implement the suspend hook. Otherwise, lockdep will + * complain about nested locks when trying to suspend MTD + * partitions or MTD devices created by gluebi which are + * backed by real devices. + */ + if (!master->_suspend) + return; + + /* + * Wait until the device is resumed. Should we have a + * non-blocking mode here? + */ + while (1) { + down_read(&master->master.suspend_lock); + if (!master->master.suspended) + return; + + up_read(&master->master.suspend_lock); + wait_event(master->master.resume_wq, master->master.suspended == 0); + } +} + +static void mtd_end_access(struct mtd_info *mtd) +{ + struct mtd_info *master = mtd_get_master(mtd); + + if (!master->_suspend) + return; + + up_read(&master->master.suspend_lock); +} + #ifdef CONFIG_PM_SLEEP static int mtd_cls_suspend(struct device *dev) @@ -1000,6 +1038,9 @@ int mtd_device_parse_register(struct mtd_info *mtd, const char * const *types, ret = mtd_otp_nvmem_add(mtd); + init_waitqueue_head(&mtd->master.resume_wq); + init_rwsem(&mtd->master.suspend_lock); + out: if (ret && device_is_registered(&mtd->dev)) del_mtd_device(mtd); @@ -1241,6 +1282,8 @@ int mtd_erase(struct mtd_info *mtd, struct erase_info *instr) struct erase_info adjinstr; int ret; + mtd_start_access(mtd); + instr->fail_addr = MTD_FAIL_ADDR_UNKNOWN; adjinstr = *instr; @@ -1278,6 +1321,8 @@ int mtd_erase(struct mtd_info *mtd, struct erase_info *instr) } } + mtd_end_access(mtd); + return ret; } EXPORT_SYMBOL_GPL(mtd_erase); @@ -1558,6 +1603,8 @@ int mtd_read_oob(struct mtd_info *mtd, loff_t from, struct mtd_oob_ops *ops) struct mtd_ecc_stats old_stats = master->ecc_stats; int ret_code; + mtd_start_access(mtd); + ops->retlen = ops->oobretlen = 0; ret_code = mtd_check_oob_ops(mtd, from, ops); @@ -1577,6 +1624,8 @@ int mtd_read_oob(struct mtd_info *mtd, loff_t from, struct mtd_oob_ops *ops) mtd_update_ecc_stats(mtd, master, &old_stats); + mtd_end_access(mtd); + /* * In cases where ops->datbuf != NULL, mtd->_read_oob() has semantics * similar to mtd->_read(), returning a non-negative integer @@ -1597,6 +1646,8 @@ int mtd_write_oob(struct mtd_info *mtd, loff_t to, struct mtd_info *master = mtd_get_master(mtd); int ret; + mtd_start_access(mtd); + ops->retlen = ops->oobretlen = 0; if (!(mtd->flags & MTD_WRITEABLE)) @@ -1615,7 +1666,11 @@ int mtd_write_oob(struct mtd_info *mtd, loff_t to, if (mtd->flags & MTD_SLC_ON_MLC_EMULATION) return mtd_io_emulated_slc(mtd, to, false, ops); - return mtd_write_oob_std(mtd, to, ops); + ret = mtd_write_oob_std(mtd, to, ops); + + mtd_end_access(mtd); + + return ret; } EXPORT_SYMBOL_GPL(mtd_write_oob); diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h index 88227044fc86..cfab07b02dc9 100644 --- a/include/linux/mtd/mtd.h +++ b/include/linux/mtd/mtd.h @@ -231,6 +231,8 @@ struct mtd_master { struct mutex partitions_lock; struct mutex chrdev_lock; unsigned int suspended : 1; + wait_queue_head_t resume_wq; + struct rw_semaphore suspend_lock; }; struct mtd_info { @@ -546,30 +548,42 @@ int mtd_block_markbad(struct mtd_info *mtd, loff_t ofs); static inline int mtd_suspend(struct mtd_info *mtd) { struct mtd_info *master = mtd_get_master(mtd); - int ret; + int ret = 0; - if (master->master.suspended) - return 0; - ret = master->_suspend ? master->_suspend(master) : 0; - if (ret) + if (!master->_suspend) return ret; - master->master.suspended = 1; - return 0; + down_write(&master->master.suspend_lock); + if (!master->master.suspended) { + ret = master->_suspend(master); + if (!ret) + master->master.suspended = 1; + } + up_write(&master->master.suspend_lock); + + return ret; } static inline void mtd_resume(struct mtd_info *mtd) { struct mtd_info *master = mtd_get_master(mtd); - if (!master->master.suspended) + if (!master->_suspend) return; - if (master->_resume) - master->_resume(master); - master->master.suspended = 0; + down_write(&master->master.suspend_lock); + if (master->master.suspended) { + if (master->_resume) + master->_resume(master); + + master->master.suspended = 0; + + /* The MTD dev has been resumed, wake up all waiters. */ + wake_up_all(&master->master.resume_wq); + } + up_write(&master->master.suspend_lock); } static inline uint32_t mtd_div_by_eb(uint64_t sz, struct mtd_info *mtd)
This will prevent reading/writing/erasing to a suspended mtd device. It will force mtd_write()/mtd_read()/mtd_erase() to wait for mtd_resume() to unlock access to mtd devices. Exec_op[0] speed things up, so we see this race when rawnand devices going into suspend. But it's actually "mtd: rawnand: Simplify the locking" that allows it to return errors rather than locking, before that commit it would have waited for the rawnand device to resume. Tested on a iMX6ULL. [0]: ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op") Fixes: 013e6292aaf5 ("mtd: rawnand: Simplify the locking") Signed-off-by: Sean Nyekjaer <sean@geanix.com> --- Hope I got it all :) drivers/mtd/mtdcore.c | 57 ++++++++++++++++++++++++++++++++++++++++- include/linux/mtd/mtd.h | 36 ++++++++++++++++++-------- 2 files changed, 81 insertions(+), 12 deletions(-)