Message ID | 20171206025342.7266-3-Zhiqiang.Hou@nxp.com |
---|---|
State | Accepted |
Delegated to: | Cyrille Pitchen |
Headers | show |
Series | mtd: m25p80: restore the addressing mode when exiting | expand |
Le 06/12/2017 à 03:53, Zhiqiang Hou a écrit : > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > Restore the status to be compatible with legacy devices. > Take Freescale eSPI boot for example, it copies (in 3 Byte > addressing mode) the RCW and bootloader images from SPI flash > without firing a reset signal previously, so the reboot command > will fail without reseting the addressing mode of SPI flash. resetting > This patch implement .shutdown function to restore the status implements > in reboot process, and add the same operation to the .remove > function. > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Applied to the spi-nor/next branch of l2-mtd Thanks! (corrected few mistakes in the commit message) > --- > V3: > - Modified the commit to make this patch specific. > > drivers/mtd/devices/m25p80.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c > index dbe6a1de2bb8..a4e18f6aaa33 100644 > --- a/drivers/mtd/devices/m25p80.c > +++ b/drivers/mtd/devices/m25p80.c > @@ -307,10 +307,18 @@ static int m25p_remove(struct spi_device *spi) > { > struct m25p *flash = spi_get_drvdata(spi); > > + spi_nor_restore(&flash->spi_nor); > + > /* Clean up MTD stuff. */ > return mtd_device_unregister(&flash->spi_nor.mtd); > } > > +static void m25p_shutdown(struct spi_device *spi) > +{ > + struct m25p *flash = spi_get_drvdata(spi); > + > + spi_nor_restore(&flash->spi_nor); > +} > /* > * Do NOT add to this array without reading the following: > * > @@ -386,6 +394,7 @@ static struct spi_driver m25p80_driver = { > .id_table = m25p_ids, > .probe = m25p_probe, > .remove = m25p_remove, > + .shutdown = m25p_shutdown, > > /* REVISIT: many of these chips have deep power-down modes, which > * should clearly be entered on suspend() to minimize power use. >
Hello, I noticed this got merged, but I wanted to put my 2 cents in here: On Wed, Dec 06, 2017 at 10:53:42AM +0800, Zhiqiang Hou wrote: > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > Restore the status to be compatible with legacy devices. > Take Freescale eSPI boot for example, it copies (in 3 Byte > addressing mode) the RCW and bootloader images from SPI flash > without firing a reset signal previously, so the reboot command > will fail without reseting the addressing mode of SPI flash. > This patch implement .shutdown function to restore the status > in reboot process, and add the same operation to the .remove > function. We have previously rejected this patch multiple times, because the above comment demonstrates a broken product. You cannot guarantee that all reboots will invoke the .shutdown() method -- what about crashes? What about watchdog resets? IIUC, those will hit the same broken behavior, and have unexepcted behavior in your bootloader. I suppose one could argue for doing this in remove(), but AIUI you're just papering over system bugs by introducing the shutdown() function here. Thus, I'd prefer we drop the shutdown() method to avoid misleading other users of this driver. Brian > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > --- > V3: > - Modified the commit to make this patch specific. > > drivers/mtd/devices/m25p80.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c > index dbe6a1de2bb8..a4e18f6aaa33 100644 > --- a/drivers/mtd/devices/m25p80.c > +++ b/drivers/mtd/devices/m25p80.c > @@ -307,10 +307,18 @@ static int m25p_remove(struct spi_device *spi) > { > struct m25p *flash = spi_get_drvdata(spi); > > + spi_nor_restore(&flash->spi_nor); > + > /* Clean up MTD stuff. */ > return mtd_device_unregister(&flash->spi_nor.mtd); > } > > +static void m25p_shutdown(struct spi_device *spi) > +{ > + struct m25p *flash = spi_get_drvdata(spi); > + > + spi_nor_restore(&flash->spi_nor); > +} > /* > * Do NOT add to this array without reading the following: > * > @@ -386,6 +394,7 @@ static struct spi_driver m25p80_driver = { > .id_table = m25p_ids, > .probe = m25p_probe, > .remove = m25p_remove, > + .shutdown = m25p_shutdown, > > /* REVISIT: many of these chips have deep power-down modes, which > * should clearly be entered on suspend() to minimize power use. > -- > 2.14.1 >
Hi Brian, On Mon, 23 Jul 2018 11:13:50 -0700 Brian Norris <computersforpeace@gmail.com> wrote: > Hello, > > I noticed this got merged, but I wanted to put my 2 cents in here: I wish you had replied to this thread when it was posted (more than 6 months ago). Reverting the patch now implies making some people unhappy because they'll have to resort to their old out-of-tree hacks :-(. > > On Wed, Dec 06, 2017 at 10:53:42AM +0800, Zhiqiang Hou wrote: > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > > Restore the status to be compatible with legacy devices. > > Take Freescale eSPI boot for example, it copies (in 3 Byte > > addressing mode) the RCW and bootloader images from SPI flash > > without firing a reset signal previously, so the reboot command > > will fail without reseting the addressing mode of SPI flash. > > This patch implement .shutdown function to restore the status > > in reboot process, and add the same operation to the .remove > > function. > > We have previously rejected this patch multiple times, because the above > comment demonstrates a broken product. If we were to only support working HW parts, I fear Linux would not support a lot of HW (that's even more true when it comes to flashes :P). > You cannot guarantee that all > reboots will invoke the .shutdown() method -- what about crashes? What > about watchdog resets? IIUC, those will hit the same broken behavior, > and have unexepcted behavior in your bootloader. Yes, there are corner cases that are not addressed with this approach, but it still seems to improve things. Of course, that means the user should try to re-route all HW reset sources to SW ones (RESET input pin muxed to the GPIO controller, watchdog generating an interrupt instead of directly asserting the RESET output pin), which is not always possible, but even when it's not, isn't it better to have a setup that works fine 99% of the time instead of 50% of the time? > > I suppose one could argue for doing this in remove(), but AIUI you're > just papering over system bugs by introducing the shutdown() function > here. Thus, I'd prefer we drop the shutdown() method to avoid misleading > other users of this driver. I understand your point. But if the problem is about making sure people designing new boards get that right, why not complaining at probe time when things are wrong? I mean, spi_nor_restore() seems to only do something on very specific NORs (those on which a SW RESET does not resets the addressing mode). So, how about adding a flag that says "my board has the NOR HW RESET pin wired" (there would be a DT props to set that flag). Then you add a WARN_ON() when this flag is not set and a NOR chip impacted by this bug is detected. This way you make sure people are informed that they're doing something wrong, and for those who can't change their HW (because it's already widely deployed), you have a fix that improve things. Regards, Boris
Hi Boris, On Mon, Jul 23, 2018 at 1:10 PM, Boris Brezillon <boris.brezillon@bootlin.com> wrote: > On Mon, 23 Jul 2018 11:13:50 -0700 > Brian Norris <computersforpeace@gmail.com> wrote: >> I noticed this got merged, but I wanted to put my 2 cents in here: > > I wish you had replied to this thread when it was posted (more than > 6 months ago). Reverting the patch now implies making some people > unhappy because they'll have to resort to their old out-of-tree > hacks :-(. I'd say I'm sorry for not following things closely these days, but I'm not really that sorry. There are plenty of other capable hands. And if y'all shoot yourselves in the foot, so be it. This patch isn't going to blow things up, but now that I did finally notice it (because it happened to show up in a list of backports I was looking at), I thought better late than never to remind you. For way of notification: Marek already noticed that we've started down a slippery slope months ago: https://lkml.org/lkml/2018/4/8/141 Re: [PATCH] mtd: spi-nor: clear Extended Address Reg on switch to 3-byte addressing. I'm not quite sure why that wasn't taken to its logical conclusion -- that the hack should be reverted. This problem has been noted many times already, and we've always stayed on the side of *avoiding* this hack. A few references from a search of my email: http://lists.infradead.org/pipermail/linux-mtd/2013-March/046343.html [PATCH 1/3] mtd: m25p80: utilize dedicated 4-byte addressing commands http://lists.infradead.org/pipermail/barebox/2014-September/020682.html [RFC] MTD m25p80 3-byte addressing and boot problem http://lists.infradead.org/pipermail/linux-mtd/2015-February/057683.html [PATCH 2/2] m25p80: if supported put chip to deep power down if not used >> On Wed, Dec 06, 2017 at 10:53:42AM +0800, Zhiqiang Hou wrote: >> > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> >> > >> > Restore the status to be compatible with legacy devices. >> > Take Freescale eSPI boot for example, it copies (in 3 Byte >> > addressing mode) the RCW and bootloader images from SPI flash >> > without firing a reset signal previously, so the reboot command >> > will fail without reseting the addressing mode of SPI flash. >> > This patch implement .shutdown function to restore the status >> > in reboot process, and add the same operation to the .remove >> > function. >> >> We have previously rejected this patch multiple times, because the above >> comment demonstrates a broken product. > > If we were to only support working HW parts, I fear Linux would not > support a lot of HW (that's even more true when it comes to flashes :P). You stopped allowing UBI to attach to MLC NAND recently, no? That sounds like almost the same boat -- you've probably killed quite a few shitty products, if they were to use mainline directly. Anyway, that's derailing the issue. Supporting broken hardware isn't something you try to do by applying the same hack to all systems. You normally try to apply your hack as narrowly as possible. You seem to imply that below. So maybe that's a solution to move forward with. But I'd personally be just as happy to see the patch reverted. >> You cannot guarantee that all >> reboots will invoke the .shutdown() method -- what about crashes? What >> about watchdog resets? IIUC, those will hit the same broken behavior, >> and have unexepcted behavior in your bootloader. > > Yes, there are corner cases that are not addressed with this approach, Is a system crash really a corner case? :D > but it still seems to improve things. Of course, that means the > user should try to re-route all HW reset sources to SW ones (RESET input > pin muxed to the GPIO controller, watchdog generating an interrupt > instead of directly asserting the RESET output pin), which is not always > possible, but even when it's not, isn't it better to have a setup that > works fine 99% of the time instead of 50% of the time? Perhaps, but not at the expense of future development. And realistically, no one is doing that if they have this hack. Most people won't even know that this hack is protecting them at all (so again, they won't try to mitigate the problem any further). >> I suppose one could argue for doing this in remove(), but AIUI you're >> just papering over system bugs by introducing the shutdown() function >> here. Thus, I'd prefer we drop the shutdown() method to avoid misleading >> other users of this driver. > > I understand your point. But if the problem is about making sure people > designing new boards get that right, why not complaining at probe time > when things are wrong? > > I mean, spi_nor_restore() seems to only do something on very specific > NORs (those on which a SW RESET does not resets the addressing > mode). The point isn't that SW RESET doesn't reset the addressing mode -- it does on any flash I've seen. The point is that most systems are built around a stateless assumption in these flash. IIRC, there wasn't even a SW RESET command at all until these "huge" flash came around and stateful addressing modes came about. So boot ROMs and bootloaders would have to be updated to start figuring out when/how to do this SW RESET. And once two vendors start doing it differently (I'm not sure: have they done this already? I think so) it's no longer something a boot ROM will get right. The only way to get this stuff right is to have a hardware reset, or else to avoid all of the stateful modes in software. > So, how about adding a flag that says "my board has the NOR HW > RESET pin wired" (there would be a DT props to set that flag). Then you > add a WARN_ON() when this flag is not set and a NOR chip impacted by > this bug is detected. I'd kinda prefer the reverse. There really isn't a need to document anything for a working system (software usually can't control this RESET pin). The burden should be on the b0rked system to document where it needs unsound hacks to survive. > This way you make sure people are informed that > they're doing something wrong, and for those who can't change their HW > (because it's already widely deployed), you have a fix that improve > things. Or even better: put this hack behind a DT flag, so that one has to admit that their board design is broken before it will even do anything. Proposal: "linux,badly-designed-flash-reset". But, I'd prefer just (partially?) reverting this, and let the authors submit something that works. We're not obligated to keep bad hacks in the kernel. Brian
On Mon, Jul 23 2018, Brian Norris wrote: > Hi Boris, > > On Mon, Jul 23, 2018 at 1:10 PM, Boris Brezillon > <boris.brezillon@bootlin.com> wrote: >> On Mon, 23 Jul 2018 11:13:50 -0700 >> Brian Norris <computersforpeace@gmail.com> wrote: >>> I noticed this got merged, but I wanted to put my 2 cents in here: >> >> I wish you had replied to this thread when it was posted (more than >> 6 months ago). Reverting the patch now implies making some people >> unhappy because they'll have to resort to their old out-of-tree >> hacks :-(. > > I'd say I'm sorry for not following things closely these days, but I'm > not really that sorry. There are plenty of other capable hands. And if > y'all shoot yourselves in the foot, so be it. This patch isn't going > to blow things up, but now that I did finally notice it (because it > happened to show up in a list of backports I was looking at), I > thought better late than never to remind you. > > For way of notification: Marek already noticed that we've started down > a slippery slope months ago: > > https://lkml.org/lkml/2018/4/8/141 > Re: [PATCH] mtd: spi-nor: clear Extended Address Reg on switch to > 3-byte addressing. > > I'm not quite sure why that wasn't taken to its logical conclusion -- > that the hack should be reverted. > > This problem has been noted many times already, and we've always > stayed on the side of *avoiding* this hack. A few references from a > search of my email: > > http://lists.infradead.org/pipermail/linux-mtd/2013-March/046343.html > [PATCH 1/3] mtd: m25p80: utilize dedicated 4-byte addressing commands > > http://lists.infradead.org/pipermail/barebox/2014-September/020682.html > [RFC] MTD m25p80 3-byte addressing and boot problem > > http://lists.infradead.org/pipermail/linux-mtd/2015-February/057683.html > [PATCH 2/2] m25p80: if supported put chip to deep power down if not used > >>> On Wed, Dec 06, 2017 at 10:53:42AM +0800, Zhiqiang Hou wrote: >>> > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> >>> > >>> > Restore the status to be compatible with legacy devices. >>> > Take Freescale eSPI boot for example, it copies (in 3 Byte >>> > addressing mode) the RCW and bootloader images from SPI flash >>> > without firing a reset signal previously, so the reboot command >>> > will fail without reseting the addressing mode of SPI flash. >>> > This patch implement .shutdown function to restore the status >>> > in reboot process, and add the same operation to the .remove >>> > function. >>> >>> We have previously rejected this patch multiple times, because the above >>> comment demonstrates a broken product. >> >> If we were to only support working HW parts, I fear Linux would not >> support a lot of HW (that's even more true when it comes to flashes :P). > > You stopped allowing UBI to attach to MLC NAND recently, no? That > sounds like almost the same boat -- you've probably killed quite a few > shitty products, if they were to use mainline directly. > > Anyway, that's derailing the issue. Supporting broken hardware isn't > something you try to do by applying the same hack to all systems. You > normally try to apply your hack as narrowly as possible. You seem to > imply that below. So maybe that's a solution to move forward with. But > I'd personally be just as happy to see the patch reverted. > >>> You cannot guarantee that all >>> reboots will invoke the .shutdown() method -- what about crashes? What >>> about watchdog resets? IIUC, those will hit the same broken behavior, >>> and have unexepcted behavior in your bootloader. >> >> Yes, there are corner cases that are not addressed with this approach, > > Is a system crash really a corner case? :D > >> but it still seems to improve things. Of course, that means the >> user should try to re-route all HW reset sources to SW ones (RESET input >> pin muxed to the GPIO controller, watchdog generating an interrupt >> instead of directly asserting the RESET output pin), which is not always >> possible, but even when it's not, isn't it better to have a setup that >> works fine 99% of the time instead of 50% of the time? > > Perhaps, but not at the expense of future development. And > realistically, no one is doing that if they have this hack. Most > people won't even know that this hack is protecting them at all (so > again, they won't try to mitigate the problem any further). > >>> I suppose one could argue for doing this in remove(), but AIUI you're >>> just papering over system bugs by introducing the shutdown() function >>> here. Thus, I'd prefer we drop the shutdown() method to avoid misleading >>> other users of this driver. >> >> I understand your point. But if the problem is about making sure people >> designing new boards get that right, why not complaining at probe time >> when things are wrong? >> >> I mean, spi_nor_restore() seems to only do something on very specific >> NORs (those on which a SW RESET does not resets the addressing >> mode). > > The point isn't that SW RESET doesn't reset the addressing mode -- it > does on any flash I've seen. The point is that most systems are built > around a stateless assumption in these flash. IIRC, there wasn't even > a SW RESET command at all until these "huge" flash came around and > stateful addressing modes came about. So boot ROMs and bootloaders > would have to be updated to start figuring out when/how to do this SW > RESET. And once two vendors start doing it differently (I'm not sure: > have they done this already? I think so) it's no longer something a > boot ROM will get right. > > The only way to get this stuff right is to have a hardware reset, or > else to avoid all of the stateful modes in software. > >> So, how about adding a flag that says "my board has the NOR HW >> RESET pin wired" (there would be a DT props to set that flag). Then you >> add a WARN_ON() when this flag is not set and a NOR chip impacted by >> this bug is detected. > > I'd kinda prefer the reverse. There really isn't a need to document > anything for a working system (software usually can't control this > RESET pin). The burden should be on the b0rked system to document > where it needs unsound hacks to survive. > >> This way you make sure people are informed that >> they're doing something wrong, and for those who can't change their HW >> (because it's already widely deployed), you have a fix that improve >> things. > > Or even better: put this hack behind a DT flag, so that one has to > admit that their board design is broken before it will even do > anything. Proposal: "linux,badly-designed-flash-reset". > > But, I'd prefer just (partially?) reverting this, and let the authors > submit something that works. We're not obligated to keep bad hacks in > the kernel. > > Brian One possibility that occurred to me when I was exploring this issue is to revert to 3-byte mode whenever 4-byte was not actively in use. So any access beyond 16Meg is: switch-to-4-byte ; perform IO ; switch to 3-byte or similar. On my hardware it would be more efficient to use the 4-byte opcode to perform the IO, then reset the cached 4th address byte that the NOR chip transparently remembered. This adds a little overhead, but should be fairly robust. It doesn't help if something goes terribly wrong while IO is happening, but I don't think any other software solution does either. How would you see that approach? Thanks, NeilBrown
+Neil On Mon, 23 Jul 2018 15:06:43 -0700 Brian Norris <computersforpeace@gmail.com> wrote: > Hi Boris, > > On Mon, Jul 23, 2018 at 1:10 PM, Boris Brezillon > <boris.brezillon@bootlin.com> wrote: > > On Mon, 23 Jul 2018 11:13:50 -0700 > > Brian Norris <computersforpeace@gmail.com> wrote: > >> I noticed this got merged, but I wanted to put my 2 cents in here: > > > > I wish you had replied to this thread when it was posted (more than > > 6 months ago). Reverting the patch now implies making some people > > unhappy because they'll have to resort to their old out-of-tree > > hacks :-(. > > I'd say I'm sorry for not following things closely these days, but I'm > not really that sorry. There are plenty of other capable hands. And if > y'all shoot yourselves in the foot, so be it. This patch isn't going > to blow things up, but now that I did finally notice it (because it > happened to show up in a list of backports I was looking at), I > thought better late than never to remind you. > > For way of notification: Marek already noticed that we've started down > a slippery slope months ago: > > https://lkml.org/lkml/2018/4/8/141 > Re: [PATCH] mtd: spi-nor: clear Extended Address Reg on switch to > 3-byte addressing. > > I'm not quite sure why that wasn't taken to its logical conclusion -- > that the hack should be reverted. > > This problem has been noted many times already, and we've always > stayed on the side of *avoiding* this hack. A few references from a > search of my email: > > http://lists.infradead.org/pipermail/linux-mtd/2013-March/046343.html > [PATCH 1/3] mtd: m25p80: utilize dedicated 4-byte addressing commands > > http://lists.infradead.org/pipermail/barebox/2014-September/020682.html > [RFC] MTD m25p80 3-byte addressing and boot problem > > http://lists.infradead.org/pipermail/linux-mtd/2015-February/057683.html > [PATCH 2/2] m25p80: if supported put chip to deep power down if not used To my defense, I was not actively following SPI NOR topics at this time, but, in the light of those discussions, I still keep thinking we should try our best to improve support for broken HW. So, ideally, we should find a way to support getting back to 3-byte addressing mode when resetting, while generating enough noise to make people well aware that their board design is broken (I think the proposal of the new DT prop + a WARN_ON() is a good idea). > > >> On Wed, Dec 06, 2017 at 10:53:42AM +0800, Zhiqiang Hou wrote: > >> > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > >> > > >> > Restore the status to be compatible with legacy devices. > >> > Take Freescale eSPI boot for example, it copies (in 3 Byte > >> > addressing mode) the RCW and bootloader images from SPI flash > >> > without firing a reset signal previously, so the reboot command > >> > will fail without reseting the addressing mode of SPI flash. > >> > This patch implement .shutdown function to restore the status > >> > in reboot process, and add the same operation to the .remove > >> > function. > >> > >> We have previously rejected this patch multiple times, because the above > >> comment demonstrates a broken product. > > > > If we were to only support working HW parts, I fear Linux would not > > support a lot of HW (that's even more true when it comes to flashes :P). > > You stopped allowing UBI to attach to MLC NAND recently, no? That > sounds like almost the same boat -- you've probably killed quite a few > shitty products, if they were to use mainline directly. Well, that's a bit different in that it's purely a SW issue, and we also try to encourage people to take over the work we started with Richard to fix that problem. But I get your point, we broke setups that were working in an unreliable way, pretty much what you're suggesting to do here. > > Anyway, that's derailing the issue. Supporting broken hardware isn't > something you try to do by applying the same hack to all systems. You > normally try to apply your hack as narrowly as possible. This, I agree with. > You seem to > imply that below. So maybe that's a solution to move forward with. But > I'd personally be just as happy to see the patch reverted. > > >> You cannot guarantee that all > >> reboots will invoke the .shutdown() method -- what about crashes? What > >> about watchdog resets? IIUC, those will hit the same broken behavior, > >> and have unexepcted behavior in your bootloader. > > > > Yes, there are corner cases that are not addressed with this approach, > > Is a system crash really a corner case? :D Well, nothing forces you to reset the platform using a HW reset when that happens :P. > > > but it still seems to improve things. Of course, that means the > > user should try to re-route all HW reset sources to SW ones (RESET input > > pin muxed to the GPIO controller, watchdog generating an interrupt > > instead of directly asserting the RESET output pin), which is not always > > possible, but even when it's not, isn't it better to have a setup that > > works fine 99% of the time instead of 50% of the time? > > Perhaps, but not at the expense of future development. And > realistically, no one is doing that if they have this hack. Most > people won't even know that this hack is protecting them at all (so > again, they won't try to mitigate the problem any further). Unless we add a huge backtrace at probe time which forces them to look closer at what they did wrong (like you seem to suggest below). > > >> I suppose one could argue for doing this in remove(), but AIUI you're > >> just papering over system bugs by introducing the shutdown() function > >> here. Thus, I'd prefer we drop the shutdown() method to avoid misleading > >> other users of this driver. > > > > I understand your point. But if the problem is about making sure people > > designing new boards get that right, why not complaining at probe time > > when things are wrong? > > > > I mean, spi_nor_restore() seems to only do something on very specific > > NORs (those on which a SW RESET does not resets the addressing > > mode). > > The point isn't that SW RESET doesn't reset the addressing mode -- it > does on any flash I've seen. The point is that most systems are built > around a stateless assumption in these flash. IIRC, there wasn't even > a SW RESET command at all until these "huge" flash came around and > stateful addressing modes came about. So boot ROMs and bootloaders > would have to be updated to start figuring out when/how to do this SW > RESET. And once two vendors start doing it differently (I'm not sure: > have they done this already? I think so) it's no longer something a > boot ROM will get right. > > The only way to get this stuff right is to have a hardware reset, or > else to avoid all of the stateful modes in software. Okay. > > > So, how about adding a flag that says "my board has the NOR HW > > RESET pin wired" (there would be a DT props to set that flag). Then you > > add a WARN_ON() when this flag is not set and a NOR chip impacted by > > this bug is detected. > > I'd kinda prefer the reverse. There really isn't a need to document > anything for a working system (software usually can't control this > RESET pin). The burden should be on the b0rked system to document > where it needs unsound hacks to survive. That's true. > > > This way you make sure people are informed that > > they're doing something wrong, and for those who can't change their HW > > (because it's already widely deployed), you have a fix that improve > > things. > > Or even better: put this hack behind a DT flag, so that one has to > admit that their board design is broken before it will even do > anything. Proposal: "linux,badly-designed-flash-reset". I think we can remove the "linux," prefix. If it's badly designed, it applies to all OSes, don't you think? Regards, Boris
On Tue, 24 Jul 2018 08:46:33 +1000 NeilBrown <neilb@suse.com> wrote: > On Mon, Jul 23 2018, Brian Norris wrote: > > > Hi Boris, > > > > On Mon, Jul 23, 2018 at 1:10 PM, Boris Brezillon > > <boris.brezillon@bootlin.com> wrote: > >> On Mon, 23 Jul 2018 11:13:50 -0700 > >> Brian Norris <computersforpeace@gmail.com> wrote: > >>> I noticed this got merged, but I wanted to put my 2 cents in here: > >> > >> I wish you had replied to this thread when it was posted (more than > >> 6 months ago). Reverting the patch now implies making some people > >> unhappy because they'll have to resort to their old out-of-tree > >> hacks :-(. > > > > I'd say I'm sorry for not following things closely these days, but I'm > > not really that sorry. There are plenty of other capable hands. And if > > y'all shoot yourselves in the foot, so be it. This patch isn't going > > to blow things up, but now that I did finally notice it (because it > > happened to show up in a list of backports I was looking at), I > > thought better late than never to remind you. > > > > For way of notification: Marek already noticed that we've started down > > a slippery slope months ago: > > > > https://lkml.org/lkml/2018/4/8/141 > > Re: [PATCH] mtd: spi-nor: clear Extended Address Reg on switch to > > 3-byte addressing. > > > > I'm not quite sure why that wasn't taken to its logical conclusion -- > > that the hack should be reverted. > > > > This problem has been noted many times already, and we've always > > stayed on the side of *avoiding* this hack. A few references from a > > search of my email: > > > > http://lists.infradead.org/pipermail/linux-mtd/2013-March/046343.html > > [PATCH 1/3] mtd: m25p80: utilize dedicated 4-byte addressing commands > > > > http://lists.infradead.org/pipermail/barebox/2014-September/020682.html > > [RFC] MTD m25p80 3-byte addressing and boot problem > > > > http://lists.infradead.org/pipermail/linux-mtd/2015-February/057683.html > > [PATCH 2/2] m25p80: if supported put chip to deep power down if not used > > > >>> On Wed, Dec 06, 2017 at 10:53:42AM +0800, Zhiqiang Hou wrote: > >>> > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > >>> > > >>> > Restore the status to be compatible with legacy devices. > >>> > Take Freescale eSPI boot for example, it copies (in 3 Byte > >>> > addressing mode) the RCW and bootloader images from SPI flash > >>> > without firing a reset signal previously, so the reboot command > >>> > will fail without reseting the addressing mode of SPI flash. > >>> > This patch implement .shutdown function to restore the status > >>> > in reboot process, and add the same operation to the .remove > >>> > function. > >>> > >>> We have previously rejected this patch multiple times, because the above > >>> comment demonstrates a broken product. > >> > >> If we were to only support working HW parts, I fear Linux would not > >> support a lot of HW (that's even more true when it comes to flashes :P). > > > > You stopped allowing UBI to attach to MLC NAND recently, no? That > > sounds like almost the same boat -- you've probably killed quite a few > > shitty products, if they were to use mainline directly. > > > > Anyway, that's derailing the issue. Supporting broken hardware isn't > > something you try to do by applying the same hack to all systems. You > > normally try to apply your hack as narrowly as possible. You seem to > > imply that below. So maybe that's a solution to move forward with. But > > I'd personally be just as happy to see the patch reverted. > > > >>> You cannot guarantee that all > >>> reboots will invoke the .shutdown() method -- what about crashes? What > >>> about watchdog resets? IIUC, those will hit the same broken behavior, > >>> and have unexepcted behavior in your bootloader. > >> > >> Yes, there are corner cases that are not addressed with this approach, > > > > Is a system crash really a corner case? :D > > > >> but it still seems to improve things. Of course, that means the > >> user should try to re-route all HW reset sources to SW ones (RESET input > >> pin muxed to the GPIO controller, watchdog generating an interrupt > >> instead of directly asserting the RESET output pin), which is not always > >> possible, but even when it's not, isn't it better to have a setup that > >> works fine 99% of the time instead of 50% of the time? > > > > Perhaps, but not at the expense of future development. And > > realistically, no one is doing that if they have this hack. Most > > people won't even know that this hack is protecting them at all (so > > again, they won't try to mitigate the problem any further). > > > >>> I suppose one could argue for doing this in remove(), but AIUI you're > >>> just papering over system bugs by introducing the shutdown() function > >>> here. Thus, I'd prefer we drop the shutdown() method to avoid misleading > >>> other users of this driver. > >> > >> I understand your point. But if the problem is about making sure people > >> designing new boards get that right, why not complaining at probe time > >> when things are wrong? > >> > >> I mean, spi_nor_restore() seems to only do something on very specific > >> NORs (those on which a SW RESET does not resets the addressing > >> mode). > > > > The point isn't that SW RESET doesn't reset the addressing mode -- it > > does on any flash I've seen. The point is that most systems are built > > around a stateless assumption in these flash. IIRC, there wasn't even > > a SW RESET command at all until these "huge" flash came around and > > stateful addressing modes came about. So boot ROMs and bootloaders > > would have to be updated to start figuring out when/how to do this SW > > RESET. And once two vendors start doing it differently (I'm not sure: > > have they done this already? I think so) it's no longer something a > > boot ROM will get right. > > > > The only way to get this stuff right is to have a hardware reset, or > > else to avoid all of the stateful modes in software. > > > >> So, how about adding a flag that says "my board has the NOR HW > >> RESET pin wired" (there would be a DT props to set that flag). Then you > >> add a WARN_ON() when this flag is not set and a NOR chip impacted by > >> this bug is detected. > > > > I'd kinda prefer the reverse. There really isn't a need to document > > anything for a working system (software usually can't control this > > RESET pin). The burden should be on the b0rked system to document > > where it needs unsound hacks to survive. > > > >> This way you make sure people are informed that > >> they're doing something wrong, and for those who can't change their HW > >> (because it's already widely deployed), you have a fix that improve > >> things. > > > > Or even better: put this hack behind a DT flag, so that one has to > > admit that their board design is broken before it will even do > > anything. Proposal: "linux,badly-designed-flash-reset". > > > > But, I'd prefer just (partially?) reverting this, and let the authors > > submit something that works. We're not obligated to keep bad hacks in > > the kernel. > > > > Brian > > One possibility that occurred to me when I was exploring this issue is > to revert to 3-byte mode whenever 4-byte was not actively in use. > So any access beyond 16Meg is: > switch-to-4-byte ; perform IO ; switch to 3-byte > or similar. On my hardware it would be more efficient to > use the 4-byte opcode to perform the IO, then reset the cached > 4th address byte that the NOR chip transparently remembered. > > This adds a little overhead, but should be fairly robust. > It doesn't help if something goes terribly wrong while IO is happening, > but I don't think any other software solution does either. > > How would you see that approach? I think the problem stands: people that have proper HW mitigation for this problem (NOR chip is reset when the Processor is reset) don't want to pay the overhead. So, even if we go for this approach, we probably want to only do that for broken HW.
On Tue, Jul 24 2018, Boris Brezillon wrote: > On Tue, 24 Jul 2018 08:46:33 +1000 > NeilBrown <neilb@suse.com> wrote: > >> On Mon, Jul 23 2018, Brian Norris wrote: >> >> > Hi Boris, >> > >> > On Mon, Jul 23, 2018 at 1:10 PM, Boris Brezillon >> > <boris.brezillon@bootlin.com> wrote: >> >> On Mon, 23 Jul 2018 11:13:50 -0700 >> >> Brian Norris <computersforpeace@gmail.com> wrote: >> >>> I noticed this got merged, but I wanted to put my 2 cents in here: >> >> >> >> I wish you had replied to this thread when it was posted (more than >> >> 6 months ago). Reverting the patch now implies making some people >> >> unhappy because they'll have to resort to their old out-of-tree >> >> hacks :-(. >> > >> > I'd say I'm sorry for not following things closely these days, but I'm >> > not really that sorry. There are plenty of other capable hands. And if >> > y'all shoot yourselves in the foot, so be it. This patch isn't going >> > to blow things up, but now that I did finally notice it (because it >> > happened to show up in a list of backports I was looking at), I >> > thought better late than never to remind you. >> > >> > For way of notification: Marek already noticed that we've started down >> > a slippery slope months ago: >> > >> > https://lkml.org/lkml/2018/4/8/141 >> > Re: [PATCH] mtd: spi-nor: clear Extended Address Reg on switch to >> > 3-byte addressing. >> > >> > I'm not quite sure why that wasn't taken to its logical conclusion -- >> > that the hack should be reverted. >> > >> > This problem has been noted many times already, and we've always >> > stayed on the side of *avoiding* this hack. A few references from a >> > search of my email: >> > >> > http://lists.infradead.org/pipermail/linux-mtd/2013-March/046343.html >> > [PATCH 1/3] mtd: m25p80: utilize dedicated 4-byte addressing commands >> > >> > http://lists.infradead.org/pipermail/barebox/2014-September/020682.html >> > [RFC] MTD m25p80 3-byte addressing and boot problem >> > >> > http://lists.infradead.org/pipermail/linux-mtd/2015-February/057683.html >> > [PATCH 2/2] m25p80: if supported put chip to deep power down if not used >> > >> >>> On Wed, Dec 06, 2017 at 10:53:42AM +0800, Zhiqiang Hou wrote: >> >>> > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> >> >>> > >> >>> > Restore the status to be compatible with legacy devices. >> >>> > Take Freescale eSPI boot for example, it copies (in 3 Byte >> >>> > addressing mode) the RCW and bootloader images from SPI flash >> >>> > without firing a reset signal previously, so the reboot command >> >>> > will fail without reseting the addressing mode of SPI flash. >> >>> > This patch implement .shutdown function to restore the status >> >>> > in reboot process, and add the same operation to the .remove >> >>> > function. >> >>> >> >>> We have previously rejected this patch multiple times, because the above >> >>> comment demonstrates a broken product. >> >> >> >> If we were to only support working HW parts, I fear Linux would not >> >> support a lot of HW (that's even more true when it comes to flashes :P). >> > >> > You stopped allowing UBI to attach to MLC NAND recently, no? That >> > sounds like almost the same boat -- you've probably killed quite a few >> > shitty products, if they were to use mainline directly. >> > >> > Anyway, that's derailing the issue. Supporting broken hardware isn't >> > something you try to do by applying the same hack to all systems. You >> > normally try to apply your hack as narrowly as possible. You seem to >> > imply that below. So maybe that's a solution to move forward with. But >> > I'd personally be just as happy to see the patch reverted. >> > >> >>> You cannot guarantee that all >> >>> reboots will invoke the .shutdown() method -- what about crashes? What >> >>> about watchdog resets? IIUC, those will hit the same broken behavior, >> >>> and have unexepcted behavior in your bootloader. >> >> >> >> Yes, there are corner cases that are not addressed with this approach, >> > >> > Is a system crash really a corner case? :D >> > >> >> but it still seems to improve things. Of course, that means the >> >> user should try to re-route all HW reset sources to SW ones (RESET input >> >> pin muxed to the GPIO controller, watchdog generating an interrupt >> >> instead of directly asserting the RESET output pin), which is not always >> >> possible, but even when it's not, isn't it better to have a setup that >> >> works fine 99% of the time instead of 50% of the time? >> > >> > Perhaps, but not at the expense of future development. And >> > realistically, no one is doing that if they have this hack. Most >> > people won't even know that this hack is protecting them at all (so >> > again, they won't try to mitigate the problem any further). >> > >> >>> I suppose one could argue for doing this in remove(), but AIUI you're >> >>> just papering over system bugs by introducing the shutdown() function >> >>> here. Thus, I'd prefer we drop the shutdown() method to avoid misleading >> >>> other users of this driver. >> >> >> >> I understand your point. But if the problem is about making sure people >> >> designing new boards get that right, why not complaining at probe time >> >> when things are wrong? >> >> >> >> I mean, spi_nor_restore() seems to only do something on very specific >> >> NORs (those on which a SW RESET does not resets the addressing >> >> mode). >> > >> > The point isn't that SW RESET doesn't reset the addressing mode -- it >> > does on any flash I've seen. The point is that most systems are built >> > around a stateless assumption in these flash. IIRC, there wasn't even >> > a SW RESET command at all until these "huge" flash came around and >> > stateful addressing modes came about. So boot ROMs and bootloaders >> > would have to be updated to start figuring out when/how to do this SW >> > RESET. And once two vendors start doing it differently (I'm not sure: >> > have they done this already? I think so) it's no longer something a >> > boot ROM will get right. >> > >> > The only way to get this stuff right is to have a hardware reset, or >> > else to avoid all of the stateful modes in software. >> > >> >> So, how about adding a flag that says "my board has the NOR HW >> >> RESET pin wired" (there would be a DT props to set that flag). Then you >> >> add a WARN_ON() when this flag is not set and a NOR chip impacted by >> >> this bug is detected. >> > >> > I'd kinda prefer the reverse. There really isn't a need to document >> > anything for a working system (software usually can't control this >> > RESET pin). The burden should be on the b0rked system to document >> > where it needs unsound hacks to survive. >> > >> >> This way you make sure people are informed that >> >> they're doing something wrong, and for those who can't change their HW >> >> (because it's already widely deployed), you have a fix that improve >> >> things. >> > >> > Or even better: put this hack behind a DT flag, so that one has to >> > admit that their board design is broken before it will even do >> > anything. Proposal: "linux,badly-designed-flash-reset". >> > >> > But, I'd prefer just (partially?) reverting this, and let the authors >> > submit something that works. We're not obligated to keep bad hacks in >> > the kernel. >> > >> > Brian >> >> One possibility that occurred to me when I was exploring this issue is >> to revert to 3-byte mode whenever 4-byte was not actively in use. >> So any access beyond 16Meg is: >> switch-to-4-byte ; perform IO ; switch to 3-byte >> or similar. On my hardware it would be more efficient to >> use the 4-byte opcode to perform the IO, then reset the cached >> 4th address byte that the NOR chip transparently remembered. >> >> This adds a little overhead, but should be fairly robust. >> It doesn't help if something goes terribly wrong while IO is happening, >> but I don't think any other software solution does either. >> >> How would you see that approach? > > I think the problem stands: people that have proper HW mitigation for > this problem (NOR chip is reset when the Processor is reset) don't want > to pay the overhead. So, even if we go for this approach, we probably > want to only do that for broken HW. I agree that a "my-hardware-is-suboptimal" flag is appropriate. I was addressing the suggestion that the current approach doesn't handle all corner cases and was suggesting a different approach that might handle more corner-cases. I should have been more explicit about that. Thanks, NeilBrown
Hi, On Tue, Jul 24, 2018 at 11:51:49AM +1000, NeilBrown wrote: > On Tue, Jul 24 2018, Boris Brezillon wrote: > > On Tue, 24 Jul 2018 08:46:33 +1000 > > NeilBrown <neilb@suse.com> wrote: > >> One possibility that occurred to me when I was exploring this issue is > >> to revert to 3-byte mode whenever 4-byte was not actively in use. > >> So any access beyond 16Meg is: > >> switch-to-4-byte ; perform IO ; switch to 3-byte > >> or similar. On my hardware it would be more efficient to > >> use the 4-byte opcode to perform the IO, then reset the cached > >> 4th address byte that the NOR chip transparently remembered. Do these chips cache the last 4th-byte used? I don't recall reading that, but that would be interesting. It also sounds like that would make things even more difficult to do robustly. > >> This adds a little overhead, but should be fairly robust. > >> It doesn't help if something goes terribly wrong while IO is happening, > >> but I don't think any other software solution does either. > >> > >> How would you see that approach? > > > > I think the problem stands: people that have proper HW mitigation for > > this problem (NOR chip is reset when the Processor is reset) don't want > > to pay the overhead. So, even if we go for this approach, we probably > > want to only do that for broken HW. If it actually saves bytes on the wire to stay in 3-byte mode more often, then that could be helpful to all systems. But otherwise, yes, it doesn't really belong on a properly-designed system. > I agree that a "my-hardware-is-suboptimal" flag is appropriate. > I was addressing the suggestion that the current approach doesn't handle > all corner cases and was suggesting a different approach that might > handle more corner-cases. I should have been more explicit about that. If you want to talk about optimizing the broken-hardware hack, then fine. That's not my interest at all. Brian
Hi Boris, On Tue, Jul 24, 2018 at 01:18:11AM +0200, Boris Brezillon wrote: > On Mon, 23 Jul 2018 15:06:43 -0700 > Brian Norris <computersforpeace@gmail.com> wrote: > > On Mon, Jul 23, 2018 at 1:10 PM, Boris Brezillon > > <boris.brezillon@bootlin.com> wrote: > > > but it still seems to improve things. Of course, that means the > > > user should try to re-route all HW reset sources to SW ones (RESET input > > > pin muxed to the GPIO controller, watchdog generating an interrupt > > > instead of directly asserting the RESET output pin), which is not always > > > possible, but even when it's not, isn't it better to have a setup that > > > works fine 99% of the time instead of 50% of the time? > > > > Perhaps, but not at the expense of future development. And > > realistically, no one is doing that if they have this hack. Most > > people won't even know that this hack is protecting them at all (so > > again, they won't try to mitigate the problem any further). > > Unless we add a huge backtrace at probe time which forces them to look > closer at what they did wrong (like you seem to suggest below). Sure, if we can force a huge backtrace, then maybe that's an acceptable situation. I just predict that someday, somebody will argue that the warning should be downgraded, since there's nothing the average user can do about it. (Well, except not buying from that manufacturer in the future.) > > Or even better: put this hack behind a DT flag, so that one has to > > admit that their board design is broken before it will even do > > anything. Proposal: "linux,badly-designed-flash-reset". > > I think we can remove the "linux," prefix. If it's badly designed, it > applies to all OSes, don't you think? Sure. Shall I send a patch? Brian
On Tue, 24 Jul 2018 12:52:02 -0700 Brian Norris <computersforpeace@gmail.com> wrote: > > > Or even better: put this hack behind a DT flag, so that one has to > > > admit that their board design is broken before it will even do > > > anything. Proposal: "linux,badly-designed-flash-reset". > > > > I think we can remove the "linux," prefix. If it's badly designed, it > > applies to all OSes, don't you think? > > Sure. > > Shall I send a patch? Yep.
On Tue, Jul 24 2018, Brian Norris wrote: > Hi, > > On Tue, Jul 24, 2018 at 11:51:49AM +1000, NeilBrown wrote: >> On Tue, Jul 24 2018, Boris Brezillon wrote: >> > On Tue, 24 Jul 2018 08:46:33 +1000 >> > NeilBrown <neilb@suse.com> wrote: >> >> One possibility that occurred to me when I was exploring this issue is >> >> to revert to 3-byte mode whenever 4-byte was not actively in use. >> >> So any access beyond 16Meg is: >> >> switch-to-4-byte ; perform IO ; switch to 3-byte >> >> or similar. On my hardware it would be more efficient to >> >> use the 4-byte opcode to perform the IO, then reset the cached >> >> 4th address byte that the NOR chip transparently remembered. > > Do these chips cache the last 4th-byte used? I don't recall reading > that, but that would be interesting. It also sounds like that would make > things even more difficult to do robustly. That is how the Winbond W25Q256FV appears to behave in my experiments. Any time you use a 4-byte address, the high byte is saved. Any time you use a 3-byte address, the saved high-byte is used. The data sheet seems to support this understanding, as detailed in Commit: f134fbbb4ff8 ("mtd: spi-nor: clear Winbond Extended Address Reg on switch to 3-byte addressing.") > >> >> This adds a little overhead, but should be fairly robust. >> >> It doesn't help if something goes terribly wrong while IO is happening, >> >> but I don't think any other software solution does either. >> >> >> >> How would you see that approach? >> > >> > I think the problem stands: people that have proper HW mitigation for >> > this problem (NOR chip is reset when the Processor is reset) don't want >> > to pay the overhead. So, even if we go for this approach, we probably >> > want to only do that for broken HW. > > If it actually saves bytes on the wire to stay in 3-byte mode more > often, then that could be helpful to all systems. But otherwise, yes, it > doesn't really belong on a properly-designed system. I'm not sure exactly what you encompass by the term "properly-designed", but with the SOC that I have (mt7621) and the flash chip (Winbond W25Q256FV) it is not possible to wire them up in any way that will work reliably without software support for leaving 3-byte mode. The SOC does not have an out-going reset line that signals a general system reset - it only has one that signals watchdog reset. The flash chip has an incoming reset line, but it shares a pin with "hold", and that is the default use. So we would need to program the flash to listen for reset, and it would only catch a watchdog reset. For any other reset, the CPU is (should be) in complete control (even after a panic!) and it needs to reprogram the flash chip before resetting the system. But maybe you think either the SOC and/or the flash chip is not "properly-designed" - and you may be correct.. Thanks, NeilBrown > >> I agree that a "my-hardware-is-suboptimal" flag is appropriate. >> I was addressing the suggestion that the current approach doesn't handle >> all corner cases and was suggesting a different approach that might >> handle more corner-cases. I should have been more explicit about that. > > If you want to talk about optimizing the broken-hardware hack, then > fine. That's not my interest at all. > > Brian
Hi, On Wed, Jul 25, 2018 at 07:48:21AM +1000, NeilBrown wrote: > On Tue, Jul 24 2018, Brian Norris wrote: > > On Tue, Jul 24, 2018 at 11:51:49AM +1000, NeilBrown wrote: > >> On Tue, Jul 24 2018, Boris Brezillon wrote: > >> > On Tue, 24 Jul 2018 08:46:33 +1000 > >> > NeilBrown <neilb@suse.com> wrote: > >> >> One possibility that occurred to me when I was exploring this issue is > >> >> to revert to 3-byte mode whenever 4-byte was not actively in use. > >> >> So any access beyond 16Meg is: > >> >> switch-to-4-byte ; perform IO ; switch to 3-byte > >> >> or similar. On my hardware it would be more efficient to > >> >> use the 4-byte opcode to perform the IO, then reset the cached > >> >> 4th address byte that the NOR chip transparently remembered. > > > > Do these chips cache the last 4th-byte used? I don't recall reading > > that, but that would be interesting. It also sounds like that would make > > things even more difficult to do robustly. > > That is how the Winbond W25Q256FV appears to behave in my experiments. > Any time you use a 4-byte address, the high byte is saved. Any time you > use a 3-byte address, the saved high-byte is used. > The data sheet seems to support this understanding, as detailed in > > Commit: f134fbbb4ff8 ("mtd: spi-nor: clear Winbond Extended Address Reg on switch to 3-byte addressing.") Thanks for the notes. I never really paid attention to that section of the datasheet, since I always relied on a hardware reset. By the way, doesn't this part support the dedicated (non-stateful) 4-byte address commands? Why not use those and avoid the problem entirely? Although, it's not actually clear to me from the datasheet whether, e.g., Fast Read with 4-byte Address (0Ch) will set the Extended Address Register. > >> >> This adds a little overhead, but should be fairly robust. > >> >> It doesn't help if something goes terribly wrong while IO is happening, > >> >> but I don't think any other software solution does either. > >> >> > >> >> How would you see that approach? > >> > > >> > I think the problem stands: people that have proper HW mitigation for > >> > this problem (NOR chip is reset when the Processor is reset) don't want > >> > to pay the overhead. So, even if we go for this approach, we probably > >> > want to only do that for broken HW. > > > > If it actually saves bytes on the wire to stay in 3-byte mode more > > often, then that could be helpful to all systems. But otherwise, yes, it > > doesn't really belong on a properly-designed system. > > I'm not sure exactly what you encompass by the term "properly-designed", I consider all systems with >16MiB SPI flash that have stateful addressing modes and don't have a useful HW RESET signal to be improperly designed. These SPI flash never had a standardized (or easily-determined) software reset, so it's nearly impossible to write robust software for them that is compatible with a relatively generic boot ROM and/or bootloader. > but with the SOC that I have (mt7621) and the flash chip (Winbond > W25Q256FV) it is not possible to wire them up in any way that will work > reliably without software support for leaving 3-byte mode. > The SOC does not have an out-going reset line that signals a general > system reset - it only has one that signals watchdog reset. > The flash chip has an incoming reset line, but it shares a pin with > "hold", and that is the default use. So we would need to program the (BTW, you could probably choose to reprogram the HOLD# line to RESET# before switching to 4-byte mode. That still doesn't resolve the SoC reset line issue on its own.) > flash to listen for reset, and it would only catch a watchdog reset. > For any other reset, the CPU is (should be) in complete control (even > after a panic!) and it needs to reprogram the flash chip before > resetting the system. "Even after a panic" is a bit dubious. That presumes that the cause of the panic didn't also corrupt your exception code, driver code, driver data structures, etc., that would be needed to reset the SPI chip without double-faulting. And what about a panic in your SPI flash driver? Overall, my point is that the existence of such non-hardware-resettable flash guarantees the possibility of a hard enough crash that you cannot possibly reset it to 3-byte addressing before system reset. > But maybe you think either the SOC and/or the flash chip is not > "properly-designed" - and you may be correct.. Yes, I believe I do. If your watchdog reset pin was routed to the flash HOLD/RESET pin, then I suppose it would be *possible* to have robust software. But otherwise, I believe it's impossible. Brian
diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c index dbe6a1de2bb8..a4e18f6aaa33 100644 --- a/drivers/mtd/devices/m25p80.c +++ b/drivers/mtd/devices/m25p80.c @@ -307,10 +307,18 @@ static int m25p_remove(struct spi_device *spi) { struct m25p *flash = spi_get_drvdata(spi); + spi_nor_restore(&flash->spi_nor); + /* Clean up MTD stuff. */ return mtd_device_unregister(&flash->spi_nor.mtd); } +static void m25p_shutdown(struct spi_device *spi) +{ + struct m25p *flash = spi_get_drvdata(spi); + + spi_nor_restore(&flash->spi_nor); +} /* * Do NOT add to this array without reading the following: * @@ -386,6 +394,7 @@ static struct spi_driver m25p80_driver = { .id_table = m25p_ids, .probe = m25p_probe, .remove = m25p_remove, + .shutdown = m25p_shutdown, /* REVISIT: many of these chips have deep power-down modes, which * should clearly be entered on suspend() to minimize power use.