Message ID | 1354740362-20412-1-git-send-email-w.sang@pengutronix.de |
---|---|
State | Accepted |
Commit | 6f2a6a52560ad8d85710aabd92b7a3239b3a6b07 |
Headers | show |
于 2012年12月06日 04:46, Wolfram Sang 写道:
> correctly after warm rebooting, so the kernel could not find the UBI or
which Soc do you meet this issue? the mx23 or mx28?
It's bad news to me. I ever thought the BCH-reset-issue is gone.
I ever tested many times in mx28(> 10000 times) with freescale's uboot.
I guest you are not use the freescale's uboot. The uboot also will reset
the BCH/GPMI.
thanks
Huang Shijie
On Thu, Dec 06, 2012 at 10:16:08AM +0800, Huang Shijie wrote: > 于 2012年12月06日 04:46, Wolfram Sang 写道: > > correctly after warm rebooting, so the kernel could not find the UBI or > which Soc do you meet this issue? the mx23 or mx28? MX28. > It's bad news to me. I ever thought the BCH-reset-issue is gone. > I ever tested many times in mx28(> 10000 times) with freescale's uboot. Did you power-cycle between each test or reboot? I still need to test if this really makes a difference, but I think the issue shows more often when only rebooting. > I guest you are not use the freescale's uboot. The uboot also will reset > the BCH/GPMI. I just checked its sources. It also has this flaw. GPMI and BCH need to be reset at the same time, i.e. before first commands are sent to the NAND (although they don't need ECC). I use barebox, but that isn't of importance here. If I don't fix barebox, I see NAND issues in barebox or the kernel. If I fix barebox, I still see issues in the kernel, but not in barebox anymore. The setup needs to be done properly to ensure a consistent state at the beginning, especially when somebody used the NAND before (ROM code, bootloader). Fixing the bootloader alone is not enough. Thanks, Wolfram
On Thu, Dec 6, 2012 at 4:52 AM, Wolfram Sang <w.sang@pengutronix.de> wrote: > On Thu, Dec 06, 2012 at 10:16:08AM +0800, Huang Shijie wrote: >> 于 2012年12月06日 04:46, Wolfram Sang 写道: >> > correctly after warm rebooting, so the kernel could not find the UBI or >> which Soc do you meet this issue? the mx23 or mx28? > > MX28. > >> It's bad news to me. I ever thought the BCH-reset-issue is gone. >> I ever tested many times in mx28(> 10000 times) with freescale's uboot. > > Did you power-cycle between each test or reboot? I still need to test if What's the meaning of "power-cycle"? I only tested with the soft reset, the power is never shut down. > this really makes a difference, but I think the issue shows more often > when only rebooting. > >> I guest you are not use the freescale's uboot. The uboot also will reset >> the BCH/GPMI. > > I just checked its sources. It also has this flaw. GPMI and BCH need to > be reset at the same time, i.e. before first commands are sent to the > NAND (although they don't need ECC). > With your patch, we have resetted the BCH twice in the driver. Could we only reset BCH one time? Do you ever remove another reset-bch code, and test it? Frankly speaking, it's strange to reset the BCH twice, and it makes no sense. :( I am afraid that we have to add three-reset-bch in the future, if we can not find the root cause. If you can remove another reset-bch code, i will ack the patch. > I use barebox, but that isn't of importance here. If I don't fix > barebox, I see NAND issues in barebox or the kernel. If I fix barebox, I > still see issues in the kernel, but not in barebox anymore. > I ever met the same case with the freescale's BSP code. I fixed the issue in uboot, and fixed the driver too. > The setup needs to be done properly to ensure a consistent state at the > beginning, especially when somebody used the NAND before (ROM code, At the beginning? it's really interesting. I want to test it myself. could you wait for some time? Best Regards Huang Shijie > bootloader). Fixing the bootloader alone is not enough. > > Thanks, > > Wolfram > > -- > Pengutronix e.K. | Wolfram Sang | > Industrial Linux Solutions | http://www.pengutronix.de/ | > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > > iEYEARECAAYFAlDAas0ACgkQD27XaX1/VRs8iQCgrqb5nADGL5Dyi+csfSdSLWSR > /mUAoJN91gz+1QiqVUsNdHtwX9jpAdZC > =nMk3 > -----END PGP SIGNATURE----- > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
> > Did you power-cycle between each test or reboot? I still need to test if > What's the meaning of "power-cycle"? power off - power on > I only tested with the soft reset, the power is never shut down. OK. > With your patch, we have resetted the BCH twice in the driver. Could > we only reset BCH one time? You tell me, please :) Is it needed when changing the layout? I don't have any setup to test that. Unless I can test that, I prefer to not remove it. I am a bit anxious, because the reason for the stalled BCH is not fully understood and all I can reliably say is that resetting twice does not hurt. > Do you ever remove another reset-bch code, and test it? It will probably work, but we don't cover the case of changing the layout? > Frankly speaking, it's strange to reset the BCH twice, and it makes no sense. :( The stalled BCH makes no sense either, currently. > I am afraid that we have to add three-reset-bch in the future, if we > can not find the root cause. Well, ask the IC guys if something can go wrong when the BCH has been active and the GPMI gets reset and issues NAND commands (without needing ECC). > If you can remove another reset-bch code, i will ack the patch. Removing code that late in the cycle without proof sounds dangerous to me. > > The setup needs to be done properly to ensure a consistent state at the > > beginning, especially when somebody used the NAND before (ROM code, > At the beginning? it's really interesting. > I want to test it myself. > > could you wait for some time? What is some time? :) Thanks, Wolfram
于 2012年12月06日 17:52, Wolfram Sang 写道: > The setup needs to be done properly to ensure a consistent state at the > beginning, especially when somebody used the NAND before (ROM code, > bootloader). Fixing the bootloader alone is not enough. I tried to make some time to test it. But i am too busy this week. Anyway, let this patch get merged. Acked-by: Huang Shijie <b32955@freescale.com> thanks Huang Shijie
On Wed, 2012-12-05 at 21:46 +0100, Wolfram Sang wrote: > It could happen (1 out of 100 times) that NAND did not start up > correctly after warm rebooting, so the kernel could not find the UBI or > DMA timed out due to a stalled BCH. When resetting BCH together with > GPMI, the issue could not be observed anymore (after 10000+ reboots). We > probably need the consistent state already before sending any command to > NAND, even when no ECC is needed. I chose to keep the extra reset for > BCH when changing the flash layout to be on the safe side. > > Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> > Cc: Huang Shijie <b32955@freescale.com> > Cc: Artem Bityutskiy <dedekind1@gmail.com> > Cc: David Woodhouse <david.woodhouse@intel.com> > --- > > Would be really great to have in 3.7. Also a stable-candidate IMO. Pushed to l2-mtd.git, thanks If you believe this is suitable for stable, please let me know.
On Wed, Dec 12, 2012 at 05:11:43PM +0200, Artem Bityutskiy wrote: > On Wed, 2012-12-05 at 21:46 +0100, Wolfram Sang wrote: > > It could happen (1 out of 100 times) that NAND did not start up > > correctly after warm rebooting, so the kernel could not find the UBI or > > DMA timed out due to a stalled BCH. When resetting BCH together with > > GPMI, the issue could not be observed anymore (after 10000+ reboots). We > > probably need the consistent state already before sending any command to > > NAND, even when no ECC is needed. I chose to keep the extra reset for > > BCH when changing the flash layout to be on the safe side. > > > > Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> > > Cc: Huang Shijie <b32955@freescale.com> > > Cc: Artem Bityutskiy <dedekind1@gmail.com> > > Cc: David Woodhouse <david.woodhouse@intel.com> > > --- > > > > Would be really great to have in 3.7. Also a stable-candidate IMO. > > Pushed to l2-mtd.git, thanks > > If you believe this is suitable for stable, please let me know. I strongly believe that.
On Wed, 2012-12-12 at 17:01 +0100, Wolfram Sang wrote: > > If you believe this is suitable for stable, please let me know. > > I strongly believe that. OK, added Cc: stable@vger.kernel.org
diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c index 3502acc..84f0526 100644 --- a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c +++ b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c @@ -166,6 +166,15 @@ int gpmi_init(struct gpmi_nand_data *this) if (ret) goto err_out; + /* + * Reset BCH here, too. We got failures otherwise :( + * See later BCH reset for explanation of MX23 handling + */ + ret = gpmi_reset_block(r->bch_regs, GPMI_IS_MX23(this)); + if (ret) + goto err_out; + + /* Choose NAND mode. */ writel(BM_GPMI_CTRL1_GPMI_MODE, r->gpmi_regs + HW_GPMI_CTRL1_CLR);
It could happen (1 out of 100 times) that NAND did not start up correctly after warm rebooting, so the kernel could not find the UBI or DMA timed out due to a stalled BCH. When resetting BCH together with GPMI, the issue could not be observed anymore (after 10000+ reboots). We probably need the consistent state already before sending any command to NAND, even when no ECC is needed. I chose to keep the extra reset for BCH when changing the flash layout to be on the safe side. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Cc: Huang Shijie <b32955@freescale.com> Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: David Woodhouse <david.woodhouse@intel.com> --- Would be really great to have in 3.7. Also a stable-candidate IMO. drivers/mtd/nand/gpmi-nand/gpmi-lib.c | 9 +++++++++ 1 file changed, 9 insertions(+)