Patchwork NAND: failure reported with mtd_oobtest(read delay violated)

login
register
mail settings
Submitter Sriram
Date Nov. 12, 2009, 3:58 p.m.
Message ID <FCCFB4CDC6E5564B9182F639FC35608702FA5CC012@dbde02.ent.ti.com>
Download mbox | patch
Permalink /patch/38257/
State New, archived
Headers show

Comments

Sriram - Nov. 12, 2009, 3:58 p.m.
Hello,
On the OMAP3 EVM platform we are seeing failures when we run the mtd_oobtest repeatedly (overnight). Usually the failures are reported after 40-45 iterations.

Further analysis with logic analyzer seems to indicate that for the specific instance where failure is observed, the time delay between dispatching the READ0 command and the start of read operation is violated. (udelay loop would start even before the write is reflected on the HW)

OMAP3EVM implementation relies on command delay(chip_delay set to 50ms) and doesn't use wait pin monitoring. By adding a data memory barrier dmb() instruction right after the command phase the issue seems to be resolved.

Have we seen similar behavior on other platforms that rely on command delay and do we specifically require a dmb() after the command is written?

DIFF Details:(will post this a patch once I hear your review comments)



Regards
Sriram
Russell King - ARM Linux - Nov. 12, 2009, 8:59 p.m.
On Thu, Nov 12, 2009 at 09:28:58PM +0530, Govindarajan, Sriramakrishnan wrote:
> OMAP3EVM implementation relies on command delay(chip_delay set to 50ms)
> and doesn't use wait pin monitoring. By adding a data memory barrier
> dmb() instruction right after the command phase the issue seems to be
> resolved.

Clearly adding ARM specific calls into generic code is the wrong thing
to do...  Why can't this dmb() go into whatever NAND driver OMAP is using?
Artem Bityutskiy - Nov. 13, 2009, 8:37 a.m.
On Thu, 2009-11-12 at 21:28 +0530, Govindarajan, Sriramakrishnan wrote:
> Hello,
> On the OMAP3 EVM platform we are seeing failures when we run the mtd_oobtest repeatedly (overnight). Usually the failures are reported after 40-45 iterations.
> 
> Further analysis with logic analyzer seems to indicate that for the specific instance where failure is observed, the time delay between dispatching the READ0 command and the start of read operation is violated. (udelay loop would start even before the write is reflected on the HW)
> 
> OMAP3EVM implementation relies on command delay(chip_delay set to 50ms) and doesn't use wait pin monitoring. By adding a data memory barrier dmb() instruction right after the command phase the issue seems to be resolved.
> 
> Have we seen similar behavior on other platforms that rely on command delay and do we specifically require a dmb() after the command is written?
> 
> DIFF Details:(will post this a patch once I hear your review comments)
> 
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 2211386..8df3780 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -651,6 +651,7 @@ static void nand_command_lp(struct mtd_info *mtd, unsigned int command,
>                                NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
>                 chip->cmd_ctrl(mtd, NAND_CMD_NONE,
>                                NAND_NCE | NAND_CTRL_CHANGE);
> +               dmb();

Did you try using 'wmb()' instead? Not 100% sure, but it looks like the
barrier you need. However, if you read
'Documentation/memory-barriers.txt', you will probably find the right
way to go.
Sriram - Nov. 13, 2009, 11:04 a.m.
> -----Original Message-----
> From: Russell King - ARM Linux [mailto:linux@arm.linux.org.uk]
> Sent: Friday, November 13, 2009 2:29 AM
> To: Govindarajan, Sriramakrishnan
> Cc: 'linux-mtd@lists.infradead.org'; 'linux-arm-
> kernel@lists.infradead.org'
> Subject: Re: NAND: failure reported with mtd_oobtest(read delay violated)
> 
> On Thu, Nov 12, 2009 at 09:28:58PM +0530, Govindarajan, Sriramakrishnan
> wrote:
> > OMAP3EVM implementation relies on command delay(chip_delay set to 50ms)
> > and doesn't use wait pin monitoring. By adding a data memory barrier
> > dmb() instruction right after the command phase the issue seems to be
> > resolved.
> 
> Clearly adding ARM specific calls into generic code is the wrong thing
> to do...  Why can't this dmb() go into whatever NAND driver OMAP is using?

Russell, thanks for the comments. But I posed this question
assuming that this a more generic problem that may show up on other
rchitectures/platforms where the NAND driver is using delay loop
instead of wait pin monitoring. Is there a more generic alternative?

Also, the time delay is implemented in the nand(nand_base.c) core file itself - there aren't platform specific hooks for us to override/customize for OMAP specifically. I am also looking at the other alternative of using wait pin monitoring instead

Regards
Sriram

Patch

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 2211386..8df3780 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -651,6 +651,7 @@  static void nand_command_lp(struct mtd_info *mtd, unsigned int command,
                               NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
                chip->cmd_ctrl(mtd, NAND_CMD_NONE,
                               NAND_NCE | NAND_CTRL_CHANGE);
+               dmb();

                /* This applies to read commands */
        default: