NAND: failure reported with mtd_oobtest(read delay violated)

Submitted by Sriram on Nov. 12, 2009, 3:58 p.m.

Details

Message ID FCCFB4CDC6E5564B9182F639FC35608702FA5CC012@dbde02.ent.ti.com
State New, archived
Headers show

Commit Message

Sriram Nov. 12, 2009, 3:58 p.m.
Hello,
On the OMAP3 EVM platform we are seeing failures when we run the mtd_oobtest repeatedly (overnight). Usually the failures are reported after 40-45 iterations.

Further analysis with logic analyzer seems to indicate that for the specific instance where failure is observed, the time delay between dispatching the READ0 command and the start of read operation is violated. (udelay loop would start even before the write is reflected on the HW)

OMAP3EVM implementation relies on command delay(chip_delay set to 50ms) and doesn't use wait pin monitoring. By adding a data memory barrier dmb() instruction right after the command phase the issue seems to be resolved.

Have we seen similar behavior on other platforms that rely on command delay and do we specifically require a dmb() after the command is written?

DIFF Details:(will post this a patch once I hear your review comments)



Regards
Sriram

Comments

Russell King - ARM Linux Nov. 12, 2009, 8:59 p.m.
On Thu, Nov 12, 2009 at 09:28:58PM +0530, Govindarajan, Sriramakrishnan wrote:
> OMAP3EVM implementation relies on command delay(chip_delay set to 50ms)
> and doesn't use wait pin monitoring. By adding a data memory barrier
> dmb() instruction right after the command phase the issue seems to be
> resolved.

Clearly adding ARM specific calls into generic code is the wrong thing
to do...  Why can't this dmb() go into whatever NAND driver OMAP is using?
Artem Bityutskiy Nov. 13, 2009, 8:37 a.m.
On Thu, 2009-11-12 at 21:28 +0530, Govindarajan, Sriramakrishnan wrote:
> Hello,
> On the OMAP3 EVM platform we are seeing failures when we run the mtd_oobtest repeatedly (overnight). Usually the failures are reported after 40-45 iterations.
> 
> Further analysis with logic analyzer seems to indicate that for the specific instance where failure is observed, the time delay between dispatching the READ0 command and the start of read operation is violated. (udelay loop would start even before the write is reflected on the HW)
> 
> OMAP3EVM implementation relies on command delay(chip_delay set to 50ms) and doesn't use wait pin monitoring. By adding a data memory barrier dmb() instruction right after the command phase the issue seems to be resolved.
> 
> Have we seen similar behavior on other platforms that rely on command delay and do we specifically require a dmb() after the command is written?
> 
> DIFF Details:(will post this a patch once I hear your review comments)
> 
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 2211386..8df3780 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -651,6 +651,7 @@ static void nand_command_lp(struct mtd_info *mtd, unsigned int command,
>                                NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
>                 chip->cmd_ctrl(mtd, NAND_CMD_NONE,
>                                NAND_NCE | NAND_CTRL_CHANGE);
> +               dmb();

Did you try using 'wmb()' instead? Not 100% sure, but it looks like the
barrier you need. However, if you read
'Documentation/memory-barriers.txt', you will probably find the right
way to go.
Sriram Nov. 13, 2009, 11:04 a.m.
> -----Original Message-----
> From: Russell King - ARM Linux [mailto:linux@arm.linux.org.uk]
> Sent: Friday, November 13, 2009 2:29 AM
> To: Govindarajan, Sriramakrishnan
> Cc: 'linux-mtd@lists.infradead.org'; 'linux-arm-
> kernel@lists.infradead.org'
> Subject: Re: NAND: failure reported with mtd_oobtest(read delay violated)
> 
> On Thu, Nov 12, 2009 at 09:28:58PM +0530, Govindarajan, Sriramakrishnan
> wrote:
> > OMAP3EVM implementation relies on command delay(chip_delay set to 50ms)
> > and doesn't use wait pin monitoring. By adding a data memory barrier
> > dmb() instruction right after the command phase the issue seems to be
> > resolved.
> 
> Clearly adding ARM specific calls into generic code is the wrong thing
> to do...  Why can't this dmb() go into whatever NAND driver OMAP is using?

Russell, thanks for the comments. But I posed this question
assuming that this a more generic problem that may show up on other
rchitectures/platforms where the NAND driver is using delay loop
instead of wait pin monitoring. Is there a more generic alternative?

Also, the time delay is implemented in the nand(nand_base.c) core file itself - there aren't platform specific hooks for us to override/customize for OMAP specifically. I am also looking at the other alternative of using wait pin monitoring instead

Regards
Sriram

Patch hide | download patch | download mbox

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 2211386..8df3780 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -651,6 +651,7 @@  static void nand_command_lp(struct mtd_info *mtd, unsigned int command,
                               NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
                chip->cmd_ctrl(mtd, NAND_CMD_NONE,
                               NAND_NCE | NAND_CTRL_CHANGE);
+               dmb();

                /* This applies to read commands */
        default: