Patchwork [v2] mtd-nand: davinci: correct 4-bit error correction

login
register
mail settings
Submitter Rajashekhara, Sudhakar
Date July 15, 2010, 5:10 a.m.
Message ID <1279170641-24494-1-git-send-email-sudhakar.raj@ti.com>
Download mbox | patch
Permalink /patch/58963/
State New
Headers show

Comments

Rajashekhara, Sudhakar - July 15, 2010, 5:10 a.m.
On TI's DA830/OMAP-L137, DA850/OMAP-L138 and DM365, after setting the
4BITECC_ADD_CALC_START bit in the NAND Flash control register to 1 and
before waiting for the NAND Flash status register to be equal to 1, 2 or
3, we have to wait till the ECC HW goes to correction state.  Without this
wait, ECC correction calculations will not be proper.

This has been tested on DA830/OMAP-L137, DA850/OMAP-L138, DM355 and DM365
EVMs.

Signed-off-by: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
Acked-by: Sneha Narnakaje <nsnehaprabha@ti.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Since v1:
a. Timeout has been changed from 100 msec to 100 usec.
b. Comment above the do, while loop was not matching the code.
   This has been corrected.
c. Initialization of 'timeo' variable has been moved down.
d. It was observed that, while calculating the time in the loop,
   if there is a context switch between setting the 4BITECC_ADD_CALC_START
   bit and reading of ECC_STATE field, then the loop will not come out
   until the timeout happens. To prevent the context switch, spin_lock_irqsave
   and spin_unlock_irqrestore are used.

 drivers/mtd/nand/davinci_nand.c |   22 ++++++++++++++++++++++
 1 files changed, 22 insertions(+), 0 deletions(-)
Jon Povey - July 15, 2010, 11:01 a.m.
Sudhakar Rajashekhara wrote:
> On TI's DA830/OMAP-L137, DA850/OMAP-L138 and DM365, after setting the
> 4BITECC_ADD_CALC_START bit in the NAND Flash control register to 1 and
> before waiting for the NAND Flash status register to be equal to 1, 2
> or 3, we have to wait till the ECC HW goes to correction state.
> Without this wait, ECC correction calculations will not be proper.
>
> This has been tested on DA830/OMAP-L137, DA850/OMAP-L138, DM355 and
> DM365 EVMs.
>
> Signed-off-by: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
> Acked-by: Sneha Narnakaje <nsnehaprabha@ti.com>
> Cc: David Woodhouse <dwmw2@infradead.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Have these people acked and signed off this new version of the patch?

> Since v1:
> a. Timeout has been changed from 100 msec to 100 usec.
> b. Comment above the do, while loop was not matching the code.
>    This has been corrected.
> c. Initialization of 'timeo' variable has been moved down.
> d. It was observed that, while calculating the time in the loop,
>    if there is a context switch between setting the
>    4BITECC_ADD_CALC_START bit and reading of ECC_STATE field, then
>    the loop will not come out until the timeout happens. To prevent
>    the context switch, spin_lock_irqsave and spin_unlock_irqrestore
> are used.

d. means interrupts are disabled for up to 100us while waiting for ECC.
Doesn't sound good if it can be avoided.

How about instead of spinlock, set timeo first time inside the loop?

From this:

+       timeo = jiffies + usecs_to_jiffies(100);
+       do {
+               ecc_state = (davinci_nand_readl(info,
+                               NANDFSR_OFFSET) >> 8) & 0x0f;
+               cpu_relax();
+       } while ((ecc_state < 4) && time_before(jiffies, timeo));
+       spin_unlock_irqrestore(&ecc_spin_lock, flags);

To something like:

+       timeo = 0;
+       do {
+               ecc_state = (davinci_nand_readl(info,
+                               NANDFSR_OFFSET) >> 8) & 0x0f;
+               if (!timeo)
+                       timeo = jiffies + usecs_to_jiffies(100);
+               cpu_relax();
+       } while ((ecc_state < 4) && time_before(jiffies, timeo));

Sorry if my mailer has messed up the formatting. Hopefuly readable.

--
Jon Povey
jon.povey@racelogic.co.uk

Racelogic is a limited company registered in England. Registered number 2743719 .
Registered Office Unit 10, Swan Business Centre, Osier Way, Buckingham, Bucks, MK18 1TB .

The information contained in this electronic mail transmission is intended by Racelogic Ltd for the use of the named individual or entity to which it is directed and may contain information that is confidential or privileged. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email so that the sender's address records can be corrected. The views expressed by the sender of this communication do not necessarily represent those of Racelogic Ltd. Please note that Racelogic reserves the right to monitor e-mail communications passing through its network
Rajashekhara, Sudhakar - July 15, 2010, 11:41 a.m.
Hi,

On Thu, Jul 15, 2010 at 16:31:19, Jon Povey wrote:
> Sudhakar Rajashekhara wrote:
> > On TI's DA830/OMAP-L137, DA850/OMAP-L138 and DM365, after setting the
> > 4BITECC_ADD_CALC_START bit in the NAND Flash control register to 1 and
> > before waiting for the NAND Flash status register to be equal to 1, 2
> > or 3, we have to wait till the ECC HW goes to correction state.
> > Without this wait, ECC correction calculations will not be proper.
> >
> > This has been tested on DA830/OMAP-L137, DA850/OMAP-L138, DM355 and
> > DM365 EVMs.
> >
> > Signed-off-by: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
> > Acked-by: Sneha Narnakaje <nsnehaprabha@ti.com>
> > Cc: David Woodhouse <dwmw2@infradead.org>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> 
> Have these people acked and signed off this new version of the patch?
> 

No. Andrew Morton has not signed off this version. I'll remove
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

> > Since v1:
> > a. Timeout has been changed from 100 msec to 100 usec.
> > b. Comment above the do, while loop was not matching the code.
> >    This has been corrected.
> > c. Initialization of 'timeo' variable has been moved down.
> > d. It was observed that, while calculating the time in the loop,
> >    if there is a context switch between setting the
> >    4BITECC_ADD_CALC_START bit and reading of ECC_STATE field, then
> >    the loop will not come out until the timeout happens. To prevent
> >    the context switch, spin_lock_irqsave and spin_unlock_irqrestore
> > are used.
> 
> d. means interrupts are disabled for up to 100us while waiting for ECC.
> Doesn't sound good if it can be avoided.
> 

I wanted to avoid context switching between setting 4BITECC_ADD_CALC_START
bit:

	davinci_nand_writel(info, NANDFCR_OFFSET,
 			davinci_nand_readl(info, NANDFCR_OFFSET) | BIT(13));

And reading of ECC_STATE field inside the loop:

+		ecc_state = (davinci_nand_readl(info,
+				NANDFSR_OFFSET) >> 8) & 0x0f;

Because even if there was a slight delay after setting 4BITECC_ADD_CALC_START,
ECC_STATE would read < 4(meaning ECC engine has completed Error correction)
and I would wait in the loop for up to 100us.

> How about instead of spinlock, set timeo first time inside the loop?

I don't think this would also solve the problem which I stated above.

Adding spinlock was a blunder as that would cause jiffies not to increment.
I'll be removing the spinlocks and will be resubmitting the patch.

Regards,
Sudhakar
Rajashekhara, Sudhakar - July 15, 2010, 12:03 p.m.
On Thu, Jul 15, 2010 at 17:11:32, Sudhakar Rajashekhara wrote:
> Hi,
> 
> On Thu, Jul 15, 2010 at 16:31:19, Jon Povey wrote:
> > Sudhakar Rajashekhara wrote:
> > > On TI's DA830/OMAP-L137, DA850/OMAP-L138 and DM365, after setting the
> > > 4BITECC_ADD_CALC_START bit in the NAND Flash control register to 1 and
> > > before waiting for the NAND Flash status register to be equal to 1, 2
> > > or 3, we have to wait till the ECC HW goes to correction state.
> > > Without this wait, ECC correction calculations will not be proper.
> > >
> > > This has been tested on DA830/OMAP-L137, DA850/OMAP-L138, DM355 and
> > > DM365 EVMs.
> > >
> > > Signed-off-by: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
> > > Acked-by: Sneha Narnakaje <nsnehaprabha@ti.com>
> > > Cc: David Woodhouse <dwmw2@infradead.org>
> > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > 
> > Have these people acked and signed off this new version of the patch?
> > 
> 
> No. Andrew Morton has not signed off this version. I'll remove
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> 

Andrew Morton had signed off an earlier version of this patch and
it was present in -mm tree for a long time. He has not yet commented
on v2 version of this patch. But I thought I can carry forward the
Sign-offs from previous version to the next version. What's the common
practice?

Thanks,
Sudhakar
Andrew Morton - July 21, 2010, 10 p.m.
On Thu, 15 Jul 2010 17:33:03 +0530
"Sudhakar Rajashekhara" <sudhakar.raj@ti.com> wrote:

> On Thu, Jul 15, 2010 at 17:11:32, Sudhakar Rajashekhara wrote:
> > Hi,
> > 
> > On Thu, Jul 15, 2010 at 16:31:19, Jon Povey wrote:
> > > Sudhakar Rajashekhara wrote:
> > > > On TI's DA830/OMAP-L137, DA850/OMAP-L138 and DM365, after setting the
> > > > 4BITECC_ADD_CALC_START bit in the NAND Flash control register to 1 and
> > > > before waiting for the NAND Flash status register to be equal to 1, 2
> > > > or 3, we have to wait till the ECC HW goes to correction state.
> > > > Without this wait, ECC correction calculations will not be proper.
> > > >
> > > > This has been tested on DA830/OMAP-L137, DA850/OMAP-L138, DM355 and
> > > > DM365 EVMs.
> > > >
> > > > Signed-off-by: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
> > > > Acked-by: Sneha Narnakaje <nsnehaprabha@ti.com>
> > > > Cc: David Woodhouse <dwmw2@infradead.org>
> > > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > > 
> > > Have these people acked and signed off this new version of the patch?
> > > 
> > 
> > No. Andrew Morton has not signed off this version. I'll remove
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > 
> 
> Andrew Morton had signed off an earlier version of this patch and
> it was present in -mm tree for a long time. He has not yet commented
> on v2 version of this patch. But I thought I can carry forward the
> Sign-offs from previous version to the next version. What's the common
> practice?
> 


I've lost the plot on this patch and I think I'll drop my copy.  Please
update and resend, cc'ing everyone.

I've been sitting on this thing since November last year!  It fixes a
bug!  Where the heck are the MTD maintainers and why aren't they
running around with hair on fire getting this thing finalised and
merged?!?!
Rajashekhara, Sudhakar - July 23, 2010, 9:49 a.m.
Hi,

On Thu, Jul 22, 2010 at 03:30:58, Andrew Morton wrote:
> On Thu, 15 Jul 2010 17:33:03 +0530
> "Sudhakar Rajashekhara" <sudhakar.raj@ti.com> wrote:
> 
> > On Thu, Jul 15, 2010 at 17:11:32, Sudhakar Rajashekhara wrote:
> > > Hi,
> > > 
> > > On Thu, Jul 15, 2010 at 16:31:19, Jon Povey wrote:
> > > > Sudhakar Rajashekhara wrote:
> > > > > On TI's DA830/OMAP-L137, DA850/OMAP-L138 and DM365, after setting the
> > > > > 4BITECC_ADD_CALC_START bit in the NAND Flash control register to 1 and
> > > > > before waiting for the NAND Flash status register to be equal to 1, 2
> > > > > or 3, we have to wait till the ECC HW goes to correction state.
> > > > > Without this wait, ECC correction calculations will not be proper.
> > > > >
> > > > > This has been tested on DA830/OMAP-L137, DA850/OMAP-L138, DM355 and
> > > > > DM365 EVMs.
> > > > >
> > > > > Signed-off-by: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
> > > > > Acked-by: Sneha Narnakaje <nsnehaprabha@ti.com>
> > > > > Cc: David Woodhouse <dwmw2@infradead.org>
> > > > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > > > 
> > > > Have these people acked and signed off this new version of the patch?
> > > > 
> > > 
> > > No. Andrew Morton has not signed off this version. I'll remove
> > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > > 
> > 
> > Andrew Morton had signed off an earlier version of this patch and
> > it was present in -mm tree for a long time. He has not yet commented
> > on v2 version of this patch. But I thought I can carry forward the
> > Sign-offs from previous version to the next version. What's the common
> > practice?
> > 
> 
> 
> I've lost the plot on this patch and I think I'll drop my copy.  Please
> update and resend, cc'ing everyone.
> 
> I've been sitting on this thing since November last year!  It fixes a
> bug!  Where the heck are the MTD maintainers and why aren't they
> running around with hair on fire getting this thing finalised and
> merged?!?!
> 

I got an e-mail from Artem Bityutskiy(dedekind1@gmail.com) saying that
he has pushed this patch to his l2-mtd-2.6/master tree. May be it will
get to upstream from there.

Thanks,
Sudhakar

Patch

diff --git a/drivers/mtd/nand/davinci_nand.c b/drivers/mtd/nand/davinci_nand.c
index 9c9d893..1e2657c 100644
--- a/drivers/mtd/nand/davinci_nand.c
+++ b/drivers/mtd/nand/davinci_nand.c
@@ -311,7 +311,11 @@  static int nand_davinci_correct_4bit(struct mtd_info *mtd,
 	unsigned short ecc10[8];
 	unsigned short *ecc16;
 	u32 syndrome[4];
+	u32 ecc_state;
 	unsigned num_errors, corrected;
+	unsigned long timeo;
+	DEFINE_SPINLOCK(ecc_spin_lock);
+	unsigned long flags;
 
 	/* All bytes 0xff?  It's an erased page; ignore its ECC. */
 	for (i = 0; i < 10; i++) {
@@ -355,12 +359,30 @@  compare:
 	 */
 	davinci_nand_readl(info, NAND_ERR_ADD1_OFFSET);
 
+	spin_lock_irqsave(&ecc_spin_lock, flags);
 	/* Start address calculation, and wait for it to complete.
 	 * We _could_ start reading more data while this is working,
 	 * to speed up the overall page read.
 	 */
 	davinci_nand_writel(info, NANDFCR_OFFSET,
 			davinci_nand_readl(info, NANDFCR_OFFSET) | BIT(13));
+
+	/*
+	 * ECC_STATE field reads 0x3 (Error correction complete) immediately
+	 * after setting the 4BITECC_ADD_CALC_START bit. So if you immediately
+	 * begin trying to poll for the state, you may fall right out of your
+	 * loop without any of the correction calculations having taken place.
+	 * The recommendation from the hardware team is to wait till ECC_STATE
+	 * reads >= 4, which means ECC HW has entered correction state.
+	 */
+	timeo = jiffies + usecs_to_jiffies(100);
+	do {
+		ecc_state = (davinci_nand_readl(info,
+				NANDFSR_OFFSET) >> 8) & 0x0f;
+		cpu_relax();
+	} while ((ecc_state < 4) && time_before(jiffies, timeo));
+	spin_unlock_irqrestore(&ecc_spin_lock, flags);
+
 	for (;;) {
 		u32	fsr = davinci_nand_readl(info, NANDFSR_OFFSET);