diff mbox

[13/15] mtd: nand: samsung: retrieve ECC requirements from extended ID

Message ID 1464353701-23233-14-git-send-email-boris.brezillon@free-electrons.com
State Superseded
Headers show

Commit Message

Boris Brezillon May 27, 2016, 12:54 p.m. UTC
From: Hans de Goede <hdegoede@redhat.com>

On some nand controllers with hw-ecc the controller code wants to know
the ecc strength and size and having these as 0, 0 is not accepted.

Specifying these in devicetree is possible but undesirable as the nand
may be different in different production runs of the same board, so it
is better to get this info from the nand id where possible.

This commit adds code to read the ecc strength and size from the nand
for Samsung extended-id nands. This code is based on the info for the 5th
id byte in the datasheets for the following Samsung nands: K9GAG08U0E,
K9GAG08U0F, K9GAG08X0D, K9GBG08U0A, K9GBG08U0B. These all use these bits
in the exact same way.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---
 drivers/mtd/nand/nand_samsung.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Valdis Kl ē tnieks May 30, 2016, 12:20 a.m. UTC | #1
On Fri, 27 May 2016 14:54:59 +0200, Boris Brezillon said:
> From: Hans de Goede <hdegoede@redhat.com>
>
> On some nand controllers with hw-ecc the controller code wants to know
> the ecc strength and size and having these as 0, 0 is not accepted.
>
> Specifying these in devicetree is possible but undesirable as the nand
> may be different in different production runs of the same board, so it
> is better to get this info from the nand id where possible.
>
> This commit adds code to read the ecc strength and size from the nand
> for Samsung extended-id nands. This code is based on the info for the 5th
> id byte in the datasheets for the following Samsung nands: K9GAG08U0E,
> K9GAG08U0F, K9GAG08X0D, K9GBG08U0A, K9GBG08U0B. These all use these bits
> in the exact same way.

Is this correct for all Samsung nand devices supported by this driver?

(If this driver only covers those 5 specific parts, it's OK.  If there's
others, more research is needed....)
Boris Brezillon May 30, 2016, 7:44 a.m. UTC | #2
Hi Valdis,

On Sun, 29 May 2016 20:20:35 -0400
Valdis.Kletnieks@vt.edu wrote:

> On Fri, 27 May 2016 14:54:59 +0200, Boris Brezillon said:
> > From: Hans de Goede <hdegoede@redhat.com>
> >
> > On some nand controllers with hw-ecc the controller code wants to know
> > the ecc strength and size and having these as 0, 0 is not accepted.
> >
> > Specifying these in devicetree is possible but undesirable as the nand
> > may be different in different production runs of the same board, so it
> > is better to get this info from the nand id where possible.
> >
> > This commit adds code to read the ecc strength and size from the nand
> > for Samsung extended-id nands. This code is based on the info for the 5th
> > id byte in the datasheets for the following Samsung nands: K9GAG08U0E,
> > K9GAG08U0F, K9GAG08X0D, K9GBG08U0A, K9GBG08U0B. These all use these bits
> > in the exact same way.  
> 
> Is this correct for all Samsung nand devices supported by this driver?
> 
> (If this driver only covers those 5 specific parts, it's OK.  If there's
> others, more research is needed....)

Actually, that was my first reaction [1], but the more I think about it
the more I realize it's a non-issue.
AFAICT, there's no full-id entries for Samsung NANDs in the nand_ids
table, so this either means there's no real users of Samsung MLCs or
NAND controller drivers connecting to those chips don't care about the
->ecc_{step_ds,strength_ds} fields.

I agree that the solution is not perfect, but I'd prefer seeing the
NAND detection code iteratively improved than rejecting everything
until we're 100% sure that all cases are correctly handled (which might
never happen since NAND vendors introduce new NAND ID scheme if they
need to).

BTW, do you have Samsung datasheets describing a different NAND ID
format, or is it purely hypothetical?

Regards,

Boris

[1]http://lists.infradead.org/pipermail/linux-mtd/2015-July/060582.html
Valdis Kl ē tnieks May 30, 2016, 8:56 p.m. UTC | #3
On Mon, 30 May 2016 09:44:46 +0200, Boris Brezillon said:
> Hi Valdis,

> Actually, that was my first reaction [1], but the more I think about it
> the more I realize it's a non-issue.
> AFAICT, there's no full-id entries for Samsung NANDs in the nand_ids
> table, so this either means there's no real users of Samsung MLCs or
> NAND controller drivers connecting to those chips don't care about the
> ->ecc_{step_ds,strength_ds} fields.

I'm mostly, though not totally convinced (not having looked closely at
the existing code).  There's still a possible issue with the distinction
between:

A) "driver never references the variable" and

B) driver check if it's zero, and acts like it doesn't care if it is, but if
it's non-zero, it goes ahead and uses it, with possible hilarity ensuing if the
value is wrong.

Should be pretty easy for somebody who knows the code better than I to rule
out case B fairly quickly...

> I agree that the solution is not perfect, but I'd prefer seeing the
> NAND detection code iteratively improved than rejecting everything
> until we're 100% sure that all cases are correctly handled (which might
> never happen since NAND vendors introduce new NAND ID scheme if they
> need to).
>
> BTW, do you have Samsung datasheets describing a different NAND ID
> format, or is it purely hypothetical?

Mostly hypothetical.  I've just seen too many patches that assume "all chips
from  vendor XYZ do *this*" that were not at all corrrect.
Boris Brezillon May 30, 2016, 10:28 p.m. UTC | #4
On Mon, 30 May 2016 16:56:09 -0400
Valdis.Kletnieks@vt.edu wrote:

> On Mon, 30 May 2016 09:44:46 +0200, Boris Brezillon said:
> > Hi Valdis,  
> 
> > Actually, that was my first reaction [1], but the more I think about it
> > the more I realize it's a non-issue.
> > AFAICT, there's no full-id entries for Samsung NANDs in the nand_ids
> > table, so this either means there's no real users of Samsung MLCs or
> > NAND controller drivers connecting to those chips don't care about the  
> > ->ecc_{step_ds,strength_ds} fields.  
> 
> I'm mostly, though not totally convinced (not having looked closely at
> the existing code).  There's still a possible issue with the distinction
> between:
> 
> A) "driver never references the variable" and
> 
> B) driver check if it's zero, and acts like it doesn't care if it is, but if
> it's non-zero, it goes ahead and uses it, with possible hilarity ensuing if the
> value is wrong.
> 
> Should be pretty easy for somebody who knows the code better than I to rule
> out case B fairly quickly...

Ok, so I had a quick look, and only 4 drivers are actually using the
->ecc_{strength,step}_ds fields, and AFAICT, all of them are already
broken with the existing implementation, even if those fields are set
to 0.

- the atmel driver uses a default ECC config (2bits/512bytes) if
  those fields are set to 0, and this config is clearly not suitable
  for the MLC NANDs we are talking about (note that SLC NANDs seem to
  all use the 4 bytes extended ID scheme, which seems to be common to
  all vendors).

- the gpmi driver either returns an error if one of these fields
  are set to zero and the 'fsl,use-minimum-ecc' DT property is defined,
  or tries to fill the whole OOB area with ECC bytes if the property is
  not defined. The 2nd solution could work, if only we were sure about
  the encoding of the OOB size, but, as the ECC requirements field, it
  depends on the extended ID scheme. So, in the end, it's broken too.

- the pxa and sunxi drivers are just blindly relying on those fields if
  the 'nand-ecc-strength' and 'nand-ecc-step-size' properties are
  undefined. The pxa default to 1bit/512bytes if ecc strength or ecc
  step appear to be set to 0, while the sunxi driver completely rejects
  the NAND chip.
  In both cases, the current implementation is broken, either because
  you will use an unsuitable ECC config or because your NAND chip won't
  be registered.

So, as you can see, we're just moving from a broken state to another
broken state, except the new infrastructures allows one to extend the
detection logic and thus allow for correct detection of more chips.

> 
> > I agree that the solution is not perfect, but I'd prefer seeing the
> > NAND detection code iteratively improved than rejecting everything
> > until we're 100% sure that all cases are correctly handled (which might
> > never happen since NAND vendors introduce new NAND ID scheme if they
> > need to).
> >
> > BTW, do you have Samsung datasheets describing a different NAND ID
> > format, or is it purely hypothetical?  
> 
> Mostly hypothetical.  I've just seen too many patches that assume "all chips
> from  vendor XYZ do *this*" that were not at all corrrect.
> 

Yep, that's true, except I'm not promising anything here, I just say
that this patch adds code to detect a range of Samsung chips, and that
it can be extended to properly detect chips that do not use this format
if we appear to find some (which is very likely to happen).

Of course, we could decide to leave everything as is and add full-id
entries to the nand_ids table each time we want to support a new chip
that does not expose a valid ONFI of JEDEC parameter table. But that
means adding more and more info to the nand_flash_dev structure and
polluting the nand_ids table with a bunch of NAND chips that could
otherwise be handled by the same detection code.
And as detailed above, this solution is just as broken as mine but in a
different way (in both cases, NANDs that are not already supported by
the kernel will either be rejected or used ).
Boris Brezillon May 30, 2016, 10:32 p.m. UTC | #5
On Tue, 31 May 2016 00:28:24 +0200
Boris Brezillon <boris.brezillon@free-electrons.com> wrote:

> On Mon, 30 May 2016 16:56:09 -0400
> Valdis.Kletnieks@vt.edu wrote:
> 
> > On Mon, 30 May 2016 09:44:46 +0200, Boris Brezillon said:  
> > > Hi Valdis,    
> >   
> > > Actually, that was my first reaction [1], but the more I think about it
> > > the more I realize it's a non-issue.
> > > AFAICT, there's no full-id entries for Samsung NANDs in the nand_ids
> > > table, so this either means there's no real users of Samsung MLCs or
> > > NAND controller drivers connecting to those chips don't care about the    
> > > ->ecc_{step_ds,strength_ds} fields.    
> > 
> > I'm mostly, though not totally convinced (not having looked closely at
> > the existing code).  There's still a possible issue with the distinction
> > between:
> > 
> > A) "driver never references the variable" and
> > 
> > B) driver check if it's zero, and acts like it doesn't care if it is, but if
> > it's non-zero, it goes ahead and uses it, with possible hilarity ensuing if the
> > value is wrong.
> > 
> > Should be pretty easy for somebody who knows the code better than I to rule
> > out case B fairly quickly...  
> 
> Ok, so I had a quick look, and only 4 drivers are actually using the
> ->ecc_{strength,step}_ds fields, and AFAICT, all of them are already  
> broken with the existing implementation, even if those fields are set
> to 0.
> 
> - the atmel driver uses a default ECC config (2bits/512bytes) if
>   those fields are set to 0, and this config is clearly not suitable
>   for the MLC NANDs we are talking about (note that SLC NANDs seem to
>   all use the 4 bytes extended ID scheme, which seems to be common to
>   all vendors).
> 
> - the gpmi driver either returns an error if one of these fields
>   are set to zero and the 'fsl,use-minimum-ecc' DT property is defined,
>   or tries to fill the whole OOB area with ECC bytes if the property is
>   not defined. The 2nd solution could work, if only we were sure about
>   the encoding of the OOB size, but, as the ECC requirements field, it
>   depends on the extended ID scheme. So, in the end, it's broken too.
> 
> - the pxa and sunxi drivers are just blindly relying on those fields if
>   the 'nand-ecc-strength' and 'nand-ecc-step-size' properties are
>   undefined. The pxa default to 1bit/512bytes if ecc strength or ecc
>   step appear to be set to 0, while the sunxi driver completely rejects
>   the NAND chip.
>   In both cases, the current implementation is broken, either because
>   you will use an unsuitable ECC config or because your NAND chip won't
>   be registered.
> 
> So, as you can see, we're just moving from a broken state to another
> broken state, except the new infrastructures allows one to extend the
> detection logic and thus allow for correct detection of more chips.
> 
> >   
> > > I agree that the solution is not perfect, but I'd prefer seeing the
> > > NAND detection code iteratively improved than rejecting everything
> > > until we're 100% sure that all cases are correctly handled (which might
> > > never happen since NAND vendors introduce new NAND ID scheme if they
> > > need to).
> > >
> > > BTW, do you have Samsung datasheets describing a different NAND ID
> > > format, or is it purely hypothetical?    
> > 
> > Mostly hypothetical.  I've just seen too many patches that assume "all chips
> > from  vendor XYZ do *this*" that were not at all corrrect.
> >   
> 
> Yep, that's true, except I'm not promising anything here, I just say
> that this patch adds code to detect a range of Samsung chips, and that
> it can be extended to properly detect chips that do not use this format
> if we appear to find some (which is very likely to happen).
> 
> Of course, we could decide to leave everything as is and add full-id
> entries to the nand_ids table each time we want to support a new chip
> that does not expose a valid ONFI of JEDEC parameter table. But that
> means adding more and more info to the nand_flash_dev structure and
> polluting the nand_ids table with a bunch of NAND chips that could
> otherwise be handled by the same detection code.
> And as detailed above, this solution is just as broken as mine but in a
> different way (in both cases, NANDs that are not already supported by
> the kernel will either be rejected or used ).
                                            ^ inappropriately
diff mbox

Patch

diff --git a/drivers/mtd/nand/nand_samsung.c b/drivers/mtd/nand/nand_samsung.c
index 27ede41..39448ff 100644
--- a/drivers/mtd/nand/nand_samsung.c
+++ b/drivers/mtd/nand/nand_samsung.c
@@ -61,6 +61,26 @@  static int samsung_nand_decode_id(struct nand_chip *chip)
 		extid >>= 2;
 		mtd->erasesize = (128 * 1024) <<
 				 (((extid >> 1) & 0x04) | (extid & 0x03));
+
+		/* Extract ECC requirements from 5th id byte*/
+		extid = (chip->id.data[4] >> 4) & 0x07;
+		if (extid < 5) {
+			chip->ecc_step_ds = 512;
+			chip->ecc_strength_ds = 1 << extid;
+		} else {
+			chip->ecc_step_ds = 1024;
+			switch (extid) {
+			case 5:
+				chip->ecc_strength_ds = 24;
+				break;
+			case 6:
+				chip->ecc_strength_ds = 40;
+				break;
+			case 7:
+				chip->ecc_strength_ds = 60;
+				break;
+			}
+		}
 	}
 
 	if (mtd->writesize > 512)