diff mbox

[U-Boot,1/4] spl: nand: simple: replace readb() with chip specific read_buf()

Message ID 1437003228-14746-2-git-send-email-vz@mleia.com
State Superseded
Headers show

Commit Message

Vladimir Zapolskiy July 15, 2015, 11:33 p.m. UTC
Some NAND controllers define custom functions to read data out,
respect this in order to correctly support bad block handling in
simple SPL NAND framework.

NAND controller specific read_buf() is used even to read 1 byte in
case of connected 8-bit NAND device, it turns out that read_byte()
may become outdated.

Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
---
 drivers/mtd/nand/nand_spl_simple.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Albert ARIBAUD July 16, 2015, 8:02 a.m. UTC | #1
Hello Vladimir,

On Thu, 16 Jul 2015 02:33:45 +0300, Vladimir Zapolskiy <vz@mleia.com>
wrote:
> Some NAND controllers define custom functions to read data out,
> respect this in order to correctly support bad block handling in
> simple SPL NAND framework.
> 
> NAND controller specific read_buf() is used even to read 1 byte in
> case of connected 8-bit NAND device, it turns out that read_byte()
> may become outdated.
> 
> Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
> ---
>  drivers/mtd/nand/nand_spl_simple.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mtd/nand/nand_spl_simple.c b/drivers/mtd/nand/nand_spl_simple.c
> index 700ca32..e69f662 100644
> --- a/drivers/mtd/nand/nand_spl_simple.c
> +++ b/drivers/mtd/nand/nand_spl_simple.c
> @@ -115,6 +115,7 @@ static int nand_command(int block, int page, uint32_t offs,
>  static int nand_is_bad_block(int block)
>  {
>  	struct nand_chip *this = mtd.priv;
> +	u_char bb_data[2];
>  
>  	nand_command(block, 0, CONFIG_SYS_NAND_BAD_BLOCK_POS,
>  		NAND_CMD_READOOB);
> @@ -123,10 +124,12 @@ static int nand_is_bad_block(int block)
>  	 * Read one byte (or two if it's a 16 bit chip).
>  	 */
>  	if (this->options & NAND_BUSWIDTH_16) {
> -		if (readw(this->IO_ADDR_R) != 0xffff)
> +		this->read_buf(&mtd, bb_data, 2);
> +		if (bb_data[0] != 0xff || bb_data[1] != 0xff)
>  			return 1;
>  	} else {
> -		if (readb(this->IO_ADDR_R) != 0xff)
> +		this->read_buf(&mtd, bb_data, 1);
> +		if (bb_data[0] != 0xff)
>  			return 1;
>  	}
>  
> -- 
> 2.1.4
> 

The way you describe this patch, it looks like a bug that should have
bitten way more boards than lpc32xx. Did you have a look at some other
boards to see why this did not affect them? Conversively, what is the
actual failure scenario that led you to writing this patch? 

Amicalement,
Vladimir Zapolskiy July 16, 2015, 11:31 a.m. UTC | #2
Hello Albert,

On 16.07.2015 11:02, Albert ARIBAUD wrote:
> Hello Vladimir,
> 
> On Thu, 16 Jul 2015 02:33:45 +0300, Vladimir Zapolskiy <vz@mleia.com>
> wrote:
>> Some NAND controllers define custom functions to read data out,
>> respect this in order to correctly support bad block handling in
>> simple SPL NAND framework.
>>
>> NAND controller specific read_buf() is used even to read 1 byte in
>> case of connected 8-bit NAND device, it turns out that read_byte()
>> may become outdated.
>>
>> Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
>> ---
>>  drivers/mtd/nand/nand_spl_simple.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/mtd/nand/nand_spl_simple.c b/drivers/mtd/nand/nand_spl_simple.c
>> index 700ca32..e69f662 100644
>> --- a/drivers/mtd/nand/nand_spl_simple.c
>> +++ b/drivers/mtd/nand/nand_spl_simple.c
>> @@ -115,6 +115,7 @@ static int nand_command(int block, int page, uint32_t offs,
>>  static int nand_is_bad_block(int block)
>>  {
>>  	struct nand_chip *this = mtd.priv;
>> +	u_char bb_data[2];
>>  
>>  	nand_command(block, 0, CONFIG_SYS_NAND_BAD_BLOCK_POS,
>>  		NAND_CMD_READOOB);
>> @@ -123,10 +124,12 @@ static int nand_is_bad_block(int block)
>>  	 * Read one byte (or two if it's a 16 bit chip).
>>  	 */
>>  	if (this->options & NAND_BUSWIDTH_16) {
>> -		if (readw(this->IO_ADDR_R) != 0xffff)
>> +		this->read_buf(&mtd, bb_data, 2);
>> +		if (bb_data[0] != 0xff || bb_data[1] != 0xff)
>>  			return 1;
>>  	} else {
>> -		if (readb(this->IO_ADDR_R) != 0xff)
>> +		this->read_buf(&mtd, bb_data, 1);
>> +		if (bb_data[0] != 0xff)
>>  			return 1;
>>  	}
>>  
>> -- 
>> 2.1.4
>>
> 
> The way you describe this patch, it looks like a bug that should have
> bitten way more boards than lpc32xx. Did you have a look at some other
> boards to see why this did not affect them?

Yes, it is a bug IMHO.

Grepping for CONFIG_SPL_NAND_SIMPLE I see that TI and Tegra boards may
be impacted (positively or negatively):

* CONFIG_NAND_OMAP_GPMC --- own .read_buf(), default .read_byte()
* CONFIG_NAND_DAVINCI   --- own .read_buf(), default .read_byte()
* CONFIG_TEGRA_NAND     --- own .read_byte(), own .read_buf()

> Conversively, what is the actual failure scenario that led you to 
> writing this patch?

The failure scenario is quite simple, the ARM core gets stuck on first
attempt to access SLC NAND data register -- traced with JTAG.

The same happens, if I remove custom .read_byte()/.read_buf() from SLC
NAND driver. The only difference between custom .read_byte() and shared
read_byte() is in readb()/readl() access to the data register, it is
essential to have only 32-bit wide access to SLC NAND registers.

--
With best wishes,
Vladimir
Albert ARIBAUD July 16, 2015, 12:43 p.m. UTC | #3
Hello Vladimir,

On Thu, 16 Jul 2015 14:31:26 +0300, Vladimir Zapolskiy <vz@mleia.com>
wrote:
> Hello Albert,
> 
> On 16.07.2015 11:02, Albert ARIBAUD wrote:
> > Hello Vladimir,
> > 
> > On Thu, 16 Jul 2015 02:33:45 +0300, Vladimir Zapolskiy <vz@mleia.com>
> > wrote:
> >> Some NAND controllers define custom functions to read data out,
> >> respect this in order to correctly support bad block handling in
> >> simple SPL NAND framework.
> >>
> >> NAND controller specific read_buf() is used even to read 1 byte in
> >> case of connected 8-bit NAND device, it turns out that read_byte()
> >> may become outdated.
> >>
> >> Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
> >> ---
> >>  drivers/mtd/nand/nand_spl_simple.c | 7 +++++--
> >>  1 file changed, 5 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/mtd/nand/nand_spl_simple.c b/drivers/mtd/nand/nand_spl_simple.c
> >> index 700ca32..e69f662 100644
> >> --- a/drivers/mtd/nand/nand_spl_simple.c
> >> +++ b/drivers/mtd/nand/nand_spl_simple.c
> >> @@ -115,6 +115,7 @@ static int nand_command(int block, int page, uint32_t offs,
> >>  static int nand_is_bad_block(int block)
> >>  {
> >>  	struct nand_chip *this = mtd.priv;
> >> +	u_char bb_data[2];
> >>  
> >>  	nand_command(block, 0, CONFIG_SYS_NAND_BAD_BLOCK_POS,
> >>  		NAND_CMD_READOOB);
> >> @@ -123,10 +124,12 @@ static int nand_is_bad_block(int block)
> >>  	 * Read one byte (or two if it's a 16 bit chip).
> >>  	 */
> >>  	if (this->options & NAND_BUSWIDTH_16) {
> >> -		if (readw(this->IO_ADDR_R) != 0xffff)
> >> +		this->read_buf(&mtd, bb_data, 2);
> >> +		if (bb_data[0] != 0xff || bb_data[1] != 0xff)
> >>  			return 1;
> >>  	} else {
> >> -		if (readb(this->IO_ADDR_R) != 0xff)
> >> +		this->read_buf(&mtd, bb_data, 1);
> >> +		if (bb_data[0] != 0xff)
> >>  			return 1;
> >>  	}
> >>  
> >> -- 
> >> 2.1.4
> >>
> > 
> > The way you describe this patch, it looks like a bug that should have
> > bitten way more boards than lpc32xx. Did you have a look at some other
> > boards to see why this did not affect them?
> 
> Yes, it is a bug IMHO.

If it is, then it has hit no board which defines CONFIG_SPL_NAND_SIMPLE
and we should understand why.

> Grepping for CONFIG_SPL_NAND_SIMPLE I see that TI and Tegra boards may
> be impacted (positively or negatively):
> 
> * CONFIG_NAND_OMAP_GPMC --- own .read_buf(), default .read_byte()
> * CONFIG_NAND_DAVINCI   --- own .read_buf(), default .read_byte()
> * CONFIG_TEGRA_NAND     --- own .read_byte(), own .read_buf()

They may be impacted by your change, but they are working now -- they
are not obscure models. Let's dig a bit...

> > Conversively, what is the actual failure scenario that led you to 
> > writing this patch?
> 
> The failure scenario is quite simple, the ARM core gets stuck on first
> attempt to access SLC NAND data register -- traced with JTAG.
> 
> The same happens, if I remove custom .read_byte()/.read_buf() from SLC
> NAND driver. The only difference between custom .read_byte() and shared
> read_byte() is in readb()/readl() access to the data register, it is
> essential to have only 32-bit wide access to SLC NAND registers.

README describes CONFIG_SPL_NAND_SIMPLE as "Support for NAND boot using
simple NAND drivers that expose the cmd_ctrl() interface". The cmd_ctrl
interface actually contains the cmd_ctrl() function *and* the
IO_ADDR_[RW] fields. IOW, a simple NAND driver provides byte or word
access to the NAND's I/O lines on which command, address and data are
passed. If the NAND is 8 bits, then there are 8 lines; if it is 16
bit, then there are 16 lines. 

I reckon that the OMAP/DAVINCI and TEGRA simple drivers just do that:
set IO_ADDR_[RW] to the IP register through which direct access to the
NAND's I/O lines can be performed, byte or word depending on the chip
width.

As such, there is no bug in the way nand_simple.c uses IO_ADDR_[RW].

OTOH, the SLC IP does not provide direct access to the NAND I/O lines
through a general register; what it provides is a set of three
specialized registers one for commands, one for addresses and one
for data. Plus, even though only 8 bit NANDs are supported, the data
register does not physically support 8-bit wide accesses (NXP: I am
still struggling to understand what you were trying to achieve there).

All this basically makes the SLC controller a 'not simple NAND IP', and
its driver a 'not simple NAND driver' incompatible with nand_simple.c.

Your solution to this incompatibility is to change nand_simple.c to
support other types of drivers; but by doing that, you're changing the
API between nand_simple.c and all simple drivers, possibly changing the
established behavior.

I personally don't think this is the right way; nand_simple.c should be
left unchanged and the board should simply not use it since it does not
have a simple NAND controller, and instead it should provide its own
nand_spl_load_image().

But hey, I'm not then NAND custodian. Scott: your call. :)

> --
> With best wishes,
> Vladimir

Amicalement,
Vladimir Zapolskiy July 16, 2015, 1:48 p.m. UTC | #4
Hello Albert,

On 16.07.2015 15:43, Albert ARIBAUD wrote:
> Hello Vladimir,
> 
> On Thu, 16 Jul 2015 14:31:26 +0300, Vladimir Zapolskiy <vz@mleia.com>
> wrote:
>> Hello Albert,
>>
>> On 16.07.2015 11:02, Albert ARIBAUD wrote:
>>> Hello Vladimir,
>>>
>>> On Thu, 16 Jul 2015 02:33:45 +0300, Vladimir Zapolskiy <vz@mleia.com>
>>> wrote:
>>>> Some NAND controllers define custom functions to read data out,
>>>> respect this in order to correctly support bad block handling in
>>>> simple SPL NAND framework.
>>>>
>>>> NAND controller specific read_buf() is used even to read 1 byte in
>>>> case of connected 8-bit NAND device, it turns out that read_byte()
>>>> may become outdated.
>>>>
>>>> Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
>>>> ---
>>>>  drivers/mtd/nand/nand_spl_simple.c | 7 +++++--
>>>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/mtd/nand/nand_spl_simple.c b/drivers/mtd/nand/nand_spl_simple.c
>>>> index 700ca32..e69f662 100644
>>>> --- a/drivers/mtd/nand/nand_spl_simple.c
>>>> +++ b/drivers/mtd/nand/nand_spl_simple.c
>>>> @@ -115,6 +115,7 @@ static int nand_command(int block, int page, uint32_t offs,
>>>>  static int nand_is_bad_block(int block)
>>>>  {
>>>>  	struct nand_chip *this = mtd.priv;
>>>> +	u_char bb_data[2];
>>>>  
>>>>  	nand_command(block, 0, CONFIG_SYS_NAND_BAD_BLOCK_POS,
>>>>  		NAND_CMD_READOOB);
>>>> @@ -123,10 +124,12 @@ static int nand_is_bad_block(int block)
>>>>  	 * Read one byte (or two if it's a 16 bit chip).
>>>>  	 */
>>>>  	if (this->options & NAND_BUSWIDTH_16) {
>>>> -		if (readw(this->IO_ADDR_R) != 0xffff)
>>>> +		this->read_buf(&mtd, bb_data, 2);
>>>> +		if (bb_data[0] != 0xff || bb_data[1] != 0xff)
>>>>  			return 1;
>>>>  	} else {
>>>> -		if (readb(this->IO_ADDR_R) != 0xff)
>>>> +		this->read_buf(&mtd, bb_data, 1);
>>>> +		if (bb_data[0] != 0xff)
>>>>  			return 1;
>>>>  	}
>>>>  
>>>> -- 
>>>> 2.1.4
>>>>
>>>
>>> The way you describe this patch, it looks like a bug that should have
>>> bitten way more boards than lpc32xx. Did you have a look at some other
>>> boards to see why this did not affect them?
>>
>> Yes, it is a bug IMHO.
> 
> If it is, then it has hit no board which defines CONFIG_SPL_NAND_SIMPLE
> and we should understand why.
> 
>> Grepping for CONFIG_SPL_NAND_SIMPLE I see that TI and Tegra boards may
>> be impacted (positively or negatively):
>>
>> * CONFIG_NAND_OMAP_GPMC --- own .read_buf(), default .read_byte()
>> * CONFIG_NAND_DAVINCI   --- own .read_buf(), default .read_byte()
>> * CONFIG_TEGRA_NAND     --- own .read_byte(), own .read_buf()
> 
> They may be impacted by your change, but they are working now -- they
> are not obscure models. Let's dig a bit...

For simplicity let's neglect 16-bit NANDs at the moment, readb() can be
replaced by readw() etc. in my message.

You may notice that nand_spl_simple.c exploits NAND driver specific
.read_buf(), therefore the latter one must be defined (by driver or NAND
framework), can it happen that .read_buf(..., 1) returns a different
result from readb()?

I hope in all cases above .read_buf(..., 1) is equal to common readb(),
so okay, this is not a bug fix for currently supported drivers, but a
misfeature, which does not allow to reuse nand_spl_simple.c for a
slightly different NAND controller.

Fortunately the proposed generalization of nand_spl_simple.c is
straightforward.

>>> Conversively, what is the actual failure scenario that led you to 
>>> writing this patch?
>>
>> The failure scenario is quite simple, the ARM core gets stuck on first
>> attempt to access SLC NAND data register -- traced with JTAG.
>>
>> The same happens, if I remove custom .read_byte()/.read_buf() from SLC
>> NAND driver. The only difference between custom .read_byte() and shared
>> read_byte() is in readb()/readl() access to the data register, it is
>> essential to have only 32-bit wide access to SLC NAND registers.
> 
> README describes CONFIG_SPL_NAND_SIMPLE as "Support for NAND boot using
> simple NAND drivers that expose the cmd_ctrl() interface". The cmd_ctrl
> interface actually contains the cmd_ctrl() function *and* the
> IO_ADDR_[RW] fields. IOW, a simple NAND driver provides byte or word
> access to the NAND's I/O lines on which command, address and data are
> passed. If the NAND is 8 bits, then there are 8 lines; if it is 16
> bit, then there are 16 lines. 

Please see my note above about read_buf().

nand_spl_simple.c is designed in such a way that it uses NAND driver
specific interfaces (ECC specific interfaces are omitted here):

* cmd_ctrl()
* dev_ready()
* read_buf()

I agree that some of these interfaces may be populated by default NAND
framework, if it is a deliberate intention to have only default
functions, it might be better to hardcode default functions into
nand_spl_simple.c , but this destroys flexibility.

I would say README description of CONFIG_SPL_NAND_SIMPLE is not
precisely correct.

> I reckon that the OMAP/DAVINCI and TEGRA simple drivers just do that:
> set IO_ADDR_[RW] to the IP register through which direct access to the
> NAND's I/O lines can be performed, byte or word depending on the chip
> width.
> 
> As such, there is no bug in the way nand_simple.c uses IO_ADDR_[RW].

Okay, there is no bug in currently supported drivers, I believe.

> OTOH, the SLC IP does not provide direct access to the NAND I/O lines
> through a general register; what it provides is a set of three
> specialized registers one for commands, one for addresses and one
> for data. Plus, even though only 8 bit NANDs are supported, the data
> register does not physically support 8-bit wide accesses (NXP: I am
> still struggling to understand what you were trying to achieve there).
> 
> All this basically makes the SLC controller a 'not simple NAND IP', and
> its driver a 'not simple NAND driver' incompatible with nand_simple.c.
> 
> Your solution to this incompatibility is to change nand_simple.c to
> support other types of drivers; but by doing that, you're changing the
> API between nand_simple.c and all simple drivers, possibly changing the
> established behavior.

Could my statement be confirmed or rejected?

   One byte data read is the same if .read_buf(..., 1), .read_byte() or
readb() is used on TI and Tegra platforms, and the interfaces can be
interchanged.

I haven't tested it, but I will be surprised, if my statement is wrong.

> I personally don't think this is the right way; nand_simple.c should be
> left unchanged and the board should simply not use it since it does not
> have a simple NAND controller, and instead it should provide its own
> nand_spl_load_image().

For me an alternative change to the proposed one is to duplicate
nand_spl_simple.c functionality in LPC32xx SLC NAND driver. From
maintenance point of view this is not the best thing to do IMHO.

> But hey, I'm not then NAND custodian. Scott: your call. :)

Anyway thank you for review and comments.

--
With best wishes,
Vladimir
Albert ARIBAUD July 16, 2015, 2:12 p.m. UTC | #5
Hello Vladimir,

On Thu, 16 Jul 2015 16:48:26 +0300, Vladimir Zapolskiy <vz@mleia.com>
wrote:

(cutting short to the essential point remaining)

> > I personally don't think this is the right way; nand_simple.c should be
> > left unchanged and the board should simply not use it since it does not
> > have a simple NAND controller, and instead it should provide its own
> > nand_spl_load_image().
> 
> For me an alternative change to the proposed one is to duplicate
> nand_spl_simple.c functionality in LPC32xx SLC NAND driver. From
> maintenance point of view this is not the best thing to do IMHO.

You're right that some of the functionality present in nand_simple.c is
duplicated elsewhere; however, that functionality is not the one specific
to nand_simple; actually, it is nand_spl_load_image(), the overall load
functionality which should be in no driver at all but of which I see ten
incarnations in ten drivers.

There should be only one nand_spl_load_image(), in its own file, with
placeholders for driver-specific actions such as page read, bad page
check, etc. One day, maybe, there will be a patch for that.

Anyway: you propose changes in the design of the NAND subsystem so that
the SLC controller driver can use the "simple" model, while I think the
SLC controller is not "simple" so it should not be managed as one. This
dilemma is about the NAND subsystem design, which is waaaay outside my
province -- hence my deferring to Scott, who will decide which way this
should go.

> > But hey, I'm not then NAND custodian. Scott: your call. :)
> 
> Anyway thank you for review and comments.

You're welcome!

> --
> With best wishes,
> Vladimir

Amicalement,
Scott Wood July 16, 2015, 7:30 p.m. UTC | #6
On Thu, 2015-07-16 at 16:12 +0200, Albert ARIBAUD wrote:
> Hello Vladimir,
> 
> On Thu, 16 Jul 2015 16:48:26 +0300, Vladimir Zapolskiy <vz@mleia.com>
> wrote:
> 
> (cutting short to the essential point remaining)
> 
> > > I personally don't think this is the right way; nand_simple.c should be
> > > left unchanged and the board should simply not use it since it does not
> > > have a simple NAND controller, and instead it should provide its own
> > > nand_spl_load_image().

I don't have a problem with expanding the definition of "simple" to include 
custom read_byte and such, as long as we make sure that existing targets 
don't break.  For drivers that don't supply these functions, where are the 
default functions coming from?  I don't see nand_spl_simple's nand_init() 
initializing them.

> > For me an alternative change to the proposed one is to duplicate
> > nand_spl_simple.c functionality in LPC32xx SLC NAND driver. From
> > maintenance point of view this is not the best thing to do IMHO.
> 
> You're right that some of the functionality present in nand_simple.c is
> duplicated elsewhere; however, that functionality is not the one specific
> to nand_simple; actually, it is nand_spl_load_image(), the overall load
> functionality which should be in no driver at all but of which I see ten
> incarnations in ten drivers.
> 
> There should be only one nand_spl_load_image(), in its own file, with
> placeholders for driver-specific actions such as page read, bad page
> check, etc. One day, maybe, there will be a patch for that.

You'll never get it down to just one, since there are some targets that rely 
on the tight integration within a single file to squeak in under a 4 KiB size 
limit.  Maybe some of the implementations could be replaced with an optional 
generic version, though.

-Scott
diff mbox

Patch

diff --git a/drivers/mtd/nand/nand_spl_simple.c b/drivers/mtd/nand/nand_spl_simple.c
index 700ca32..e69f662 100644
--- a/drivers/mtd/nand/nand_spl_simple.c
+++ b/drivers/mtd/nand/nand_spl_simple.c
@@ -115,6 +115,7 @@  static int nand_command(int block, int page, uint32_t offs,
 static int nand_is_bad_block(int block)
 {
 	struct nand_chip *this = mtd.priv;
+	u_char bb_data[2];
 
 	nand_command(block, 0, CONFIG_SYS_NAND_BAD_BLOCK_POS,
 		NAND_CMD_READOOB);
@@ -123,10 +124,12 @@  static int nand_is_bad_block(int block)
 	 * Read one byte (or two if it's a 16 bit chip).
 	 */
 	if (this->options & NAND_BUSWIDTH_16) {
-		if (readw(this->IO_ADDR_R) != 0xffff)
+		this->read_buf(&mtd, bb_data, 2);
+		if (bb_data[0] != 0xff || bb_data[1] != 0xff)
 			return 1;
 	} else {
-		if (readb(this->IO_ADDR_R) != 0xff)
+		this->read_buf(&mtd, bb_data, 1);
+		if (bb_data[0] != 0xff)
 			return 1;
 	}