diff mbox

[U-Boot,1/3] nand: lpc32xx: add SLC NAND driver

Message ID 1437166134-21204-2-git-send-email-slemieux.tyco@gmail.com
State Superseded
Headers show

Commit Message

Sylvain Lemieux July 17, 2015, 8:48 p.m. UTC
From: Sylvain Lemieux <slemieux@tycoint.com>

Incorporate NAND SLC drivers from legacy LPCLinux NXP BSP.
The files taken from the legacy patch are:
- lpc32xx SLC NAND driver
- lpc3250 header file SLC NAND registers definition.

Updated the driver to integrate with the latest u-boot:
1) Fixed checkpatch script output in legacy code.
   A single warning remaining.
2) Use LPC32xx definition from "cpu.h" and "clk.h".
3) Incorporate NAND specific register definition from "lpc3250.h" header file
   from legacy BSP patch from LPCLinux.
4) Use u-boot API for register access to remove the use of volatile in
   register definition taken from "lpc3250.h" header file.
5) Add support to configure BBT options.
6) Add support to configure ECC strength.
7) Add slc nand clock control register bits (clk.h).
8) Add functions to initialize the SLC NAND clock.

The following warning from the legacy code is still present:
lpc32xx_nand.c:195: WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt

The legacy BSP patch (u-boot-2009.03_lpc32x0-v1.07.patch.tar.bz2)
was downloaded from the LPCLinux Web site.

Signed-off-by: Sylvain Lemieux <slemieux@tycoint.com>
---
 arch/arm/cpu/arm926ejs/lpc32xx/devices.c      |   6 +
 arch/arm/include/asm/arch-lpc32xx/clk.h       |   2 +
 arch/arm/include/asm/arch-lpc32xx/sys_proto.h |   1 +
 drivers/mtd/nand/Makefile                     |   1 +
 drivers/mtd/nand/lpc32xx_nand_slc.c           | 501 ++++++++++++++++++++++++++
 5 files changed, 511 insertions(+)
 create mode 100644 drivers/mtd/nand/lpc32xx_nand_slc.c

Comments

LEMIEUX, SYLVAIN July 17, 2015, 10:24 p.m. UTC | #1
Hi Albert,

Thanks for the feedback.

> From: U-Boot [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Albert ARIBAUD
> Sent: 17-Jul-15 5:20 PM
>
> Hello Sylvain,
>
> On Fri, 17 Jul 2015 16:48:52 -0400, slemieux.tyco@gmail.com
> <slemieux.tyco@gmail.com> wrote:
>
> > 1) Fixed checkpatch script output in legacy code.
> >    A single warning remaining.
>
> > The following warning from the legacy code is still present:
> > lpc32xx_nand.c:195: WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt
>
> > +static u_char lpc32xx_read_byte(struct mtd_info *mtd)
> > +{
> > +   struct nand_chip *this = mtd->priv;
> > +   unsigned long *preg = (unsigned long *)this->IO_ADDR_R;
> > +   volatile unsigned long tmp32;
> > +   tmp32 = *preg;
> > +   return (u_char)tmp32;
> > +}
>
> The volatile above has no reason to exist; the warning is justified
> here as we have accessors that guarantee that the access will not be
> optimized away or reordered, juste like the 'volatile' above tries to
> do (and yes, these accessors *use* 'volatile'. All the more a reason
> not to use it again here).
>
> Besides, the code is quite verbose and not precise enough. Yes,
> 'unsigned long' is 32-bit-ish, but in U-Boot, when something is 32-bit,
> that is explicit.
>
> All in all, the whole function could be expressed as:
>
>       static u_char lpc32xx_read_byte(struct mtd_info *mtd)
>       {
>               struct nand_chip *this = mtd->priv;
>
>               return (u_char)readl(this->IO_ADDR_R);
>       }
>
> BTW, isn't IO_ADDR_R pointing to the data register, and isn't the data
> register 16-bits? And if so, then why the 32-bits read?
>

The register is 16 bits; this implementation is the porting of the initial code.
I will wait for feedback and see how we want to approach this
(add DMA and HW ECC to the NAND SLC driver sent by Vladimir or
update the driver as part of the porting effort).

> Also, I did not find where the IO_ADDR_R field is assigned. Did I miss
> it?

"CONFIG_SYS_NAND_SELF_INIT" is not define; the legacy BSP driver is using the
default initialization done within "nand_init_chip()" function inside "drivers/mtd/nand/nand.c".

>
> This is just one case, but I suspect in many places, the code is
> unnecessarily complex. I understand it was ported, not written from
> scratch, but I think porting code should not prevent us from making it
> smaller, more efficient and more maintainable.
>
> Amicalement,
> --
> Albert.
>

Sylvain
Scott Wood July 17, 2015, 11:01 p.m. UTC | #2
On Fri, 2015-07-17 at 22:24 +0000, LEMIEUX, SYLVAIN wrote:
> Hi Albert,
> 
> Thanks for the feedback.
> 
> > From: U-Boot [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Albert 
> > ARIBAUD
> > Sent: 17-Jul-15 5:20 PM
> > 
> > Hello Sylvain,
> > 
> > On Fri, 17 Jul 2015 16:48:52 -0400, slemieux.tyco@gmail.com
> > <slemieux.tyco@gmail.com> wrote:
> > 
> > > 1) Fixed checkpatch script output in legacy code.
> > >    A single warning remaining.
> > 
> > > The following warning from the legacy code is still present:
> > > lpc32xx_nand.c:195: WARNING: Use of volatile is usually wrong: see 
> > > Documentation/volatile-considered-harmful.txt
> > 
> > > +static u_char lpc32xx_read_byte(struct mtd_info *mtd)
> > > +{
> > > +   struct nand_chip *this = mtd->priv;
> > > +   unsigned long *preg = (unsigned long *)this->IO_ADDR_R;
> > > +   volatile unsigned long tmp32;
> > > +   tmp32 = *preg;
> > > +   return (u_char)tmp32;
> > > +}
> > 
> > The volatile above has no reason to exist; the warning is justified
> > here as we have accessors that guarantee that the access will not be
> > optimized away or reordered, juste like the 'volatile' above tries to
> > do (and yes, these accessors *use* 'volatile'. All the more a reason
> > not to use it again here).
> > 
> > Besides, the code is quite verbose and not precise enough. Yes,
> > 'unsigned long' is 32-bit-ish, but in U-Boot, when something is 32-bit,
> > that is explicit.
> > 
> > All in all, the whole function could be expressed as:
> > 
> >       static u_char lpc32xx_read_byte(struct mtd_info *mtd)
> >       {
> >               struct nand_chip *this = mtd->priv;
> > 
> >               return (u_char)readl(this->IO_ADDR_R);
> >       }
> > 
> > BTW, isn't IO_ADDR_R pointing to the data register, and isn't the data
> > register 16-bits? And if so, then why the 32-bits read?
> > 
> 
> The register is 16 bits; this implementation is the porting of the initial 
> code.
> I will wait for feedback and see how we want to approach this
> (add DMA and HW ECC to the NAND SLC driver sent by Vladimir or
> update the driver as part of the porting effort).

If the register is 16-bit, then you should use readw(), not readl().

Why are there two different versions of this driver being submitted in 
parallel?

-Scott
Vladimir Zapolskiy July 17, 2015, 11:10 p.m. UTC | #3
Hi Sylvain, Albert,

On 18.07.2015 01:24, LEMIEUX, SYLVAIN wrote:
> Hi Albert,
> 
> Thanks for the feedback.
> 
>> From: U-Boot [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Albert ARIBAUD
>> Sent: 17-Jul-15 5:20 PM
>>
>> Hello Sylvain,
>>
>> On Fri, 17 Jul 2015 16:48:52 -0400, slemieux.tyco@gmail.com
>> <slemieux.tyco@gmail.com> wrote:
>>
>>> 1) Fixed checkpatch script output in legacy code.
>>>    A single warning remaining.
>>
>>> The following warning from the legacy code is still present:
>>> lpc32xx_nand.c:195: WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt
>>
>>> +static u_char lpc32xx_read_byte(struct mtd_info *mtd)
>>> +{
>>> +   struct nand_chip *this = mtd->priv;
>>> +   unsigned long *preg = (unsigned long *)this->IO_ADDR_R;
>>> +   volatile unsigned long tmp32;
>>> +   tmp32 = *preg;
>>> +   return (u_char)tmp32;
>>> +}
>>
>> The volatile above has no reason to exist; the warning is justified
>> here as we have accessors that guarantee that the access will not be
>> optimized away or reordered, juste like the 'volatile' above tries to
>> do (and yes, these accessors *use* 'volatile'. All the more a reason
>> not to use it again here).
>>
>> Besides, the code is quite verbose and not precise enough. Yes,
>> 'unsigned long' is 32-bit-ish, but in U-Boot, when something is 32-bit,
>> that is explicit.
>>
>> All in all, the whole function could be expressed as:
>>
>>       static u_char lpc32xx_read_byte(struct mtd_info *mtd)
>>       {
>>               struct nand_chip *this = mtd->priv;
>>
>>               return (u_char)readl(this->IO_ADDR_R);
>>       }
>>
>> BTW, isn't IO_ADDR_R pointing to the data register, and isn't the data
>> register 16-bits? And if so, then why the 32-bits read?
>>
> 
> The register is 16 bits; this implementation is the porting of the initial code.

hmm, you remember it was discussed yesterday that the data register is
32-bit...

----8<----
5.2 Data FIFO

There is only one Data FIFO. The Data FIFO is configured either in Read
or in Write mode.

1. When the Data FIFO is configured in Read mode, the sequencer reads
data from the NAND flash, and stores the data in the Data FIFO. The FIFO
is then emptied by 32-bit reads on the AHB bus from either the ARM or
the DMA.

2. When the Data FIFO is configured in Write mode, the ARM or the DMA
writes data to the FIFO with 32-bit AHB bus writes. The sequencer then
takes data out of the FIFO 8 bits at a time, and writes data to the NAND
flash.

----8<----

> I will wait for feedback and see how we want to approach this
> (add DMA and HW ECC to the NAND SLC driver sent by Vladimir or
> update the driver as part of the porting effort).

now when I see the code I still haven't changed my opinion, I would
propose to add HW ECC processing on top of my trivial change.

Some general reasons:

* I agree with Albert that the code is a bit overcomplicated and can be
improved, basic functions like read_byte(), cmd_ctrl() etc are better in
my version IMHO --- for example just compare Kevin's monstrous
lpc32xx_nand_hwcontrol() and my lpc32xx_nand_cmd_ctrl() --- so if my
version is reviewed and accepted, then there is no need to fix all the
same issues in this legacy forward-ported code,

* this change strictly depends on DMA driver (the driver simply does not
work, if DMA is disabled), this means that DMA driver must be 1/3 and
SLC NAND should go as 2/3, this implies that DMA driver is reviewed and
accepted by maintainers firstly,

* I don't see any users of this new code, this addresses Albert's notice
about adding dead code ---
http://lists.denx.de/pipermail/u-boot/2015-July/219124.html

* What about functional support in SPL? Is it correct, that if I want to
have this code in SPL, then I have to pull in DMA driver as a mandatory
dependency to tiny SPL?

>> Also, I did not find where the IO_ADDR_R field is assigned. Did I miss
>> it?
> 
> "CONFIG_SYS_NAND_SELF_INIT" is not define; the legacy BSP driver is using the
> default initialization done within "nand_init_chip()" function inside "drivers/mtd/nand/nand.c".
> 
>>
>> This is just one case, but I suspect in many places, the code is
>> unnecessarily complex. I understand it was ported, not written from
>> scratch, but I think porting code should not prevent us from making it
>> smaller, more efficient and more maintainable.
>>
>> Amicalement,
>> --
>> Albert.
>>
> 
> Sylvain
>
LEMIEUX, SYLVAIN July 17, 2015, 11:46 p.m. UTC | #4
Hi Vladimir,

> From: Vladimir Zapolskiy [mailto:vz@mleia.com]
> Sent: 17-Jul-15 7:10 PM
>
> Hi Sylvain, Albert,
>
> On 18.07.2015 01:24, LEMIEUX, SYLVAIN wrote:
> > Hi Albert,
> >
> > Thanks for the feedback.
> >
> >> From: U-Boot [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Albert ARIBAUD
> >> Sent: 17-Jul-15 5:20 PM
> >>
> >> Hello Sylvain,
> >>
> >> On Fri, 17 Jul 2015 16:48:52 -0400, slemieux.tyco@gmail.com
> >> <slemieux.tyco@gmail.com> wrote:
> >>
> >>> 1) Fixed checkpatch script output in legacy code.
> >>>    A single warning remaining.
> >>
> >>> The following warning from the legacy code is still present:
> >>> lpc32xx_nand.c:195: WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt
> >>
> >>> +static u_char lpc32xx_read_byte(struct mtd_info *mtd)
> >>> +{
> >>> +   struct nand_chip *this = mtd->priv;
> >>> +   unsigned long *preg = (unsigned long *)this->IO_ADDR_R;
> >>> +   volatile unsigned long tmp32;
> >>> +   tmp32 = *preg;
> >>> +   return (u_char)tmp32;
> >>> +}
> >>
> >> The volatile above has no reason to exist; the warning is justified
> >> here as we have accessors that guarantee that the access will not be
> >> optimized away or reordered, juste like the 'volatile' above tries to
> >> do (and yes, these accessors *use* 'volatile'. All the more a reason
> >> not to use it again here).
> >>
> >> Besides, the code is quite verbose and not precise enough. Yes,
> >> 'unsigned long' is 32-bit-ish, but in U-Boot, when something is 32-bit,
> >> that is explicit.
> >>
> >> All in all, the whole function could be expressed as:
> >>
> >>       static u_char lpc32xx_read_byte(struct mtd_info *mtd)
> >>       {
> >>               struct nand_chip *this = mtd->priv;
> >>
> >>               return (u_char)readl(this->IO_ADDR_R);
> >>       }
> >>
> >> BTW, isn't IO_ADDR_R pointing to the data register, and isn't the data
> >> register 16-bits? And if so, then why the 32-bits read?
> >>
> >
> > The register is 16 bits; this implementation is the porting of the initial code.
>
> hmm, you remember it was discussed yesterday that the data register is
> 32-bit...
>
> ----8<----
> 5.2 Data FIFO
>
> There is only one Data FIFO. The Data FIFO is configured either in Read
> or in Write mode.
>
> 1. When the Data FIFO is configured in Read mode, the sequencer reads
> data from the NAND flash, and stores the data in the Data FIFO. The FIFO
> is then emptied by 32-bit reads on the AHB bus from either the ARM or
> the DMA.
>
> 2. When the Data FIFO is configured in Write mode, the ARM or the DMA
> writes data to the FIFO with 32-bit AHB bus writes. The sequencer then
> takes data out of the FIFO 8 bits at a time, and writes data to the NAND
> flash.
>
> ----8<----
>
> > I will wait for feedback and see how we want to approach this
> > (add DMA and HW ECC to the NAND SLC driver sent by Vladimir or
> > update the driver as part of the porting effort).
>
> now when I see the code I still haven't changed my opinion, I would
> propose to add HW ECC processing on top of my trivial change.
>
> Some general reasons:
>
> * I agree with Albert that the code is a bit overcomplicated and can be
> improved, basic functions like read_byte(), cmd_ctrl() etc are better in
> my version IMHO --- for example just compare Kevin's monstrous
> lpc32xx_nand_hwcontrol() and my lpc32xx_nand_cmd_ctrl() --- so if my
> version is reviewed and accepted, then there is no need to fix all the
> same issues in this legacy forward-ported code,
>
> * this change strictly depends on DMA driver (the driver simply does not
> work, if DMA is disabled), this means that DMA driver must be 1/3 and
> SLC NAND should go as 2/3, this implies that DMA driver is reviewed and
> accepted by maintainers firstly,
>

DMA can be the first patch of the series.
No problem, I can try to add the support for hardware ECC and DMA inside
your driver. However, I will not be able to this until the following week.

> * I don't see any users of this new code, this addresses Albert's notice
> about adding dead code ---
> http://lists.denx.de/pipermail/u-boot/2015-July/219124.html
>

We did the porting of the legacy NXP BSP driver internally, as we are using it
on a custom LPC32xx board running u-boot.

As LPC32xx driver became available in u-boot (thanks to Albert and Vladimir),
we start using them (I2C, Ethernet, GPIO). After migrating to those drivers,
we did the porting of the remaining legacy drivers to the latest u-boot version.

The intent of those patches was to bring the remaining legacy drivers,
not wet available in u-boot.

> * What about functional support in SPL? Is it correct, that if I want to
> have this code in SPL, then I have to pull in DMA driver as a mandatory
> dependency to tiny SPL?
>

We are not using the SPL;
may be the SLC NAND driver can use the DMA only for non-SPL build.

> >> Also, I did not find where the IO_ADDR_R field is assigned. Did I miss
> >> it?
> >
> > "CONFIG_SYS_NAND_SELF_INIT" is not define; the legacy BSP driver is using the
> > default initialization done within "nand_init_chip()" function inside "drivers/mtd/nand/nand.c".
> >
> >>
> >> This is just one case, but I suspect in many places, the code is
> >> unnecessarily complex. I understand it was ported, not written from
> >> scratch, but I think porting code should not prevent us from making it
> >> smaller, more efficient and more maintainable.
> >>
> >> Amicalement,
> >> --
> >> Albert.
> >>
> >
> > Sylvain
> >
Albert ARIBAUD July 18, 2015, 5:50 a.m. UTC | #5
Hello Vladimir,

On Sat, 18 Jul 2015 02:10:01 +0300, Vladimir Zapolskiy <vz@mleia.com>
wrote:
> Hi Sylvain, Albert,
> 
> On 18.07.2015 01:24, LEMIEUX, SYLVAIN wrote:
> > Hi Albert,
> > 
> > Thanks for the feedback.
> > 
> >> From: U-Boot [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Albert ARIBAUD
> >> Sent: 17-Jul-15 5:20 PM
> >>
> >> Hello Sylvain,
> >>
> >> On Fri, 17 Jul 2015 16:48:52 -0400, slemieux.tyco@gmail.com
> >> <slemieux.tyco@gmail.com> wrote:
> >>
> >>> 1) Fixed checkpatch script output in legacy code.
> >>>    A single warning remaining.
> >>
> >>> The following warning from the legacy code is still present:
> >>> lpc32xx_nand.c:195: WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt
> >>
> >>> +static u_char lpc32xx_read_byte(struct mtd_info *mtd)
> >>> +{
> >>> +   struct nand_chip *this = mtd->priv;
> >>> +   unsigned long *preg = (unsigned long *)this->IO_ADDR_R;
> >>> +   volatile unsigned long tmp32;
> >>> +   tmp32 = *preg;
> >>> +   return (u_char)tmp32;
> >>> +}
> >>
> >> The volatile above has no reason to exist; the warning is justified
> >> here as we have accessors that guarantee that the access will not be
> >> optimized away or reordered, juste like the 'volatile' above tries to
> >> do (and yes, these accessors *use* 'volatile'. All the more a reason
> >> not to use it again here).
> >>
> >> Besides, the code is quite verbose and not precise enough. Yes,
> >> 'unsigned long' is 32-bit-ish, but in U-Boot, when something is 32-bit,
> >> that is explicit.
> >>
> >> All in all, the whole function could be expressed as:
> >>
> >>       static u_char lpc32xx_read_byte(struct mtd_info *mtd)
> >>       {
> >>               struct nand_chip *this = mtd->priv;
> >>
> >>               return (u_char)readl(this->IO_ADDR_R);
> >>       }
> >>
> >> BTW, isn't IO_ADDR_R pointing to the data register, and isn't the data
> >> register 16-bits? And if so, then why the 32-bits read?
> >>
> > 
> > The register is 16 bits; this implementation is the porting of the initial code.
> 
> hmm, you remember it was discussed yesterday that the data register is
> 32-bit...
> 
> ----8<----
> 5.2 Data FIFO
> 
> There is only one Data FIFO. The Data FIFO is configured either in Read
> or in Write mode.
> 
> 1. When the Data FIFO is configured in Read mode, the sequencer reads
> data from the NAND flash, and stores the data in the Data FIFO. The FIFO
> is then emptied by 32-bit reads on the AHB bus from either the ARM or
> the DMA.
> 
> 2. When the Data FIFO is configured in Write mode, the ARM or the DMA
> writes data to the FIFO with 32-bit AHB bus writes. The sequencer then
> takes data out of the FIFO 8 bits at a time, and writes data to the NAND
> flash.
> 
> ----8<----

This is about the width of the data FIFO bus transfers, not about the
width of the DATA register.

The width of the data register is defined in UM10326 rev. 1 chapter 9,
section 6.1, page 192/193.

While, for other registers only bits 0..7 are specified, for SLC_DATA,
described as "a 16-bit wide register", bits 0..15 are specified, with
bits 8..15 being described as "Reserved, user software should not write
ones to reserved bits. The value read from a reserved bit is not
defined".

This makes me interpret the 'word' in the statement "SLC_DATA must be
accessed as a word register, although only 8 bits of data are used
during a write or provided during a read" as being a 16-, not 32-bit
quantity.

> > I will wait for feedback and see how we want to approach this
> > (add DMA and HW ECC to the NAND SLC driver sent by Vladimir or
> > update the driver as part of the porting effort).
> 
> now when I see the code I still haven't changed my opinion, I would
> propose to add HW ECC processing on top of my trivial change.
> 
> Some general reasons:
> 
> * I agree with Albert that the code is a bit overcomplicated and can be
> improved, basic functions like read_byte(), cmd_ctrl() etc are better in
> my version IMHO --- for example just compare Kevin's monstrous
> lpc32xx_nand_hwcontrol() and my lpc32xx_nand_cmd_ctrl() --- so if my
> version is reviewed and accepted, then there is no need to fix all the
> same issues in this legacy forward-ported code,

Ok, so there is a slight misunderstanding: when I said which version
should be submitted was to be discussed between you and Sylvain, I
meant that you come out with a single proposal.

> * this change strictly depends on DMA driver (the driver simply does not
> work, if DMA is disabled), this means that DMA driver must be 1/3 and
> SLC NAND should go as 2/3, this implies that DMA driver is reviewed and
> accepted by maintainers firstly,

That's what patch series are for. Just submit the series and it should
be accepted as a whole and in order, or rejected as a whole (unless some
patch in it is really independent, in which case it can be applied
independently (and any subsequent version of the series will have one
less patch).

> * I don't see any users of this new code, this addresses Albert's notice
> about adding dead code ---
> http://lists.denx.de/pipermail/u-boot/2015-July/219124.html

Actually, my notice about dead code is not that there should already
be a use for new code, it is that there must appear a use for new code
in the same patch series where the new code is introduced. I don't want
to see independent patches for the code and use, because the first one
might get applied and the second one dropped and then we have dead code.

> * What about functional support in SPL? Is it correct, that if I want to
> have this code in SPL, then I have to pull in DMA driver as a mandatory
> dependency to tiny SPL?

You have to ensure DMA, but whether you pull the whole BSP driver or
hard-code the necessary parts into a single self-contained SLC NAND +
DMA file is a design decision. That said, I would personally go for
pulling the driver *and* adding preprocessing out any part of it not
necessary for SPL -- not so much for size (the compiler and linker
should optimize useless parts away on their own anyway) than for clarity
and maintenance (readers of the driver code would know which part is
needed for SPL and which is not).

Amicalement,
LEMIEUX, SYLVAIN July 27, 2015, 9:45 p.m. UTC | #6
Hi Vladimir and Albert,

See comment, questions and test results below;

> From: Albert ARIBAUD [mailto:albert.u.boot@aribaud.net]
> Sent: 18-Jul-15 1:50 AM
>
> Hello Vladimir,
>
> On Sat, 18 Jul 2015 02:10:01 +0300, Vladimir Zapolskiy <vz@mleia.com>
> wrote:
> > Hi Sylvain, Albert,
> >
> > On 18.07.2015 01:24, LEMIEUX, SYLVAIN wrote:
> > > Hi Albert,
> > >
> > > Thanks for the feedback.
> > >
> > >> From: U-Boot [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Albert ARIBAUD
> > >> Sent: 17-Jul-15 5:20 PM
> > >>
> > >> Hello Sylvain,
> > >>
> > >> On Fri, 17 Jul 2015 16:48:52 -0400, slemieux.tyco@gmail.com
> > >> <slemieux.tyco@gmail.com> wrote:
> > >>
> > >>>

 ....

>
> > * What about functional support in SPL? Is it correct, that if I want to
> > have this code in SPL, then I have to pull in DMA driver as a mandatory
> > dependency to tiny SPL?
>
> You have to ensure DMA, but whether you pull the whole BSP driver or
> hard-code the necessary parts into a single self-contained SLC NAND +
> DMA file is a design decision. That said, I would personally go for
> pulling the driver *and* adding preprocessing out any part of it not
> necessary for SPL -- not so much for size (the compiler and linker
> should optimize useless parts away on their own anyway) than for clarity
> and maintenance (readers of the driver code would know which part is
> needed for SPL and which is not).
>

I am in the process of putting a patch together to add the hardware ECC
support on top of the SLC NAND driver patch
(https://patchwork.ozlabs.org/patch/497308/).

A have a few questions:
1) The LPC32xx SLC NAND driver is using a custom NAND ECC Layout for small page;
    You can refer to the Kernel driver:
     https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/mtd/nand/lpc32xx_slc.c?id=refs/tags/v4.1.3

    Are you planning to update your driver with the change or should I submit a separate
    patch for it before adding the support for DMA and hardware ECC?

2) As suggested by Albert, I will add a conditional compile option to ensure the
     original version of the driver (no DMA) can be used for SPL.

    I was planning to do it using the configuration option use to select the LPC32xx
    DMA driver (CONFIG_DMA_LPC32XX). Are you OK with this approach?

3) Should I separate the NAND SLC /DMA & the USB into 2 separate patch?


Test result (full command log below):

Clock configuration:
CPU clock: 266MHz / AHB bus clock: 133MHz / Peripheral clock: 13MHz

1) NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/
     NAND read:  51904512 bytes
     read.raw: 1.444 MiB per second / read: 0.625 MiB per second

2) NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/
     & Legacy BSP porting (hardware ECC & DMA)
     NAND read:  51904512 bytes
     read.raw: 2.272 MiB per second / read: 2.139 MiB per second

------------------
Full log:

NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/

==> gettime; nand read.raw 0x80000000 0xd00000 0x6000; gettime
Timer val: 259607
Seconds : 259
Remainder : 607
sys_hz = 1000

NAND read:  51904512 bytes read: OK
Timer val: 293885
Seconds : 293
Remainder : 885
sys_hz = 1000
--> 1.444 MiB per second

==> gettime; nand read.e 0x80000000 0xd00000 0x3180000; gettime
Timer val: 11304
Seconds : 11
Remainder : 304
sys_hz = 1000

NAND read: device 0 offset 0xd00000, size 0x3180000
 51904512 bytes read: OK
Timer val: 90495
Seconds : 90
Remainder : 495
sys_hz = 1000
--> 0.625 MiB per second

NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/
& Legacy BSP porting (hardware ECC & DMA)

==> gettime; nand read.raw 0x80000000 0xd00000 0x6000; gettime
Timer val: 49705
Seconds : 49
Remainder : 705
sys_hz = 1000

NAND read:  51904512 bytes read: OK
Timer val: 71483
Seconds : 71
Remainder : 483
sys_hz = 1000
--> 2.272 MiB per second

Timer val: 280282
Seconds : 280
Remainder : 282
sys_hz = 1000

NAND read: device 0 offset 0xd00000, size 0x3180000
 51904512 bytes read: OK
Timer val: 303423
Seconds : 303
Remainder : 423
sys_hz = 1000
--> 2.139 MiB per second

Sylvain

> Amicalement,
> --
> Albert.
Vladimir Zapolskiy July 28, 2015, 12:21 a.m. UTC | #7
Hi Sylvain,

On 28.07.2015 00:45, LEMIEUX, SYLVAIN wrote:
> Hi Vladimir and Albert,
> 
> See comment, questions and test results below;
> 
>> From: Albert ARIBAUD [mailto:albert.u.boot@aribaud.net]
>> Sent: 18-Jul-15 1:50 AM
>>
>> Hello Vladimir,
>>
>> On Sat, 18 Jul 2015 02:10:01 +0300, Vladimir Zapolskiy <vz@mleia.com>
>> wrote:
>>> Hi Sylvain, Albert,
>>>
>>> On 18.07.2015 01:24, LEMIEUX, SYLVAIN wrote:
>>>> Hi Albert,
>>>>
>>>> Thanks for the feedback.
>>>>
>>>>> From: U-Boot [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Albert ARIBAUD
>>>>> Sent: 17-Jul-15 5:20 PM
>>>>>
>>>>> Hello Sylvain,
>>>>>
>>>>> On Fri, 17 Jul 2015 16:48:52 -0400, slemieux.tyco@gmail.com
>>>>> <slemieux.tyco@gmail.com> wrote:
>>>>>
>>>>>>
> 
>  ....
> 
>>
>>> * What about functional support in SPL? Is it correct, that if I want to
>>> have this code in SPL, then I have to pull in DMA driver as a mandatory
>>> dependency to tiny SPL?
>>
>> You have to ensure DMA, but whether you pull the whole BSP driver or
>> hard-code the necessary parts into a single self-contained SLC NAND +
>> DMA file is a design decision. That said, I would personally go for
>> pulling the driver *and* adding preprocessing out any part of it not
>> necessary for SPL -- not so much for size (the compiler and linker
>> should optimize useless parts away on their own anyway) than for clarity
>> and maintenance (readers of the driver code would know which part is
>> needed for SPL and which is not).
>>
> 
> I am in the process of putting a patch together to add the hardware ECC
> support on top of the SLC NAND driver patch
> (https://patchwork.ozlabs.org/patch/497308/).
> 
> A have a few questions:
> 1) The LPC32xx SLC NAND driver is using a custom NAND ECC Layout for small page;
>     You can refer to the Kernel driver:
>      https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/mtd/nand/lpc32xx_slc.c?id=refs/tags/v4.1.3
> 
>     Are you planning to update your driver with the change or should I submit a separate
>     patch for it before adding the support for DMA and hardware ECC?

if you don't object to rebase your changes on top of mine (see my
closing comment), I'm fine, if custom NAND ECC Layout for small page
chip is added in a separate patch. For your information on my side I
won't be able to test it though.

> 2) As suggested by Albert, I will add a conditional compile option to ensure the
>      original version of the driver (no DMA) can be used for SPL.

Great, thank you.

>     I was planning to do it using the configuration option use to select the LPC32xx
>     DMA driver (CONFIG_DMA_LPC32XX). Are you OK with this approach?

I'm not sure, if DMA is really needed in SPL binary, but if you want to
add this option, I don't object. To separate code for SPL and normal
image please use CONFIG_SPL_BUILD macro.

> 3) Should I separate the NAND SLC /DMA & the USB into 2 separate patch?

I would say DMA driver as a prerequisite should go first, then in
arbitrary order NAND SLC updates specific to available/compiled DMA and
USB changes.

> 
> Test result (full command log below):
> 
> Clock configuration:
> CPU clock: 266MHz / AHB bus clock: 133MHz / Peripheral clock: 13MHz
> 
> 1) NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/
>      NAND read:  51904512 bytes
>      read.raw: 1.444 MiB per second / read: 0.625 MiB per second
> 
> 2) NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/
>      & Legacy BSP porting (hardware ECC & DMA)
>      NAND read:  51904512 bytes
>      read.raw: 2.272 MiB per second / read: 2.139 MiB per second
> 
> ------------------
> Full log:
> 
> NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/
> 
> ==> gettime; nand read.raw 0x80000000 0xd00000 0x6000; gettime
> Timer val: 259607
> Seconds : 259
> Remainder : 607
> sys_hz = 1000
> 
> NAND read:  51904512 bytes read: OK
> Timer val: 293885
> Seconds : 293
> Remainder : 885
> sys_hz = 1000
> --> 1.444 MiB per second
> 
> ==> gettime; nand read.e 0x80000000 0xd00000 0x3180000; gettime
> Timer val: 11304
> Seconds : 11
> Remainder : 304
> sys_hz = 1000
> 
> NAND read: device 0 offset 0xd00000, size 0x3180000
>  51904512 bytes read: OK
> Timer val: 90495
> Seconds : 90
> Remainder : 495
> sys_hz = 1000
> --> 0.625 MiB per second
> 
> NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/
> & Legacy BSP porting (hardware ECC & DMA)
> 
> ==> gettime; nand read.raw 0x80000000 0xd00000 0x6000; gettime
> Timer val: 49705
> Seconds : 49
> Remainder : 705
> sys_hz = 1000
> 
> NAND read:  51904512 bytes read: OK
> Timer val: 71483
> Seconds : 71
> Remainder : 483
> sys_hz = 1000
> --> 2.272 MiB per second
> 
> Timer val: 280282
> Seconds : 280
> Remainder : 282
> sys_hz = 1000
> 
> NAND read: device 0 offset 0xd00000, size 0x3180000
>  51904512 bytes read: OK
> Timer val: 303423
> Seconds : 303
> Remainder : 423
> sys_hz = 1000
> --> 2.139 MiB per second
> 

Thank you for tests. So, as expected if DMA and hardware ECC is enabled,
then the performance is 1.5 (raw read) or 3.5 times (read & ecc) better.

I do recognize and appreciate this work, advocating my change I can
repeat my last arguments:
* it has a ready valid user in U-boot, my change is not a dead code,
* it is more functional -- read from NAND in SPL is supported,
* it is extremely simple (180 LoC vs. 500 LoC + ~200 LoC of DMA driver),
* it has been reviewed by Scott, all review comments are fixed and
published in v4, and it is ready for inclusion to the next release IMHO.

If there is no more review comments to my version of the driver, I would
kindly ask you to rebase legacy SLC NAND driver on top of my version of
the driver, it is a doable and simple task in my opinion, performance
optimization with the associated code complexity may be added later on.

--
With best wishes,
Vladimir
LEMIEUX, SYLVAIN July 28, 2015, 4:28 p.m. UTC | #8
Hi Vladimir,

See comment below.


> From: Vladimir Zapolskiy [mailto:vz@mleia.com]
> Sent: 27-Jul-15 8:22 PM
>
> Hi Sylvain,
>
> On 28.07.2015 00:45, LEMIEUX, SYLVAIN wrote:
> > Hi Vladimir and Albert,
> >
> > See comment, questions and test results below;
> >
> >> From: Albert ARIBAUD [mailto:albert.u.boot@aribaud.net]
> >> Sent: 18-Jul-15 1:50 AM
> >>
> >> Hello Vladimir,
> >>
> >> On Sat, 18 Jul 2015 02:10:01 +0300, Vladimir Zapolskiy <vz@mleia.com>
> >> wrote:
> >>> Hi Sylvain, Albert,
> >>>
> >>> On 18.07.2015 01:24, LEMIEUX, SYLVAIN wrote:
> >>>> Hi Albert,
> >>>>
> >>>> Thanks for the feedback.
> >>>>
> >>>>> From: U-Boot [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Albert ARIBAUD
> >>>>> Sent: 17-Jul-15 5:20 PM
> >>>>>
> >>>>> Hello Sylvain,
> >>>>>
> >>>>> On Fri, 17 Jul 2015 16:48:52 -0400, slemieux.tyco@gmail.com
> >>>>> <slemieux.tyco@gmail.com> wrote:
> >>>>>
> >>>>>>
> >
> >  ....
> >
> >>
> >>> * What about functional support in SPL? Is it correct, that if I want to
> >>> have this code in SPL, then I have to pull in DMA driver as a mandatory
> >>> dependency to tiny SPL?
> >>
> >> You have to ensure DMA, but whether you pull the whole BSP driver or
> >> hard-code the necessary parts into a single self-contained SLC NAND +
> >> DMA file is a design decision. That said, I would personally go for
> >> pulling the driver *and* adding preprocessing out any part of it not
> >> necessary for SPL -- not so much for size (the compiler and linker
> >> should optimize useless parts away on their own anyway) than for clarity
> >> and maintenance (readers of the driver code would know which part is
> >> needed for SPL and which is not).
> >>
> >
> > I am in the process of putting a patch together to add the hardware ECC
> > support on top of the SLC NAND driver patch
> > (https://patchwork.ozlabs.org/patch/497308/).
> >
> > A have a few questions:
> > 1) The LPC32xx SLC NAND driver is using a custom NAND ECC Layout for small page;
> >     You can refer to the Kernel driver:
> >      https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/mtd/nand/lpc32xx_slc.c?id=refs/tags/v4.1.3
> >
> >     Are you planning to update your driver with the change or should I submit a separate
> >     patch for it before adding the support for DMA and hardware ECC?
>
> if you don't object to rebase your changes on top of mine (see my
> closing comment), I'm fine, if custom NAND ECC Layout for small page
> chip is added in a separate patch. For your information on my side I
> won't be able to test it though.

Already done, I will update the change based on the comment bellow
and submit a second version of the patch;

The test result, listed below, are with the NXP legacy BSP code on top
of your patch.

I will add a patch from the small page change; the change are taken
from the legacy NXP BSP and the mapping is matching the kernel driver.

Who will provide feedback on the DMA patch?

>
> > 2) As suggested by Albert, I will add a conditional compile option to ensure the
> >      original version of the driver (no DMA) can be used for SPL.
>
> Great, thank you.
>
> >     I was planning to do it using the configuration option use to select the LPC32xx
> >     DMA driver (CONFIG_DMA_LPC32XX). Are you OK with this approach?
>
> I'm not sure, if DMA is really needed in SPL binary, but if you want to
> add this option, I don't object. To separate code for SPL and normal
> image please use CONFIG_SPL_BUILD macro.
>
> > 3) Should I separate the NAND SLC /DMA & the USB into 2 separate patch?
>
> I would say DMA driver as a prerequisite should go first, then in
> arbitrary order NAND SLC updates specific to available/compiled DMA and
> USB changes.
>
> >
> > Test result (full command log below):
> >
> > Clock configuration:
> > CPU clock: 266MHz / AHB bus clock: 133MHz / Peripheral clock: 13MHz
> >
> > 1) NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/
> >      NAND read:  51904512 bytes
> >      read.raw: 1.444 MiB per second / read: 0.625 MiB per second
> >
> > 2) NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/
> >      & Legacy BSP porting (hardware ECC & DMA)
> >      NAND read:  51904512 bytes
> >      read.raw: 2.272 MiB per second / read: 2.139 MiB per second
> >
> > ------------------
> > Full log:
> >
> > NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/
> >
> > ==> gettime; nand read.raw 0x80000000 0xd00000 0x6000; gettime
> > Timer val: 259607
> > Seconds : 259
> > Remainder : 607
> > sys_hz = 1000
> >
> > NAND read:  51904512 bytes read: OK
> > Timer val: 293885
> > Seconds : 293
> > Remainder : 885
> > sys_hz = 1000
> > --> 1.444 MiB per second
> >
> > ==> gettime; nand read.e 0x80000000 0xd00000 0x3180000; gettime
> > Timer val: 11304
> > Seconds : 11
> > Remainder : 304
> > sys_hz = 1000
> >
> > NAND read: device 0 offset 0xd00000, size 0x3180000
> >  51904512 bytes read: OK
> > Timer val: 90495
> > Seconds : 90
> > Remainder : 495
> > sys_hz = 1000
> > --> 0.625 MiB per second
> >
> > NAND SLC driver: https://patchwork.ozlabs.org/patch/497308/
> > & Legacy BSP porting (hardware ECC & DMA)
> >
> > ==> gettime; nand read.raw 0x80000000 0xd00000 0x6000; gettime
> > Timer val: 49705
> > Seconds : 49
> > Remainder : 705
> > sys_hz = 1000
> >
> > NAND read:  51904512 bytes read: OK
> > Timer val: 71483
> > Seconds : 71
> > Remainder : 483
> > sys_hz = 1000
> > --> 2.272 MiB per second
> >
> > Timer val: 280282
> > Seconds : 280
> > Remainder : 282
> > sys_hz = 1000
> >
> > NAND read: device 0 offset 0xd00000, size 0x3180000
> >  51904512 bytes read: OK
> > Timer val: 303423
> > Seconds : 303
> > Remainder : 423
> > sys_hz = 1000
> > --> 2.139 MiB per second
> >
>
> Thank you for tests. So, as expected if DMA and hardware ECC is enabled,
> then the performance is 1.5 (raw read) or 3.5 times (read & ecc) better.
>
> I do recognize and appreciate this work, advocating my change I can
> repeat my last arguments:
> * it has a ready valid user in U-boot, my change is not a dead code,
> * it is more functional -- read from NAND in SPL is supported,
> * it is extremely simple (180 LoC vs. 500 LoC + ~200 LoC of DMA driver),
> * it has been reviewed by Scott, all review comments are fixed and
> published in v4, and it is ready for inclusion to the next release IMHO.
>
> If there is no more review comments to my version of the driver, I would
> kindly ask you to rebase legacy SLC NAND driver on top of my version of
> the driver, it is a doable and simple task in my opinion, performance
> optimization with the associated code complexity may be added later on.
>
> --
> With best wishes,
> Vladimir
diff mbox

Patch

diff --git a/arch/arm/cpu/arm926ejs/lpc32xx/devices.c b/arch/arm/cpu/arm926ejs/lpc32xx/devices.c
index 9c8d655..c0c9c6c 100644
--- a/arch/arm/cpu/arm926ejs/lpc32xx/devices.c
+++ b/arch/arm/cpu/arm926ejs/lpc32xx/devices.c
@@ -59,6 +59,12 @@  void lpc32xx_mlc_nand_init(void)
 	writel(CLK_NAND_MLC | CLK_NAND_MLC_INT, &clk->flashclk_ctrl);
 }
 
+void lpc32xx_slc_nand_init(void)
+{
+	/* Enable SLC NAND interface */
+	writel(CLK_NAND_SLC | CLK_NAND_SLC_SELECT, &clk->flashclk_ctrl);
+}
+
 void lpc32xx_i2c_init(unsigned int devnum)
 {
 	/* Enable I2C interface */
diff --git a/arch/arm/include/asm/arch-lpc32xx/clk.h b/arch/arm/include/asm/arch-lpc32xx/clk.h
index 9449869..010211a 100644
--- a/arch/arm/include/asm/arch-lpc32xx/clk.h
+++ b/arch/arm/include/asm/arch-lpc32xx/clk.h
@@ -153,7 +153,9 @@  struct clk_pm_regs {
 #define CLK_DMA_ENABLE			(1 << 0)
 
 /* NAND Clock Control Register bits */
+#define CLK_NAND_SLC			(1 << 0)
 #define CLK_NAND_MLC			(1 << 1)
+#define CLK_NAND_SLC_SELECT		(1 << 2)
 #define CLK_NAND_MLC_INT		(1 << 5)
 
 /* SSP Clock Control Register bits */
diff --git a/arch/arm/include/asm/arch-lpc32xx/sys_proto.h b/arch/arm/include/asm/arch-lpc32xx/sys_proto.h
index c3d890d..0845f83 100644
--- a/arch/arm/include/asm/arch-lpc32xx/sys_proto.h
+++ b/arch/arm/include/asm/arch-lpc32xx/sys_proto.h
@@ -12,6 +12,7 @@ 
 void lpc32xx_uart_init(unsigned int uart_id);
 void lpc32xx_mac_init(void);
 void lpc32xx_mlc_nand_init(void);
+void lpc32xx_slc_nand_init(void);
 void lpc32xx_i2c_init(unsigned int devnum);
 void lpc32xx_ssp_init(void);
 #if defined(CONFIG_SPL_BUILD)
diff --git a/drivers/mtd/nand/Makefile b/drivers/mtd/nand/Makefile
index 347ea62..e2dc99a 100644
--- a/drivers/mtd/nand/Makefile
+++ b/drivers/mtd/nand/Makefile
@@ -53,6 +53,7 @@  obj-$(CONFIG_NAND_KB9202) += kb9202_nand.o
 obj-$(CONFIG_NAND_KIRKWOOD) += kirkwood_nand.o
 obj-$(CONFIG_NAND_KMETER1) += kmeter1_nand.o
 obj-$(CONFIG_NAND_LPC32XX_MLC) += lpc32xx_nand_mlc.o
+obj-$(CONFIG_NAND_LPC32XX_SLC) += lpc32xx_nand_slc.o
 obj-$(CONFIG_NAND_MPC5121_NFC) += mpc5121_nfc.o
 obj-$(CONFIG_NAND_VF610_NFC) += vf610_nfc.o
 obj-$(CONFIG_NAND_MXC) += mxc_nand.o
diff --git a/drivers/mtd/nand/lpc32xx_nand_slc.c b/drivers/mtd/nand/lpc32xx_nand_slc.c
new file mode 100644
index 0000000..c3cb983
--- /dev/null
+++ b/drivers/mtd/nand/lpc32xx_nand_slc.c
@@ -0,0 +1,501 @@ 
+/*
+ * Copyright (C) 2008-2015 by NXP Semiconductors
+ * All rights reserved.
+ *
+ * @Author: Kevin Wells
+ * @Descr: LPC3250 SLC NAND controller interface support functions
+ *
+ * See file CREDITS for list of people who contributed to this
+ * project.
+ *
+ * SPDX-License-Identifier:	GPL-2.0+
+ */
+
+#include <common.h>
+#include <asm/arch/dma.h>
+#include <asm/arch/cpu.h>
+#include <asm/arch/clk.h>
+#include <nand.h>
+#include <linux/mtd/nand_ecc.h>
+#include <asm/errno.h>
+#include <asm/io.h>
+
+#define	NAND_ALE_OFFS	4
+#define	NAND_CLE_OFFS	8
+
+#define NAND_LARGE_BLOCK_PAGE_SIZE	2048
+#define NAND_SMALL_BLOCK_PAGE_SIZE	512
+
+/* SLC NAND controller module register structures */
+struct slc_nand_reg {
+	unsigned int slc_data;      /* SLC NAND data reg */
+	unsigned int slc_addr;      /* SLC NAND address register */
+	unsigned int slc_cmd;       /* SLC NAND command reg */
+	unsigned int slc_stop;      /* SLC NAND stop register */
+	unsigned int slc_ctrl;      /* SLC NAND control reg */
+	unsigned int slc_cfg;       /* SLC NAND config register */
+	unsigned int slc_stat;      /* SLC NAND status register */
+	unsigned int slc_int_stat;  /* SLC NAND int status register */
+	unsigned int slc_ien;       /* SLC NAND int enable register */
+	unsigned int slc_isr;       /* SLC NAND int set register */
+	unsigned int slc_icr;       /* SLC NAND int clear register */
+	unsigned int slc_tac;       /* SLC NAND timing register */
+	unsigned int slc_tc;        /* SLC NAND transfer count reg */
+	unsigned int slc_ecc;       /* SLC NAND parity register */
+	unsigned int slc_dma_data;  /* SLC NAND DMA data register */
+};
+
+/* slc_ctrl register definitions */
+#define SLCCTRL_SW_RESET	(1 << 2) /* Reset the NAND controller bit */
+#define SLCCTRL_ECC_CLEAR	(1 << 1) /* Reset ECC bit */
+#define SLCCTRL_DMA_START	(1 << 0) /* Start DMA channel bit */
+
+/* slc_cfg register definitions */
+#define SLCCFG_CE_LOW		(1 << 5) /* Force CE low bit */
+#define SLCCFG_DMA_ECC		(1 << 4) /* Enable DMA ECC bit */
+#define SLCCFG_ECC_EN		(1 << 3) /* ECC enable bit */
+#define SLCCFG_DMA_BURST	(1 << 2) /* DMA burst bit */
+#define SLCCFG_DMA_DIR		(1 << 1) /* DMA write(0)/read(1) bit */
+
+/* slc_stat register definitions */
+#define SLCSTAT_DMA_FIFO	(1 << 2) /* DMA FIFO has data bit */
+#define SLCSTAT_NAND_READY	(1 << 0) /* NAND device is ready bit */
+
+/* slc_int_stat, slc_ien, slc_isr, and slc_icr register definitions */
+#define SLCSTAT_INT_TC		(1 << 1) /* Transfer count bit */
+#define SLCSTAT_INT_RDY_EN	(1 << 0) /* Ready interrupt bit */
+
+/* slc_tac register definitions
+ * SLCTAC_WDR: Clock setting for RDY write sample wait time in 2*n clocks
+ * SLCTAC_WWIDTH: Write pulse width in clocks cycles, 1 to 16 clocks
+ * SLCTAC_WHOLD: Write hold time of control and data signals, 1 to 16 clocks
+ * SLCTAC_WSETUP: Write setup time of control and data signals, 1 to 16 clocks
+ * SLCTAC_RDR: Clock setting for RDY read sample wait time in 2*n clocks
+ * SLCTAC_RWIDTH: Read pulse width in clocks cycles, 1 to 16 clocks
+ * SLCTAC_RHOLD: Read hold time of control and data signals, 1 to 16 clocks
+ * SLCTAC_RSETUP: Read setup time of control and data signals, 1 to 16 clocks */
+#define SLCTAC_WDR(n)		(((n) & 0xF) << 28)
+#define SLCTAC_WWIDTH(n)	(((n) & 0xF) << 24)
+#define SLCTAC_WHOLD(n)		(((n) & 0xF) << 20)
+#define SLCTAC_WSETUP(n)	(((n) & 0xF) << 16)
+#define SLCTAC_RDR(n)		(((n) & 0xF) << 12)
+#define SLCTAC_RWIDTH(n)	(((n) & 0xF) << 8)
+#define SLCTAC_RHOLD(n)		(((n) & 0xF) << 4)
+#define SLCTAC_RSETUP(n)	(((n) & 0xF) << 0)
+
+/* control register definitions */
+#define DMAC_CHAN_INT_TC_EN	(1 << 31) /* channel terminal count interrupt */
+#define DMAC_CHAN_DEST_AUTOINC	(1 << 27) /* automatic destination increment */
+#define DMAC_CHAN_SRC_AUTOINC	(1 << 26) /* automatic source increment */
+#define DMAC_CHAN_DEST_AHB1	(1 << 25) /* AHB1 master for dest. transfer */
+#define DMAC_CHAN_DEST_WIDTH_32	(1 << 22) /* Destination data width selection */
+#define DMAC_CHAN_SRC_WIDTH_32	(1 << 19) /* Source data width selection */
+#define DMAC_CHAN_DEST_BURST_1	0
+#define DMAC_CHAN_DEST_BURST_4	(1 << 15) /* Destination data burst size */
+#define DMAC_CHAN_SRC_BURST_1	0
+#define DMAC_CHAN_SRC_BURST_4	(1 << 12) /* Source data burst size */
+
+/* config_ch register definitions
+ * DMAC_CHAN_FLOW_D_xxx: flow control with DMA as the controller
+ * DMAC_DEST_PERIP: Macro for loading destination peripheral
+ * DMAC_SRC_PERIP: Macro for loading source peripheral */
+#define DMAC_CHAN_FLOW_D_M2P	(0x1 << 11)
+#define DMAC_CHAN_FLOW_D_P2M	(0x2 << 11)
+#define DMAC_DEST_PERIP(n)	(((n) & 0x1F) << 6)
+#define DMAC_SRC_PERIP(n)	(((n) & 0x1F) << 1)
+
+/* config_ch register definitions
+ * (source and destination peripheral ID numbers).
+ * These can be used with the DMAC_DEST_PERIP and DMAC_SRC_PERIP macros.*/
+#define DMA_PERID_NAND1		1
+
+/* Channel enable bit */
+#define DMAC_CHAN_ENABLE	(1 << 0)
+
+static struct nand_ecclayout lpc32xx_nand_oob_16 = {
+	.eccbytes = 6,
+	.eccpos = {10, 11, 12, 13, 14, 15},
+	.oobfree = {
+		{.offset = 0,
+		 . length = 4},
+		{.offset = 6,
+		 . length = 4}
+		}
+};
+
+/*
+ * DMA Descriptors
+ * For Large Block: 17 descriptors = ((16 Data and ECC Read) + 1 Spare Area)
+ * For Small Block: 5 descriptors = ((4 Data and ECC Read) + 1 Spare Area)
+ */
+static dmac_ll_t dmalist[(CONFIG_SYS_NAND_ECCSIZE/256) * 2 + 1];
+static uint32_t ecc_buffer[8]; /* MAX ECC size */
+static int dmachan = -1;
+
+static struct slc_nand_reg *slc_nand = (struct slc_nand_reg *)SLC_NAND_BASE;
+
+#define XFER_PENDING ((readl(&slc_nand->slc_stat) & SLCSTAT_DMA_FIFO) | \
+		      readl(&slc_nand->slc_tc))
+
+static void lpc32xx_nand_init(void)
+{
+	/* SLC NAND clock are enable by "lpc32xx_slc_nand_init()"
+	 * and should be call by board "board_early_init_f()" function. */
+
+	/* Reset SLC NAND controller & clear ECC */
+	writel(SLCCTRL_SW_RESET | SLCCTRL_ECC_CLEAR, &slc_nand->slc_ctrl);
+
+	/* 8-bit bus, no DMA, CE normal */
+	writel(0, &slc_nand->slc_cfg);
+
+	/* Interrupts disabled and cleared */
+	writel(0, &slc_nand->slc_ien);
+	writel(SLCSTAT_INT_TC | SLCSTAT_INT_RDY_EN, &slc_nand->slc_icr);
+
+	writel(LPC32XX_SLC_NAND_TIMING, &slc_nand->slc_tac);
+}
+
+static void lpc32xx_nand_hwcontrol(struct mtd_info *mtd, int cmd,
+				   unsigned int ctrl)
+{
+	struct nand_chip *this = mtd->priv;
+	ulong  IO_ADDR_W;
+
+	if (ctrl & NAND_CTRL_CHANGE) {
+		IO_ADDR_W = (ulong) this->IO_ADDR_W;
+		IO_ADDR_W &= ~(NAND_CLE_OFFS | NAND_ALE_OFFS);
+
+		if (ctrl & NAND_CLE)
+			IO_ADDR_W |= NAND_CLE_OFFS;
+		else if (ctrl & NAND_ALE)
+			IO_ADDR_W |= NAND_ALE_OFFS;
+
+		if (ctrl & NAND_NCE)
+			setbits_le32(&slc_nand->slc_cfg, SLCCFG_CE_LOW);
+		else
+			clrbits_le32(&slc_nand->slc_cfg, SLCCFG_CE_LOW);
+
+		this->IO_ADDR_W = (void *)IO_ADDR_W;
+	}
+
+	if (cmd != NAND_CMD_NONE)
+		writel(cmd, this->IO_ADDR_W);
+}
+
+static int lpc32xx_nand_ready(struct mtd_info *mtd)
+{
+	/* Check the SLC NAND controller status */
+	return readl(&slc_nand->slc_stat) & SLCSTAT_NAND_READY;
+}
+
+static u_char lpc32xx_read_byte(struct mtd_info *mtd)
+{
+	struct nand_chip *this = mtd->priv;
+	unsigned long *preg = (unsigned long *)this->IO_ADDR_R;
+	volatile unsigned long tmp32;
+	tmp32 = *preg;
+	return (u_char)tmp32;
+}
+
+/* Prepares DMA descriptors for NAND RD/WR operations */
+/* If the size is < 256 Bytes then it is assumed to be
+ * an OOB transfer */
+static void lpc32xx_nand_dma_configure(struct nand_chip *chip,
+				       const void *buffer, int size, int read)
+{
+	uint32_t i, dmasrc, ctrl, ecc_ctrl, oob_ctrl, dmadst;
+	void __iomem *base = chip->IO_ADDR_R;
+	uint32_t *ecc_gen = ecc_buffer;
+
+	/*
+	 * CTRL descriptor entry for reading ECC
+	 * Copy Multiple times to sync DMA with Flash Controller
+	 */
+	ecc_ctrl =  (0x5 |
+			DMAC_CHAN_SRC_BURST_1 |
+			DMAC_CHAN_DEST_BURST_1 |
+			DMAC_CHAN_SRC_WIDTH_32 |
+			DMAC_CHAN_DEST_WIDTH_32 |
+			DMAC_CHAN_DEST_AHB1);
+
+	/* CTRL descriptor entry for reading/writing Data */
+	ctrl =  64 | /* 256/4 */
+			DMAC_CHAN_SRC_BURST_4 |
+			DMAC_CHAN_DEST_BURST_4 |
+			DMAC_CHAN_SRC_WIDTH_32 |
+			DMAC_CHAN_DEST_WIDTH_32 |
+			DMAC_CHAN_DEST_AHB1;
+
+	/* CTRL descriptor entry for reading/writing Spare Area */
+	oob_ctrl =  ((CONFIG_SYS_NAND_OOBSIZE / 4) |
+			DMAC_CHAN_SRC_BURST_4 |
+			DMAC_CHAN_DEST_BURST_4 |
+			DMAC_CHAN_SRC_WIDTH_32 |
+			DMAC_CHAN_DEST_WIDTH_32 |
+			DMAC_CHAN_DEST_AHB1);
+
+	if (read) {
+		dmasrc = (uint32_t)(base + offsetof(struct slc_nand_reg,
+						    slc_dma_data));
+		dmadst = (uint32_t)(buffer);
+		ctrl |= DMAC_CHAN_DEST_AUTOINC;
+	} else {
+		dmadst = (uint32_t)(base + offsetof(struct slc_nand_reg,
+						    slc_dma_data));
+		dmasrc = (uint32_t)(buffer);
+		ctrl |= DMAC_CHAN_SRC_AUTOINC;
+	}
+
+	/*
+	 * Write Operation Sequence for Small Block NAND
+	 * ----------------------------------------------------------
+	 * 1. X'fer 256 bytes of data from Memory to Flash.
+	 * 2. Copy generated ECC data from Register to Spare Area
+	 * 3. X'fer next 256 bytes of data from Memory to Flash.
+	 * 4. Copy generated ECC data from Register to Spare Area.
+	 * 5. X'fer 16 byets of Spare area from Memory to Flash.
+	 * Read Operation Sequence for Small Block NAND
+	 * ----------------------------------------------------------
+	 * 1. X'fer 256 bytes of data from Flash to Memory.
+	 * 2. Copy generated ECC data from Register to ECC calc Buffer.
+	 * 3. X'fer next 256 bytes of data from Flash to Memory.
+	 * 4. Copy generated ECC data from Register to ECC calc Buffer.
+	 * 5. X'fer 16 bytes of Spare area from Flash to Memory.
+	 * Write Operation Sequence for Large Block NAND
+	 * ----------------------------------------------------------
+	 * 1. Steps(1-4) of Write Operations repeate for four times
+	 * which generates 16 DMA descriptors to X'fer 2048 bytes of
+	 * data & 32 bytes of ECC data.
+	 * 2. X'fer 64 bytes of Spare area from Memory to Flash.
+	 * Read Operation Sequence for Large Block NAND
+	 * ----------------------------------------------------------
+	 * 1. Steps(1-4) of Read Operations repeate for four times
+	 * which generates 16 DMA descriptors to X'fer 2048 bytes of
+	 * data & 32 bytes of ECC data.
+	 * 2. X'fer 64 bytes of Spare area from Flash to Memory.
+	 */
+
+	for (i = 0; i < size/256; i++) {
+		dmalist[i*2].dma_src = (read ? (dmasrc) : (dmasrc + (i*256)));
+		dmalist[i*2].dma_dest = (read ? (dmadst + (i*256)) : dmadst);
+		dmalist[i*2].next_lli = (uint32_t)&dmalist[(i*2)+1];
+		dmalist[i*2].next_ctrl = ctrl;
+
+		dmalist[(i*2) + 1].dma_src =
+			(uint32_t)(base + offsetof(struct slc_nand_reg,
+						   slc_ecc));
+		dmalist[(i*2) + 1].dma_dest = (uint32_t)&ecc_gen[i];
+		dmalist[(i*2) + 1].next_lli = (uint32_t)&dmalist[(i*2)+2];
+		dmalist[(i*2) + 1].next_ctrl = ecc_ctrl;
+	}
+
+	if (i) { /* Data only transfer */
+		dmalist[(i*2) - 1].next_lli = 0;
+		dmalist[(i*2) - 1].next_ctrl |= DMAC_CHAN_INT_TC_EN;
+		return;
+	}
+
+	/* OOB only transfer */
+	if (read) {
+		dmasrc = (uint32_t)(base + offsetof(struct slc_nand_reg,
+						    slc_dma_data));
+		dmadst = (uint32_t)(buffer);
+		oob_ctrl |= DMAC_CHAN_DEST_AUTOINC;
+	} else {
+		dmadst = (uint32_t)(base + offsetof(struct slc_nand_reg,
+						    slc_dma_data));
+		dmasrc = (uint32_t)(buffer);
+		oob_ctrl |= DMAC_CHAN_SRC_AUTOINC;
+	}
+
+	/* Read/ Write Spare Area Data To/From Flash */
+	dmalist[i*2].dma_src = dmasrc;
+	dmalist[i*2].dma_dest = dmadst;
+	dmalist[i*2].next_lli = 0;
+	dmalist[i*2].next_ctrl = (oob_ctrl | DMAC_CHAN_INT_TC_EN);
+}
+
+static void lpc32xx_nand_xfer(struct mtd_info *mtd, const u_char *buf,
+			      int len, int read)
+{
+	struct nand_chip *chip = mtd->priv;
+	uint32_t config;
+
+	/* DMA Channel Configuration */
+	config = (read ? DMAC_CHAN_FLOW_D_P2M : DMAC_CHAN_FLOW_D_M2P) |
+		(read ? DMAC_DEST_PERIP(0) : DMAC_DEST_PERIP(DMA_PERID_NAND1)) |
+		(read ? DMAC_SRC_PERIP(DMA_PERID_NAND1) : DMAC_SRC_PERIP(0)) |
+		DMAC_CHAN_ENABLE;
+
+	/* Prepare DMA descriptors */
+	lpc32xx_nand_dma_configure(chip, buf, len, read);
+
+	/* Setup SLC controller and start transfer */
+	if (read)
+		setbits_le32(&slc_nand->slc_cfg, SLCCFG_DMA_DIR);
+	else  /* NAND_ECC_WRITE */
+		clrbits_le32(&slc_nand->slc_cfg, SLCCFG_DMA_DIR);
+	setbits_le32(&slc_nand->slc_cfg, SLCCFG_DMA_BURST);
+
+	/* Write length for new transfers */
+	if (!XFER_PENDING) {
+		int tmp = (len != mtd->oobsize) ? mtd->oobsize : 0;
+		writel(len + tmp, &slc_nand->slc_tc);
+	}
+
+	setbits_le32(&slc_nand->slc_ctrl, SLCCTRL_DMA_START);
+
+	/* Start DMA transfers */
+	lpc32xx_dma_start_xfer(dmachan, dmalist, config);
+
+	/* Wait for NAND to be ready */
+	while (!lpc32xx_nand_ready(mtd))
+		;
+
+	/* Wait till DMA transfer is DONE */
+	if (lpc32xx_dma_wait_status(dmachan))
+		pr_err("NAND DMA transfer error!\r\n");
+
+	/* Stop DMA & HW ECC */
+	clrbits_le32(&slc_nand->slc_ctrl, SLCCTRL_DMA_START);
+	clrbits_le32(&slc_nand->slc_cfg,
+		     SLCCFG_DMA_DIR | SLCCFG_DMA_BURST |
+		     SLCCFG_ECC_EN | SLCCFG_DMA_ECC);
+}
+
+static uint32_t slc_ecc_copy_to_buffer(uint8_t *spare, const uint32_t *ecc,
+				       int count)
+{
+	int i;
+	for (i = 0; i < (count * 3); i += 3) {
+		uint32_t ce = ecc[i/3];
+		ce = ~(ce << 2) & 0xFFFFFF;
+		spare[i+2] = (uint8_t)(ce & 0xFF); ce >>= 8;
+		spare[i+1] = (uint8_t)(ce & 0xFF); ce >>= 8;
+		spare[i]   = (uint8_t)(ce & 0xFF);
+	}
+	return 0;
+}
+
+static int lpc32xx_ecc_calculate(struct mtd_info *mtd, const uint8_t *dat,
+					     uint8_t *ecc_code)
+{
+	return slc_ecc_copy_to_buffer(ecc_code, ecc_buffer,
+				      CONFIG_SYS_NAND_ECCSIZE
+				      == NAND_LARGE_BLOCK_PAGE_SIZE ? 8 : 2);
+}
+
+/*
+ * Enables and prepares SLC NAND controller
+ * for doing data transfers with H/W ECC enabled.
+ */
+static void lpc32xx_hwecc_enable(struct mtd_info *mtd, int mode)
+{
+	/* Clear ECC */
+	writel(SLCCTRL_ECC_CLEAR, &slc_nand->slc_ctrl);
+
+	/* Setup SLC controller for H/W ECC operations */
+	setbits_le32(&slc_nand->slc_cfg, SLCCFG_ECC_EN | SLCCFG_DMA_ECC);
+}
+
+/*
+ * lpc32xx_write_buf - [DEFAULT] write buffer to chip
+ * mtd:	MTD device structure
+ * buf:	data buffer
+ * len:	number of bytes to write
+ *
+ * Default write function for 8bit buswith
+ */
+static void lpc32xx_write_buf(struct mtd_info *mtd, const u_char *buf, int len)
+{
+	lpc32xx_nand_xfer(mtd, buf, len, 0);
+}
+
+
+/**
+ * lpc32xx_correct_data - [NAND Interface] Detect and correct bit error(s)
+ * mtd:	MTD block structure
+ * dat:	raw data read from the chip
+ * read_ecc:	ECC from the chip
+ * calc_ecc:	the ECC calculated from raw data
+ *
+ * Detect and correct a 1 bit error for 256 byte block
+ *
+ */
+int lpc32xx_correct_data(struct mtd_info *mtd, u_char *dat,
+		      u_char *read_ecc, u_char *calc_ecc)
+{
+	uint8_t i, nb_ecc256;
+	int	ret1, ret2 = 0;
+	u_char	*r = read_ecc;
+	u_char	*c = calc_ecc;
+	uint16_t	data_offset = 0;
+
+	nb_ecc256 = (CONFIG_SYS_NAND_ECCSIZE == NAND_LARGE_BLOCK_PAGE_SIZE
+		     ? 8 : 2);
+
+	for (i = 0 ; i < nb_ecc256 ; i++ , r += 3, c += 3, data_offset += 256) {
+		ret1 = nand_correct_data(mtd, dat + data_offset, r, c);
+
+		if (ret1 < 0)
+			return -EBADMSG;
+		else
+			ret2 += ret1;
+	}
+
+	return ret2;
+}
+
+/*
+ * lpc32xx_read_buf - [DEFAULT] read chip data into buffer
+ * mtd:	MTD device structure
+ * buf:	buffer to store date
+ * len:	number of bytes to read
+ *
+ * Default read function for 8bit buswith
+ */
+static void lpc32xx_read_buf(struct mtd_info *mtd, u_char *buf, int len)
+{
+	lpc32xx_nand_xfer(mtd, buf, len, 1);
+}
+
+int board_nand_init(struct nand_chip *nand)
+{
+	/* Initial NAND interface */
+	lpc32xx_nand_init();
+
+	/* Acquire a channel for our use */
+	dmachan = lpc32xx_dma_get_channel();
+	if (unlikely(dmachan < 0)) {
+		pr_info("Unable to get free DMA channel for NAND transfers\n");
+		return -1;
+	}
+
+#ifdef CONFIG_SYS_NAND_USE_FLASH_BBT
+	nand->bbt_options |= NAND_BBT_USE_FLASH;
+#endif
+
+	/* ECC mode and size */
+	nand->ecc.mode = NAND_ECC_HW;
+	nand->ecc.bytes	= CONFIG_SYS_NAND_ECCBYTES;
+	nand->ecc.size = CONFIG_SYS_NAND_ECCSIZE;
+
+	/* max number of correctible bits per ECC step */
+	nand->ecc.strength = CONFIG_SYS_NAND_ECCSTRENGTH;
+
+	if (CONFIG_SYS_NAND_ECCSIZE != NAND_LARGE_BLOCK_PAGE_SIZE)
+		nand->ecc.layout = &lpc32xx_nand_oob_16;
+
+	nand->ecc.calculate = lpc32xx_ecc_calculate;
+	nand->ecc.correct = lpc32xx_correct_data;
+	nand->ecc.hwctl = lpc32xx_hwecc_enable;
+	nand->cmd_ctrl = lpc32xx_nand_hwcontrol;
+	nand->dev_ready = lpc32xx_nand_ready;
+	nand->chip_delay = 2000;
+
+	nand->read_buf = lpc32xx_read_buf;
+	nand->write_buf = lpc32xx_write_buf;
+	nand->read_byte = lpc32xx_read_byte;
+
+	return 0;
+}