Patchwork [U-Boot,1/4] Optimized nand_read_buf for kirkwood

login
register
mail settings
Submitter Phil Sutter
Date Nov. 26, 2012, 10:33 a.m.
Message ID <1353925988-6859-1-git-send-email-phil.sutter@viprinet.com>
Download mbox | patch
Permalink /patch/201647/
State Changes Requested
Delegated to: Prafulla Wadaskar
Headers show

Comments

Phil Sutter - Nov. 26, 2012, 10:33 a.m.
The basic idea is taken from the linux-kernel, but further optimized.

First align the buffer to 8 bytes, then use ldrd/strd to read and store
in 8 byte quantities, then do the final bytes.

Tested using: 'date ; nand read.raw 0xE00000 0x0 0x10000 ; date'.
Without this patch, NAND read of 132MB took 49s (~2.69MB/s). With this
patch in place, reading the same amount of data was done in 27s
(~4.89MB/s). So read performance is increased by ~80%!

Signed-off-by: Nico Erfurth <ne@erfurth.eu>
Tested-by: Phil Sutter <phil.sutter@viprinet.com>
Cc: Prafulla Wadaskar <prafulla@marvell.com>
---
 drivers/mtd/nand/kirkwood_nand.c |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)
Prafulla Wadaskar - Dec. 20, 2012, 6:44 a.m.
> -----Original Message-----
> From: u-boot-bounces@lists.denx.de [mailto:u-boot-
> bounces@lists.denx.de] On Behalf Of Scott Wood
> Sent: 27 November 2012 05:09
> To: Phil Sutter
> Cc: u-boot@lists.denx.de; Nico Erfurth
> Subject: Re: [U-Boot] [PATCH 1/4] Optimized nand_read_buf for kirkwood
> 
> On 11/26/2012 04:33:08 AM, Phil Sutter wrote:
> > The basic idea is taken from the linux-kernel, but further
> optimized.
> >
> > First align the buffer to 8 bytes, then use ldrd/strd to read and
> > store
> > in 8 byte quantities, then do the final bytes.
> >
> > Tested using: 'date ; nand read.raw 0xE00000 0x0 0x10000 ; date'.
> > Without this patch, NAND read of 132MB took 49s (~2.69MB/s). With
> this
> > patch in place, reading the same amount of data was done in 27s
> > (~4.89MB/s). So read performance is increased by ~80%!
> >
> > Signed-off-by: Nico Erfurth <ne@erfurth.eu>
> > Tested-by: Phil Sutter <phil.sutter@viprinet.com>
> > Cc: Prafulla Wadaskar <prafulla@marvell.com>
> > ---
> >  drivers/mtd/nand/kirkwood_nand.c |   29
> +++++++++++++++++++++++++++++
> >  1 files changed, 29 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/mtd/nand/kirkwood_nand.c
> > b/drivers/mtd/nand/kirkwood_nand.c
> > index bdab5aa..e04a59f 100644
> > --- a/drivers/mtd/nand/kirkwood_nand.c
> > +++ b/drivers/mtd/nand/kirkwood_nand.c
> > @@ -38,6 +38,34 @@ struct kwnandf_registers {
> >  static struct kwnandf_registers *nf_reg =
> >  	(struct kwnandf_registers *)KW_NANDF_BASE;
> >
> > +
> > +/* The basic idea is stolen from the linux kernel, but the inner
> > loop is optimized a bit more */
> > +static void kw_nand_read_buf(struct mtd_info *mtd, uint8_t *buf,
> int
> > len)
> > +{
> > +	struct nand_chip *chip = mtd->priv;
> > +
> > +	while (len && (unsigned long)buf & 7)
> > +	{
> 
> Brace goes on the previous line.
> 
> > +		*buf++ = readb(chip->IO_ADDR_R);
> > +		len--;
> > +	};
> > +
> > +	asm volatile (
> > +		".LFlashLoop:\n"
> > +		"  subs\t%0, #8\n"
> > +		"  ldrpld\tr2, [%2]\n" // Read 2 words
> > +		"  strpld\tr2, [%1], #8\n" // Read 2 words
> > +		"  bpl\t.LFlashLoop\n" // This results in one
> > additional loop if len%8 <> 0
> > +		"  addne\t%0, #8\n"
> > +		: "+&r" (len), "+&r" (buf)
> > +		: "r" (chip->IO_ADDR_R)
> > +		: "r2", "r3", "memory", "cc"
> > +	);
> 
> Use a real tab (or a space) rather than \t (which only helps
> readability in the asm output, rather than the C source that people
> actually look at).
> 
> Should probably use a numeric label to avoid any possibility of
> conflict.
> 
> Would this make more sense as a more generic optimized memcpy_fromio()
> or similar?

Hi Phil

For your next post of this patch, please do not forget to add version info and changlog to the patch.

Regards...
Prafulla . . .
 
> 
> -Scott
> _______________________________________________
> U-Boot mailing list
> U-Boot@lists.denx.de
> http://lists.denx.de/mailman/listinfo/u-boot
Phil Sutter - Dec. 20, 2012, 10:55 a.m.
Hi Prafulla,

On Wed, Dec 19, 2012 at 10:44:01PM -0800, Prafulla Wadaskar wrote:
> For your next post of this patch, please do not forget to add version info and changlog to the patch.

Ah yes, indeed! Thanks a lot for the hint and sorry for the confusion
caused.

Best wishes, Phil

Patch

diff --git a/drivers/mtd/nand/kirkwood_nand.c b/drivers/mtd/nand/kirkwood_nand.c
index bdab5aa..e04a59f 100644
--- a/drivers/mtd/nand/kirkwood_nand.c
+++ b/drivers/mtd/nand/kirkwood_nand.c
@@ -38,6 +38,34 @@  struct kwnandf_registers {
 static struct kwnandf_registers *nf_reg =
 	(struct kwnandf_registers *)KW_NANDF_BASE;
 
+
+/* The basic idea is stolen from the linux kernel, but the inner loop is optimized a bit more */
+static void kw_nand_read_buf(struct mtd_info *mtd, uint8_t *buf, int len)
+{
+	struct nand_chip *chip = mtd->priv;
+
+	while (len && (unsigned long)buf & 7)
+	{
+		*buf++ = readb(chip->IO_ADDR_R);
+		len--;
+	};
+
+	asm volatile (
+		".LFlashLoop:\n"
+		"  subs\t%0, #8\n"
+		"  ldrpld\tr2, [%2]\n" // Read 2 words
+		"  strpld\tr2, [%1], #8\n" // Read 2 words
+		"  bpl\t.LFlashLoop\n" // This results in one additional loop if len%8 <> 0
+		"  addne\t%0, #8\n"
+		: "+&r" (len), "+&r" (buf)
+		: "r" (chip->IO_ADDR_R)
+		: "r2", "r3", "memory", "cc"
+	);
+
+	while (len--)
+		*buf++ = readb(chip->IO_ADDR_R);
+}
+
 /*
  * hardware specific access to control-lines/bits
  */
@@ -76,6 +104,7 @@  int board_nand_init(struct nand_chip *nand)
 	nand->options = NAND_COPYBACK | NAND_CACHEPRG | NAND_NO_PADDING;
 	nand->ecc.mode = NAND_ECC_SOFT;
 	nand->cmd_ctrl = kw_nand_hwcontrol;
+	nand->read_buf = kw_nand_read_buf;
 	nand->chip_delay = 40;
 	nand->select_chip = kw_nand_select_chip;
 	return 0;