| Submitter | Phil Sutter |
|---|---|
| Date | Nov. 26, 2012, 10:33 a.m. |
| Message ID | <1353925988-6859-1-git-send-email-phil.sutter@viprinet.com> |
| Download | mbox | patch |
| Permalink | /patch/201647/ |
| State | Changes Requested |
| Delegated to: | Prafulla Wadaskar |
| Headers | show |
Comments
> -----Original Message----- > From: u-boot-bounces@lists.denx.de [mailto:u-boot- > bounces@lists.denx.de] On Behalf Of Scott Wood > Sent: 27 November 2012 05:09 > To: Phil Sutter > Cc: u-boot@lists.denx.de; Nico Erfurth > Subject: Re: [U-Boot] [PATCH 1/4] Optimized nand_read_buf for kirkwood > > On 11/26/2012 04:33:08 AM, Phil Sutter wrote: > > The basic idea is taken from the linux-kernel, but further > optimized. > > > > First align the buffer to 8 bytes, then use ldrd/strd to read and > > store > > in 8 byte quantities, then do the final bytes. > > > > Tested using: 'date ; nand read.raw 0xE00000 0x0 0x10000 ; date'. > > Without this patch, NAND read of 132MB took 49s (~2.69MB/s). With > this > > patch in place, reading the same amount of data was done in 27s > > (~4.89MB/s). So read performance is increased by ~80%! > > > > Signed-off-by: Nico Erfurth <ne@erfurth.eu> > > Tested-by: Phil Sutter <phil.sutter@viprinet.com> > > Cc: Prafulla Wadaskar <prafulla@marvell.com> > > --- > > drivers/mtd/nand/kirkwood_nand.c | 29 > +++++++++++++++++++++++++++++ > > 1 files changed, 29 insertions(+), 0 deletions(-) > > > > diff --git a/drivers/mtd/nand/kirkwood_nand.c > > b/drivers/mtd/nand/kirkwood_nand.c > > index bdab5aa..e04a59f 100644 > > --- a/drivers/mtd/nand/kirkwood_nand.c > > +++ b/drivers/mtd/nand/kirkwood_nand.c > > @@ -38,6 +38,34 @@ struct kwnandf_registers { > > static struct kwnandf_registers *nf_reg = > > (struct kwnandf_registers *)KW_NANDF_BASE; > > > > + > > +/* The basic idea is stolen from the linux kernel, but the inner > > loop is optimized a bit more */ > > +static void kw_nand_read_buf(struct mtd_info *mtd, uint8_t *buf, > int > > len) > > +{ > > + struct nand_chip *chip = mtd->priv; > > + > > + while (len && (unsigned long)buf & 7) > > + { > > Brace goes on the previous line. > > > + *buf++ = readb(chip->IO_ADDR_R); > > + len--; > > + }; > > + > > + asm volatile ( > > + ".LFlashLoop:\n" > > + " subs\t%0, #8\n" > > + " ldrpld\tr2, [%2]\n" // Read 2 words > > + " strpld\tr2, [%1], #8\n" // Read 2 words > > + " bpl\t.LFlashLoop\n" // This results in one > > additional loop if len%8 <> 0 > > + " addne\t%0, #8\n" > > + : "+&r" (len), "+&r" (buf) > > + : "r" (chip->IO_ADDR_R) > > + : "r2", "r3", "memory", "cc" > > + ); > > Use a real tab (or a space) rather than \t (which only helps > readability in the asm output, rather than the C source that people > actually look at). > > Should probably use a numeric label to avoid any possibility of > conflict. > > Would this make more sense as a more generic optimized memcpy_fromio() > or similar? Hi Phil For your next post of this patch, please do not forget to add version info and changlog to the patch. Regards... Prafulla . . . > > -Scott > _______________________________________________ > U-Boot mailing list > U-Boot@lists.denx.de > http://lists.denx.de/mailman/listinfo/u-boot
Hi Prafulla,
On Wed, Dec 19, 2012 at 10:44:01PM -0800, Prafulla Wadaskar wrote:
> For your next post of this patch, please do not forget to add version info and changlog to the patch.
Ah yes, indeed! Thanks a lot for the hint and sorry for the confusion
caused.
Best wishes, Phil
Patch
diff --git a/drivers/mtd/nand/kirkwood_nand.c b/drivers/mtd/nand/kirkwood_nand.c index bdab5aa..e04a59f 100644 --- a/drivers/mtd/nand/kirkwood_nand.c +++ b/drivers/mtd/nand/kirkwood_nand.c @@ -38,6 +38,34 @@ struct kwnandf_registers { static struct kwnandf_registers *nf_reg = (struct kwnandf_registers *)KW_NANDF_BASE; + +/* The basic idea is stolen from the linux kernel, but the inner loop is optimized a bit more */ +static void kw_nand_read_buf(struct mtd_info *mtd, uint8_t *buf, int len) +{ + struct nand_chip *chip = mtd->priv; + + while (len && (unsigned long)buf & 7) + { + *buf++ = readb(chip->IO_ADDR_R); + len--; + }; + + asm volatile ( + ".LFlashLoop:\n" + " subs\t%0, #8\n" + " ldrpld\tr2, [%2]\n" // Read 2 words + " strpld\tr2, [%1], #8\n" // Read 2 words + " bpl\t.LFlashLoop\n" // This results in one additional loop if len%8 <> 0 + " addne\t%0, #8\n" + : "+&r" (len), "+&r" (buf) + : "r" (chip->IO_ADDR_R) + : "r2", "r3", "memory", "cc" + ); + + while (len--) + *buf++ = readb(chip->IO_ADDR_R); +} + /* * hardware specific access to control-lines/bits */ @@ -76,6 +104,7 @@ int board_nand_init(struct nand_chip *nand) nand->options = NAND_COPYBACK | NAND_CACHEPRG | NAND_NO_PADDING; nand->ecc.mode = NAND_ECC_SOFT; nand->cmd_ctrl = kw_nand_hwcontrol; + nand->read_buf = kw_nand_read_buf; nand->chip_delay = 40; nand->select_chip = kw_nand_select_chip; return 0;