From patchwork Wed Aug 13 11:20:35 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Agner X-Patchwork-Id: 379602 X-Patchwork-Delegate: scottwood@freescale.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from theia.denx.de (theia.denx.de [85.214.87.163]) by ozlabs.org (Postfix) with ESMTP id BC7C11400B9 for ; Wed, 13 Aug 2014 21:21:11 +1000 (EST) Received: from localhost (localhost [127.0.0.1]) by theia.denx.de (Postfix) with ESMTP id A4E764A027; Wed, 13 Aug 2014 13:21:09 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at theia.denx.de Received: from theia.denx.de ([127.0.0.1]) by localhost (theia.denx.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id umbROc-VcnWj; Wed, 13 Aug 2014 13:21:09 +0200 (CEST) Received: from theia.denx.de (localhost [127.0.0.1]) by theia.denx.de (Postfix) with ESMTP id 6234A4A03B; Wed, 13 Aug 2014 13:21:08 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by theia.denx.de (Postfix) with ESMTP id C9ACB4A03B for ; Wed, 13 Aug 2014 13:21:05 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at theia.denx.de Received: from theia.denx.de ([127.0.0.1]) by localhost (theia.denx.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eV-mHWJh1Px8 for ; Wed, 13 Aug 2014 13:21:02 +0200 (CEST) X-policyd-weight: NOT_IN_SBL_XBL_SPAMHAUS=-1.5 NOT_IN_SPAMCOP=-1.5 NOT_IN_BL_NJABL=-1.5 (only DNSBL check requested) Received: from mail.kmu-office.ch (mail.kmu-office.ch [178.209.48.102]) by theia.denx.de (Postfix) with ESMTP id 6375A4A027 for ; Wed, 13 Aug 2014 13:20:59 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by mail.kmu-office.ch (Postfix) with ESMTP id 5BD878D148 for ; Wed, 13 Aug 2014 13:20:38 +0200 (CEST) X-Virus-Scanned: by amavisd-new at kmu-office.ch Received: from mail.kmu-office.ch ([127.0.0.1]) by localhost (mail.kmu-office.ch [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ERI0Ymwq6pRs for ; Wed, 13 Aug 2014 13:20:37 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by mail.kmu-office.ch (Postfix) with ESMTP id 047CC8D14F for ; Wed, 13 Aug 2014 13:20:37 +0200 (CEST) Received: from webmail.kmu-office.ch (unknown [178.209.48.103]) (Authenticated sender: stefan@agner.ch) by mail.kmu-office.ch (Postfix) with ESMTPSA id D0AF58D143; Wed, 13 Aug 2014 13:20:36 +0200 (CEST) MIME-Version: 1.0 Date: Wed, 13 Aug 2014 13:20:35 +0200 From: Stefan Agner To: Bill Pringlemeir In-Reply-To: <87wqadl3h6.fsf@nbsps.com> References: <4ea124ab817db6e3dee2067aadd6db14643990f5.1407312577.git.stefan@agner.ch> <1407796426.7427.100.camel@snotra.buserror.net> <1380dbe3110cbf9359220a78d69e1896@agner.ch> <1407881820.7427.137.camel@snotra.buserror.net> <87wqadl3h6.fsf@nbsps.com> Message-ID: <7b41b54fe3f11a943b0ec63f2e44f573@agner.ch> X-Sender: stefan@agner.ch User-Agent: Roundcube Webmail/1.0.1 Cc: Scott Wood , u-boot@lists.denx.de Subject: Re: [U-Boot] [PATCH 3/4] mtd: nand: add Freescale NFC driver X-BeenThere: u-boot@lists.denx.de X-Mailman-Version: 2.1.11 Precedence: list List-Id: U-Boot discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: u-boot-bounces@lists.denx.de Errors-To: u-boot-bounces@lists.denx.de Am 2014-08-13 00:58, schrieb Bill Pringlemeir: [snip] >>>>> +static u32 nfc_read(struct mtd_info *mtd, uint reg) >>>>> +{ >>>>> + struct fsl_nfc *nfc = mtd_to_nfc(mtd); >>>>> + >>>>> + if (reg == NFC_FLASH_STATUS1 || >>>>> + reg == NFC_FLASH_STATUS2 || >>>>> + reg == NFC_IRQ_STATUS) >>>>> + return __raw_readl(nfc->regs + reg); >>>>> + /* Gang read/writes together for most registers. */ >>>>> + else >>>>> + return *(u32 *)(nfc->regs + reg); >>>>> +} >>>>> + >>>>> +static void nfc_write(struct mtd_info *mtd, uint reg, u32 val) >>>>> +{ >>>>> + struct fsl_nfc *nfc = mtd_to_nfc(mtd); >>>>> + >>>>> + if (reg == NFC_FLASH_STATUS1 || >>>>> + reg == NFC_FLASH_STATUS2 || >>>>> + reg == NFC_IRQ_STATUS) >>>>> + __raw_writel(val, nfc->regs + reg); >>>>> + /* Gang read/writes together for most registers. */ >>>>> + else >>>>> + *(u32 *)(nfc->regs + reg) = val; >>>>> +} >>>> >>>> You should always be using raw I/O accessors. If the intent is to >>>> bypass I/O ordering for certain registers, the raw accessors should >>>> already be skipping that. > >>> Ok, will do. > >> Sorry, the above should say "always be using I/O accesors", not always >> raw ones. > > This is probably because of me. There are lines like this for > configuration, > > /* Set configuration register. */ > nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_ADDR_AUTO_INCR_BIT); > nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_BUFNO_AUTO_INCR_BIT); > nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_BOOT_MODE_BIT); > nfc_clear(mtd, NFC_FLASH_CONFIG, CONFIG_DMA_REQ_BIT); > nfc_set(mtd, NFC_FLASH_CONFIG, CONFIG_FAST_FLASH_BIT); > > If you use some sort of 'volatile', you are saying to the compiler that > these lines are *not*, > > ldr r0, [r1, #CONFIG] # read previous value > ldr r2, =~MASK > orr r0, #FAST_FLASH_BIT # set one bit. > and r0, r2 # clear all bits at once. > str r0, [r1, #CONFIG] # write it back. > > Instead it will change into five different load/modify/stores. The > memory itself is just like SRAM except a few registers that are actually > volatile; ie changed via the hardware. > > Particularly bad are the memcpy_io() on the ARM. Using these results > in about 1/2 to 1/4 of the performance of the drivers through-put on the > Vybrid. I can see that using accessors is good, but just replacing this > with 'writel' will result in significantly more code and slower > throughput. > > If people insist on this, then something like, > > val = nfc_read(mtd, NFC_REG); > val = nfc_clear(val, CONFIG_ADDR_AUTO_INCR_BIT); > val = nfc_clear(val, CONFIG_BUFNO_AUTO_INCR_BIT); > val = nfc_clear(val, CONFIG_BOOT_MODE_BIT); > val = nfc_clear(val, CONFIG_DMA_REQ_BIT); > val = nfc_set(val, CONFIG_FAST_FLASH_BIT); > nfc_write(mtd, NFC_REG, val); > > would result in the same code with 'nfc_read()' and 'nfc_write()' using > the I/O accessors. I found this more difficult to read for the higher > level functions. Instead some some standard macros for setting and > clearing bits could be used. The original driver was using 'nfc_set()' > and 'nfc_clear()'. Maybe they should just go. > I just applied this change: static void nfc_set(struct mtd_info *mtd, uint reg, u32 bits) And did some performance measurement: => nand read ${loadaddr} 0x0 0x2000000 ===> With if/else NAND read: device 0 offset 0x0, size 0x2000000 33554432 bytes read in 8494 ms: OK (3.8 MiB/s) ===> With __raw_readl only... NAND read: device 0 offset 0x0, size 0x2000000 33554432 bytes read in 8184 ms: OK (3.9 MiB/s) => nand write ${loadaddr} rootfs-ubi 0x2000000 ===> With if/else NAND write: device 0 offset 0xa00000, size 0x2000000 33554432 bytes written in 18200 ms: OK (1.8 MiB/s) ===> With __raw_readl only... NAND write: device 0 offset 0xa00000, size 0x2000000 33554432 bytes written in 15095 ms: OK (2.1 MiB/s) These values are reproducible. I guess by removing the conditional branch in the nfc_read/nfc_write functions we even gain performance. IMHO we should use the raw_writel only and "hand optimize" for functions which are used often. For the initialization/configuration functions, there is little value to save some register access. --- Stefan diff --git a/drivers/mtd/nand/fsl_nfc.c b/drivers/mtd/nand/fsl_nfc.c index df2c8be..01c010f 100644 --- a/drivers/mtd/nand/fsl_nfc.c +++ b/drivers/mtd/nand/fsl_nfc.c @@ -191,26 +191,14 @@ static u32 nfc_read(struct mtd_info *mtd, uint reg) { struct fsl_nfc *nfc = mtd_to_nfc(mtd); - if (reg == NFC_FLASH_STATUS1 || - reg == NFC_FLASH_STATUS2 || - reg == NFC_IRQ_STATUS) - return __raw_readl(nfc->regs + reg); - /* Gang read/writes together for most registers. */ - else - return *(u32 *)(nfc->regs + reg); + return __raw_readl(nfc->regs + reg); } static void nfc_write(struct mtd_info *mtd, uint reg, u32 val) { struct fsl_nfc *nfc = mtd_to_nfc(mtd); - if (reg == NFC_FLASH_STATUS1 || - reg == NFC_FLASH_STATUS2 || - reg == NFC_IRQ_STATUS) - __raw_writel(val, nfc->regs + reg); - /* Gang read/writes together for most registers. */ - else - *(u32 *)(nfc->regs + reg) = val; + __raw_writel(val, nfc->regs + reg); }