Message ID | 1391033909-6563-1-git-send-email-computersforpeace@gmail.com |
---|---|
State | Accepted |
Commit | 3dad2344e92c6e1aeae42df1c4824f307c51bcc7 |
Headers | show |
On Wed, Jan 29, 2014 at 02:18:28PM -0800, Brian Norris wrote: > The NAND command helpers tend to automatically shift the column address > for x16 bus devices, since most commands expect a word address, not a > byte address. The Read ID command, however, expects an 8-bit address > (i.e., 0x00, 0x20, or 0x40 should not be translated to 0x00, 0x10, or > 0x20). > > This fixes the column address for a few drivers which imitate the > nand_base defaults. Note that I don't touch sh_flctl.c, since it already > handles this problem slightly differently (note its comment "READID is > always performed using an 8-bit bus"). > > I have not tested this patch, as I only have x8 parts up for testing at > this point. Hopefully that can change soon... > > Signed-off-by: Brian Norris <computersforpeace@gmail.com> I applied both patches and tested on my AM335x board (omap2-nand driver). Both 8-bit and 16-bit devices get ONFI-probed and pass a nandtest round. Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Also, checked that without these patches, the 16-bit device would be ID-probed, but not detected as ONFI-compliant. [..] > + > +/** > + * Check if the opcode's address should be sent only on the lower 8 bits > + * @command: opcode to check > + */ > +static inline int nand_opcode_8bits(unsigned int command) > +{ > + return command == NAND_CMD_READID; > +} > + > #endif /* __LINUX_MTD_NAND_H */ With the introduction of this function, I think all the problems we've discussed. The solution looks good to me so. Nice job!
On Thu, Jan 30, 2014 at 09:17:04AM -0300, Ezequiel Garcia wrote: > On Wed, Jan 29, 2014 at 02:18:28PM -0800, Brian Norris wrote: > > The NAND command helpers tend to automatically shift the column address > > for x16 bus devices, since most commands expect a word address, not a > > byte address. The Read ID command, however, expects an 8-bit address > > (i.e., 0x00, 0x20, or 0x40 should not be translated to 0x00, 0x10, or > > 0x20). > > > > This fixes the column address for a few drivers which imitate the > > nand_base defaults. Note that I don't touch sh_flctl.c, since it already > > handles this problem slightly differently (note its comment "READID is > > always performed using an 8-bit bus"). > > > > I have not tested this patch, as I only have x8 parts up for testing at > > this point. Hopefully that can change soon... > > > > Signed-off-by: Brian Norris <computersforpeace@gmail.com> > > I applied both patches and tested on my AM335x board (omap2-nand driver). > Both 8-bit and 16-bit devices get ONFI-probed and pass a nandtest round. > > Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Thanks for the quick testing! > [..] > > + > > +/** > > + * Check if the opcode's address should be sent only on the lower 8 bits > > + * @command: opcode to check > > + */ > > +static inline int nand_opcode_8bits(unsigned int command) > > +{ > > + return command == NAND_CMD_READID; > > +} > > + > > #endif /* __LINUX_MTD_NAND_H */ > > With the introduction of this function, I think all the problems we've > discussed. I think so too. But I forgot to pose this question: are there any other commands which should be treated similarly? NAND_CMD_PARAM only takes column address 0, and I think Change Read Column (or NAND_CMD_RNDOUT, used for extended parameter pages) takes a full buswidth address, according to the spec; I suppose ONFI presumes the host has read the full parameter page and determined the bus width by the time it wants to skip to the extended parameter pages; or else it just continues to read out bytes sequentially). > The solution looks good to me so. Nice job! Thanks. Glad to get it out of the way. Brian
Hi Brian, >From: Brian Norris [mailto:computersforpeace@gmail.com] >diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h >index a719686c9cce..c034dc4224cb 100644 >--- a/include/linux/mtd/nand.h >+++ b/include/linux/mtd/nand.h >@@ -832,4 +832,14 @@ static inline bool nand_is_slc(struct nand_chip *chip) > { > return chip->bits_per_cell == 1; > } >+ >+/** >+ * Check if the opcode's address should be sent only on the lower 8 bits >+ * @command: opcode to check >+ */ >+static inline int nand_opcode_8bits(unsigned int command) >+{ >+ return command == NAND_CMD_READID; >+} >+ > Can 'nand_opcode_8bits, made a macro instead of inline function ? #define IS_8BIT_CMD(cmd) (unlikely(cmd == NAND_CMD_READID)) Because 'nand_opcode_8bits' is used in nand_command() and nand_command_lp() which is performance critical code (chip->cmd is called multiple times for fetching page data and OOB). Though we should expect compiler to treat this inline function same as macro here, But just to be doubly sure for future changes also. [1] http://stackoverflow.com/questions/1571392/pros-and-cons-of-different-macro-function-inline-methods-in-c [2] http://stackoverflow.com/questions/5226803/inline-function-v-macro-in-c-whats-the-overhead-memory-speed with regards, pekon
On Thu, Jan 30, 2014 at 07:17:29PM +0000, Gupta, Pekon wrote: > Hi Brian, > > >From: Brian Norris [mailto:computersforpeace@gmail.com] > >diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h > >index a719686c9cce..c034dc4224cb 100644 > >--- a/include/linux/mtd/nand.h > >+++ b/include/linux/mtd/nand.h > >@@ -832,4 +832,14 @@ static inline bool nand_is_slc(struct nand_chip *chip) > > { > > return chip->bits_per_cell == 1; > > } > >+ > >+/** > >+ * Check if the opcode's address should be sent only on the lower 8 bits > >+ * @command: opcode to check > >+ */ > >+static inline int nand_opcode_8bits(unsigned int command) > >+{ > >+ return command == NAND_CMD_READID; > >+} > >+ > > > Can 'nand_opcode_8bits, made a macro instead of inline function ? > #define IS_8BIT_CMD(cmd) (unlikely(cmd == NAND_CMD_READID)) > > Because 'nand_opcode_8bits' is used in nand_command() and nand_command_lp() > which is performance critical code (chip->cmd is called multiple times for fetching > page data and OOB). Though we should expect compiler to treat this inline function > same as macro here, But just to be doubly sure for future changes also. > I'm not a compiler guru, but I'd be very surprised if the compiler would make a difference here, given the function is static, inline and small. Besides, an inline function is more readable and type-aware. I'd say it's the Right Thing to do. Nevertheless, I did a small check on my platform (handbuilt ARM GCC 4.7.2) and both macro and inline is compiled into a "cmp" instruction. The unlikely doesn't seem to have any effect.
Hi Ezequiel, >> >+static inline int nand_opcode_8bits(unsigned int command) >> >+{ >> >+ return command == NAND_CMD_READID; >> >+} >> >+ >> > >> Can 'nand_opcode_8bits, made a macro instead of inline function ? >> #define IS_8BIT_CMD(cmd) (unlikely(cmd == NAND_CMD_READID)) >> >> Because 'nand_opcode_8bits' is used in nand_command() and nand_command_lp() >> which is performance critical code (chip->cmd is called multiple times for fetching >> page data and OOB). Though we should expect compiler to treat this inline function >> same as macro here, But just to be doubly sure for future changes also. >> > >I'm not a compiler guru, but I'd be very surprised if the compiler would make >a difference here, given the function is static, inline and small. Besides, >an inline function is more readable and type-aware. I'd say it's the Right >Thing to do. > >Nevertheless, I did a small check on my platform (handbuilt ARM GCC 4.7.2) and >both macro and inline is compiled into a "cmp" instruction. The unlikely >doesn't seem to have any effect. > Please check the JMP instruction just after CMP instruction.. In-case of (unlikely(x == y)) JMP/JE will be used to branch when x == y. where as in-case of (likely(x == y)) JMP/JNE will be used to branch x != y. As in most cases 'command != NAND_CMD_READID' so with unlikely() Compiler should lay down the instructions in such an order that program executes _without_ branching. Thus JMP will point to the code when command == NAND_CMD_READID. (this is my understanding, however may be for small branches this has no effect because both paths can be fetched simultaneously into pipeline). with regards, pekon
On Thu, Jan 30, 2014 at 04:51:08PM -0300, Ezequiel Garcia wrote: > On Thu, Jan 30, 2014 at 07:17:29PM +0000, Gupta, Pekon wrote: > > >From: Brian Norris [mailto:computersforpeace@gmail.com] > > >diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h > > >index a719686c9cce..c034dc4224cb 100644 > > >--- a/include/linux/mtd/nand.h > > >+++ b/include/linux/mtd/nand.h > > >@@ -832,4 +832,14 @@ static inline bool nand_is_slc(struct nand_chip *chip) > > > { > > > return chip->bits_per_cell == 1; > > > } > > >+ > > >+/** > > >+ * Check if the opcode's address should be sent only on the lower 8 bits > > >+ * @command: opcode to check > > >+ */ > > >+static inline int nand_opcode_8bits(unsigned int command) > > >+{ > > >+ return command == NAND_CMD_READID; > > >+} > > >+ > > > > > Can 'nand_opcode_8bits, made a macro instead of inline function ? > > #define IS_8BIT_CMD(cmd) (unlikely(cmd == NAND_CMD_READID)) > > > > Because 'nand_opcode_8bits' is used in nand_command() and nand_command_lp() > > which is performance critical code (chip->cmd is called multiple times for fetching > > page data and OOB). First of all, I really don't think micro-architectural optimizations are significant at this point. The overhead (if any exists) is likely very small, especially relative to other sorts of optimization obstacles (e.g., use of function pointers). But in any case, optimization must be informed by data, not simply speculation. > > Though we should expect compiler to treat this inline function > > same as macro here, But just to be doubly sure for future changes also. > > > > I'm not a compiler guru, but I'd be very surprised if the compiler would make > a difference here, given the function is static, inline and small. Besides, > an inline function is more readable and type-aware. I'd say it's the Right > Thing to do. I agree that, unless there is a significant demonstrable benefit to do otherwise, static inline is the way to go (for reasons of type safety, for one). > Nevertheless, I did a small check on my platform (handbuilt ARM GCC 4.7.2) and > both macro and inline is compiled into a "cmp" instruction. The unlikely > doesn't seem to have any effect. And so we have data. I'm sure there are other data points to consider (different compilers, different ARCH, nand_opcode_8bits() used in different contexts, etc.), but it's not worth it IMO. Thanks, Brian
On Thu, Jan 30, 2014 at 08:18:07PM +0000, Pekon Gupta wrote: > >> >+static inline int nand_opcode_8bits(unsigned int command) > >> >+{ > >> >+ return command == NAND_CMD_READID; > >> >+} > >> >+ > >> > > >> Can 'nand_opcode_8bits, made a macro instead of inline function ? > >> #define IS_8BIT_CMD(cmd) (unlikely(cmd == NAND_CMD_READID)) > >> > >> Because 'nand_opcode_8bits' is used in nand_command() and nand_command_lp() > >> which is performance critical code (chip->cmd is called multiple times for fetching > >> page data and OOB). Though we should expect compiler to treat this inline function > >> same as macro here, But just to be doubly sure for future changes also. > >> > > > >I'm not a compiler guru, but I'd be very surprised if the compiler would make > >a difference here, given the function is static, inline and small. Besides, > >an inline function is more readable and type-aware. I'd say it's the Right > >Thing to do. > > > >Nevertheless, I did a small check on my platform (handbuilt ARM GCC 4.7.2) and > >both macro and inline is compiled into a "cmp" instruction. The unlikely > >doesn't seem to have any effect. > > > Please check the JMP instruction just after CMP instruction.. > In-case of (unlikely(x == y)) JMP/JE will be used to branch when x == y. > where as in-case of (likely(x == y)) JMP/JNE will be used to branch x != y. ... > (this is my understanding, however may be for small branches this has > no effect because both paths can be fetched simultaneously into pipeline). With small branches, ARM just turns them into very compact conditional instructions, rather than true branches. This program generates identical code for functionA() and functionB() when compiled with -O2 on ARM: /* ------------- BEGIN PROGRAM ---------------- */ #define unlikely(x) __builtin_expect(!!(x), 0) #define NAND_CMD_READID 17 #define NAND_OPCODE_8BITS(x) (unlikely(command == NAND_CMD_READID)) static inline int nand_opcode_8bits(unsigned int command) { return unlikely(command == NAND_CMD_READID); } int functionA(int command, int col) { if (nand_opcode_8bits(command)) return col; return col >> 1; } int functionB(int command, int col) { if (NAND_OPCODE_8BITS(command)) return col; return col >> 1; } /* ------------- END PROGRAM ---------------- */ Disassembly: 00000000 <functionA>: 0: e3500011 cmp r0, #17 4: 11a010c1 asrne r1, r1, #1 8: e1a00001 mov r0, r1 c: e12fff1e bx lr 00000010 <functionB>: 10: e3500011 cmp r0, #17 14: 11a010c1 asrne r1, r1, #1 18: e1a00001 mov r0, r1 1c: e12fff1e bx lr Let's stop the nonsense then :) If you really want to pursue this, give me (1) an example where they compile differently with (2) real, significant performance differences. The burden is on you! Brian
>From: Brian Norris [mailto:computersforpeace@gmail.com] > >Let's stop the nonsense then :) > Ok .. :-) Though its redundant, as Ezequiel has already tested these patches on AM335x using both x8 and x16 Micron devices. But still I re-tested them on different OMAP boards. Tested-By: Pekon Gupta <pekon@ti.com> with regards, pekon
+ Huang On Fri, Jan 31, 2014 at 06:55:49AM +0000, Pekon Gupta wrote: > Though its redundant, as Ezequiel has already tested these > patches on AM335x using both x8 and x16 Micron devices. > But still I re-tested them on different OMAP boards. Still worthwhile. > Tested-By: Pekon Gupta <pekon@ti.com> Thanks! Pushed both to l2-mtd.git/next. BTW, I think Huang's JEDEC parameter page support should be modified similar to patch 1. I'll comment there. Brian
diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c index c36e9b84487c..1955389c8fa6 100644 --- a/drivers/mtd/nand/atmel_nand.c +++ b/drivers/mtd/nand/atmel_nand.c @@ -1659,8 +1659,8 @@ static void nfc_select_chip(struct mtd_info *mtd, int chip) nfc_writel(host->nfc->hsmc_regs, CTRL, NFC_CTRL_ENABLE); } -static int nfc_make_addr(struct mtd_info *mtd, int column, int page_addr, - unsigned int *addr1234, unsigned int *cycle0) +static int nfc_make_addr(struct mtd_info *mtd, int command, int column, + int page_addr, unsigned int *addr1234, unsigned int *cycle0) { struct nand_chip *chip = mtd->priv; @@ -1674,7 +1674,8 @@ static int nfc_make_addr(struct mtd_info *mtd, int column, int page_addr, *addr1234 = 0; if (column != -1) { - if (chip->options & NAND_BUSWIDTH_16) + if (chip->options & NAND_BUSWIDTH_16 && + !nand_opcode_8bits(command)) column >>= 1; addr_bytes[acycle++] = column & 0xff; if (mtd->writesize > 512) @@ -1787,8 +1788,8 @@ static void nfc_nand_command(struct mtd_info *mtd, unsigned int command, } if (do_addr) - acycle = nfc_make_addr(mtd, column, page_addr, &addr1234, - &cycle0); + acycle = nfc_make_addr(mtd, command, column, page_addr, + &addr1234, &cycle0); nfc_addr_cmd = cmd1 | cmd2 | vcmd2 | acycle | csid | dataen | nfcwr; nfc_send_command(host, nfc_addr_cmd, addr1234, cycle0); diff --git a/drivers/mtd/nand/au1550nd.c b/drivers/mtd/nand/au1550nd.c index 7d84c4e4bf43..bc5c518828d2 100644 --- a/drivers/mtd/nand/au1550nd.c +++ b/drivers/mtd/nand/au1550nd.c @@ -307,7 +307,8 @@ static void au1550_command(struct mtd_info *mtd, unsigned command, int column, i /* Serially input address */ if (column != -1) { /* Adjust columns for 16 bit buswidth */ - if (this->options & NAND_BUSWIDTH_16) + if (this->options & NAND_BUSWIDTH_16 && + !nand_opcode_8bits(command)) column >>= 1; ctx->write_byte(mtd, column); } diff --git a/drivers/mtd/nand/diskonchip.c b/drivers/mtd/nand/diskonchip.c index fec31d71b84e..b9b4db607850 100644 --- a/drivers/mtd/nand/diskonchip.c +++ b/drivers/mtd/nand/diskonchip.c @@ -698,7 +698,8 @@ static void doc2001plus_command(struct mtd_info *mtd, unsigned command, int colu /* Serially input address */ if (column != -1) { /* Adjust columns for 16 bit buswidth */ - if (this->options & NAND_BUSWIDTH_16) + if (this->options & NAND_BUSWIDTH_16 && + !nand_opcode_8bits(command)) column >>= 1; WriteDOC(column, docptr, Mplus_FlashAddress); } diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c index 9111a4c7c7a2..c69a1f7ce0a2 100644 --- a/drivers/mtd/nand/nand_base.c +++ b/drivers/mtd/nand/nand_base.c @@ -589,7 +589,8 @@ static void nand_command(struct mtd_info *mtd, unsigned int command, /* Serially input address */ if (column != -1) { /* Adjust columns for 16 bit buswidth */ - if (chip->options & NAND_BUSWIDTH_16) + if (chip->options & NAND_BUSWIDTH_16 && + !nand_opcode_8bits(command)) column >>= 1; chip->cmd_ctrl(mtd, column, ctrl); ctrl &= ~NAND_CTRL_CHANGE; @@ -680,7 +681,8 @@ static void nand_command_lp(struct mtd_info *mtd, unsigned int command, /* Serially input address */ if (column != -1) { /* Adjust columns for 16 bit buswidth */ - if (chip->options & NAND_BUSWIDTH_16) + if (chip->options & NAND_BUSWIDTH_16 && + !nand_opcode_8bits(command)) column >>= 1; chip->cmd_ctrl(mtd, column, ctrl); ctrl &= ~NAND_CTRL_CHANGE; diff --git a/drivers/mtd/nand/nuc900_nand.c b/drivers/mtd/nand/nuc900_nand.c index 90c99ea5184a..331fccbdde61 100644 --- a/drivers/mtd/nand/nuc900_nand.c +++ b/drivers/mtd/nand/nuc900_nand.c @@ -151,7 +151,8 @@ static void nuc900_nand_command_lp(struct mtd_info *mtd, unsigned int command, if (column != -1 || page_addr != -1) { if (column != -1) { - if (chip->options & NAND_BUSWIDTH_16) + if (chip->options & NAND_BUSWIDTH_16 && + !nand_opcode_8bits(command)) column >>= 1; write_addr_reg(nand, column); write_addr_reg(nand, column >> 8 | ENDADDR); diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h index a719686c9cce..c034dc4224cb 100644 --- a/include/linux/mtd/nand.h +++ b/include/linux/mtd/nand.h @@ -832,4 +832,14 @@ static inline bool nand_is_slc(struct nand_chip *chip) { return chip->bits_per_cell == 1; } + +/** + * Check if the opcode's address should be sent only on the lower 8 bits + * @command: opcode to check + */ +static inline int nand_opcode_8bits(unsigned int command) +{ + return command == NAND_CMD_READID; +} + #endif /* __LINUX_MTD_NAND_H */
The NAND command helpers tend to automatically shift the column address for x16 bus devices, since most commands expect a word address, not a byte address. The Read ID command, however, expects an 8-bit address (i.e., 0x00, 0x20, or 0x40 should not be translated to 0x00, 0x10, or 0x20). This fixes the column address for a few drivers which imitate the nand_base defaults. Note that I don't touch sh_flctl.c, since it already handles this problem slightly differently (note its comment "READID is always performed using an 8-bit bus"). I have not tested this patch, as I only have x8 parts up for testing at this point. Hopefully that can change soon... Signed-off-by: Brian Norris <computersforpeace@gmail.com> --- drivers/mtd/nand/atmel_nand.c | 11 ++++++----- drivers/mtd/nand/au1550nd.c | 3 ++- drivers/mtd/nand/diskonchip.c | 3 ++- drivers/mtd/nand/nand_base.c | 6 ++++-- drivers/mtd/nand/nuc900_nand.c | 3 ++- include/linux/mtd/nand.h | 10 ++++++++++ 6 files changed, 26 insertions(+), 10 deletions(-)