diff mbox

[1/2] mtd: nand: force NAND_CMD_READID onto 8-bit bus

Message ID 1391033909-6563-1-git-send-email-computersforpeace@gmail.com
State Accepted
Commit 3dad2344e92c6e1aeae42df1c4824f307c51bcc7
Headers show

Commit Message

Brian Norris Jan. 29, 2014, 10:18 p.m. UTC
The NAND command helpers tend to automatically shift the column address
for x16 bus devices, since most commands expect a word address, not a
byte address. The Read ID command, however, expects an 8-bit address
(i.e., 0x00, 0x20, or 0x40 should not be translated to 0x00, 0x10, or
0x20).

This fixes the column address for a few drivers which imitate the
nand_base defaults. Note that I don't touch sh_flctl.c, since it already
handles this problem slightly differently (note its comment "READID is
always performed using an 8-bit bus").

I have not tested this patch, as I only have x8 parts up for testing at
this point. Hopefully that can change soon...

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
 drivers/mtd/nand/atmel_nand.c  | 11 ++++++-----
 drivers/mtd/nand/au1550nd.c    |  3 ++-
 drivers/mtd/nand/diskonchip.c  |  3 ++-
 drivers/mtd/nand/nand_base.c   |  6 ++++--
 drivers/mtd/nand/nuc900_nand.c |  3 ++-
 include/linux/mtd/nand.h       | 10 ++++++++++
 6 files changed, 26 insertions(+), 10 deletions(-)

Comments

Ezequiel Garcia Jan. 30, 2014, 12:17 p.m. UTC | #1
On Wed, Jan 29, 2014 at 02:18:28PM -0800, Brian Norris wrote:
> The NAND command helpers tend to automatically shift the column address
> for x16 bus devices, since most commands expect a word address, not a
> byte address. The Read ID command, however, expects an 8-bit address
> (i.e., 0x00, 0x20, or 0x40 should not be translated to 0x00, 0x10, or
> 0x20).
> 
> This fixes the column address for a few drivers which imitate the
> nand_base defaults. Note that I don't touch sh_flctl.c, since it already
> handles this problem slightly differently (note its comment "READID is
> always performed using an 8-bit bus").
> 
> I have not tested this patch, as I only have x8 parts up for testing at
> this point. Hopefully that can change soon...
> 
> Signed-off-by: Brian Norris <computersforpeace@gmail.com>

I applied both patches and tested on my AM335x board (omap2-nand driver).
Both 8-bit and 16-bit devices get ONFI-probed and pass a nandtest round.

Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>

Also, checked that without these patches, the 16-bit device would be ID-probed,
but not detected as ONFI-compliant.

[..]
> +
> +/**
> + * Check if the opcode's address should be sent only on the lower 8 bits
> + * @command: opcode to check
> + */
> +static inline int nand_opcode_8bits(unsigned int command)
> +{
> +	return command == NAND_CMD_READID;
> +}
> +
>  #endif /* __LINUX_MTD_NAND_H */

With the introduction of this function, I think all the problems we've
discussed. The solution looks good to me so. Nice job!
Brian Norris Jan. 30, 2014, 6 p.m. UTC | #2
On Thu, Jan 30, 2014 at 09:17:04AM -0300, Ezequiel Garcia wrote:
> On Wed, Jan 29, 2014 at 02:18:28PM -0800, Brian Norris wrote:
> > The NAND command helpers tend to automatically shift the column address
> > for x16 bus devices, since most commands expect a word address, not a
> > byte address. The Read ID command, however, expects an 8-bit address
> > (i.e., 0x00, 0x20, or 0x40 should not be translated to 0x00, 0x10, or
> > 0x20).
> > 
> > This fixes the column address for a few drivers which imitate the
> > nand_base defaults. Note that I don't touch sh_flctl.c, since it already
> > handles this problem slightly differently (note its comment "READID is
> > always performed using an 8-bit bus").
> > 
> > I have not tested this patch, as I only have x8 parts up for testing at
> > this point. Hopefully that can change soon...
> > 
> > Signed-off-by: Brian Norris <computersforpeace@gmail.com>
> 
> I applied both patches and tested on my AM335x board (omap2-nand driver).
> Both 8-bit and 16-bit devices get ONFI-probed and pass a nandtest round.
> 
> Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>

Thanks for the quick testing!

> [..]
> > +
> > +/**
> > + * Check if the opcode's address should be sent only on the lower 8 bits
> > + * @command: opcode to check
> > + */
> > +static inline int nand_opcode_8bits(unsigned int command)
> > +{
> > +	return command == NAND_CMD_READID;
> > +}
> > +
> >  #endif /* __LINUX_MTD_NAND_H */
> 
> With the introduction of this function, I think all the problems we've
> discussed.

I think so too.

But I forgot to pose this question: are there any other commands which
should be treated similarly? NAND_CMD_PARAM only takes column address 0,
and I think Change Read Column (or NAND_CMD_RNDOUT, used for extended
parameter pages) takes a full buswidth address, according to the spec; I
suppose ONFI presumes the host has read the full parameter page and
determined the bus width by the time it wants to skip to the extended
parameter pages; or else it just continues to read out bytes
sequentially).

> The solution looks good to me so. Nice job!

Thanks. Glad to get it out of the way.

Brian
pekon gupta Jan. 30, 2014, 7:17 p.m. UTC | #3
Hi Brian,

>From: Brian Norris [mailto:computersforpeace@gmail.com]
>diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
>index a719686c9cce..c034dc4224cb 100644
>--- a/include/linux/mtd/nand.h
>+++ b/include/linux/mtd/nand.h
>@@ -832,4 +832,14 @@ static inline bool nand_is_slc(struct nand_chip *chip)
> {
> 	return chip->bits_per_cell == 1;
> }
>+
>+/**
>+ * Check if the opcode's address should be sent only on the lower 8 bits
>+ * @command: opcode to check
>+ */
>+static inline int nand_opcode_8bits(unsigned int command)
>+{
>+	return command == NAND_CMD_READID;
>+}
>+
>
Can 'nand_opcode_8bits, made a macro instead of inline function ?
#define IS_8BIT_CMD(cmd)  (unlikely(cmd == NAND_CMD_READID))

Because 'nand_opcode_8bits' is used in nand_command() and nand_command_lp()
which is performance critical code (chip->cmd is called multiple times for fetching
page data and OOB). Though we should expect compiler to treat this inline function
same as macro here, But just to be doubly sure for future changes also.

[1] http://stackoverflow.com/questions/1571392/pros-and-cons-of-different-macro-function-inline-methods-in-c
[2] http://stackoverflow.com/questions/5226803/inline-function-v-macro-in-c-whats-the-overhead-memory-speed

with regards, pekon
Ezequiel Garcia Jan. 30, 2014, 7:51 p.m. UTC | #4
On Thu, Jan 30, 2014 at 07:17:29PM +0000, Gupta, Pekon wrote:
> Hi Brian,
> 
> >From: Brian Norris [mailto:computersforpeace@gmail.com]
> >diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
> >index a719686c9cce..c034dc4224cb 100644
> >--- a/include/linux/mtd/nand.h
> >+++ b/include/linux/mtd/nand.h
> >@@ -832,4 +832,14 @@ static inline bool nand_is_slc(struct nand_chip *chip)
> > {
> > 	return chip->bits_per_cell == 1;
> > }
> >+
> >+/**
> >+ * Check if the opcode's address should be sent only on the lower 8 bits
> >+ * @command: opcode to check
> >+ */
> >+static inline int nand_opcode_8bits(unsigned int command)
> >+{
> >+	return command == NAND_CMD_READID;
> >+}
> >+
> >
> Can 'nand_opcode_8bits, made a macro instead of inline function ?
> #define IS_8BIT_CMD(cmd)  (unlikely(cmd == NAND_CMD_READID))
> 
> Because 'nand_opcode_8bits' is used in nand_command() and nand_command_lp()
> which is performance critical code (chip->cmd is called multiple times for fetching
> page data and OOB). Though we should expect compiler to treat this inline function
> same as macro here, But just to be doubly sure for future changes also.
> 

I'm not a compiler guru, but I'd be very surprised if the compiler would make
a difference here, given the function is static, inline and small. Besides,
an inline function is more readable and type-aware. I'd say it's the Right
Thing to do.

Nevertheless, I did a small check on my platform (handbuilt ARM GCC 4.7.2) and
both macro and inline is compiled into a "cmp" instruction. The unlikely
doesn't seem to have any effect.
pekon gupta Jan. 30, 2014, 8:18 p.m. UTC | #5
Hi Ezequiel,

>> >+static inline int nand_opcode_8bits(unsigned int command)
>> >+{
>> >+	return command == NAND_CMD_READID;
>> >+}
>> >+
>> >
>> Can 'nand_opcode_8bits, made a macro instead of inline function ?
>> #define IS_8BIT_CMD(cmd)  (unlikely(cmd == NAND_CMD_READID))
>>
>> Because 'nand_opcode_8bits' is used in nand_command() and nand_command_lp()
>> which is performance critical code (chip->cmd is called multiple times for fetching
>> page data and OOB). Though we should expect compiler to treat this inline function
>> same as macro here, But just to be doubly sure for future changes also.
>>
>
>I'm not a compiler guru, but I'd be very surprised if the compiler would make
>a difference here, given the function is static, inline and small. Besides,
>an inline function is more readable and type-aware. I'd say it's the Right
>Thing to do.
>
>Nevertheless, I did a small check on my platform (handbuilt ARM GCC 4.7.2) and
>both macro and inline is compiled into a "cmp" instruction. The unlikely
>doesn't seem to have any effect.
>
Please check the JMP instruction just after CMP instruction..
In-case of (unlikely(x == y)) JMP/JE will be used to branch when x == y.
where as in-case of (likely(x == y)) JMP/JNE will be used to branch x != y.

As in most cases 'command != NAND_CMD_READID' so with unlikely() 
Compiler should lay down the instructions in such an order that program
executes _without_ branching.  Thus JMP will point to the code when
command == NAND_CMD_READID.

(this is my understanding, however may be for small branches this has
 no effect because both paths can be fetched simultaneously into pipeline).


with regards, pekon
Brian Norris Jan. 30, 2014, 8:39 p.m. UTC | #6
On Thu, Jan 30, 2014 at 04:51:08PM -0300, Ezequiel Garcia wrote:
> On Thu, Jan 30, 2014 at 07:17:29PM +0000, Gupta, Pekon wrote:
> > >From: Brian Norris [mailto:computersforpeace@gmail.com]
> > >diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
> > >index a719686c9cce..c034dc4224cb 100644
> > >--- a/include/linux/mtd/nand.h
> > >+++ b/include/linux/mtd/nand.h
> > >@@ -832,4 +832,14 @@ static inline bool nand_is_slc(struct nand_chip *chip)
> > > {
> > > 	return chip->bits_per_cell == 1;
> > > }
> > >+
> > >+/**
> > >+ * Check if the opcode's address should be sent only on the lower 8 bits
> > >+ * @command: opcode to check
> > >+ */
> > >+static inline int nand_opcode_8bits(unsigned int command)
> > >+{
> > >+	return command == NAND_CMD_READID;
> > >+}
> > >+
> > >
> > Can 'nand_opcode_8bits, made a macro instead of inline function ?
> > #define IS_8BIT_CMD(cmd)  (unlikely(cmd == NAND_CMD_READID))
> > 
> > Because 'nand_opcode_8bits' is used in nand_command() and nand_command_lp()
> > which is performance critical code (chip->cmd is called multiple times for fetching
> > page data and OOB).

First of all, I really don't think micro-architectural optimizations are
significant at this point. The overhead (if any exists) is likely very
small, especially relative to other sorts of optimization obstacles
(e.g., use of function pointers). But in any case, optimization must be
informed by data, not simply speculation.

> > Though we should expect compiler to treat this inline function
> > same as macro here, But just to be doubly sure for future changes also.
> > 
> 
> I'm not a compiler guru, but I'd be very surprised if the compiler would make
> a difference here, given the function is static, inline and small. Besides,
> an inline function is more readable and type-aware. I'd say it's the Right
> Thing to do.

I agree that, unless there is a significant demonstrable benefit to do
otherwise, static inline is the way to go (for reasons of type safety,
for one).

> Nevertheless, I did a small check on my platform (handbuilt ARM GCC 4.7.2) and
> both macro and inline is compiled into a "cmp" instruction. The unlikely
> doesn't seem to have any effect.

And so we have data. I'm sure there are other data points to consider
(different compilers, different ARCH, nand_opcode_8bits() used in
different contexts, etc.), but it's not worth it IMO.

Thanks,
Brian
Brian Norris Jan. 30, 2014, 8:47 p.m. UTC | #7
On Thu, Jan 30, 2014 at 08:18:07PM +0000, Pekon Gupta wrote:
> >> >+static inline int nand_opcode_8bits(unsigned int command)
> >> >+{
> >> >+	return command == NAND_CMD_READID;
> >> >+}
> >> >+
> >> >
> >> Can 'nand_opcode_8bits, made a macro instead of inline function ?
> >> #define IS_8BIT_CMD(cmd)  (unlikely(cmd == NAND_CMD_READID))
> >>
> >> Because 'nand_opcode_8bits' is used in nand_command() and nand_command_lp()
> >> which is performance critical code (chip->cmd is called multiple times for fetching
> >> page data and OOB). Though we should expect compiler to treat this inline function
> >> same as macro here, But just to be doubly sure for future changes also.
> >>
> >
> >I'm not a compiler guru, but I'd be very surprised if the compiler would make
> >a difference here, given the function is static, inline and small. Besides,
> >an inline function is more readable and type-aware. I'd say it's the Right
> >Thing to do.
> >
> >Nevertheless, I did a small check on my platform (handbuilt ARM GCC 4.7.2) and
> >both macro and inline is compiled into a "cmp" instruction. The unlikely
> >doesn't seem to have any effect.
> >
> Please check the JMP instruction just after CMP instruction..
> In-case of (unlikely(x == y)) JMP/JE will be used to branch when x == y.
> where as in-case of (likely(x == y)) JMP/JNE will be used to branch x != y.
...
> (this is my understanding, however may be for small branches this has
>  no effect because both paths can be fetched simultaneously into pipeline).

With small branches, ARM just turns them into very compact conditional
instructions, rather than true branches.

This program generates identical code for functionA() and functionB()
when compiled with -O2 on ARM:

/* ------------- BEGIN PROGRAM ---------------- */
#define unlikely(x)    __builtin_expect(!!(x), 0)
#define NAND_CMD_READID	17

#define NAND_OPCODE_8BITS(x) (unlikely(command == NAND_CMD_READID))

static inline int nand_opcode_8bits(unsigned int command)
{
	return unlikely(command == NAND_CMD_READID);
}

int functionA(int command, int col)
{
	if (nand_opcode_8bits(command))
		return col;
	return col >> 1;
}

int functionB(int command, int col)
{
	if (NAND_OPCODE_8BITS(command))
		return col;
	return col >> 1;
}
/* ------------- END PROGRAM ---------------- */

Disassembly:

00000000 <functionA>:
   0:   e3500011        cmp     r0, #17
   4:   11a010c1        asrne   r1, r1, #1
   8:   e1a00001        mov     r0, r1
   c:   e12fff1e        bx      lr

00000010 <functionB>:
  10:   e3500011        cmp     r0, #17
  14:   11a010c1        asrne   r1, r1, #1
  18:   e1a00001        mov     r0, r1
  1c:   e12fff1e        bx      lr

Let's stop the nonsense then :) If you really want to pursue this, give me (1)
an example where they compile differently with (2) real, significant
performance differences. The burden is on you!

Brian
pekon gupta Jan. 31, 2014, 6:55 a.m. UTC | #8
>From: Brian Norris [mailto:computersforpeace@gmail.com]
>
>Let's stop the nonsense then :)
>
Ok .. :-)

Though its redundant, as Ezequiel has already tested these
patches on AM335x using both x8 and x16 Micron devices.
But still I re-tested them on different OMAP boards.

Tested-By: Pekon Gupta <pekon@ti.com>

with regards, pekon
Brian Norris Jan. 31, 2014, 6:04 p.m. UTC | #9
+ Huang

On Fri, Jan 31, 2014 at 06:55:49AM +0000, Pekon Gupta wrote:
> Though its redundant, as Ezequiel has already tested these
> patches on AM335x using both x8 and x16 Micron devices.
> But still I re-tested them on different OMAP boards.

Still worthwhile.

> Tested-By: Pekon Gupta <pekon@ti.com>

Thanks! Pushed both to l2-mtd.git/next.

BTW, I think Huang's JEDEC parameter page support should be modified
similar to patch 1. I'll comment there.

Brian
diff mbox

Patch

diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c
index c36e9b84487c..1955389c8fa6 100644
--- a/drivers/mtd/nand/atmel_nand.c
+++ b/drivers/mtd/nand/atmel_nand.c
@@ -1659,8 +1659,8 @@  static void nfc_select_chip(struct mtd_info *mtd, int chip)
 		nfc_writel(host->nfc->hsmc_regs, CTRL, NFC_CTRL_ENABLE);
 }
 
-static int nfc_make_addr(struct mtd_info *mtd, int column, int page_addr,
-		unsigned int *addr1234, unsigned int *cycle0)
+static int nfc_make_addr(struct mtd_info *mtd, int command, int column,
+		int page_addr, unsigned int *addr1234, unsigned int *cycle0)
 {
 	struct nand_chip *chip = mtd->priv;
 
@@ -1674,7 +1674,8 @@  static int nfc_make_addr(struct mtd_info *mtd, int column, int page_addr,
 	*addr1234 = 0;
 
 	if (column != -1) {
-		if (chip->options & NAND_BUSWIDTH_16)
+		if (chip->options & NAND_BUSWIDTH_16 &&
+				!nand_opcode_8bits(command))
 			column >>= 1;
 		addr_bytes[acycle++] = column & 0xff;
 		if (mtd->writesize > 512)
@@ -1787,8 +1788,8 @@  static void nfc_nand_command(struct mtd_info *mtd, unsigned int command,
 	}
 
 	if (do_addr)
-		acycle = nfc_make_addr(mtd, column, page_addr, &addr1234,
-				&cycle0);
+		acycle = nfc_make_addr(mtd, command, column, page_addr,
+				&addr1234, &cycle0);
 
 	nfc_addr_cmd = cmd1 | cmd2 | vcmd2 | acycle | csid | dataen | nfcwr;
 	nfc_send_command(host, nfc_addr_cmd, addr1234, cycle0);
diff --git a/drivers/mtd/nand/au1550nd.c b/drivers/mtd/nand/au1550nd.c
index 7d84c4e4bf43..bc5c518828d2 100644
--- a/drivers/mtd/nand/au1550nd.c
+++ b/drivers/mtd/nand/au1550nd.c
@@ -307,7 +307,8 @@  static void au1550_command(struct mtd_info *mtd, unsigned command, int column, i
 		/* Serially input address */
 		if (column != -1) {
 			/* Adjust columns for 16 bit buswidth */
-			if (this->options & NAND_BUSWIDTH_16)
+			if (this->options & NAND_BUSWIDTH_16 &&
+					!nand_opcode_8bits(command))
 				column >>= 1;
 			ctx->write_byte(mtd, column);
 		}
diff --git a/drivers/mtd/nand/diskonchip.c b/drivers/mtd/nand/diskonchip.c
index fec31d71b84e..b9b4db607850 100644
--- a/drivers/mtd/nand/diskonchip.c
+++ b/drivers/mtd/nand/diskonchip.c
@@ -698,7 +698,8 @@  static void doc2001plus_command(struct mtd_info *mtd, unsigned command, int colu
 		/* Serially input address */
 		if (column != -1) {
 			/* Adjust columns for 16 bit buswidth */
-			if (this->options & NAND_BUSWIDTH_16)
+			if (this->options & NAND_BUSWIDTH_16 &&
+					!nand_opcode_8bits(command))
 				column >>= 1;
 			WriteDOC(column, docptr, Mplus_FlashAddress);
 		}
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 9111a4c7c7a2..c69a1f7ce0a2 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -589,7 +589,8 @@  static void nand_command(struct mtd_info *mtd, unsigned int command,
 	/* Serially input address */
 	if (column != -1) {
 		/* Adjust columns for 16 bit buswidth */
-		if (chip->options & NAND_BUSWIDTH_16)
+		if (chip->options & NAND_BUSWIDTH_16 &&
+				!nand_opcode_8bits(command))
 			column >>= 1;
 		chip->cmd_ctrl(mtd, column, ctrl);
 		ctrl &= ~NAND_CTRL_CHANGE;
@@ -680,7 +681,8 @@  static void nand_command_lp(struct mtd_info *mtd, unsigned int command,
 		/* Serially input address */
 		if (column != -1) {
 			/* Adjust columns for 16 bit buswidth */
-			if (chip->options & NAND_BUSWIDTH_16)
+			if (chip->options & NAND_BUSWIDTH_16 &&
+					!nand_opcode_8bits(command))
 				column >>= 1;
 			chip->cmd_ctrl(mtd, column, ctrl);
 			ctrl &= ~NAND_CTRL_CHANGE;
diff --git a/drivers/mtd/nand/nuc900_nand.c b/drivers/mtd/nand/nuc900_nand.c
index 90c99ea5184a..331fccbdde61 100644
--- a/drivers/mtd/nand/nuc900_nand.c
+++ b/drivers/mtd/nand/nuc900_nand.c
@@ -151,7 +151,8 @@  static void nuc900_nand_command_lp(struct mtd_info *mtd, unsigned int command,
 	if (column != -1 || page_addr != -1) {
 
 		if (column != -1) {
-			if (chip->options & NAND_BUSWIDTH_16)
+			if (chip->options & NAND_BUSWIDTH_16 &&
+					!nand_opcode_8bits(command))
 				column >>= 1;
 			write_addr_reg(nand, column);
 			write_addr_reg(nand, column >> 8 | ENDADDR);
diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index a719686c9cce..c034dc4224cb 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -832,4 +832,14 @@  static inline bool nand_is_slc(struct nand_chip *chip)
 {
 	return chip->bits_per_cell == 1;
 }
+
+/**
+ * Check if the opcode's address should be sent only on the lower 8 bits
+ * @command: opcode to check
+ */
+static inline int nand_opcode_8bits(unsigned int command)
+{
+	return command == NAND_CMD_READID;
+}
+
 #endif /* __LINUX_MTD_NAND_H */