diff mbox series

[RFC] mtd: spinand: fix detection of GD5FxGQ4xA flash

Message ID 20191016013845.23508-1-gch981213@gmail.com
State Superseded, archived
Headers show
Series [RFC] mtd: spinand: fix detection of GD5FxGQ4xA flash | expand

Commit Message

Chuanhong Guo Oct. 16, 2019, 1:38 a.m. UTC
GD5FxGQ4xA didn't follow the SPI spec to keep MISO low while slave is
reading, and instead MISO is kept high. As a result, the first byte
of id becomes 0xFF.
Since the first byte isn't supposed to be checked at all, this patch
just removed that check.

While at it, redo the comment above to better explain what's happening.

Fixes: cfd93d7c908e ("mtd: spinand: Add support for GigaDevice GD5F1GQ4UFxxG")
Signed-off-by: Chuanhong Guo <gch981213@gmail.com>
CC: Jeff Kletsky <git-commits@allycomm.com>
---
RFC:
I doubt whether this patch is a proper fix for the underlying problem:
The actual problem is that we have two different implementation of read id
command: One replies immediately after master sending 0x9f and the other
need to send 0x9f and an offset byte (found in winbond and early GD flashes.)
Current code only works if SPI master is properly implemented (i.e. keep MOSI
low while reading.)
I'm wondering if it worths to split the implementation of read_id into two
variants and assign corresponding ID tables to each variant, or we could
trust all SPI controllers and this fix is sufficient.

 drivers/mtd/nand/spi/gigadevice.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Miquel Raynal Oct. 28, 2019, 4:41 p.m. UTC | #1
Hello,

Chuanhong Guo <gch981213@gmail.com> wrote on Wed, 16 Oct 2019 09:38:24
+0800:

> GD5FxGQ4xA didn't follow the SPI spec to keep MISO low while slave is
> reading, and instead MISO is kept high. As a result, the first byte
> of id becomes 0xFF.
> Since the first byte isn't supposed to be checked at all, this patch
> just removed that check.
> 
> While at it, redo the comment above to better explain what's happening.
> 
> Fixes: cfd93d7c908e ("mtd: spinand: Add support for GigaDevice GD5F1GQ4UFxxG")
> Signed-off-by: Chuanhong Guo <gch981213@gmail.com>
> CC: Jeff Kletsky <git-commits@allycomm.com>
> ---
> RFC:
> I doubt whether this patch is a proper fix for the underlying problem:
> The actual problem is that we have two different implementation of read id
> command: One replies immediately after master sending 0x9f and the other
> need to send 0x9f and an offset byte (found in winbond and early GD flashes.)
> Current code only works if SPI master is properly implemented (i.e. keep MOSI
> low while reading.)

I am not entirely against the fix, but this is a SPI host controller
issue, right? Can you try to fix the controller driver instead?

Thanks,
Miquèl
Chuanhong Guo Nov. 3, 2019, 12:03 p.m. UTC | #2
Hi!

On Tue, Oct 29, 2019 at 12:41 AM Miquel Raynal
<miquel.raynal@bootlin.com> wrote:
>
> Hello,
>
> Chuanhong Guo <gch981213@gmail.com> wrote on Wed, 16 Oct 2019 09:38:24
> +0800:
>
> > GD5FxGQ4xA didn't follow the SPI spec to keep MISO low while slave is
> > reading, and instead MISO is kept high. As a result, the first byte
> > of id becomes 0xFF.
> > Since the first byte isn't supposed to be checked at all, this patch
> > just removed that check.
> >
> > While at it, redo the comment above to better explain what's happening.
> >
> > Fixes: cfd93d7c908e ("mtd: spinand: Add support for GigaDevice GD5F1GQ4UFxxG")
> > Signed-off-by: Chuanhong Guo <gch981213@gmail.com>
> > CC: Jeff Kletsky <git-commits@allycomm.com>
> > ---
> > RFC:
> > I doubt whether this patch is a proper fix for the underlying problem:
> > The actual problem is that we have two different implementation of read id
> > command: One replies immediately after master sending 0x9f and the other
> > need to send 0x9f and an offset byte (found in winbond and early GD flashes.)

Correction: Only early GigaDevice nand chips uses this implementation.
Winbond chips uses a dummy byte instead of an address byte so there's
no problem for Winbond chips.

> > Current code only works if SPI master is properly implemented (i.e. keep MOSI
> > low while reading.)
>
> I am not entirely against the fix, but this is a SPI host controller
> issue, right? Can you try to fix the controller driver instead?

I think this is a spi nand framework issue. GigaDevice uses an unusual
READ ID implementation, and as a result, both host controller and chip
are reading during the first byte after 0x9f command: chip is reading
the address/offset byte and host is expecting the first ID byte.
Here lies two problems:
1. According to the sequence diagram in their datasheet, MISO pin is
in High-Z state during the 0x9f command and the offset byte, and host
could read anything during this time instead of a fixed 0x0 or 0xff
byte, so the check of first byte should be removed. This is what this
patch is doing.
2. If there's a buggy SPI host controller that didn't keep MOSI low
during reading operation, the chip will get 0xff as ID offset, causing
the read vendor/device ID to be swapped. I never met such a controller
so far, but if there is one, it will be a silicon bug that can't be
fixed by software. To fix this one, we'll have to make a second
read-id implementation in spi nand framework.

The second problem only exist in theory, so my preference is to apply
this patch and fix only the first problem for now.

Regards,
Chuanhong Guo
Boris Brezillon Nov. 3, 2019, 1:27 p.m. UTC | #3
On Sun, 3 Nov 2019 20:03:21 +0800
Chuanhong Guo <gch981213@gmail.com> wrote:

> Hi!
> 
> On Tue, Oct 29, 2019 at 12:41 AM Miquel Raynal
> <miquel.raynal@bootlin.com> wrote:
> >
> > Hello,
> >
> > Chuanhong Guo <gch981213@gmail.com> wrote on Wed, 16 Oct 2019 09:38:24
> > +0800:
> >  
> > > GD5FxGQ4xA didn't follow the SPI spec to keep MISO low while slave is
> > > reading, and instead MISO is kept high. As a result, the first byte
> > > of id becomes 0xFF.
> > > Since the first byte isn't supposed to be checked at all, this patch
> > > just removed that check.
> > >
> > > While at it, redo the comment above to better explain what's happening.
> > >
> > > Fixes: cfd93d7c908e ("mtd: spinand: Add support for GigaDevice GD5F1GQ4UFxxG")
> > > Signed-off-by: Chuanhong Guo <gch981213@gmail.com>
> > > CC: Jeff Kletsky <git-commits@allycomm.com>
> > > ---
> > > RFC:
> > > I doubt whether this patch is a proper fix for the underlying problem:
> > > The actual problem is that we have two different implementation of read id
> > > command: One replies immediately after master sending 0x9f and the other
> > > need to send 0x9f and an offset byte (found in winbond and early GD flashes.)  
> 
> Correction: Only early GigaDevice nand chips uses this implementation.
> Winbond chips uses a dummy byte instead of an address byte so there's
> no problem for Winbond chips.
> 
> > > Current code only works if SPI master is properly implemented (i.e. keep MOSI
> > > low while reading.)  
> >
> > I am not entirely against the fix, but this is a SPI host controller
> > issue, right? Can you try to fix the controller driver instead?  
> 
> I think this is a spi nand framework issue. GigaDevice uses an unusual
> READ ID implementation, and as a result, both host controller and chip
> are reading during the first byte after 0x9f command: chip is reading
> the address/offset byte and host is expecting the first ID byte.
> Here lies two problems:
> 1. According to the sequence diagram in their datasheet, MISO pin is
> in High-Z state during the 0x9f command and the offset byte, and host
> could read anything during this time instead of a fixed 0x0 or 0xff
> byte, so the check of first byte should be removed. This is what this
> patch is doing.
> 2. If there's a buggy SPI host controller that didn't keep MOSI low
> during reading operation, the chip will get 0xff as ID offset, causing
> the read vendor/device ID to be swapped. I never met such a controller
> so far, but if there is one, it will be a silicon bug that can't be
> fixed by software. To fix this one, we'll have to make a second
> read-id implementation in spi nand framework.

I realize how fragile this ID-based detection is when manufacturers
decide to not follow the standard READID semantic (one 0x9f command byte
followed by 1 or more input cycles encoding the ID). Let's imagine you
have a valid manuf ID byte in ID[0], and the device ID (ID[1]) matches
the Winbond or Gigadevice manufacturer ID, and ID[3] (extended Device ID
byte?) matches a valid Winbond/Gigadevice device ID. If you skip the
check on ID[0] you might erroneously detect a Winbond or Gigadevice
NAND, while it's actually something else.

Note that I don't really have a solution to make this detection more
robust.

> 
> The second problem only exist in theory, so my preference is to apply
> this patch and fix only the first problem for now.

I think we should fix that problem now. Maybe by doing a 3 steps
detection:

1/ READID + ID[]
2/ READID + DUMMY + ID[]
3/ READID + ADDR + ID[]

At each step we would check if the returned ID matches a valid NAND,
and if it does, stop there.
Miquel Raynal Nov. 5, 2019, 6:20 p.m. UTC | #4
Hi Boris,

Boris Brezillon <boris.brezillon@collabora.com> wrote on Sun, 3 Nov
2019 14:27:41 +0100:

> On Sun, 3 Nov 2019 20:03:21 +0800
> Chuanhong Guo <gch981213@gmail.com> wrote:
> 
> > Hi!
> > 
> > On Tue, Oct 29, 2019 at 12:41 AM Miquel Raynal
> > <miquel.raynal@bootlin.com> wrote:  
> > >
> > > Hello,
> > >
> > > Chuanhong Guo <gch981213@gmail.com> wrote on Wed, 16 Oct 2019 09:38:24
> > > +0800:
> > >    
> > > > GD5FxGQ4xA didn't follow the SPI spec to keep MISO low while slave is
> > > > reading, and instead MISO is kept high. As a result, the first byte
> > > > of id becomes 0xFF.
> > > > Since the first byte isn't supposed to be checked at all, this patch
> > > > just removed that check.
> > > >
> > > > While at it, redo the comment above to better explain what's happening.
> > > >
> > > > Fixes: cfd93d7c908e ("mtd: spinand: Add support for GigaDevice GD5F1GQ4UFxxG")
> > > > Signed-off-by: Chuanhong Guo <gch981213@gmail.com>
> > > > CC: Jeff Kletsky <git-commits@allycomm.com>
> > > > ---
> > > > RFC:
> > > > I doubt whether this patch is a proper fix for the underlying problem:
> > > > The actual problem is that we have two different implementation of read id
> > > > command: One replies immediately after master sending 0x9f and the other
> > > > need to send 0x9f and an offset byte (found in winbond and early GD flashes.)    
> > 
> > Correction: Only early GigaDevice nand chips uses this implementation.
> > Winbond chips uses a dummy byte instead of an address byte so there's
> > no problem for Winbond chips.
> >   
> > > > Current code only works if SPI master is properly implemented (i.e. keep MOSI
> > > > low while reading.)    
> > >
> > > I am not entirely against the fix, but this is a SPI host controller
> > > issue, right? Can you try to fix the controller driver instead?    
> > 
> > I think this is a spi nand framework issue. GigaDevice uses an unusual
> > READ ID implementation, and as a result, both host controller and chip
> > are reading during the first byte after 0x9f command: chip is reading
> > the address/offset byte and host is expecting the first ID byte.
> > Here lies two problems:
> > 1. According to the sequence diagram in their datasheet, MISO pin is
> > in High-Z state during the 0x9f command and the offset byte, and host
> > could read anything during this time instead of a fixed 0x0 or 0xff
> > byte, so the check of first byte should be removed. This is what this
> > patch is doing.
> > 2. If there's a buggy SPI host controller that didn't keep MOSI low
> > during reading operation, the chip will get 0xff as ID offset, causing
> > the read vendor/device ID to be swapped. I never met such a controller
> > so far, but if there is one, it will be a silicon bug that can't be
> > fixed by software. To fix this one, we'll have to make a second
> > read-id implementation in spi nand framework.  
> 
> I realize how fragile this ID-based detection is when manufacturers
> decide to not follow the standard READID semantic (one 0x9f command byte
> followed by 1 or more input cycles encoding the ID). Let's imagine you
> have a valid manuf ID byte in ID[0], and the device ID (ID[1]) matches
> the Winbond or Gigadevice manufacturer ID, and ID[3] (extended Device ID
> byte?) matches a valid Winbond/Gigadevice device ID. If you skip the
> check on ID[0] you might erroneously detect a Winbond or Gigadevice
> NAND, while it's actually something else.
> 
> Note that I don't really have a solution to make this detection more
> robust.
> 
> > 
> > The second problem only exist in theory, so my preference is to apply
> > this patch and fix only the first problem for now.  
> 
> I think we should fix that problem now. Maybe by doing a 3 steps
> detection:
> 
> 1/ READID + ID[]
> 2/ READID + DUMMY + ID[]
> 3/ READID + ADDR + ID[]
> 
> At each step we would check if the returned ID matches a valid NAND,
> and if it does, stop there.

I like the idea. That will be way more robust.

Thanks,
Miquèl
diff mbox series

Patch

diff --git a/drivers/mtd/nand/spi/gigadevice.c b/drivers/mtd/nand/spi/gigadevice.c
index e99d425aa93f..ab0e53b09f0c 100644
--- a/drivers/mtd/nand/spi/gigadevice.c
+++ b/drivers/mtd/nand/spi/gigadevice.c
@@ -249,13 +249,14 @@  static int gigadevice_spinand_detect(struct spinand_device *spinand)
 	int ret;
 
 	/*
-	 * Earlier GDF5-series devices (A,E) return [0][MID][DID]
-	 * Later (F) devices return [MID][DID1][DID2]
+	 * Earlier GDF5-series devices (A,E) need sending an extra offset
+	 * byte before replying flash ID, so the first byte is undetermined.
+	 * Later (F) devices don't need that.
 	 */
 
 	if (id[0] == SPINAND_MFR_GIGADEVICE)
 		did = (id[1] << 8) + id[2];
-	else if (id[0] == 0 && id[1] == SPINAND_MFR_GIGADEVICE)
+	else if (id[1] == SPINAND_MFR_GIGADEVICE)
 		did = id[2];
 	else
 		return 0;