Message ID | 89ae32a0-9b19-4735-90eb-4ffa22aad704@kernel.org |
---|---|
State | Not Applicable |
Delegated to: | Miquel Raynal |
Headers | show |
Series | GPMI iMX6ull timeout on DMA | expand |
Hi Greg, One question below. +Michael +Sascha Hello Michael, here is a similar issue to yours, I know you did not have enough time to share your solution but here we have someone else reproducing the issue, would you mind sharing a branch or a patch, even a WIP one, just to help debugging? Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > Hi Miquel, > > I am experiencing a problem with NAND flash DMA timeouts on > iMX6ull based boards. The problem is very similar to that > described in: > > https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > > That didn't come to any specific resolution that I could see > in that thread. > > The boot trace on the console for me looks like this: > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > gpmi-nand 1806000.gpmi-nand: Show BCH registers : > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: BCH Geometry : > GF length : 13 > ECC Strength : 8 > Page Size in Bytes : 2110 > Metadata Size in Bytes : 10 > ECC Chunk0 Size in Bytes: 512 > ECC Chunkn Size in Bytes: 512 > ECC Chunk Count : 4 > Payload Size in Bytes : 2048 > Auxiliary Size in Bytes: 16 > Auxiliary Status Offset: 12 > Block Mark Byte Offset : 1999 > Block Mark Bit Offset : 0 > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > nand: timing mode 5 not acknowledged by the NAND chip What is the final timing mode used? Most of us tested in mode 5 I guess, maybe mode 4 is broken (don't know if this is the one used here, neither why mode 5 is refused). Can you please try by limiting the mode to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > Scanning device for bad blocks > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > .... > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > 5 fixed-partitions partitions found on MTD device gpmi-nand > Creating 5 MTD partitions on "gpmi-nand": > 0x000000000000-0x000000500000 : "u-boot" > 0x000000500000-0x000000600000 : "u-boot-env" > 0x000000600000-0x000000800000 : "log" > 0x000000800000-0x000010000000 : "flash" > 0x000000000000-0x000010000000 : "all" > gpmi-nand 1806000.gpmi-nand: driver registered. > > > This is using a linux kernel v5.1.14. I have seen this happen on > a number of boards I have here - but it is only occasional. It > only happens once in a while on boot, maybe 1 in 40 or more times. > So it can take quite a while to reproduce (using a boot loop setup). That's strange... I don't get what would produce such unstable issue. > > As per the email thread I pointed to above I looked at reverting > those patches, but that was not at all easy given how much the gpmi > driver code had moved. So instead I modified the code with this: > > --- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > { > +#if 0 > struct gpmi_nfc_hardware_timing *hw = &this->hw; > struct resources *r = &this->resources; > void __iomem *gpmi_regs = r->gpmi_regs; > @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > /* Wait for the DLL to settle. */ > udelay(dll_wait_time_us); > +#endif > } > int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > > So far after a couple of days of testing with this I no longer > see the DMA timeout. > > Any thoughts? > > Regards > Greg > Thanks, Miquèl
Hi all On Mon, Jul 29, 2019 at 10:36 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > Hi Greg, > > One question below. > > +Michael > +Sascha > > Hello Michael, here is a similar issue to yours, I know you did not > have enough time to share your solution but here we have someone else > reproducing the issue, would you mind sharing a branch or a patch, even > a WIP one, just to help debugging? > I have patches reverted as I mention in the email. The step to reproduce is simple. Just reboot every successful boot. Michael > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > > > Hi Miquel, > > > > I am experiencing a problem with NAND flash DMA timeouts on > > iMX6ull based boards. The problem is very similar to that > > described in: > > > > https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > > > > That didn't come to any specific resolution that I could see > > in that thread. > > > > The boot trace on the console for me looks like this: > > > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > > nand: Micron MT29F2G08ABAEAWP > > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > GF length : 13 > > ECC Strength : 8 > > Page Size in Bytes : 2110 > > Metadata Size in Bytes : 10 > > ECC Chunk0 Size in Bytes: 512 > > ECC Chunkn Size in Bytes: 512 > > ECC Chunk Count : 4 > > Payload Size in Bytes : 2048 > > Auxiliary Size in Bytes: 16 > > Auxiliary Status Offset: 12 > > Block Mark Byte Offset : 1999 > > Block Mark Bit Offset : 0 > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > > nand: timing mode 5 not acknowledged by the NAND chip > > What is the final timing mode used? Most of us tested in mode 5 I > guess, maybe mode 4 is broken (don't know if this is the one used here, > neither why mode 5 is refused). Can you please try by limiting the mode > to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > Scanning device for bad blocks > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > .... > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > 5 fixed-partitions partitions found on MTD device gpmi-nand > > Creating 5 MTD partitions on "gpmi-nand": > > 0x000000000000-0x000000500000 : "u-boot" > > 0x000000500000-0x000000600000 : "u-boot-env" > > 0x000000600000-0x000000800000 : "log" > > 0x000000800000-0x000010000000 : "flash" > > 0x000000000000-0x000010000000 : "all" > > gpmi-nand 1806000.gpmi-nand: driver registered. > > > > > > This is using a linux kernel v5.1.14. I have seen this happen on > > a number of boards I have here - but it is only occasional. It > > only happens once in a while on boot, maybe 1 in 40 or more times. > > So it can take quite a while to reproduce (using a boot loop setup). > > That's strange... I don't get what would produce such unstable issue. > > > > > As per the email thread I pointed to above I looked at reverting > > those patches, but that was not at all easy given how much the gpmi > > driver code had moved. So instead I modified the code with this: > > > > --- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > { > > +#if 0 > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > struct resources *r = &this->resources; > > void __iomem *gpmi_regs = r->gpmi_regs; > > @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > /* Wait for the DLL to settle. */ > > udelay(dll_wait_time_us); > > +#endif > > } > > int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > > > > So far after a couple of days of testing with this I no longer > > see the DMA timeout. > > > > Any thoughts? > > > > Regards > > Greg > > > > Thanks, > Miquèl
Hi Michael, On 29/7/19 6:42 pm, Michael Nazzareno Trimarchi wrote: > Hi all > > On Mon, Jul 29, 2019 at 10:36 AM Miquel Raynal > <miquel.raynal@bootlin.com> wrote: >> >> Hi Greg, >> >> One question below. >> >> +Michael >> +Sascha >> >> Hello Michael, here is a similar issue to yours, I know you did not >> have enough time to share your solution but here we have someone else >> reproducing the issue, would you mind sharing a branch or a patch, even >> a WIP one, just to help debugging? >> > > I have patches reverted as I mention in the email. The step to > reproduce is simple. > > Just reboot every successful boot. Testing like this how often does it occur? Regards Greg > Michael > >> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >> >>> Hi Miquel, >>> >>> I am experiencing a problem with NAND flash DMA timeouts on >>> iMX6ull based boards. The problem is very similar to that >>> described in: >>> >>> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma >>> >>> That didn't come to any specific resolution that I could see >>> in that thread. >>> >>> The boot trace on the console for me looks like this: >>> >>> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >>> nand: Micron MT29F2G08ABAEAWP >>> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >>> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA >>> gpmi-nand 1806000.gpmi-nand: Show GPMI registers : >>> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 >>> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c >>> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 >>> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 >>> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee >>> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 >>> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 >>> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 >>> gpmi-nand 1806000.gpmi-nand: Show BCH registers : >>> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 >>> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 >>> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 >>> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 >>> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 >>> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 >>> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 >>> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 >>> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 >>> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 >>> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 >>> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 >>> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 >>> gpmi-nand 1806000.gpmi-nand: BCH Geometry : >>> GF length : 13 >>> ECC Strength : 8 >>> Page Size in Bytes : 2110 >>> Metadata Size in Bytes : 10 >>> ECC Chunk0 Size in Bytes: 512 >>> ECC Chunkn Size in Bytes: 512 >>> ECC Chunk Count : 4 >>> Payload Size in Bytes : 2048 >>> Auxiliary Size in Bytes: 16 >>> Auxiliary Status Offset: 12 >>> Block Mark Byte Offset : 1999 >>> Block Mark Bit Offset : 0 >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 >>> nand: timing mode 5 not acknowledged by the NAND chip >> >> What is the final timing mode used? Most of us tested in mode 5 I >> guess, maybe mode 4 is broken (don't know if this is the one used here, >> neither why mode 5 is refused). Can you please try by limiting the mode >> to 0, 1, 2... until, hopefully, we narrow down to the failing mode. >> >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>> Scanning device for bad blocks >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>> .... >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>> 5 fixed-partitions partitions found on MTD device gpmi-nand >>> Creating 5 MTD partitions on "gpmi-nand": >>> 0x000000000000-0x000000500000 : "u-boot" >>> 0x000000500000-0x000000600000 : "u-boot-env" >>> 0x000000600000-0x000000800000 : "log" >>> 0x000000800000-0x000010000000 : "flash" >>> 0x000000000000-0x000010000000 : "all" >>> gpmi-nand 1806000.gpmi-nand: driver registered. >>> >>> >>> This is using a linux kernel v5.1.14. I have seen this happen on >>> a number of boards I have here - but it is only occasional. It >>> only happens once in a while on boot, maybe 1 in 40 or more times. >>> So it can take quite a while to reproduce (using a boot loop setup). >> >> That's strange... I don't get what would produce such unstable issue. >> >>> >>> As per the email thread I pointed to above I looked at reverting >>> those patches, but that was not at all easy given how much the gpmi >>> driver code had moved. So instead I modified the code with this: >>> >>> --- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c >>> +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c >>> @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, >>> void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >>> { >>> +#if 0 >>> struct gpmi_nfc_hardware_timing *hw = &this->hw; >>> struct resources *r = &this->resources; >>> void __iomem *gpmi_regs = r->gpmi_regs; >>> @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >>> /* Wait for the DLL to settle. */ >>> udelay(dll_wait_time_us); >>> +#endif >>> } >>> int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, >>> >>> So far after a couple of days of testing with this I no longer >>> see the DMA timeout. >>> >>> Any thoughts? >>> >>> Regards >>> Greg >>> >> >> Thanks, >> Miquèl > > >
Hi On Mon, Jul 29, 2019 at 2:19 PM Greg Ungerer <gerg@kernel.org> wrote: > > Hi Michael, > > On 29/7/19 6:42 pm, Michael Nazzareno Trimarchi wrote: > > Hi all > > > > On Mon, Jul 29, 2019 at 10:36 AM Miquel Raynal > > <miquel.raynal@bootlin.com> wrote: > >> > >> Hi Greg, > >> > >> One question below. > >> > >> +Michael > >> +Sascha > >> > >> Hello Michael, here is a similar issue to yours, I know you did not > >> have enough time to share your solution but here we have someone else > >> reproducing the issue, would you mind sharing a branch or a patch, even > >> a WIP one, just to help debugging? > >> > > > > I have patches reverted as I mention in the email. The step to > > reproduce is simple. > > > > Just reboot every successful boot. > > Testing like this how often does it occur? > Not more then 60 reboot ;). The problem is how is done the code the system can not rebooting without an hardware watchdog MIchael > Regards > Greg > > > > Michael > > > >> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > >> > >>> Hi Miquel, > >>> > >>> I am experiencing a problem with NAND flash DMA timeouts on > >>> iMX6ull based boards. The problem is very similar to that > >>> described in: > >>> > >>> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > >>> > >>> That didn't come to any specific resolution that I could see > >>> in that thread. > >>> > >>> The boot trace on the console for me looks like this: > >>> > >>> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > >>> nand: Micron MT29F2G08ABAEAWP > >>> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > >>> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > >>> gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > >>> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > >>> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > >>> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > >>> gpmi-nand 1806000.gpmi-nand: Show BCH registers : > >>> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > >>> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > >>> gpmi-nand 1806000.gpmi-nand: BCH Geometry : > >>> GF length : 13 > >>> ECC Strength : 8 > >>> Page Size in Bytes : 2110 > >>> Metadata Size in Bytes : 10 > >>> ECC Chunk0 Size in Bytes: 512 > >>> ECC Chunkn Size in Bytes: 512 > >>> ECC Chunk Count : 4 > >>> Payload Size in Bytes : 2048 > >>> Auxiliary Size in Bytes: 16 > >>> Auxiliary Status Offset: 12 > >>> Block Mark Byte Offset : 1999 > >>> Block Mark Bit Offset : 0 > >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > >>> nand: timing mode 5 not acknowledged by the NAND chip > >> > >> What is the final timing mode used? Most of us tested in mode 5 I > >> guess, maybe mode 4 is broken (don't know if this is the one used here, > >> neither why mode 5 is refused). Can you please try by limiting the mode > >> to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > >> > >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >>> Scanning device for bad blocks > >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >>> .... > >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >>> 5 fixed-partitions partitions found on MTD device gpmi-nand > >>> Creating 5 MTD partitions on "gpmi-nand": > >>> 0x000000000000-0x000000500000 : "u-boot" > >>> 0x000000500000-0x000000600000 : "u-boot-env" > >>> 0x000000600000-0x000000800000 : "log" > >>> 0x000000800000-0x000010000000 : "flash" > >>> 0x000000000000-0x000010000000 : "all" > >>> gpmi-nand 1806000.gpmi-nand: driver registered. > >>> > >>> > >>> This is using a linux kernel v5.1.14. I have seen this happen on > >>> a number of boards I have here - but it is only occasional. It > >>> only happens once in a while on boot, maybe 1 in 40 or more times. > >>> So it can take quite a while to reproduce (using a boot loop setup). > >> > >> That's strange... I don't get what would produce such unstable issue. > >> > >>> > >>> As per the email thread I pointed to above I looked at reverting > >>> those patches, but that was not at all easy given how much the gpmi > >>> driver code had moved. So instead I modified the code with this: > >>> > >>> --- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > >>> +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > >>> @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > >>> void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > >>> { > >>> +#if 0 > >>> struct gpmi_nfc_hardware_timing *hw = &this->hw; > >>> struct resources *r = &this->resources; > >>> void __iomem *gpmi_regs = r->gpmi_regs; > >>> @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > >>> /* Wait for the DLL to settle. */ > >>> udelay(dll_wait_time_us); > >>> +#endif > >>> } > >>> int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > >>> > >>> So far after a couple of days of testing with this I no longer > >>> see the DMA timeout. > >>> > >>> Any thoughts? > >>> > >>> Regards > >>> Greg > >>> > >> > >> Thanks, > >> Miquèl > > > > > >
Hi Miquel, On 29/7/19 6:36 pm, Miquel Raynal wrote: > Hi Greg, > > One question below. > > +Michael > +Sascha > > Hello Michael, here is a similar issue to yours, I know you did not > have enough time to share your solution but here we have someone else > reproducing the issue, would you mind sharing a branch or a patch, even > a WIP one, just to help debugging? > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > >> Hi Miquel, >> >> I am experiencing a problem with NAND flash DMA timeouts on >> iMX6ull based boards. The problem is very similar to that >> described in: >> >> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma >> >> That didn't come to any specific resolution that I could see >> in that thread. >> >> The boot trace on the console for me looks like this: >> >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >> nand: Micron MT29F2G08ABAEAWP >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA >> gpmi-nand 1806000.gpmi-nand: Show GPMI registers : >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 >> gpmi-nand 1806000.gpmi-nand: Show BCH registers : >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 >> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 >> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 >> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 >> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 >> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 >> gpmi-nand 1806000.gpmi-nand: BCH Geometry : >> GF length : 13 >> ECC Strength : 8 >> Page Size in Bytes : 2110 >> Metadata Size in Bytes : 10 >> ECC Chunk0 Size in Bytes: 512 >> ECC Chunkn Size in Bytes: 512 >> ECC Chunk Count : 4 >> Payload Size in Bytes : 2048 >> Auxiliary Size in Bytes: 16 >> Auxiliary Status Offset: 12 >> Block Mark Byte Offset : 1999 >> Block Mark Bit Offset : 0 >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 >> nand: timing mode 5 not acknowledged by the NAND chip > > What is the final timing mode used? Most of us tested in mode 5 I > guess, maybe mode 4 is broken (don't know if this is the one used here, > neither why mode 5 is refused). Can you please try by limiting the mode > to 0, 1, 2... until, hopefully, we narrow down to the failing mode. Sure, how to do that? >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >> Scanning device for bad blocks >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >> .... >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >> 5 fixed-partitions partitions found on MTD device gpmi-nand >> Creating 5 MTD partitions on "gpmi-nand": >> 0x000000000000-0x000000500000 : "u-boot" >> 0x000000500000-0x000000600000 : "u-boot-env" >> 0x000000600000-0x000000800000 : "log" >> 0x000000800000-0x000010000000 : "flash" >> 0x000000000000-0x000010000000 : "all" >> gpmi-nand 1806000.gpmi-nand: driver registered. >> >> >> This is using a linux kernel v5.1.14. I have seen this happen on >> a number of boards I have here - but it is only occasional. It >> only happens once in a while on boot, maybe 1 in 40 or more times. >> So it can take quite a while to reproduce (using a boot loop setup). > > That's strange... I don't get what would produce such unstable issue. My initial guess is that the calculated timing is very marginal. The problem seems more likely to happen if flash write activity had been occurring just before a soft reboot. Its not a guarantee, just more likely. Interesting observation is that Michael was using Micron flash, and boards that I have with the problem also have Micron flash. Both a form of Micron MT29F2G08. I have similar boards, iMX6ull based, with different brands of NAND flash and I have not seen any problem on them. Regards Greg >> As per the email thread I pointed to above I looked at reverting >> those patches, but that was not at all easy given how much the gpmi >> driver code had moved. So instead I modified the code with this: >> >> --- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c >> +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c >> @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, >> void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >> { >> +#if 0 >> struct gpmi_nfc_hardware_timing *hw = &this->hw; >> struct resources *r = &this->resources; >> void __iomem *gpmi_regs = r->gpmi_regs; >> @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >> /* Wait for the DLL to settle. */ >> udelay(dll_wait_time_us); >> +#endif >> } >> int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, >> >> So far after a couple of days of testing with this I no longer >> see the DMA timeout. >> >> Any thoughts? >> >> Regards >> Greg >> > > Thanks, > Miquèl >
Hi Greg, + Boris Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > Hi Miquel, > > On 29/7/19 6:36 pm, Miquel Raynal wrote: > > Hi Greg, > > > > One question below. > > > > +Michael > > +Sascha > > > > Hello Michael, here is a similar issue to yours, I know you did not > > have enough time to share your solution but here we have someone else > > reproducing the issue, would you mind sharing a branch or a patch, even > > a WIP one, just to help debugging? > > > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > > > >> Hi Miquel, > >> > >> I am experiencing a problem with NAND flash DMA timeouts on > >> iMX6ull based boards. The problem is very similar to that > >> described in: > >> > >> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > >> > >> That didn't come to any specific resolution that I could see > >> in that thread. > >> > >> The boot trace on the console for me looks like this: > >> > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > >> nand: Micron MT29F2G08ABAEAWP > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > >> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > >> gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > >> gpmi-nand 1806000.gpmi-nand: Show BCH registers : > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > >> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > >> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > >> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > >> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > >> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > >> gpmi-nand 1806000.gpmi-nand: BCH Geometry : > >> GF length : 13 > >> ECC Strength : 8 > >> Page Size in Bytes : 2110 > >> Metadata Size in Bytes : 10 > >> ECC Chunk0 Size in Bytes: 512 > >> ECC Chunkn Size in Bytes: 512 > >> ECC Chunk Count : 4 > >> Payload Size in Bytes : 2048 > >> Auxiliary Size in Bytes: 16 > >> Auxiliary Status Offset: 12 > >> Block Mark Byte Offset : 1999 > >> Block Mark Bit Offset : 0 > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > >> nand: timing mode 5 not acknowledged by the NAND chip > > > > What is the final timing mode used? Most of us tested in mode 5 I > > guess, maybe mode 4 is broken (don't know if this is the one used here, > > neither why mode 5 is refused). Can you please try by limiting the mode > > to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > > Sure, how to do that? This loop [1] tries to configure each mode (5, 4, ...) until one succeeds (default is 0: must always work). Please try to limit mode to 0, 1, etc. Mode 0 should work. [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933 > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >> Scanning device for bad blocks > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >> .... > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > >> 5 fixed-partitions partitions found on MTD device gpmi-nand > >> Creating 5 MTD partitions on "gpmi-nand": > >> 0x000000000000-0x000000500000 : "u-boot" > >> 0x000000500000-0x000000600000 : "u-boot-env" > >> 0x000000600000-0x000000800000 : "log" > >> 0x000000800000-0x000010000000 : "flash" > >> 0x000000000000-0x000010000000 : "all" > >> gpmi-nand 1806000.gpmi-nand: driver registered. > >> > >> > >> This is using a linux kernel v5.1.14. I have seen this happen on > >> a number of boards I have here - but it is only occasional. It > >> only happens once in a while on boot, maybe 1 in 40 or more times. > >> So it can take quite a while to reproduce (using a boot loop setup). > > > > That's strange... I don't get what would produce such unstable issue. > > My initial guess is that the calculated timing is very marginal. What do you mean by "marginal"? > The problem seems more likely to happen if flash write activity > had been occurring just before a soft reboot. Its not a guarantee, > just more likely. That's really disturbing. I doubt this is the real cause though. > > Interesting observation is that Michael was using Micron flash, > and boards that I have with the problem also have Micron flash. > Both a form of Micron MT29F2G08. > > I have similar boards, iMX6ull based, with different brands of > NAND flash and I have not seen any problem on them. That's great to narrow down the root cause. Maybe these chips have tighter timing constraints. > > Regards > Greg > > > > >> As per the email thread I pointed to above I looked at reverting > >> those patches, but that was not at all easy given how much the gpmi > >> driver code had moved. So instead I modified the code with this: > >> > >> --- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > >> +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > >> @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > >> void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > >> { > >> +#if 0 > >> struct gpmi_nfc_hardware_timing *hw = &this->hw; > >> struct resources *r = &this->resources; > >> void __iomem *gpmi_regs = r->gpmi_regs; > >> @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > >> /* Wait for the DLL to settle. */ > >> udelay(dll_wait_time_us); > >> +#endif > >> } > >> int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > >> > >> So far after a couple of days of testing with this I no longer > >> see the DMA timeout. > >> > >> Any thoughts? > >> > >> Regards > >> Greg > >> > > > > Thanks, > > Miquèl > > Thanks, Miquèl
Hi Miguel On Mon, Jul 29, 2019 at 2:47 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > Hi Greg, > > + Boris > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > > > Hi Miquel, > > > > On 29/7/19 6:36 pm, Miquel Raynal wrote: > > > Hi Greg, > > > > > > One question below. > > > > > > +Michael > > > +Sascha > > > > > > Hello Michael, here is a similar issue to yours, I know you did not > > > have enough time to share your solution but here we have someone else > > > reproducing the issue, would you mind sharing a branch or a patch, even > > > a WIP one, just to help debugging? > > > > > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > > > > > >> Hi Miquel, > > >> > > >> I am experiencing a problem with NAND flash DMA timeouts on > > >> iMX6ull based boards. The problem is very similar to that > > >> described in: > > >> > > >> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > > >> > > >> That didn't come to any specific resolution that I could see > > >> in that thread. > > >> > > >> The boot trace on the console for me looks like this: > > >> > > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > > >> nand: Micron MT29F2G08ABAEAWP > > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > >> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > >> gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > >> gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > >> GF length : 13 > > >> ECC Strength : 8 > > >> Page Size in Bytes : 2110 > > >> Metadata Size in Bytes : 10 > > >> ECC Chunk0 Size in Bytes: 512 > > >> ECC Chunkn Size in Bytes: 512 > > >> ECC Chunk Count : 4 > > >> Payload Size in Bytes : 2048 > > >> Auxiliary Size in Bytes: 16 > > >> Auxiliary Status Offset: 12 > > >> Block Mark Byte Offset : 1999 > > >> Block Mark Bit Offset : 0 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > > >> nand: timing mode 5 not acknowledged by the NAND chip > > > > > > What is the final timing mode used? Most of us tested in mode 5 I > > > guess, maybe mode 4 is broken (don't know if this is the one used here, > > > neither why mode 5 is refused). Can you please try by limiting the mode > > > to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > > > > Sure, how to do that? > > This loop [1] tries to configure each mode (5, 4, ...) until one > succeeds (default is 0: must always work). Please try to limit mode to > 0, 1, etc. > > Mode 0 should work. > This is not correct. When all the mode fail it fallback to 0 that does not work. Already check So the fallback is created for this situation > [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933 > > > > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> Scanning device for bad blocks > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> .... > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> 5 fixed-partitions partitions found on MTD device gpmi-nand > > >> Creating 5 MTD partitions on "gpmi-nand": > > >> 0x000000000000-0x000000500000 : "u-boot" > > >> 0x000000500000-0x000000600000 : "u-boot-env" > > >> 0x000000600000-0x000000800000 : "log" > > >> 0x000000800000-0x000010000000 : "flash" > > >> 0x000000000000-0x000010000000 : "all" > > >> gpmi-nand 1806000.gpmi-nand: driver registered. > > >> > > >> > > >> This is using a linux kernel v5.1.14. I have seen this happen on > > >> a number of boards I have here - but it is only occasional. It > > >> only happens once in a while on boot, maybe 1 in 40 or more times. > > >> So it can take quite a while to reproduce (using a boot loop setup). > > > > > > That's strange... I don't get what would produce such unstable issue. > > > > My initial guess is that the calculated timing is very marginal. > > What do you mean by "marginal"? > I don't think that is timing calculation. I have tried to use the same timing as before but when those are applide. Is it possible? Michael > > The problem seems more likely to happen if flash write activity > > had been occurring just before a soft reboot. Its not a guarantee, > > just more likely. > > That's really disturbing. I doubt this is the real cause though. > > > > > Interesting observation is that Michael was using Micron flash, > > and boards that I have with the problem also have Micron flash. > > Both a form of Micron MT29F2G08. > > > > I have similar boards, iMX6ull based, with different brands of > > NAND flash and I have not seen any problem on them. > > That's great to narrow down the root cause. Maybe these chips have > tighter timing constraints. > > > > > Regards > > Greg > > > > > > > > >> As per the email thread I pointed to above I looked at reverting > > >> those patches, but that was not at all easy given how much the gpmi > > >> driver code had moved. So instead I modified the code with this: > > >> > > >> --- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > >> +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > >> @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > > >> void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > >> { > > >> +#if 0 > > >> struct gpmi_nfc_hardware_timing *hw = &this->hw; > > >> struct resources *r = &this->resources; > > >> void __iomem *gpmi_regs = r->gpmi_regs; > > >> @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > >> /* Wait for the DLL to settle. */ > > >> udelay(dll_wait_time_us); > > >> +#endif > > >> } > > >> int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > > >> > > >> So far after a couple of days of testing with this I no longer > > >> see the DMA timeout. > > >> > > >> Any thoughts? > > >> > > >> Regards > > >> Greg > > >> > > > > > > Thanks, > > > Miquèl > > > > > Thanks, > Miquèl
Hi Michael, Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on Mon, 29 Jul 2019 14:49:19 +0200: > Hi Miguel > > On Mon, Jul 29, 2019 at 2:47 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > Hi Greg, > > > > + Boris > > > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > > > > > Hi Miquel, > > > > > > On 29/7/19 6:36 pm, Miquel Raynal wrote: > > > > Hi Greg, > > > > > > > > One question below. > > > > > > > > +Michael > > > > +Sascha > > > > > > > > Hello Michael, here is a similar issue to yours, I know you did not > > > > have enough time to share your solution but here we have someone else > > > > reproducing the issue, would you mind sharing a branch or a patch, even > > > > a WIP one, just to help debugging? > > > > > > > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > > > > > > > >> Hi Miquel, > > > >> > > > >> I am experiencing a problem with NAND flash DMA timeouts on > > > >> iMX6ull based boards. The problem is very similar to that > > > >> described in: > > > >> > > > >> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > > > >> > > > >> That didn't come to any specific resolution that I could see > > > >> in that thread. > > > >> > > > >> The boot trace on the console for me looks like this: > > > >> > > > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > > > >> nand: Micron MT29F2G08ABAEAWP > > > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > > >> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > > >> gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > > >> gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > > >> gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > > >> GF length : 13 > > > >> ECC Strength : 8 > > > >> Page Size in Bytes : 2110 > > > >> Metadata Size in Bytes : 10 > > > >> ECC Chunk0 Size in Bytes: 512 > > > >> ECC Chunkn Size in Bytes: 512 > > > >> ECC Chunk Count : 4 > > > >> Payload Size in Bytes : 2048 > > > >> Auxiliary Size in Bytes: 16 > > > >> Auxiliary Status Offset: 12 > > > >> Block Mark Byte Offset : 1999 > > > >> Block Mark Bit Offset : 0 > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > > > >> nand: timing mode 5 not acknowledged by the NAND chip > > > > > > > > What is the final timing mode used? Most of us tested in mode 5 I > > > > guess, maybe mode 4 is broken (don't know if this is the one used here, > > > > neither why mode 5 is refused). Can you please try by limiting the mode > > > > to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > > > > > > Sure, how to do that? > > > > This loop [1] tries to configure each mode (5, 4, ...) until one > > succeeds (default is 0: must always work). Please try to limit mode to > > 0, 1, etc. > > > > Mode 0 should work. > > > > This is not correct. When all the mode fail it fallback to 0 that does > not work. Already check > So the fallback is created for this situation Sorry but I don't understand what you are saying. Are you telling me that you already tried mode 0 and that it did not work better than other timings? > > > [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933 > > > > > > > > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > >> Scanning device for bad blocks > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > >> .... > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > >> 5 fixed-partitions partitions found on MTD device gpmi-nand > > > >> Creating 5 MTD partitions on "gpmi-nand": > > > >> 0x000000000000-0x000000500000 : "u-boot" > > > >> 0x000000500000-0x000000600000 : "u-boot-env" > > > >> 0x000000600000-0x000000800000 : "log" > > > >> 0x000000800000-0x000010000000 : "flash" > > > >> 0x000000000000-0x000010000000 : "all" > > > >> gpmi-nand 1806000.gpmi-nand: driver registered. > > > >> > > > >> > > > >> This is using a linux kernel v5.1.14. I have seen this happen on > > > >> a number of boards I have here - but it is only occasional. It > > > >> only happens once in a while on boot, maybe 1 in 40 or more times. > > > >> So it can take quite a while to reproduce (using a boot loop setup). > > > > > > > > That's strange... I don't get what would produce such unstable issue. > > > > > > My initial guess is that the calculated timing is very marginal. > > > > What do you mean by "marginal"? > > > > I don't think that is timing calculation. I have tried to use the same timing > as before but when those are applide. Is it possible? ^ I suppose the end of the sentence is missing? Thanks, Miquèl
Hi Miquel On Mon, Jul 29, 2019 at 2:55 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > Hi Michael, > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > Mon, 29 Jul 2019 14:49:19 +0200: > > > Hi Miguel > > > > On Mon, Jul 29, 2019 at 2:47 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > > > Hi Greg, > > > > > > + Boris > > > > > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > > > > > > > Hi Miquel, > > > > > > > > On 29/7/19 6:36 pm, Miquel Raynal wrote: > > > > > Hi Greg, > > > > > > > > > > One question below. > > > > > > > > > > +Michael > > > > > +Sascha > > > > > > > > > > Hello Michael, here is a similar issue to yours, I know you did not > > > > > have enough time to share your solution but here we have someone else > > > > > reproducing the issue, would you mind sharing a branch or a patch, even > > > > > a WIP one, just to help debugging? > > > > > > > > > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > > > > > > > > > >> Hi Miquel, > > > > >> > > > > >> I am experiencing a problem with NAND flash DMA timeouts on > > > > >> iMX6ull based boards. The problem is very similar to that > > > > >> described in: > > > > >> > > > > >> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > > > > >> > > > > >> That didn't come to any specific resolution that I could see > > > > >> in that thread. > > > > >> > > > > >> The boot trace on the console for me looks like this: > > > > >> > > > > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > > > > >> nand: Micron MT29F2G08ABAEAWP > > > > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > > > >> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > > > >> gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > > > >> gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > > > >> gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > > > >> GF length : 13 > > > > >> ECC Strength : 8 > > > > >> Page Size in Bytes : 2110 > > > > >> Metadata Size in Bytes : 10 > > > > >> ECC Chunk0 Size in Bytes: 512 > > > > >> ECC Chunkn Size in Bytes: 512 > > > > >> ECC Chunk Count : 4 > > > > >> Payload Size in Bytes : 2048 > > > > >> Auxiliary Size in Bytes: 16 > > > > >> Auxiliary Status Offset: 12 > > > > >> Block Mark Byte Offset : 1999 > > > > >> Block Mark Bit Offset : 0 > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > > > > >> nand: timing mode 5 not acknowledged by the NAND chip > > > > > > > > > > What is the final timing mode used? Most of us tested in mode 5 I > > > > > guess, maybe mode 4 is broken (don't know if this is the one used here, > > > > > neither why mode 5 is refused). Can you please try by limiting the mode > > > > > to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > > > > > > > > Sure, how to do that? > > > > > > This loop [1] tries to configure each mode (5, 4, ...) until one > > > succeeds (default is 0: must always work). Please try to limit mode to > > > 0, 1, etc. > > > > > > Mode 0 should work. > > > > > > > This is not correct. When all the mode fail it fallback to 0 that does > > not work. Already check > > So the fallback is created for this situation > > Sorry but I don't understand what you are saying. > I said that where a timing mode is not ackolege then the mtd stack should send a reset command and fallback to timeing mode 0. The nand does not respond anymore. > Are you telling me that you already tried mode 0 and that it did not > work better than other timings? > I force only to use different mode but never try mode 0 ;) just because should be the normal fallback Michael > > > > > [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933 > > > > > > > > > > > > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > > >> Scanning device for bad blocks > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > > >> .... > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > > >> 5 fixed-partitions partitions found on MTD device gpmi-nand > > > > >> Creating 5 MTD partitions on "gpmi-nand": > > > > >> 0x000000000000-0x000000500000 : "u-boot" > > > > >> 0x000000500000-0x000000600000 : "u-boot-env" > > > > >> 0x000000600000-0x000000800000 : "log" > > > > >> 0x000000800000-0x000010000000 : "flash" > > > > >> 0x000000000000-0x000010000000 : "all" > > > > >> gpmi-nand 1806000.gpmi-nand: driver registered. > > > > >> > > > > >> > > > > >> This is using a linux kernel v5.1.14. I have seen this happen on > > > > >> a number of boards I have here - but it is only occasional. It > > > > >> only happens once in a while on boot, maybe 1 in 40 or more times. > > > > >> So it can take quite a while to reproduce (using a boot loop setup). > > > > > > > > > > That's strange... I don't get what would produce such unstable issue. > > > > > > > > My initial guess is that the calculated timing is very marginal. > > > > > > What do you mean by "marginal"? > > > > > > > I don't think that is timing calculation. I have tried to use the same timing > > as before but when those are applide. Is it possible? > > ^ > I suppose the end of the sentence is missing? > > > Thanks, > Miquèl
Hi Michael, Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on Mon, 29 Jul 2019 15:00:04 +0200: > Hi Miquel > > On Mon, Jul 29, 2019 at 2:55 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > Hi Michael, > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > Mon, 29 Jul 2019 14:49:19 +0200: > > > > > Hi Miguel > > > > > > On Mon, Jul 29, 2019 at 2:47 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > > > > > Hi Greg, > > > > > > > > + Boris > > > > > > > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > > > > > > > > > Hi Miquel, > > > > > > > > > > On 29/7/19 6:36 pm, Miquel Raynal wrote: > > > > > > Hi Greg, > > > > > > > > > > > > One question below. > > > > > > > > > > > > +Michael > > > > > > +Sascha > > > > > > > > > > > > Hello Michael, here is a similar issue to yours, I know you did not > > > > > > have enough time to share your solution but here we have someone else > > > > > > reproducing the issue, would you mind sharing a branch or a patch, even > > > > > > a WIP one, just to help debugging? > > > > > > > > > > > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > > > > > > > > > > > >> Hi Miquel, > > > > > >> > > > > > >> I am experiencing a problem with NAND flash DMA timeouts on > > > > > >> iMX6ull based boards. The problem is very similar to that > > > > > >> described in: > > > > > >> > > > > > >> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > > > > > >> > > > > > >> That didn't come to any specific resolution that I could see > > > > > >> in that thread. > > > > > >> > > > > > >> The boot trace on the console for me looks like this: > > > > > >> > > > > > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > > > > > >> nand: Micron MT29F2G08ABAEAWP > > > > > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > > > > >> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > > > > >> gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > > > > >> gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > > > > >> gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > > > > >> GF length : 13 > > > > > >> ECC Strength : 8 > > > > > >> Page Size in Bytes : 2110 > > > > > >> Metadata Size in Bytes : 10 > > > > > >> ECC Chunk0 Size in Bytes: 512 > > > > > >> ECC Chunkn Size in Bytes: 512 > > > > > >> ECC Chunk Count : 4 > > > > > >> Payload Size in Bytes : 2048 > > > > > >> Auxiliary Size in Bytes: 16 > > > > > >> Auxiliary Status Offset: 12 > > > > > >> Block Mark Byte Offset : 1999 > > > > > >> Block Mark Bit Offset : 0 > > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > > > > > >> nand: timing mode 5 not acknowledged by the NAND chip > > > > > > > > > > > > What is the final timing mode used? Most of us tested in mode 5 I > > > > > > guess, maybe mode 4 is broken (don't know if this is the one used here, > > > > > > neither why mode 5 is refused). Can you please try by limiting the mode > > > > > > to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > > > > > > > > > > Sure, how to do that? > > > > > > > > This loop [1] tries to configure each mode (5, 4, ...) until one > > > > succeeds (default is 0: must always work). Please try to limit mode to > > > > 0, 1, etc. > > > > > > > > Mode 0 should work. > > > > > > > > > > This is not correct. When all the mode fail it fallback to 0 that does > > > not work. Already check > > > So the fallback is created for this situation > > > > Sorry but I don't understand what you are saying. > > > > I said that where a timing mode is not ackolege then the mtd stack should > send a reset command and fallback to timeing mode 0. The nand does not > respond anymore. It depends on what you define by "not acknowledged". What you describe is the current situation: if either the NAND controller or the NAND chip do not support the mode requested by the core, the core will try another (slower) mode until either we found one or we are at timing mode 0. Unfortunately, we cannot check that "all operation with these timings will work" at boot time, it would be very time consuming; especially for something that is very likely to be a controller driver issue, and that is what happens here: both the controller and the chip acknowledge the new timings. > > > Are you telling me that you already tried mode 0 and that it did not > > work better than other timings? > > > > I force only to use different mode but never try mode 0 ;) just > because should be > the normal fallback Unless there is a timing calculation issue in the controller driver. Greg, can you please find the quickest working mode (starting from 0, of course, to ensure mode 0 is stable). [...] > > > > > > > > > > > > That's strange... I don't get what would produce such unstable issue. > > > > > > > > > > My initial guess is that the calculated timing is very marginal. > > > > > > > > What do you mean by "marginal"? > > > > > > > > > > I don't think that is timing calculation. I have tried to use the same timing > > > as before but when those are applide. Is it possible? > > > > ^ > > I suppose the end of the sentence is missing? Michael, what did you mean here? Thanks, Miquèl
Hi Miquel sorry was difficult day ;). My answer below On Mon, Jul 29, 2019 at 3:22 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > Hi Michael, > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > Mon, 29 Jul 2019 15:00:04 +0200: > > > Hi Miquel > > > > On Mon, Jul 29, 2019 at 2:55 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > > > Hi Michael, > > > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > > Mon, 29 Jul 2019 14:49:19 +0200: > > > > > > > Hi Miguel > > > > > > > > On Mon, Jul 29, 2019 at 2:47 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > > > > > > > Hi Greg, > > > > > > > > > > + Boris > > > > > > > > > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > > > > > > > > > > > Hi Miquel, > > > > > > > > > > > > On 29/7/19 6:36 pm, Miquel Raynal wrote: > > > > > > > Hi Greg, > > > > > > > > > > > > > > One question below. > > > > > > > > > > > > > > +Michael > > > > > > > +Sascha > > > > > > > > > > > > > > Hello Michael, here is a similar issue to yours, I know you did not > > > > > > > have enough time to share your solution but here we have someone else > > > > > > > reproducing the issue, would you mind sharing a branch or a patch, even > > > > > > > a WIP one, just to help debugging? > > > > > > > > > > > > > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > > > > > > > > > > > > > >> Hi Miquel, > > > > > > >> > > > > > > >> I am experiencing a problem with NAND flash DMA timeouts on > > > > > > >> iMX6ull based boards. The problem is very similar to that > > > > > > >> described in: > > > > > > >> > > > > > > >> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > > > > > > >> > > > > > > >> That didn't come to any specific resolution that I could see > > > > > > >> in that thread. > > > > > > >> > > > > > > >> The boot trace on the console for me looks like this: > > > > > > >> > > > > > > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > > > > > > >> nand: Micron MT29F2G08ABAEAWP > > > > > > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > > > > > >> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > > > > > >> gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > > > > > >> gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > > > > > >> GF length : 13 > > > > > > >> ECC Strength : 8 > > > > > > >> Page Size in Bytes : 2110 > > > > > > >> Metadata Size in Bytes : 10 > > > > > > >> ECC Chunk0 Size in Bytes: 512 > > > > > > >> ECC Chunkn Size in Bytes: 512 > > > > > > >> ECC Chunk Count : 4 > > > > > > >> Payload Size in Bytes : 2048 > > > > > > >> Auxiliary Size in Bytes: 16 > > > > > > >> Auxiliary Status Offset: 12 > > > > > > >> Block Mark Byte Offset : 1999 > > > > > > >> Block Mark Bit Offset : 0 > > > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > > > > > > >> nand: timing mode 5 not acknowledged by the NAND chip > > > > > > > > > > > > > > What is the final timing mode used? Most of us tested in mode 5 I > > > > > > > guess, maybe mode 4 is broken (don't know if this is the one used here, > > > > > > > neither why mode 5 is refused). Can you please try by limiting the mode > > > > > > > to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > > > > > > > > > > > > Sure, how to do that? > > > > > > > > > > This loop [1] tries to configure each mode (5, 4, ...) until one > > > > > succeeds (default is 0: must always work). Please try to limit mode to > > > > > 0, 1, etc. > > > > > > > > > > Mode 0 should work. > > > > > > > > > > > > > This is not correct. When all the mode fail it fallback to 0 that does > > > > not work. Already check > > > > So the fallback is created for this situation > > > > > > Sorry but I don't understand what you are saying. > > > > > > > I said that where a timing mode is not ackolege then the mtd stack should > > send a reset command and fallback to timeing mode 0. The nand does not > > respond anymore. > > It depends on what you define by "not acknowledged". What you describe > is the current situation: if either the NAND controller or the NAND > chip do not support the mode requested by the core, the core will try > another (slower) mode until either we found one or we are at timing > mode 0. > > Unfortunately, we cannot check that "all operation with these timings > will work" at boot time, it would be very time consuming; especially > for something that is very likely to be a controller driver issue, and > that is what happens here: both the controller and the chip acknowledge > the new timings. > > > > > > Are you telling me that you already tried mode 0 and that it did not > > > work better than other timings? > > > > > > > I force only to use different mode but never try mode 0 ;) just > > because should be > > the normal fallback > > Unless there is a timing calculation issue in the controller driver. > Greg, can you please find the quickest working mode (starting from 0, > of course, to ensure mode 0 is stable). > > [...] > > > > > > > > > > > > > > > That's strange... I don't get what would produce such unstable issue. > > > > > > > > > > > > My initial guess is that the calculated timing is very marginal. > > > > > > > > > > What do you mean by "marginal"? > > > > > > > > > > > > > I don't think that is timing calculation. I have tried to use the same timing > > > > as before but when those are applide. Is it possible? > > > > > > ^ > > > I suppose the end of the sentence is missing? > > Michael, what did you mean here? > commit 02c786627b93b3c3286570f793294816286ff397 Author: Michael Trimarchi <michael@amarulasolutions.com> Date: Fri Oct 5 09:46:29 2018 +0200 Revert "mtd: rawnand: gpmi: use core timings instead of an empirical derivation" This reverts commit b1206122069aadabe1a8c50789277a978aaa4df7. Change-Id: Icd0ddcd5e3ac7d82932bbf412299cca424cbc571 Jira-Id: WAN-50 Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> Revert this one does not fix the problem. Right now I have two revert this one and commit 6ab543c1924f77957004994bd6806a9daa45f903 (tag: MMI_004_011_R02) Author: Michael Trimarchi <michael@amarulasolutions.com> Date: Fri Oct 5 09:46:44 2018 +0200 Revert "mtd: rawnand: gpmi: support ->setup_data_interface()" This reverts commit 76e1a0086a0c3276b384f77905345e0fcc886fdd. Change-Id: I60fb6f874364d1deeda3424d4508553a38ac9b1a Jira-Id: WAN-50 Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> I did not have time to finish to undestand why this was fixing my problem Michael > > Thanks, > Miquèl
Hi Michael, Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on Mon, 29 Jul 2019 22:00:25 +0200: > Hi Miquel > > sorry was difficult day ;). My answer below > No pb ;) [...] > > > > > > > > That's strange... I don't get what would produce such unstable issue. > > > > > > > > > > > > > > My initial guess is that the calculated timing is very marginal. > > > > > > > > > > > > What do you mean by "marginal"? > > > > > > > > > > > > > > > > I don't think that is timing calculation. I have tried to use the same timing > > > > > as before but when those are applide. Is it possible? > > > > > > > > ^ > > > > I suppose the end of the sentence is missing? > > > > Michael, what did you mean here? > > > commit 02c786627b93b3c3286570f793294816286ff397 > Author: Michael Trimarchi <michael@amarulasolutions.com> > Date: Fri Oct 5 09:46:29 2018 +0200 > > Revert "mtd: rawnand: gpmi: use core timings instead of an > empirical derivation" > > This reverts commit b1206122069aadabe1a8c50789277a978aaa4df7. > > Change-Id: Icd0ddcd5e3ac7d82932bbf412299cca424cbc571 > Jira-Id: WAN-50 > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > > Revert this one does not fix the problem. Right now I have two revert > this one and > > commit 6ab543c1924f77957004994bd6806a9daa45f903 (tag: MMI_004_011_R02) > Author: Michael Trimarchi <michael@amarulasolutions.com> > Date: Fri Oct 5 09:46:44 2018 +0200 > > Revert "mtd: rawnand: gpmi: support ->setup_data_interface()" > > This reverts commit 76e1a0086a0c3276b384f77905345e0fcc886fdd. > > Change-Id: I60fb6f874364d1deeda3424d4508553a38ac9b1a > Jira-Id: WAN-50 > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > > I did not have time to finish to undestand why this was fixing my problem Ok so I am pretty convinced that this is still a timings issue then; but not entirely due to the different timings calculation. Interesting. I'm waiting for Greg's results now. Thanks, Miquèl
Hi Miquel, On 29/7/19 10:47 pm, Miquel Raynal wrote: > Hi Greg, > > + Boris > > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > >> Hi Miquel, >> >> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>> Hi Greg, >>> >>> One question below. >>> >>> +Michael >>> +Sascha >>> >>> Hello Michael, here is a similar issue to yours, I know you did not >>> have enough time to share your solution but here we have someone else >>> reproducing the issue, would you mind sharing a branch or a patch, even >>> a WIP one, just to help debugging? >>> >>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >>> >>>> Hi Miquel, >>>> >>>> I am experiencing a problem with NAND flash DMA timeouts on >>>> iMX6ull based boards. The problem is very similar to that >>>> described in: >>>> >>>> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma >>>> >>>> That didn't come to any specific resolution that I could see >>>> in that thread. >>>> >>>> The boot trace on the console for me looks like this: >>>> >>>> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >>>> nand: Micron MT29F2G08ABAEAWP >>>> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >>>> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA >>>> gpmi-nand 1806000.gpmi-nand: Show GPMI registers : >>>> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c >>>> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee >>>> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 >>>> gpmi-nand 1806000.gpmi-nand: Show BCH registers : >>>> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 >>>> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 >>>> gpmi-nand 1806000.gpmi-nand: BCH Geometry : >>>> GF length : 13 >>>> ECC Strength : 8 >>>> Page Size in Bytes : 2110 >>>> Metadata Size in Bytes : 10 >>>> ECC Chunk0 Size in Bytes: 512 >>>> ECC Chunkn Size in Bytes: 512 >>>> ECC Chunk Count : 4 >>>> Payload Size in Bytes : 2048 >>>> Auxiliary Size in Bytes: 16 >>>> Auxiliary Status Offset: 12 >>>> Block Mark Byte Offset : 1999 >>>> Block Mark Bit Offset : 0 >>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 >>>> nand: timing mode 5 not acknowledged by the NAND chip >>> >>> What is the final timing mode used? Most of us tested in mode 5 I >>> guess, maybe mode 4 is broken (don't know if this is the one used here, >>> neither why mode 5 is refused). Can you please try by limiting the mode >>> to 0, 1, 2... until, hopefully, we narrow down to the failing mode. >> >> Sure, how to do that? > > This loop [1] tries to configure each mode (5, 4, ...) until one > succeeds (default is 0: must always work). Please try to limit mode to > 0, 1, etc. > > Mode 0 should work. > > [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933 The normal behavior - which usually works - has chip->onfi_timing_mode_default=5 here. So in other words on the first pass through this loop it is checking mode 5, and setting it as the default. I am running a test/reboot loop now waiting for failure to see if it is still using mode 5 in that case. Regards Greg >>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>>> Scanning device for bad blocks >>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>>> .... >>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 >>>> 5 fixed-partitions partitions found on MTD device gpmi-nand >>>> Creating 5 MTD partitions on "gpmi-nand": >>>> 0x000000000000-0x000000500000 : "u-boot" >>>> 0x000000500000-0x000000600000 : "u-boot-env" >>>> 0x000000600000-0x000000800000 : "log" >>>> 0x000000800000-0x000010000000 : "flash" >>>> 0x000000000000-0x000010000000 : "all" >>>> gpmi-nand 1806000.gpmi-nand: driver registered. >>>> >>>> >>>> This is using a linux kernel v5.1.14. I have seen this happen on >>>> a number of boards I have here - but it is only occasional. It >>>> only happens once in a while on boot, maybe 1 in 40 or more times. >>>> So it can take quite a while to reproduce (using a boot loop setup). >>> >>> That's strange... I don't get what would produce such unstable issue. >> >> My initial guess is that the calculated timing is very marginal. > > What do you mean by "marginal"? > >> The problem seems more likely to happen if flash write activity >> had been occurring just before a soft reboot. Its not a guarantee, >> just more likely. > > That's really disturbing. I doubt this is the real cause though. > >> >> Interesting observation is that Michael was using Micron flash, >> and boards that I have with the problem also have Micron flash. >> Both a form of Micron MT29F2G08. >> >> I have similar boards, iMX6ull based, with different brands of >> NAND flash and I have not seen any problem on them. > > That's great to narrow down the root cause. Maybe these chips have > tighter timing constraints. > >> >> Regards >> Greg >> >> >> >>>> As per the email thread I pointed to above I looked at reverting >>>> those patches, but that was not at all easy given how much the gpmi >>>> driver code had moved. So instead I modified the code with this: >>>> >>>> --- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c >>>> +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c >>>> @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, >>>> void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >>>> { >>>> +#if 0 >>>> struct gpmi_nfc_hardware_timing *hw = &this->hw; >>>> struct resources *r = &this->resources; >>>> void __iomem *gpmi_regs = r->gpmi_regs; >>>> @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >>>> /* Wait for the DLL to settle. */ >>>> udelay(dll_wait_time_us); >>>> +#endif >>>> } >>>> int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, >>>> >>>> So far after a couple of days of testing with this I no longer >>>> see the DMA timeout. >>>> >>>> Any thoughts? >>>> >>>> Regards >>>> Greg >>>> >>> >>> Thanks, >>> Miquèl >>> > > Thanks, > Miquèl >
Hi Miquel, On 30/7/19 10:28 am, Greg Ungerer wrote: > On 29/7/19 10:47 pm, Miquel Raynal wrote: >> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: [snip] >>>>> nand: timing mode 5 not acknowledged by the NAND chip >>>> >>>> What is the final timing mode used? Most of us tested in mode 5 I >>>> guess, maybe mode 4 is broken (don't know if this is the one used here, >>>> neither why mode 5 is refused). Can you please try by limiting the mode >>>> to 0, 1, 2... until, hopefully, we narrow down to the failing mode. >>> >>> Sure, how to do that? >> >> This loop [1] tries to configure each mode (5, 4, ...) until one >> succeeds (default is 0: must always work). Please try to limit mode to >> 0, 1, etc. >> >> Mode 0 should work. >> >> [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933 > > The normal behavior - which usually works - has > chip->onfi_timing_mode_default=5 here. So in other words on the first pass > through this loop it is checking mode 5, and setting it as the default. > > I am running a test/reboot loop now waiting for failure to see > if it is still using mode 5 in that case. With this trace in place: --- a/linux/drivers/mtd/nand/raw/nand_base.c +++ b/linux/drivers/mtd/nand/raw/nand_base.c @@ -910,6 +910,7 @@ static int nand_init_data_interface(struct nand_chip *chip) } for (mode = fls(modes) - 1; mode >= 0; mode--) { + printk("%s(%d): checking mode=%d\n", __FILE__, __LINE__, mode); ret = onfi_fill_data_interface(chip, NAND_SDR_IFACE, mode); if (ret) continue; @@ -923,10 +924,12 @@ static int nand_init_data_interface(struct nand_chip *chip) &chip->data_interface); if (!ret) { chip->onfi_timing_mode_default = mode; + printk("%s(%d): BREAKING AT mode=%d\n", __FILE__, __LINE__, mode); break; } } + printk("%s(%d): chip->onfi_timing_mode_default=%d\n", __FILE__, __LINE__, chip->onfi_timing_mode_default); return 0; } First NAND failure gives this: nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAEAWP nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 gpmi-nand 1806000.gpmi-nand: use legacy bch geometry drivers/mtd/nand/raw/nand_base.c(913): checking mode=5 drivers/mtd/nand/raw/nand_base.c(927): BREAKING AT mode=5 drivers/mtd/nand/raw/nand_base.c(932): chip->onfi_timing_mode_default=5 gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA gpmi-nand 1806000.gpmi-nand: Show GPMI registers : gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000100 gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 gpmi-nand 1806000.gpmi-nand: Show BCH registers : gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 gpmi-nand 1806000.gpmi-nand: BCH Geometry : GF length : 13 ECC Strength : 8 Page Size in Bytes : 2110 Metadata Size in Bytes : 10 ECC Chunk0 Size in Bytes: 512 ECC Chunkn Size in Bytes: 512 ECC Chunk Count : 4 Payload Size in Bytes : 2048 Auxiliary Size in Bytes: 16 Auxiliary Status Offset: 12 Block Mark Byte Offset : 1999 Block Mark Bit Offset : 0 gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 nand: timing mode 5 not acknowledged by the NAND chip gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 Regards Greg
Hi Miquel, On 30/7/19 10:41 am, Greg Ungerer wrote: > On 30/7/19 10:28 am, Greg Ungerer wrote: >> On 29/7/19 10:47 pm, Miquel Raynal wrote: >>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > [snip] >>>>>> nand: timing mode 5 not acknowledged by the NAND chip >>>>> >>>>> What is the final timing mode used? Most of us tested in mode 5 I >>>>> guess, maybe mode 4 is broken (don't know if this is the one used here, >>>>> neither why mode 5 is refused). Can you please try by limiting the mode >>>>> to 0, 1, 2... until, hopefully, we narrow down to the failing mode. >>>> >>>> Sure, how to do that? >>> >>> This loop [1] tries to configure each mode (5, 4, ...) until one >>> succeeds (default is 0: must always work). Please try to limit mode to >>> 0, 1, etc. >>> >>> Mode 0 should work. >>> >>> [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933 >> >> The normal behavior - which usually works - has >> chip->onfi_timing_mode_default=5 here. So in other words on the first pass >> through this loop it is checking mode 5, and setting it as the default. >> >> I am running a test/reboot loop now waiting for failure to see >> if it is still using mode 5 in that case. > > With this trace in place: > > --- a/linux/drivers/mtd/nand/raw/nand_base.c > +++ b/linux/drivers/mtd/nand/raw/nand_base.c > @@ -910,6 +910,7 @@ static int nand_init_data_interface(struct nand_chip *chip) > } > > for (mode = fls(modes) - 1; mode >= 0; mode--) { > + printk("%s(%d): checking mode=%d\n", __FILE__, __LINE__, mode); > ret = onfi_fill_data_interface(chip, NAND_SDR_IFACE, mode); > if (ret) > continue; > @@ -923,10 +924,12 @@ static int nand_init_data_interface(struct nand_chip *chip) > &chip->data_interface); > if (!ret) { > chip->onfi_timing_mode_default = mode; > + printk("%s(%d): BREAKING AT mode=%d\n", __FILE__, __LINE__, mode); > break; > } > } > > + printk("%s(%d): chip->onfi_timing_mode_default=%d\n", __FILE__, __LINE__, chip->onfi_timing_mode_default); > return 0; > } > > > First NAND failure gives this: > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > gpmi-nand 1806000.gpmi-nand: use legacy bch geometry > drivers/mtd/nand/raw/nand_base.c(913): checking mode=5 > drivers/mtd/nand/raw/nand_base.c(927): BREAKING AT mode=5 > drivers/mtd/nand/raw/nand_base.c(932): chip->onfi_timing_mode_default=5 > gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000100 > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > gpmi-nand 1806000.gpmi-nand: Show BCH registers : > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: BCH Geometry : > GF length : 13 > ECC Strength : 8 > Page Size in Bytes : 2110 > Metadata Size in Bytes : 10 > ECC Chunk0 Size in Bytes: 512 > ECC Chunkn Size in Bytes: 512 > ECC Chunk Count : 4 > Payload Size in Bytes : 2048 > Auxiliary Size in Bytes: 16 > Auxiliary Status Offset: 12 > Block Mark Byte Offset : 1999 > Block Mark Bit Offset : 0 > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > nand: timing mode 5 not acknowledged by the NAND chip > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 Not sure if this is a useful data point... But I modified that nand_init_data_interface() loop to start checking from data mode 4. So now on every boot it defaults to mode 4. That has been running most of the day, up to 900 boot cycles now, no failures. Regards Greg
Hi Greg, Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: > Hi Miquel, > > On 30/7/19 10:41 am, Greg Ungerer wrote: > > On 30/7/19 10:28 am, Greg Ungerer wrote: > >> On 29/7/19 10:47 pm, Miquel Raynal wrote: > >>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > >>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: > >>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > > [snip] > >>>>>> nand: timing mode 5 not acknowledged by the NAND chip > >>>>> > >>>>> What is the final timing mode used? Most of us tested in mode 5 I > >>>>> guess, maybe mode 4 is broken (don't know if this is the one used here, > >>>>> neither why mode 5 is refused). Can you please try by limiting the mode > >>>>> to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > >>>> > >>>> Sure, how to do that? > >>> > >>> This loop [1] tries to configure each mode (5, 4, ...) until one > >>> succeeds (default is 0: must always work). Please try to limit mode to > >>> 0, 1, etc. > >>> > >>> Mode 0 should work. > >>> > >>> [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933 > >> > >> The normal behavior - which usually works - has > >> chip->onfi_timing_mode_default=5 here. So in other words on the first pass > >> through this loop it is checking mode 5, and setting it as the default. > >> > >> I am running a test/reboot loop now waiting for failure to see > >> if it is still using mode 5 in that case. > > > > With this trace in place: > > > > --- a/linux/drivers/mtd/nand/raw/nand_base.c > > +++ b/linux/drivers/mtd/nand/raw/nand_base.c > > @@ -910,6 +910,7 @@ static int nand_init_data_interface(struct nand_chip *chip) > > } > > > > for (mode = fls(modes) - 1; mode >= 0; mode--) { > > + printk("%s(%d): checking mode=%d\n", __FILE__, __LINE__, mode); > > ret = onfi_fill_data_interface(chip, NAND_SDR_IFACE, mode); > > if (ret) > > continue; > > @@ -923,10 +924,12 @@ static int nand_init_data_interface(struct nand_chip *chip) > > &chip->data_interface); > > if (!ret) { > > chip->onfi_timing_mode_default = mode; > > + printk("%s(%d): BREAKING AT mode=%d\n", __FILE__, __LINE__, mode); > > break; > > } > > } > > > > + printk("%s(%d): chip->onfi_timing_mode_default=%d\n", __FILE__, __LINE__, chip->onfi_timing_mode_default); > > return 0; > > } > > > > > > First NAND failure gives this: > > > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > > nand: Micron MT29F2G08ABAEAWP > > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > gpmi-nand 1806000.gpmi-nand: use legacy bch geometry > > drivers/mtd/nand/raw/nand_base.c(913): checking mode=5 > > drivers/mtd/nand/raw/nand_base.c(927): BREAKING AT mode=5 > > drivers/mtd/nand/raw/nand_base.c(932): chip->onfi_timing_mode_default=5 > > gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000100 > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > GF length : 13 > > ECC Strength : 8 > > Page Size in Bytes : 2110 > > Metadata Size in Bytes : 10 > > ECC Chunk0 Size in Bytes: 512 > > ECC Chunkn Size in Bytes: 512 > > ECC Chunk Count : 4 > > Payload Size in Bytes : 2048 > > Auxiliary Size in Bytes: 16 > > Auxiliary Status Offset: 12 > > Block Mark Byte Offset : 1999 > > Block Mark Bit Offset : 0 > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > > nand: timing mode 5 not acknowledged by the NAND chip > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > Not sure if this is a useful data point... But I modified that > nand_init_data_interface() loop to start checking from data mode 4. > So now on every boot it defaults to mode 4. That has been running > most of the day, up to 900 boot cycles now, no failures. Ok so after having chatted quite a bit with Boris, it is very likely that, for these chips, the timings in mode 5 are too tight. It could fail the GET_FEATURES once in mode 5. Can you please dump every single intermediate value in gpmi_nfc_compute_timings() (period, *_cycles, use of half pêriods, tRP, sample delay, etc) as well as the content of /sys/kernel/debug/clk/clk_summary (you'll need debugfs support enabled and mounted). Also, can you be sure that the NAND chip is powered with 3.3V? Thanks, Miquèl
On Tue, 30 Jul 2019 10:38:22 +0200 Miquel Raynal <miquel.raynal@bootlin.com> wrote: > Hi Greg, > > Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: > > > Hi Miquel, > > > > On 30/7/19 10:41 am, Greg Ungerer wrote: > > > On 30/7/19 10:28 am, Greg Ungerer wrote: > > >> On 29/7/19 10:47 pm, Miquel Raynal wrote: > > >>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > > >>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: > > >>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > > > [snip] > > >>>>>> nand: timing mode 5 not acknowledged by the NAND chip > > >>>>> > > >>>>> What is the final timing mode used? Most of us tested in mode 5 I > > >>>>> guess, maybe mode 4 is broken (don't know if this is the one used here, > > >>>>> neither why mode 5 is refused). Can you please try by limiting the mode > > >>>>> to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > > >>>> > > >>>> Sure, how to do that? > > >>> > > >>> This loop [1] tries to configure each mode (5, 4, ...) until one > > >>> succeeds (default is 0: must always work). Please try to limit mode to > > >>> 0, 1, etc. > > >>> > > >>> Mode 0 should work. > > >>> > > >>> [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933 > > >> > > >> The normal behavior - which usually works - has > > >> chip->onfi_timing_mode_default=5 here. So in other words on the first pass > > >> through this loop it is checking mode 5, and setting it as the default. > > >> > > >> I am running a test/reboot loop now waiting for failure to see > > >> if it is still using mode 5 in that case. > > > > > > With this trace in place: > > > > > > --- a/linux/drivers/mtd/nand/raw/nand_base.c > > > +++ b/linux/drivers/mtd/nand/raw/nand_base.c > > > @@ -910,6 +910,7 @@ static int nand_init_data_interface(struct nand_chip *chip) > > > } > > > > > > for (mode = fls(modes) - 1; mode >= 0; mode--) { > > > + printk("%s(%d): checking mode=%d\n", __FILE__, __LINE__, mode); > > > ret = onfi_fill_data_interface(chip, NAND_SDR_IFACE, mode); > > > if (ret) > > > continue; > > > @@ -923,10 +924,12 @@ static int nand_init_data_interface(struct nand_chip *chip) > > > &chip->data_interface); > > > if (!ret) { > > > chip->onfi_timing_mode_default = mode; > > > + printk("%s(%d): BREAKING AT mode=%d\n", __FILE__, __LINE__, mode); > > > break; > > > } > > > } > > > > > > + printk("%s(%d): chip->onfi_timing_mode_default=%d\n", __FILE__, __LINE__, chip->onfi_timing_mode_default); > > > return 0; > > > } > > > > > > > > > First NAND failure gives this: > > > > > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > > > nand: Micron MT29F2G08ABAEAWP > > > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > > gpmi-nand 1806000.gpmi-nand: use legacy bch geometry > > > drivers/mtd/nand/raw/nand_base.c(913): checking mode=5 > > > drivers/mtd/nand/raw/nand_base.c(927): BREAKING AT mode=5 > > > drivers/mtd/nand/raw/nand_base.c(932): chip->onfi_timing_mode_default=5 > > > gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > > gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000100 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > > gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > > gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > > GF length : 13 > > > ECC Strength : 8 > > > Page Size in Bytes : 2110 > > > Metadata Size in Bytes : 10 > > > ECC Chunk0 Size in Bytes: 512 > > > ECC Chunkn Size in Bytes: 512 > > > ECC Chunk Count : 4 > > > Payload Size in Bytes : 2048 > > > Auxiliary Size in Bytes: 16 > > > Auxiliary Status Offset: 12 > > > Block Mark Byte Offset : 1999 > > > Block Mark Bit Offset : 0 > > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > > > nand: timing mode 5 not acknowledged by the NAND chip > > > gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > > > Not sure if this is a useful data point... But I modified that > > nand_init_data_interface() loop to start checking from data mode 4. > > So now on every boot it defaults to mode 4. That has been running > > most of the day, up to 900 boot cycles now, no failures. > > Ok so after having chatted quite a bit with Boris, it is very likely > that, for these chips, the timings in mode 5 are too tight. It could > fail the GET_FEATURES once in mode 5. Can you please dump every single > intermediate value in gpmi_nfc_compute_timings() (period, *_cycles, > use of half pêriods, tRP, sample delay, etc) as well as the content > of /sys/kernel/debug/clk/clk_summary (you'll need debugfs support > enabled and mounted). Not sure the clk will stay at the rate it was set during the timing selection. Can you also add a trace printing the result of clk_get_rate(r->clock[0], hw->clk_rate) here [1]? [1]https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c#L711
Hi Miquel, Boris, On 30/7/19 6:38 pm, Miquel Raynal wrote: > Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: >> On 30/7/19 10:41 am, Greg Ungerer wrote: >>> On 30/7/19 10:28 am, Greg Ungerer wrote: >>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: >>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >>> [snip] >> Not sure if this is a useful data point... But I modified that >> nand_init_data_interface() loop to start checking from data mode 4. >> So now on every boot it defaults to mode 4. That has been running >> most of the day, up to 900 boot cycles now, no failures. > > Ok so after having chatted quite a bit with Boris, it is very likely > that, for these chips, the timings in mode 5 are too tight. It could > fail the GET_FEATURES once in mode 5. Can you please dump every single > intermediate value in gpmi_nfc_compute_timings() (period, *_cycles, > use of half pêriods, tRP, sample delay, etc) as well as the content > of /sys/kernel/debug/clk/clk_summary (you'll need debugfs support > enabled and mounted). > > Also, can you be sure that the NAND chip is powered with 3.3V? Yes, 3.3V NAND chip. Using the attached patch I get the following trace: ... drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() sdr->tBERS_max=65535000000 sdr->tCCS_min=500000000 sdr->tPROG_max=65535000000 sdr->tR_max=200000000000000 sdr->tALH_min=20000 sdr->tADL_min=400000 sdr->tALS_min=50000 sdr->tAR_min=25000 sdr->tCEA_max=100000 sdr->tCEH_min=20000 sdr->tCH_min=20000 sdr->tCHZ_max=100000 sdr->tCLH_min=20000 sdr->tCLR_min=20000 sdr->tCLS_min=50000 sdr->tCOH_min=0 sdr->tCS_min=70000 sdr->tDH_min=20000 sdr->tDS_min=40000 sdr->tFEAT_max=1000000 sdr->tIR_min=10000 sdr->tITC_max=1000000 sdr->tRC_min=100000 sdr->tREA_max=40000 sdr->tREH_min=30000 sdr->tRHOH_min=0 sdr->tRHW_min=200000 sdr->tRHZ_max=200000 sdr->tRLOH_min=0 sdr->tRP_min=50000 sdr->tRR_min=40000 sdr->tRST_max=250000000000 sdr->tWB_max=200000 sdr->tWC_min=100000 sdr->tWH_min=30000 sdr->tWHR_min=120000 sdr->tWP_min=50000 sdr->tWW_min=100000 hw->clk_rate=22000000 wrn_dly_sel=0 period_ps=45454 addr_setup_cycles=2 data_setup_cycles=1 data_hold_cycles=1 busy_timeout_cycles=31302 hw->timing0=0x00020101 hw->timing1=0x60000000 dll_threshold_ps=12000 use_half_period=1 reference_period_ps=22727 tRP_ps=45454 sample_delay_ps=4294955664 sample_delay_factor=0 hw->ctrl1n=0x00000000 hw->ctrl1n=0x00000000 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() hw>clk_rate=22000000 clk_set_rate(r->clock[0], hw->clk_rate)=0 clk_get_rate(r->clock[0])=22000000 random: fast init done nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAEAWP nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() sdr->tBERS_max=3000000000 sdr->tCCS_min=100000 sdr->tPROG_max=600000000 sdr->tR_max=25000000 sdr->tALH_min=20000 sdr->tADL_min=400000 sdr->tALS_min=50000 sdr->tAR_min=25000 sdr->tCEA_max=100000 sdr->tCEH_min=20000 sdr->tCH_min=20000 sdr->tCHZ_max=100000 sdr->tCLH_min=20000 sdr->tCLR_min=20000 sdr->tCLS_min=50000 sdr->tCOH_min=0 sdr->tCS_min=70000 sdr->tDH_min=20000 sdr->tDS_min=40000 sdr->tFEAT_max=1000000 sdr->tIR_min=10000 sdr->tITC_max=1000000 sdr->tRC_min=100000 sdr->tREA_max=40000 sdr->tREH_min=30000 sdr->tRHOH_min=0 sdr->tRHW_min=200000 sdr->tRHZ_max=200000 sdr->tRLOH_min=0 sdr->tRP_min=50000 sdr->tRR_min=40000 sdr->tRST_max=250000000000 sdr->tWB_max=200000 sdr->tWC_min=100000 sdr->tWH_min=30000 sdr->tWHR_min=120000 sdr->tWP_min=50000 sdr->tWW_min=100000 hw->clk_rate=22000000 wrn_dly_sel=0 period_ps=45454 addr_setup_cycles=2 data_setup_cycles=1 data_hold_cycles=1 busy_timeout_cycles=555 hw->timing0=0x00020101 hw->timing1=0xb0000000 dll_threshold_ps=12000 use_half_period=1 reference_period_ps=22727 tRP_ps=45454 sample_delay_ps=4294955664 sample_delay_factor=0 hw->ctrl1n=0x00000000 hw->ctrl1n=0x00000000 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() hw>clk_rate=22000000 clk_set_rate(r->clock[0], hw->clk_rate)=0 clk_get_rate(r->clock[0])=22000000 gpmi-nand 1806000.gpmi-nand: use legacy bch geometry drivers/mtd/nand/raw/nand_base.c(913): checking mode=5 drivers/mtd/nand/raw/nand_base.c(927): BREAKING AT mode=5 drivers/mtd/nand/raw/nand_base.c(932): chip->onfi_timing_mode_default=5 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() sdr->tBERS_max=3000000000 sdr->tCCS_min=100000 sdr->tPROG_max=600000000 sdr->tR_max=25000000 sdr->tALH_min=5000 sdr->tADL_min=400000 sdr->tALS_min=10000 sdr->tAR_min=10000 sdr->tCEA_max=25000 sdr->tCEH_min=20000 sdr->tCH_min=5000 sdr->tCHZ_max=30000 sdr->tCLH_min=5000 sdr->tCLR_min=10000 sdr->tCLS_min=10000 sdr->tCOH_min=15000 sdr->tCS_min=15000 sdr->tDH_min=5000 sdr->tDS_min=7000 sdr->tFEAT_max=1000000 sdr->tIR_min=0 sdr->tITC_max=1000000 sdr->tRC_min=20000 sdr->tREA_max=16000 sdr->tREH_min=7000 sdr->tRHOH_min=15000 sdr->tRHW_min=100000 sdr->tRHZ_max=100000 sdr->tRLOH_min=5000 sdr->tRP_min=10000 sdr->tRR_min=20000 sdr->tRST_max=500000000 sdr->tWB_max=100000 sdr->tWC_min=20000 sdr->tWH_min=7000 sdr->tWHR_min=80000 sdr->tWP_min=10000 sdr->tWW_min=100000 hw->clk_rate=100000000 wrn_dly_sel=3 period_ps=10000 addr_setup_cycles=1 data_setup_cycles=1 data_hold_cycles=1 busy_timeout_cycles=2510 hw->timing0=0x00010101 hw->timing1=0xe0000000 dll_threshold_ps=12000 use_half_period=0 reference_period_ps=10000 tRP_ps=10000 sample_delay_ps=80000 sample_delay_factor=8 hw->ctrl1n=0x00c00000 hw->ctrl1n=0x00c28000 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() hw>clk_rate=100000000 clk_set_rate(r->clock[0], hw->clk_rate)=0 clk_get_rate(r->clock[0])=99000000 Scanning device for bad blocks 5 fixed-partitions partitions found on MTD device gpmi-nand Creating 5 MTD partitions on "gpmi-nand": 0x000000000000-0x000000500000 : "u-boot" 0x000000500000-0x000000600000 : "u-boot-env" 0x000000600000-0x000000800000 : "log" 0x000000800000-0x000010000000 : "flash" 0x000000000000-0x000010000000 : "all" gpmi-nand 1806000.gpmi-nand: driver registered. ... And "cat /sys/kernel/debug/clk/clk_summary" gives: enable prepare protect duty clock count count count rate accuracy phase cycle --------------------------------------------------------------------------------------------- dummy 2 2 0 0 0 0 50000 cko2_sel 0 0 0 0 0 0 50000 cko2_podf 0 0 0 0 0 0 50000 cko2 0 0 0 0 0 0 50000 cko1_sel 0 0 0 0 0 0 50000 cko1_podf 0 0 0 0 0 0 50000 cko1 0 0 0 0 0 0 50000 cko 0 0 0 0 0 0 50000 usbphy2_gate 1 1 0 0 0 0 50000 usbphy1_gate 1 1 0 0 0 0 50000 ipp_di1 0 0 0 0 0 0 50000 ipp_di0 0 0 0 0 0 0 50000 osc 6 6 0 24000000 0 0 50000 perclk_sel 1 1 0 24000000 0 0 50000 perclk 3 3 0 24000000 0 0 50000 pwm7 0 0 0 24000000 0 0 50000 pwm6 0 0 0 24000000 0 0 50000 pwm5 0 0 0 24000000 0 0 50000 i2c4 0 0 0 24000000 0 0 50000 pwm8 0 0 0 24000000 0 0 50000 pwm4 0 0 0 24000000 0 0 50000 pwm3 0 0 0 24000000 0 0 50000 pwm2 0 0 0 24000000 0 0 50000 pwm1 0 0 0 24000000 0 0 50000 i2c3 0 0 0 24000000 0 0 50000 i2c2 1 1 0 24000000 0 0 50000 i2c1 0 0 0 24000000 0 0 50000 gpt1_serial 1 1 0 24000000 0 0 50000 gpt1_bus 1 1 0 24000000 0 0 50000 epit2 0 0 0 24000000 0 0 50000 epit1 0 0 0 24000000 0 0 50000 gpt2_serial 0 0 0 24000000 0 0 50000 gpt2_bus 0 0 0 24000000 0 0 50000 periph_clk2_sel 0 0 0 24000000 0 0 50000 periph_clk2 0 0 0 24000000 0 0 50000 gpt_3m 0 0 0 3000000 0 0 50000 csi_sel 0 0 0 24000000 0 0 50000 csi_podf 0 0 0 24000000 0 0 50000 csi 0 0 0 24000000 0 0 50000 pll7 1 1 0 480000000 0 0 50000 pll7_bypass 1 1 0 480000000 0 0 50000 pll7_usb_host 1 1 0 480000000 0 0 50000 usbphy2 1 1 0 480000000 0 0 50000 pll6 1 1 0 500000000 0 0 50000 pll6_bypass 1 1 0 500000000 0 0 50000 pll6_enet 2 2 0 500000000 0 0 50000 enet_ptp_ref 1 1 0 25000000 0 0 50000 enet_ptp 1 1 0 25000000 0 0 50000 enet2_ref 0 0 0 50000000 0 0 50000 enet_ref_125m 0 0 0 50000000 0 0 50000 enet_ref 2 2 0 50000000 0 0 50000 pll5 0 0 0 296600000 0 0 50000 pll5_bypass 0 0 0 296600000 0 0 50000 pll5_video 0 0 0 296600000 0 0 50000 pll5_post_div 0 0 0 74150000 0 0 50000 pll5_video_div 0 0 0 74150000 0 0 50000 pll4 0 0 0 147456000 0 0 50000 pll4_bypass 0 0 0 147456000 0 0 50000 pll4_audio 0 0 0 147456000 0 0 50000 pll4_post_div 0 0 0 36864000 0 0 50000 pll4_audio_div 0 0 0 36864000 0 0 50000 pll3 1 1 0 480000000 0 0 50000 pll3_bypass 1 1 0 480000000 0 0 50000 pll3_usb_otg 2 2 0 480000000 0 0 50000 spdif_sel 0 0 0 480000000 0 0 50000 spdif_pred 0 0 0 240000000 0 0 50000 spdif_podf 0 0 0 30000000 0 0 50000 spdif 0 0 0 30000000 0 0 50000 esai_sel 0 0 0 480000000 0 0 50000 esai_pred 0 0 0 240000000 0 0 50000 esai_podf 0 0 0 30000000 0 0 50000 esai_extal 0 0 0 30000000 0 0 50000 qspi1_sel 0 0 0 480000000 0 0 50000 qspi1_podf 0 0 0 240000000 0 0 50000 qspi1 0 0 0 240000000 0 0 50000 ldb_di1_div_7 0 0 0 68571428 0 0 50000 ldb_di1 0 0 0 68571428 0 0 50000 ldb_di1_div_3_5 0 0 0 137142857 0 0 50000 periph2_clk2_sel 0 0 0 480000000 0 0 50000 periph2_clk2 0 0 0 480000000 0 0 50000 pll3_60m 0 0 0 60000000 0 0 50000 can_sel 0 0 0 60000000 0 0 50000 can_podf 0 0 0 30000000 0 0 50000 can2_serial 0 0 0 30000000 0 0 50000 can1_serial 0 0 0 30000000 0 0 50000 ecspi_sel 0 0 0 60000000 0 0 50000 ecspi_podf 0 0 0 60000000 0 0 50000 ecspi4 0 0 0 60000000 0 0 50000 ecspi3 0 0 0 60000000 0 0 50000 ecspi2 0 0 0 60000000 0 0 50000 ecspi1 0 0 0 60000000 0 0 50000 pll3_80m 1 1 0 80000000 0 0 50000 uart_sel 1 1 0 80000000 0 0 50000 uart_podf 1 1 0 80000000 0 0 50000 uart8_serial 0 0 0 80000000 0 0 50000 uart7_serial 0 0 0 80000000 0 0 50000 uart1_serial 1 2 0 80000000 0 0 50000 uart6_serial 0 0 0 80000000 0 0 50000 uart5_serial 0 0 0 80000000 0 0 50000 uart4_serial 0 0 0 80000000 0 0 50000 uart3_serial 0 0 0 80000000 0 0 50000 uart2_serial 0 0 0 80000000 0 0 50000 pll3_pfd3_454m 0 0 0 454736842 0 0 50000 pll3_pfd2_508m 0 0 0 508235294 0 0 50000 epdc_pre_sel 0 0 0 508235294 0 0 50000 epdc_podf 0 0 0 254117647 0 0 50000 epdc_pix 0 0 0 254117647 0 0 50000 epdc_sel 0 0 0 254117647 0 0 50000 sai1_sel 0 0 0 508235294 0 0 50000 sai1_pred 0 0 0 127058824 0 0 50000 sai1_podf 0 0 0 63529412 0 0 50000 sai1 0 0 0 63529412 0 0 50000 sai2_sel 0 0 0 508235294 0 0 50000 sai2_pred 0 0 0 127058824 0 0 50000 sai2_podf 0 0 0 63529412 0 0 50000 sai2 0 0 0 63529412 0 0 50000 sai3_sel 0 0 0 508235294 0 0 50000 sai3_pred 0 0 0 127058824 0 0 50000 sai3_podf 0 0 0 63529412 0 0 50000 sai3 0 0 0 63529412 0 0 50000 pll3_pfd1_540m 0 0 0 540000000 0 0 50000 lcdif_pre_sel 0 0 0 540000000 0 0 50000 lcdif_pred 0 0 0 270000000 0 0 50000 lcdif_podf 0 0 0 135000000 0 0 50000 lcdif_pix 0 0 0 135000000 0 0 50000 iomuxc 0 0 0 135000000 0 0 50000 lcdif_sel 0 0 0 135000000 0 0 50000 pll3_pfd0_720m 0 0 0 720000000 0 0 50000 usbphy1 1 1 0 480000000 0 0 50000 pll2 1 1 0 528000000 0 0 50000 pll2_bypass 1 1 0 528000000 0 0 50000 pll2_bus 2 2 0 528000000 0 0 50000 ca7_secondary_sel 0 0 0 528000000 0 0 50000 step 0 0 0 528000000 0 0 50000 periph_pre 1 1 0 528000000 0 0 50000 periph 3 3 0 528000000 0 0 50000 ahb 7 7 0 132000000 0 0 50000 sdma 0 0 0 132000000 0 0 50000 rom 1 1 0 132000000 0 0 50000 esai_mem 0 0 0 132000000 0 0 50000 esai_ipg 0 0 0 132000000 0 0 50000 aips_tz3 1 1 0 132000000 0 0 50000 enet_ahb 2 2 0 132000000 0 0 50000 dcp 0 0 0 132000000 0 0 50000 asrc_mem 0 0 0 132000000 0 0 50000 asrc_ipg 0 0 0 132000000 0 0 50000 aips_tz2 1 1 0 132000000 0 0 50000 aips_tz1 1 1 0 132000000 0 0 50000 ipg 10 10 0 66000000 0 0 50000 wdog3 0 0 0 66000000 0 0 50000 uart8_ipg 0 0 0 66000000 0 0 50000 usboh3 2 2 0 66000000 0 0 50000 sai2_ipg 0 0 0 66000000 0 0 50000 sai1_ipg 0 0 0 66000000 0 0 50000 uart7_ipg 0 0 0 66000000 0 0 50000 uart1_ipg 1 2 0 66000000 0 0 50000 sai3_ipg 0 0 0 66000000 0 0 50000 spdif_gclk 0 0 0 66000000 0 0 50000 spba 0 0 0 66000000 0 0 50000 wdog2 0 0 0 66000000 0 0 50000 kpp 0 0 0 66000000 0 0 50000 mmdc_p1_ipg 0 0 0 66000000 0 0 50000 mmdc_p0_ipg 2 2 0 66000000 0 0 50000 wdog1 1 1 0 66000000 0 0 50000 gpio4 1 1 0 66000000 0 0 50000 uart6_ipg 0 0 0 66000000 0 0 50000 uart5_ipg 0 0 0 66000000 0 0 50000 gpio3 1 1 0 66000000 0 0 50000 ocotp 0 0 0 66000000 0 0 50000 gpio5 1 1 0 66000000 0 0 50000 gpio1 1 1 0 66000000 0 0 50000 uart4_ipg 0 0 0 66000000 0 0 50000 adc1 0 0 0 66000000 0 0 50000 uart3_ipg 0 0 0 66000000 0 0 50000 adc2 0 0 0 66000000 0 0 50000 gpio2 1 1 0 66000000 0 0 50000 uart2_ipg 0 0 0 66000000 0 0 50000 can2_ipg 0 0 0 66000000 0 0 50000 can1_ipg 0 0 0 66000000 0 0 50000 enet 2 2 0 66000000 0 0 50000 axi_sel 1 1 0 528000000 0 0 50000 axi_podf 2 2 0 264000000 0 0 50000 axi 1 1 0 264000000 0 0 50000 eim_slow_sel 0 0 0 264000000 0 0 50000 eim_slow_podf 0 0 0 132000000 0 0 50000 eim 0 0 0 132000000 0 0 50000 lcdif_apb 0 0 0 264000000 0 0 50000 pxp 0 0 0 264000000 0 0 50000 epdc_aclk 0 0 0 264000000 0 0 50000 pll2_pfd3_594m 0 0 0 594000000 0 0 50000 ldb_di0_sel 0 0 0 594000000 0 0 50000 ldb_di0_div_7 0 0 0 84857142 0 0 50000 ldb_di0 0 0 0 84857142 0 0 50000 ldb_di0_div_3_5 0 0 0 169714285 0 0 50000 pll2_pfd2_396m 2 2 0 396000000 0 0 50000 enfc_sel 0 0 0 396000000 0 0 50000 enfc_pred 0 0 0 99000000 0 0 50000 enfc_podf 0 0 0 99000000 0 0 50000 gpmi_io 0 0 0 99000000 0 0 50000 usdhc1_sel 0 0 0 396000000 0 0 50000 usdhc1_podf 0 0 0 198000000 0 0 50000 usdhc1 0 0 0 198000000 0 0 50000 usdhc2_sel 0 0 0 396000000 0 0 50000 usdhc2_podf 0 0 0 198000000 0 0 50000 usdhc2 0 0 0 198000000 0 0 50000 bch_sel 1 1 0 396000000 0 0 50000 bch_podf 1 1 0 99000000 0 0 50000 gpmi_apb 0 0 0 99000000 0 0 50000 gpmi_bch_apb 0 0 0 99000000 0 0 50000 per_bch 0 0 0 99000000 0 0 50000 apbh_dma 1 1 0 99000000 0 0 50000 gpmi_sel 0 0 0 396000000 0 0 50000 gpmi_podf 0 0 0 99000000 0 0 50000 gpmi_bch 0 0 0 99000000 0 0 50000 periph2_pre 1 1 0 396000000 0 0 50000 periph2 2 2 0 396000000 0 0 50000 mmdc_podf 2 2 0 396000000 0 0 50000 mmdc_p0_fast 1 1 0 396000000 0 0 50000 axi_alt_sel 0 0 0 396000000 0 0 50000 pll2_198m 0 0 0 198000000 0 0 50000 pll2_pfd1_594m 0 0 0 594000000 0 0 50000 pll2_pfd0_352m 0 0 0 352000000 0 0 50000 pll1 1 1 0 900000000 0 0 50000 pll1_bypass 1 1 0 900000000 0 0 50000 pll1_sys 1 1 0 900000000 0 0 50000 pll1_sw 1 1 0 900000000 0 0 50000 arm 1 1 0 900000000 0 0 50000 pll7_bypass_src 0 0 0 24000000 0 0 50000 pll6_bypass_src 0 0 0 24000000 0 0 50000 pll5_bypass_src 0 0 0 24000000 0 0 50000 pll4_bypass_src 0 0 0 24000000 0 0 50000 pll3_bypass_src 0 0 0 24000000 0 0 50000 pll2_bypass_src 0 0 0 24000000 0 0 50000 pll1_bypass_src 0 0 0 24000000 0 0 50000 ckil 0 0 0 32768 0 0 50000 Note that this was generated on a normal boot up (not failure). Running boot testing now waiting for a failure. Regards Greg
On Wed, 31 Jul 2019 12:05:44 +1000 Greg Ungerer <gerg@kernel.org> wrote: > Hi Miquel, Boris, > > On 30/7/19 6:38 pm, Miquel Raynal wrote: > > Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: > >> On 30/7/19 10:41 am, Greg Ungerer wrote: > >>> On 30/7/19 10:28 am, Greg Ungerer wrote: > >>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: > >>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > >>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: > >>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > >>> [snip] > >> Not sure if this is a useful data point... But I modified that > >> nand_init_data_interface() loop to start checking from data mode 4. > >> So now on every boot it defaults to mode 4. That has been running > >> most of the day, up to 900 boot cycles now, no failures. > > > > Ok so after having chatted quite a bit with Boris, it is very likely > > that, for these chips, the timings in mode 5 are too tight. It could > > fail the GET_FEATURES once in mode 5. Can you please dump every single > > intermediate value in gpmi_nfc_compute_timings() (period, *_cycles, > > use of half pêriods, tRP, sample delay, etc) as well as the content > > of /sys/kernel/debug/clk/clk_summary (you'll need debugfs support > > enabled and mounted). > > > > Also, can you be sure that the NAND chip is powered with 3.3V? > > Yes, 3.3V NAND chip. > > Using the attached patch I get the following trace: > > ... > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() > sdr->tBERS_max=65535000000 > sdr->tCCS_min=500000000 > sdr->tPROG_max=65535000000 > sdr->tR_max=200000000000000 > sdr->tALH_min=20000 > sdr->tADL_min=400000 > sdr->tALS_min=50000 > sdr->tAR_min=25000 > sdr->tCEA_max=100000 > sdr->tCEH_min=20000 > sdr->tCH_min=20000 > sdr->tCHZ_max=100000 > sdr->tCLH_min=20000 > sdr->tCLR_min=20000 > sdr->tCLS_min=50000 > sdr->tCOH_min=0 > sdr->tCS_min=70000 > sdr->tDH_min=20000 > sdr->tDS_min=40000 > sdr->tFEAT_max=1000000 > sdr->tIR_min=10000 > sdr->tITC_max=1000000 > sdr->tRC_min=100000 > sdr->tREA_max=40000 > sdr->tREH_min=30000 > sdr->tRHOH_min=0 > sdr->tRHW_min=200000 > sdr->tRHZ_max=200000 > sdr->tRLOH_min=0 > sdr->tRP_min=50000 > sdr->tRR_min=40000 > sdr->tRST_max=250000000000 > sdr->tWB_max=200000 > sdr->tWC_min=100000 > sdr->tWH_min=30000 > sdr->tWHR_min=120000 > sdr->tWP_min=50000 > sdr->tWW_min=100000 > hw->clk_rate=22000000 > wrn_dly_sel=0 > period_ps=45454 > addr_setup_cycles=2 > data_setup_cycles=1 > data_hold_cycles=1 > busy_timeout_cycles=31302 > hw->timing0=0x00020101 > hw->timing1=0x60000000 > dll_threshold_ps=12000 > use_half_period=1 > reference_period_ps=22727 > tRP_ps=45454 > sample_delay_ps=4294955664 > sample_delay_factor=0 > hw->ctrl1n=0x00000000 > hw->ctrl1n=0x00000000 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() > hw>clk_rate=22000000 > clk_set_rate(r->clock[0], hw->clk_rate)=0 > clk_get_rate(r->clock[0])=22000000 > random: fast init done > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() > sdr->tBERS_max=3000000000 > sdr->tCCS_min=100000 > sdr->tPROG_max=600000000 > sdr->tR_max=25000000 > sdr->tALH_min=20000 > sdr->tADL_min=400000 > sdr->tALS_min=50000 > sdr->tAR_min=25000 > sdr->tCEA_max=100000 > sdr->tCEH_min=20000 > sdr->tCH_min=20000 > sdr->tCHZ_max=100000 > sdr->tCLH_min=20000 > sdr->tCLR_min=20000 > sdr->tCLS_min=50000 > sdr->tCOH_min=0 > sdr->tCS_min=70000 > sdr->tDH_min=20000 > sdr->tDS_min=40000 > sdr->tFEAT_max=1000000 > sdr->tIR_min=10000 > sdr->tITC_max=1000000 > sdr->tRC_min=100000 > sdr->tREA_max=40000 > sdr->tREH_min=30000 > sdr->tRHOH_min=0 > sdr->tRHW_min=200000 > sdr->tRHZ_max=200000 > sdr->tRLOH_min=0 > sdr->tRP_min=50000 > sdr->tRR_min=40000 > sdr->tRST_max=250000000000 > sdr->tWB_max=200000 > sdr->tWC_min=100000 > sdr->tWH_min=30000 > sdr->tWHR_min=120000 > sdr->tWP_min=50000 > sdr->tWW_min=100000 > hw->clk_rate=22000000 > wrn_dly_sel=0 > period_ps=45454 > addr_setup_cycles=2 > data_setup_cycles=1 > data_hold_cycles=1 > busy_timeout_cycles=555 > hw->timing0=0x00020101 > hw->timing1=0xb0000000 > dll_threshold_ps=12000 > use_half_period=1 > reference_period_ps=22727 > tRP_ps=45454 > sample_delay_ps=4294955664 > sample_delay_factor=0 > hw->ctrl1n=0x00000000 > hw->ctrl1n=0x00000000 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() > hw>clk_rate=22000000 > clk_set_rate(r->clock[0], hw->clk_rate)=0 > clk_get_rate(r->clock[0])=22000000 > gpmi-nand 1806000.gpmi-nand: use legacy bch geometry > drivers/mtd/nand/raw/nand_base.c(913): checking mode=5 > drivers/mtd/nand/raw/nand_base.c(927): BREAKING AT mode=5 > drivers/mtd/nand/raw/nand_base.c(932): chip->onfi_timing_mode_default=5 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() > sdr->tBERS_max=3000000000 > sdr->tCCS_min=100000 > sdr->tPROG_max=600000000 > sdr->tR_max=25000000 > sdr->tALH_min=5000 > sdr->tADL_min=400000 > sdr->tALS_min=10000 > sdr->tAR_min=10000 > sdr->tCEA_max=25000 > sdr->tCEH_min=20000 > sdr->tCH_min=5000 > sdr->tCHZ_max=30000 > sdr->tCLH_min=5000 > sdr->tCLR_min=10000 > sdr->tCLS_min=10000 > sdr->tCOH_min=15000 > sdr->tCS_min=15000 > sdr->tDH_min=5000 > sdr->tDS_min=7000 > sdr->tFEAT_max=1000000 > sdr->tIR_min=0 > sdr->tITC_max=1000000 > sdr->tRC_min=20000 > sdr->tREA_max=16000 > sdr->tREH_min=7000 > sdr->tRHOH_min=15000 > sdr->tRHW_min=100000 > sdr->tRHZ_max=100000 > sdr->tRLOH_min=5000 > sdr->tRP_min=10000 > sdr->tRR_min=20000 > sdr->tRST_max=500000000 > sdr->tWB_max=100000 > sdr->tWC_min=20000 > sdr->tWH_min=7000 > sdr->tWHR_min=80000 > sdr->tWP_min=10000 > sdr->tWW_min=100000 > hw->clk_rate=100000000 > wrn_dly_sel=3 > period_ps=10000 > addr_setup_cycles=1 > data_setup_cycles=1 > data_hold_cycles=1 > busy_timeout_cycles=2510 > hw->timing0=0x00010101 > hw->timing1=0xe0000000 > dll_threshold_ps=12000 > use_half_period=0 > reference_period_ps=10000 > tRP_ps=10000 > sample_delay_ps=80000 > sample_delay_factor=8 > hw->ctrl1n=0x00c00000 > hw->ctrl1n=0x00c28000 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() > hw>clk_rate=100000000 > clk_set_rate(r->clock[0], hw->clk_rate)=0 > clk_get_rate(r->clock[0])=99000000 > Scanning device for bad blocks > 5 fixed-partitions partitions found on MTD device gpmi-nand > Creating 5 MTD partitions on "gpmi-nand": > 0x000000000000-0x000000500000 : "u-boot" > 0x000000500000-0x000000600000 : "u-boot-env" > 0x000000600000-0x000000800000 : "log" > 0x000000800000-0x000010000000 : "flash" > 0x000000000000-0x000010000000 : "all" > gpmi-nand 1806000.gpmi-nand: driver registered. > ... > > > And "cat /sys/kernel/debug/clk/clk_summary" gives: > > enable prepare protect duty > clock count count count rate accuracy phase cycle > --------------------------------------------------------------------------------------------- > dummy 2 2 0 0 0 0 50000 > cko2_sel 0 0 0 0 0 0 50000 > cko2_podf 0 0 0 0 0 0 50000 > cko2 0 0 0 0 0 0 50000 > cko1_sel 0 0 0 0 0 0 50000 > cko1_podf 0 0 0 0 0 0 50000 > cko1 0 0 0 0 0 0 50000 > cko 0 0 0 0 0 0 50000 > usbphy2_gate 1 1 0 0 0 0 50000 > usbphy1_gate 1 1 0 0 0 0 50000 > ipp_di1 0 0 0 0 0 0 50000 > ipp_di0 0 0 0 0 0 0 50000 > osc 6 6 0 24000000 0 0 50000 > perclk_sel 1 1 0 24000000 0 0 50000 > perclk 3 3 0 24000000 0 0 50000 > pwm7 0 0 0 24000000 0 0 50000 > pwm6 0 0 0 24000000 0 0 50000 > pwm5 0 0 0 24000000 0 0 50000 > i2c4 0 0 0 24000000 0 0 50000 > pwm8 0 0 0 24000000 0 0 50000 > pwm4 0 0 0 24000000 0 0 50000 > pwm3 0 0 0 24000000 0 0 50000 > pwm2 0 0 0 24000000 0 0 50000 > pwm1 0 0 0 24000000 0 0 50000 > i2c3 0 0 0 24000000 0 0 50000 > i2c2 1 1 0 24000000 0 0 50000 > i2c1 0 0 0 24000000 0 0 50000 > gpt1_serial 1 1 0 24000000 0 0 50000 > gpt1_bus 1 1 0 24000000 0 0 50000 > epit2 0 0 0 24000000 0 0 50000 > epit1 0 0 0 24000000 0 0 50000 > gpt2_serial 0 0 0 24000000 0 0 50000 > gpt2_bus 0 0 0 24000000 0 0 50000 > periph_clk2_sel 0 0 0 24000000 0 0 50000 > periph_clk2 0 0 0 24000000 0 0 50000 > gpt_3m 0 0 0 3000000 0 0 50000 > csi_sel 0 0 0 24000000 0 0 50000 > csi_podf 0 0 0 24000000 0 0 50000 > csi 0 0 0 24000000 0 0 50000 > pll7 1 1 0 480000000 0 0 50000 > pll7_bypass 1 1 0 480000000 0 0 50000 > pll7_usb_host 1 1 0 480000000 0 0 50000 > usbphy2 1 1 0 480000000 0 0 50000 > pll6 1 1 0 500000000 0 0 50000 > pll6_bypass 1 1 0 500000000 0 0 50000 > pll6_enet 2 2 0 500000000 0 0 50000 > enet_ptp_ref 1 1 0 25000000 0 0 50000 > enet_ptp 1 1 0 25000000 0 0 50000 > enet2_ref 0 0 0 50000000 0 0 50000 > enet_ref_125m 0 0 0 50000000 0 0 50000 > enet_ref 2 2 0 50000000 0 0 50000 > pll5 0 0 0 296600000 0 0 50000 > pll5_bypass 0 0 0 296600000 0 0 50000 > pll5_video 0 0 0 296600000 0 0 50000 > pll5_post_div 0 0 0 74150000 0 0 50000 > pll5_video_div 0 0 0 74150000 0 0 50000 > pll4 0 0 0 147456000 0 0 50000 > pll4_bypass 0 0 0 147456000 0 0 50000 > pll4_audio 0 0 0 147456000 0 0 50000 > pll4_post_div 0 0 0 36864000 0 0 50000 > pll4_audio_div 0 0 0 36864000 0 0 50000 > pll3 1 1 0 480000000 0 0 50000 > pll3_bypass 1 1 0 480000000 0 0 50000 > pll3_usb_otg 2 2 0 480000000 0 0 50000 > spdif_sel 0 0 0 480000000 0 0 50000 > spdif_pred 0 0 0 240000000 0 0 50000 > spdif_podf 0 0 0 30000000 0 0 50000 > spdif 0 0 0 30000000 0 0 50000 > esai_sel 0 0 0 480000000 0 0 50000 > esai_pred 0 0 0 240000000 0 0 50000 > esai_podf 0 0 0 30000000 0 0 50000 > esai_extal 0 0 0 30000000 0 0 50000 > qspi1_sel 0 0 0 480000000 0 0 50000 > qspi1_podf 0 0 0 240000000 0 0 50000 > qspi1 0 0 0 240000000 0 0 50000 > ldb_di1_div_7 0 0 0 68571428 0 0 50000 > ldb_di1 0 0 0 68571428 0 0 50000 > ldb_di1_div_3_5 0 0 0 137142857 0 0 50000 > periph2_clk2_sel 0 0 0 480000000 0 0 50000 > periph2_clk2 0 0 0 480000000 0 0 50000 > pll3_60m 0 0 0 60000000 0 0 50000 > can_sel 0 0 0 60000000 0 0 50000 > can_podf 0 0 0 30000000 0 0 50000 > can2_serial 0 0 0 30000000 0 0 50000 > can1_serial 0 0 0 30000000 0 0 50000 > ecspi_sel 0 0 0 60000000 0 0 50000 > ecspi_podf 0 0 0 60000000 0 0 50000 > ecspi4 0 0 0 60000000 0 0 50000 > ecspi3 0 0 0 60000000 0 0 50000 > ecspi2 0 0 0 60000000 0 0 50000 > ecspi1 0 0 0 60000000 0 0 50000 > pll3_80m 1 1 0 80000000 0 0 50000 > uart_sel 1 1 0 80000000 0 0 50000 > uart_podf 1 1 0 80000000 0 0 50000 > uart8_serial 0 0 0 80000000 0 0 50000 > uart7_serial 0 0 0 80000000 0 0 50000 > uart1_serial 1 2 0 80000000 0 0 50000 > uart6_serial 0 0 0 80000000 0 0 50000 > uart5_serial 0 0 0 80000000 0 0 50000 > uart4_serial 0 0 0 80000000 0 0 50000 > uart3_serial 0 0 0 80000000 0 0 50000 > uart2_serial 0 0 0 80000000 0 0 50000 > pll3_pfd3_454m 0 0 0 454736842 0 0 50000 > pll3_pfd2_508m 0 0 0 508235294 0 0 50000 > epdc_pre_sel 0 0 0 508235294 0 0 50000 > epdc_podf 0 0 0 254117647 0 0 50000 > epdc_pix 0 0 0 254117647 0 0 50000 > epdc_sel 0 0 0 254117647 0 0 50000 > sai1_sel 0 0 0 508235294 0 0 50000 > sai1_pred 0 0 0 127058824 0 0 50000 > sai1_podf 0 0 0 63529412 0 0 50000 > sai1 0 0 0 63529412 0 0 50000 > sai2_sel 0 0 0 508235294 0 0 50000 > sai2_pred 0 0 0 127058824 0 0 50000 > sai2_podf 0 0 0 63529412 0 0 50000 > sai2 0 0 0 63529412 0 0 50000 > sai3_sel 0 0 0 508235294 0 0 50000 > sai3_pred 0 0 0 127058824 0 0 50000 > sai3_podf 0 0 0 63529412 0 0 50000 > sai3 0 0 0 63529412 0 0 50000 > pll3_pfd1_540m 0 0 0 540000000 0 0 50000 > lcdif_pre_sel 0 0 0 540000000 0 0 50000 > lcdif_pred 0 0 0 270000000 0 0 50000 > lcdif_podf 0 0 0 135000000 0 0 50000 > lcdif_pix 0 0 0 135000000 0 0 50000 > iomuxc 0 0 0 135000000 0 0 50000 > lcdif_sel 0 0 0 135000000 0 0 50000 > pll3_pfd0_720m 0 0 0 720000000 0 0 50000 > usbphy1 1 1 0 480000000 0 0 50000 > pll2 1 1 0 528000000 0 0 50000 > pll2_bypass 1 1 0 528000000 0 0 50000 > pll2_bus 2 2 0 528000000 0 0 50000 > ca7_secondary_sel 0 0 0 528000000 0 0 50000 > step 0 0 0 528000000 0 0 50000 > periph_pre 1 1 0 528000000 0 0 50000 > periph 3 3 0 528000000 0 0 50000 > ahb 7 7 0 132000000 0 0 50000 > sdma 0 0 0 132000000 0 0 50000 > rom 1 1 0 132000000 0 0 50000 > esai_mem 0 0 0 132000000 0 0 50000 > esai_ipg 0 0 0 132000000 0 0 50000 > aips_tz3 1 1 0 132000000 0 0 50000 > enet_ahb 2 2 0 132000000 0 0 50000 > dcp 0 0 0 132000000 0 0 50000 > asrc_mem 0 0 0 132000000 0 0 50000 > asrc_ipg 0 0 0 132000000 0 0 50000 > aips_tz2 1 1 0 132000000 0 0 50000 > aips_tz1 1 1 0 132000000 0 0 50000 > ipg 10 10 0 66000000 0 0 50000 > wdog3 0 0 0 66000000 0 0 50000 > uart8_ipg 0 0 0 66000000 0 0 50000 > usboh3 2 2 0 66000000 0 0 50000 > sai2_ipg 0 0 0 66000000 0 0 50000 > sai1_ipg 0 0 0 66000000 0 0 50000 > uart7_ipg 0 0 0 66000000 0 0 50000 > uart1_ipg 1 2 0 66000000 0 0 50000 > sai3_ipg 0 0 0 66000000 0 0 50000 > spdif_gclk 0 0 0 66000000 0 0 50000 > spba 0 0 0 66000000 0 0 50000 > wdog2 0 0 0 66000000 0 0 50000 > kpp 0 0 0 66000000 0 0 50000 > mmdc_p1_ipg 0 0 0 66000000 0 0 50000 > mmdc_p0_ipg 2 2 0 66000000 0 0 50000 > wdog1 1 1 0 66000000 0 0 50000 > gpio4 1 1 0 66000000 0 0 50000 > uart6_ipg 0 0 0 66000000 0 0 50000 > uart5_ipg 0 0 0 66000000 0 0 50000 > gpio3 1 1 0 66000000 0 0 50000 > ocotp 0 0 0 66000000 0 0 50000 > gpio5 1 1 0 66000000 0 0 50000 > gpio1 1 1 0 66000000 0 0 50000 > uart4_ipg 0 0 0 66000000 0 0 50000 > adc1 0 0 0 66000000 0 0 50000 > uart3_ipg 0 0 0 66000000 0 0 50000 > adc2 0 0 0 66000000 0 0 50000 > gpio2 1 1 0 66000000 0 0 50000 > uart2_ipg 0 0 0 66000000 0 0 50000 > can2_ipg 0 0 0 66000000 0 0 50000 > can1_ipg 0 0 0 66000000 0 0 50000 > enet 2 2 0 66000000 0 0 50000 > axi_sel 1 1 0 528000000 0 0 50000 > axi_podf 2 2 0 264000000 0 0 50000 > axi 1 1 0 264000000 0 0 50000 > eim_slow_sel 0 0 0 264000000 0 0 50000 > eim_slow_podf 0 0 0 132000000 0 0 50000 > eim 0 0 0 132000000 0 0 50000 > lcdif_apb 0 0 0 264000000 0 0 50000 > pxp 0 0 0 264000000 0 0 50000 > epdc_aclk 0 0 0 264000000 0 0 50000 > pll2_pfd3_594m 0 0 0 594000000 0 0 50000 > ldb_di0_sel 0 0 0 594000000 0 0 50000 > ldb_di0_div_7 0 0 0 84857142 0 0 50000 > ldb_di0 0 0 0 84857142 0 0 50000 > ldb_di0_div_3_5 0 0 0 169714285 0 0 50000 > pll2_pfd2_396m 2 2 0 396000000 0 0 50000 > enfc_sel 0 0 0 396000000 0 0 50000 > enfc_pred 0 0 0 99000000 0 0 50000 > enfc_podf 0 0 0 99000000 0 0 50000 > gpmi_io 0 0 0 99000000 0 0 50000 > usdhc1_sel 0 0 0 396000000 0 0 50000 > usdhc1_podf 0 0 0 198000000 0 0 50000 > usdhc1 0 0 0 198000000 0 0 50000 > usdhc2_sel 0 0 0 396000000 0 0 50000 > usdhc2_podf 0 0 0 198000000 0 0 50000 > usdhc2 0 0 0 198000000 0 0 50000 > bch_sel 1 1 0 396000000 0 0 50000 > bch_podf 1 1 0 99000000 0 0 50000 > gpmi_apb 0 0 0 99000000 0 0 50000 > gpmi_bch_apb 0 0 0 99000000 0 0 50000 > per_bch 0 0 0 99000000 0 0 50000 > apbh_dma 1 1 0 99000000 0 0 50000 > gpmi_sel 0 0 0 396000000 0 0 50000 > gpmi_podf 0 0 0 99000000 0 0 50000 > gpmi_bch 0 0 0 99000000 0 0 50000 > periph2_pre 1 1 0 396000000 0 0 50000 > periph2 2 2 0 396000000 0 0 50000 > mmdc_podf 2 2 0 396000000 0 0 50000 > mmdc_p0_fast 1 1 0 396000000 0 0 50000 > axi_alt_sel 0 0 0 396000000 0 0 50000 > pll2_198m 0 0 0 198000000 0 0 50000 > pll2_pfd1_594m 0 0 0 594000000 0 0 50000 > pll2_pfd0_352m 0 0 0 352000000 0 0 50000 > pll1 1 1 0 900000000 0 0 50000 > pll1_bypass 1 1 0 900000000 0 0 50000 > pll1_sys 1 1 0 900000000 0 0 50000 > pll1_sw 1 1 0 900000000 0 0 50000 > arm 1 1 0 900000000 0 0 50000 > pll7_bypass_src 0 0 0 24000000 0 0 50000 > pll6_bypass_src 0 0 0 24000000 0 0 50000 > pll5_bypass_src 0 0 0 24000000 0 0 50000 > pll4_bypass_src 0 0 0 24000000 0 0 50000 > pll3_bypass_src 0 0 0 24000000 0 0 50000 > pll2_bypass_src 0 0 0 24000000 0 0 50000 > pll1_bypass_src 0 0 0 24000000 0 0 50000 > ckil 0 0 0 32768 0 0 50000 > > > Note that this was generated on a normal boot up (not failure). The values looks good. Can you try with the below diff applied? --->8--- diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c index 334fe3130285..9771f6a82abe 100644 --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; - if (!dll_wait_time_us) - dll_wait_time_us = 1; + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); /* Wait for the DLL to settle. */ - udelay(dll_wait_time_us); + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); } static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr,
Hi Boris, On 31/7/19 4:28 pm, Boris Brezillon wrote: > On Wed, 31 Jul 2019 12:05:44 +1000 > Greg Ungerer <gerg@kernel.org> wrote: > >> Hi Miquel, Boris, >> >> On 30/7/19 6:38 pm, Miquel Raynal wrote: >>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: >>>> On 30/7/19 10:41 am, Greg Ungerer wrote: >>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: >>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: >>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >>>>> [snip] >>>> Not sure if this is a useful data point... But I modified that >>>> nand_init_data_interface() loop to start checking from data mode 4. >>>> So now on every boot it defaults to mode 4. That has been running >>>> most of the day, up to 900 boot cycles now, no failures. >>> >>> Ok so after having chatted quite a bit with Boris, it is very likely >>> that, for these chips, the timings in mode 5 are too tight. It could >>> fail the GET_FEATURES once in mode 5. Can you please dump every single >>> intermediate value in gpmi_nfc_compute_timings() (period, *_cycles, >>> use of half pêriods, tRP, sample delay, etc) as well as the content >>> of /sys/kernel/debug/clk/clk_summary (you'll need debugfs support >>> enabled and mounted). >>> >>> Also, can you be sure that the NAND chip is powered with 3.3V? >> >> Yes, 3.3V NAND chip. >> >> Using the attached patch I get the following trace: >> >> ... >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() >> sdr->tBERS_max=65535000000 >> sdr->tCCS_min=500000000 >> sdr->tPROG_max=65535000000 >> sdr->tR_max=200000000000000 >> sdr->tALH_min=20000 >> sdr->tADL_min=400000 >> sdr->tALS_min=50000 >> sdr->tAR_min=25000 >> sdr->tCEA_max=100000 >> sdr->tCEH_min=20000 >> sdr->tCH_min=20000 >> sdr->tCHZ_max=100000 >> sdr->tCLH_min=20000 >> sdr->tCLR_min=20000 >> sdr->tCLS_min=50000 >> sdr->tCOH_min=0 >> sdr->tCS_min=70000 >> sdr->tDH_min=20000 >> sdr->tDS_min=40000 >> sdr->tFEAT_max=1000000 >> sdr->tIR_min=10000 >> sdr->tITC_max=1000000 >> sdr->tRC_min=100000 >> sdr->tREA_max=40000 >> sdr->tREH_min=30000 >> sdr->tRHOH_min=0 >> sdr->tRHW_min=200000 >> sdr->tRHZ_max=200000 >> sdr->tRLOH_min=0 >> sdr->tRP_min=50000 >> sdr->tRR_min=40000 >> sdr->tRST_max=250000000000 >> sdr->tWB_max=200000 >> sdr->tWC_min=100000 >> sdr->tWH_min=30000 >> sdr->tWHR_min=120000 >> sdr->tWP_min=50000 >> sdr->tWW_min=100000 >> hw->clk_rate=22000000 >> wrn_dly_sel=0 >> period_ps=45454 >> addr_setup_cycles=2 >> data_setup_cycles=1 >> data_hold_cycles=1 >> busy_timeout_cycles=31302 >> hw->timing0=0x00020101 >> hw->timing1=0x60000000 >> dll_threshold_ps=12000 >> use_half_period=1 >> reference_period_ps=22727 >> tRP_ps=45454 >> sample_delay_ps=4294955664 >> sample_delay_factor=0 >> hw->ctrl1n=0x00000000 >> hw->ctrl1n=0x00000000 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() >> hw>clk_rate=22000000 >> clk_set_rate(r->clock[0], hw->clk_rate)=0 >> clk_get_rate(r->clock[0])=22000000 >> random: fast init done >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >> nand: Micron MT29F2G08ABAEAWP >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() >> sdr->tBERS_max=3000000000 >> sdr->tCCS_min=100000 >> sdr->tPROG_max=600000000 >> sdr->tR_max=25000000 >> sdr->tALH_min=20000 >> sdr->tADL_min=400000 >> sdr->tALS_min=50000 >> sdr->tAR_min=25000 >> sdr->tCEA_max=100000 >> sdr->tCEH_min=20000 >> sdr->tCH_min=20000 >> sdr->tCHZ_max=100000 >> sdr->tCLH_min=20000 >> sdr->tCLR_min=20000 >> sdr->tCLS_min=50000 >> sdr->tCOH_min=0 >> sdr->tCS_min=70000 >> sdr->tDH_min=20000 >> sdr->tDS_min=40000 >> sdr->tFEAT_max=1000000 >> sdr->tIR_min=10000 >> sdr->tITC_max=1000000 >> sdr->tRC_min=100000 >> sdr->tREA_max=40000 >> sdr->tREH_min=30000 >> sdr->tRHOH_min=0 >> sdr->tRHW_min=200000 >> sdr->tRHZ_max=200000 >> sdr->tRLOH_min=0 >> sdr->tRP_min=50000 >> sdr->tRR_min=40000 >> sdr->tRST_max=250000000000 >> sdr->tWB_max=200000 >> sdr->tWC_min=100000 >> sdr->tWH_min=30000 >> sdr->tWHR_min=120000 >> sdr->tWP_min=50000 >> sdr->tWW_min=100000 >> hw->clk_rate=22000000 >> wrn_dly_sel=0 >> period_ps=45454 >> addr_setup_cycles=2 >> data_setup_cycles=1 >> data_hold_cycles=1 >> busy_timeout_cycles=555 >> hw->timing0=0x00020101 >> hw->timing1=0xb0000000 >> dll_threshold_ps=12000 >> use_half_period=1 >> reference_period_ps=22727 >> tRP_ps=45454 >> sample_delay_ps=4294955664 >> sample_delay_factor=0 >> hw->ctrl1n=0x00000000 >> hw->ctrl1n=0x00000000 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() >> hw>clk_rate=22000000 >> clk_set_rate(r->clock[0], hw->clk_rate)=0 >> clk_get_rate(r->clock[0])=22000000 >> gpmi-nand 1806000.gpmi-nand: use legacy bch geometry >> drivers/mtd/nand/raw/nand_base.c(913): checking mode=5 >> drivers/mtd/nand/raw/nand_base.c(927): BREAKING AT mode=5 >> drivers/mtd/nand/raw/nand_base.c(932): chip->onfi_timing_mode_default=5 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() >> sdr->tBERS_max=3000000000 >> sdr->tCCS_min=100000 >> sdr->tPROG_max=600000000 >> sdr->tR_max=25000000 >> sdr->tALH_min=5000 >> sdr->tADL_min=400000 >> sdr->tALS_min=10000 >> sdr->tAR_min=10000 >> sdr->tCEA_max=25000 >> sdr->tCEH_min=20000 >> sdr->tCH_min=5000 >> sdr->tCHZ_max=30000 >> sdr->tCLH_min=5000 >> sdr->tCLR_min=10000 >> sdr->tCLS_min=10000 >> sdr->tCOH_min=15000 >> sdr->tCS_min=15000 >> sdr->tDH_min=5000 >> sdr->tDS_min=7000 >> sdr->tFEAT_max=1000000 >> sdr->tIR_min=0 >> sdr->tITC_max=1000000 >> sdr->tRC_min=20000 >> sdr->tREA_max=16000 >> sdr->tREH_min=7000 >> sdr->tRHOH_min=15000 >> sdr->tRHW_min=100000 >> sdr->tRHZ_max=100000 >> sdr->tRLOH_min=5000 >> sdr->tRP_min=10000 >> sdr->tRR_min=20000 >> sdr->tRST_max=500000000 >> sdr->tWB_max=100000 >> sdr->tWC_min=20000 >> sdr->tWH_min=7000 >> sdr->tWHR_min=80000 >> sdr->tWP_min=10000 >> sdr->tWW_min=100000 >> hw->clk_rate=100000000 >> wrn_dly_sel=3 >> period_ps=10000 >> addr_setup_cycles=1 >> data_setup_cycles=1 >> data_hold_cycles=1 >> busy_timeout_cycles=2510 >> hw->timing0=0x00010101 >> hw->timing1=0xe0000000 >> dll_threshold_ps=12000 >> use_half_period=0 >> reference_period_ps=10000 >> tRP_ps=10000 >> sample_delay_ps=80000 >> sample_delay_factor=8 >> hw->ctrl1n=0x00c00000 >> hw->ctrl1n=0x00c28000 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() >> hw>clk_rate=100000000 >> clk_set_rate(r->clock[0], hw->clk_rate)=0 >> clk_get_rate(r->clock[0])=99000000 >> Scanning device for bad blocks >> 5 fixed-partitions partitions found on MTD device gpmi-nand >> Creating 5 MTD partitions on "gpmi-nand": >> 0x000000000000-0x000000500000 : "u-boot" >> 0x000000500000-0x000000600000 : "u-boot-env" >> 0x000000600000-0x000000800000 : "log" >> 0x000000800000-0x000010000000 : "flash" >> 0x000000000000-0x000010000000 : "all" >> gpmi-nand 1806000.gpmi-nand: driver registered. >> ... >> >> >> And "cat /sys/kernel/debug/clk/clk_summary" gives: >> >> enable prepare protect duty >> clock count count count rate accuracy phase cycle >> --------------------------------------------------------------------------------------------- >> dummy 2 2 0 0 0 0 50000 >> cko2_sel 0 0 0 0 0 0 50000 >> cko2_podf 0 0 0 0 0 0 50000 >> cko2 0 0 0 0 0 0 50000 >> cko1_sel 0 0 0 0 0 0 50000 >> cko1_podf 0 0 0 0 0 0 50000 >> cko1 0 0 0 0 0 0 50000 >> cko 0 0 0 0 0 0 50000 >> usbphy2_gate 1 1 0 0 0 0 50000 >> usbphy1_gate 1 1 0 0 0 0 50000 >> ipp_di1 0 0 0 0 0 0 50000 >> ipp_di0 0 0 0 0 0 0 50000 >> osc 6 6 0 24000000 0 0 50000 >> perclk_sel 1 1 0 24000000 0 0 50000 >> perclk 3 3 0 24000000 0 0 50000 >> pwm7 0 0 0 24000000 0 0 50000 >> pwm6 0 0 0 24000000 0 0 50000 >> pwm5 0 0 0 24000000 0 0 50000 >> i2c4 0 0 0 24000000 0 0 50000 >> pwm8 0 0 0 24000000 0 0 50000 >> pwm4 0 0 0 24000000 0 0 50000 >> pwm3 0 0 0 24000000 0 0 50000 >> pwm2 0 0 0 24000000 0 0 50000 >> pwm1 0 0 0 24000000 0 0 50000 >> i2c3 0 0 0 24000000 0 0 50000 >> i2c2 1 1 0 24000000 0 0 50000 >> i2c1 0 0 0 24000000 0 0 50000 >> gpt1_serial 1 1 0 24000000 0 0 50000 >> gpt1_bus 1 1 0 24000000 0 0 50000 >> epit2 0 0 0 24000000 0 0 50000 >> epit1 0 0 0 24000000 0 0 50000 >> gpt2_serial 0 0 0 24000000 0 0 50000 >> gpt2_bus 0 0 0 24000000 0 0 50000 >> periph_clk2_sel 0 0 0 24000000 0 0 50000 >> periph_clk2 0 0 0 24000000 0 0 50000 >> gpt_3m 0 0 0 3000000 0 0 50000 >> csi_sel 0 0 0 24000000 0 0 50000 >> csi_podf 0 0 0 24000000 0 0 50000 >> csi 0 0 0 24000000 0 0 50000 >> pll7 1 1 0 480000000 0 0 50000 >> pll7_bypass 1 1 0 480000000 0 0 50000 >> pll7_usb_host 1 1 0 480000000 0 0 50000 >> usbphy2 1 1 0 480000000 0 0 50000 >> pll6 1 1 0 500000000 0 0 50000 >> pll6_bypass 1 1 0 500000000 0 0 50000 >> pll6_enet 2 2 0 500000000 0 0 50000 >> enet_ptp_ref 1 1 0 25000000 0 0 50000 >> enet_ptp 1 1 0 25000000 0 0 50000 >> enet2_ref 0 0 0 50000000 0 0 50000 >> enet_ref_125m 0 0 0 50000000 0 0 50000 >> enet_ref 2 2 0 50000000 0 0 50000 >> pll5 0 0 0 296600000 0 0 50000 >> pll5_bypass 0 0 0 296600000 0 0 50000 >> pll5_video 0 0 0 296600000 0 0 50000 >> pll5_post_div 0 0 0 74150000 0 0 50000 >> pll5_video_div 0 0 0 74150000 0 0 50000 >> pll4 0 0 0 147456000 0 0 50000 >> pll4_bypass 0 0 0 147456000 0 0 50000 >> pll4_audio 0 0 0 147456000 0 0 50000 >> pll4_post_div 0 0 0 36864000 0 0 50000 >> pll4_audio_div 0 0 0 36864000 0 0 50000 >> pll3 1 1 0 480000000 0 0 50000 >> pll3_bypass 1 1 0 480000000 0 0 50000 >> pll3_usb_otg 2 2 0 480000000 0 0 50000 >> spdif_sel 0 0 0 480000000 0 0 50000 >> spdif_pred 0 0 0 240000000 0 0 50000 >> spdif_podf 0 0 0 30000000 0 0 50000 >> spdif 0 0 0 30000000 0 0 50000 >> esai_sel 0 0 0 480000000 0 0 50000 >> esai_pred 0 0 0 240000000 0 0 50000 >> esai_podf 0 0 0 30000000 0 0 50000 >> esai_extal 0 0 0 30000000 0 0 50000 >> qspi1_sel 0 0 0 480000000 0 0 50000 >> qspi1_podf 0 0 0 240000000 0 0 50000 >> qspi1 0 0 0 240000000 0 0 50000 >> ldb_di1_div_7 0 0 0 68571428 0 0 50000 >> ldb_di1 0 0 0 68571428 0 0 50000 >> ldb_di1_div_3_5 0 0 0 137142857 0 0 50000 >> periph2_clk2_sel 0 0 0 480000000 0 0 50000 >> periph2_clk2 0 0 0 480000000 0 0 50000 >> pll3_60m 0 0 0 60000000 0 0 50000 >> can_sel 0 0 0 60000000 0 0 50000 >> can_podf 0 0 0 30000000 0 0 50000 >> can2_serial 0 0 0 30000000 0 0 50000 >> can1_serial 0 0 0 30000000 0 0 50000 >> ecspi_sel 0 0 0 60000000 0 0 50000 >> ecspi_podf 0 0 0 60000000 0 0 50000 >> ecspi4 0 0 0 60000000 0 0 50000 >> ecspi3 0 0 0 60000000 0 0 50000 >> ecspi2 0 0 0 60000000 0 0 50000 >> ecspi1 0 0 0 60000000 0 0 50000 >> pll3_80m 1 1 0 80000000 0 0 50000 >> uart_sel 1 1 0 80000000 0 0 50000 >> uart_podf 1 1 0 80000000 0 0 50000 >> uart8_serial 0 0 0 80000000 0 0 50000 >> uart7_serial 0 0 0 80000000 0 0 50000 >> uart1_serial 1 2 0 80000000 0 0 50000 >> uart6_serial 0 0 0 80000000 0 0 50000 >> uart5_serial 0 0 0 80000000 0 0 50000 >> uart4_serial 0 0 0 80000000 0 0 50000 >> uart3_serial 0 0 0 80000000 0 0 50000 >> uart2_serial 0 0 0 80000000 0 0 50000 >> pll3_pfd3_454m 0 0 0 454736842 0 0 50000 >> pll3_pfd2_508m 0 0 0 508235294 0 0 50000 >> epdc_pre_sel 0 0 0 508235294 0 0 50000 >> epdc_podf 0 0 0 254117647 0 0 50000 >> epdc_pix 0 0 0 254117647 0 0 50000 >> epdc_sel 0 0 0 254117647 0 0 50000 >> sai1_sel 0 0 0 508235294 0 0 50000 >> sai1_pred 0 0 0 127058824 0 0 50000 >> sai1_podf 0 0 0 63529412 0 0 50000 >> sai1 0 0 0 63529412 0 0 50000 >> sai2_sel 0 0 0 508235294 0 0 50000 >> sai2_pred 0 0 0 127058824 0 0 50000 >> sai2_podf 0 0 0 63529412 0 0 50000 >> sai2 0 0 0 63529412 0 0 50000 >> sai3_sel 0 0 0 508235294 0 0 50000 >> sai3_pred 0 0 0 127058824 0 0 50000 >> sai3_podf 0 0 0 63529412 0 0 50000 >> sai3 0 0 0 63529412 0 0 50000 >> pll3_pfd1_540m 0 0 0 540000000 0 0 50000 >> lcdif_pre_sel 0 0 0 540000000 0 0 50000 >> lcdif_pred 0 0 0 270000000 0 0 50000 >> lcdif_podf 0 0 0 135000000 0 0 50000 >> lcdif_pix 0 0 0 135000000 0 0 50000 >> iomuxc 0 0 0 135000000 0 0 50000 >> lcdif_sel 0 0 0 135000000 0 0 50000 >> pll3_pfd0_720m 0 0 0 720000000 0 0 50000 >> usbphy1 1 1 0 480000000 0 0 50000 >> pll2 1 1 0 528000000 0 0 50000 >> pll2_bypass 1 1 0 528000000 0 0 50000 >> pll2_bus 2 2 0 528000000 0 0 50000 >> ca7_secondary_sel 0 0 0 528000000 0 0 50000 >> step 0 0 0 528000000 0 0 50000 >> periph_pre 1 1 0 528000000 0 0 50000 >> periph 3 3 0 528000000 0 0 50000 >> ahb 7 7 0 132000000 0 0 50000 >> sdma 0 0 0 132000000 0 0 50000 >> rom 1 1 0 132000000 0 0 50000 >> esai_mem 0 0 0 132000000 0 0 50000 >> esai_ipg 0 0 0 132000000 0 0 50000 >> aips_tz3 1 1 0 132000000 0 0 50000 >> enet_ahb 2 2 0 132000000 0 0 50000 >> dcp 0 0 0 132000000 0 0 50000 >> asrc_mem 0 0 0 132000000 0 0 50000 >> asrc_ipg 0 0 0 132000000 0 0 50000 >> aips_tz2 1 1 0 132000000 0 0 50000 >> aips_tz1 1 1 0 132000000 0 0 50000 >> ipg 10 10 0 66000000 0 0 50000 >> wdog3 0 0 0 66000000 0 0 50000 >> uart8_ipg 0 0 0 66000000 0 0 50000 >> usboh3 2 2 0 66000000 0 0 50000 >> sai2_ipg 0 0 0 66000000 0 0 50000 >> sai1_ipg 0 0 0 66000000 0 0 50000 >> uart7_ipg 0 0 0 66000000 0 0 50000 >> uart1_ipg 1 2 0 66000000 0 0 50000 >> sai3_ipg 0 0 0 66000000 0 0 50000 >> spdif_gclk 0 0 0 66000000 0 0 50000 >> spba 0 0 0 66000000 0 0 50000 >> wdog2 0 0 0 66000000 0 0 50000 >> kpp 0 0 0 66000000 0 0 50000 >> mmdc_p1_ipg 0 0 0 66000000 0 0 50000 >> mmdc_p0_ipg 2 2 0 66000000 0 0 50000 >> wdog1 1 1 0 66000000 0 0 50000 >> gpio4 1 1 0 66000000 0 0 50000 >> uart6_ipg 0 0 0 66000000 0 0 50000 >> uart5_ipg 0 0 0 66000000 0 0 50000 >> gpio3 1 1 0 66000000 0 0 50000 >> ocotp 0 0 0 66000000 0 0 50000 >> gpio5 1 1 0 66000000 0 0 50000 >> gpio1 1 1 0 66000000 0 0 50000 >> uart4_ipg 0 0 0 66000000 0 0 50000 >> adc1 0 0 0 66000000 0 0 50000 >> uart3_ipg 0 0 0 66000000 0 0 50000 >> adc2 0 0 0 66000000 0 0 50000 >> gpio2 1 1 0 66000000 0 0 50000 >> uart2_ipg 0 0 0 66000000 0 0 50000 >> can2_ipg 0 0 0 66000000 0 0 50000 >> can1_ipg 0 0 0 66000000 0 0 50000 >> enet 2 2 0 66000000 0 0 50000 >> axi_sel 1 1 0 528000000 0 0 50000 >> axi_podf 2 2 0 264000000 0 0 50000 >> axi 1 1 0 264000000 0 0 50000 >> eim_slow_sel 0 0 0 264000000 0 0 50000 >> eim_slow_podf 0 0 0 132000000 0 0 50000 >> eim 0 0 0 132000000 0 0 50000 >> lcdif_apb 0 0 0 264000000 0 0 50000 >> pxp 0 0 0 264000000 0 0 50000 >> epdc_aclk 0 0 0 264000000 0 0 50000 >> pll2_pfd3_594m 0 0 0 594000000 0 0 50000 >> ldb_di0_sel 0 0 0 594000000 0 0 50000 >> ldb_di0_div_7 0 0 0 84857142 0 0 50000 >> ldb_di0 0 0 0 84857142 0 0 50000 >> ldb_di0_div_3_5 0 0 0 169714285 0 0 50000 >> pll2_pfd2_396m 2 2 0 396000000 0 0 50000 >> enfc_sel 0 0 0 396000000 0 0 50000 >> enfc_pred 0 0 0 99000000 0 0 50000 >> enfc_podf 0 0 0 99000000 0 0 50000 >> gpmi_io 0 0 0 99000000 0 0 50000 >> usdhc1_sel 0 0 0 396000000 0 0 50000 >> usdhc1_podf 0 0 0 198000000 0 0 50000 >> usdhc1 0 0 0 198000000 0 0 50000 >> usdhc2_sel 0 0 0 396000000 0 0 50000 >> usdhc2_podf 0 0 0 198000000 0 0 50000 >> usdhc2 0 0 0 198000000 0 0 50000 >> bch_sel 1 1 0 396000000 0 0 50000 >> bch_podf 1 1 0 99000000 0 0 50000 >> gpmi_apb 0 0 0 99000000 0 0 50000 >> gpmi_bch_apb 0 0 0 99000000 0 0 50000 >> per_bch 0 0 0 99000000 0 0 50000 >> apbh_dma 1 1 0 99000000 0 0 50000 >> gpmi_sel 0 0 0 396000000 0 0 50000 >> gpmi_podf 0 0 0 99000000 0 0 50000 >> gpmi_bch 0 0 0 99000000 0 0 50000 >> periph2_pre 1 1 0 396000000 0 0 50000 >> periph2 2 2 0 396000000 0 0 50000 >> mmdc_podf 2 2 0 396000000 0 0 50000 >> mmdc_p0_fast 1 1 0 396000000 0 0 50000 >> axi_alt_sel 0 0 0 396000000 0 0 50000 >> pll2_198m 0 0 0 198000000 0 0 50000 >> pll2_pfd1_594m 0 0 0 594000000 0 0 50000 >> pll2_pfd0_352m 0 0 0 352000000 0 0 50000 >> pll1 1 1 0 900000000 0 0 50000 >> pll1_bypass 1 1 0 900000000 0 0 50000 >> pll1_sys 1 1 0 900000000 0 0 50000 >> pll1_sw 1 1 0 900000000 0 0 50000 >> arm 1 1 0 900000000 0 0 50000 >> pll7_bypass_src 0 0 0 24000000 0 0 50000 >> pll6_bypass_src 0 0 0 24000000 0 0 50000 >> pll5_bypass_src 0 0 0 24000000 0 0 50000 >> pll4_bypass_src 0 0 0 24000000 0 0 50000 >> pll3_bypass_src 0 0 0 24000000 0 0 50000 >> pll2_bypass_src 0 0 0 24000000 0 0 50000 >> pll1_bypass_src 0 0 0 24000000 0 0 50000 >> ckil 0 0 0 32768 0 0 50000 >> >> >> Note that this was generated on a normal boot up (not failure). > > The values looks good. Can you try with the below diff applied? > --->8--- > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > index 334fe3130285..9771f6a82abe 100644 > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); > > /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ > - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; > - if (!dll_wait_time_us) > - dll_wait_time_us = 1; > + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); > > /* Wait for the DLL to settle. */ > - udelay(dll_wait_time_us); > + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); > } > > static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, Running a boot test with this now. Has run for a couple of hours, but I want to give it a really good run over the weekend. Will report results on Monday. Thanks Greg
Hi Boris, On 31/7/19 4:28 pm, Boris Brezillon wrote: > On Wed, 31 Jul 2019 12:05:44 +1000 > Greg Ungerer <gerg@kernel.org> wrote: > >> Hi Miquel, Boris, >> >> On 30/7/19 6:38 pm, Miquel Raynal wrote: >>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: >>>> On 30/7/19 10:41 am, Greg Ungerer wrote: >>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: >>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: >>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >>>>> [snip] >>>> Not sure if this is a useful data point... But I modified that >>>> nand_init_data_interface() loop to start checking from data mode 4. >>>> So now on every boot it defaults to mode 4. That has been running >>>> most of the day, up to 900 boot cycles now, no failures. >>> >>> Ok so after having chatted quite a bit with Boris, it is very likely >>> that, for these chips, the timings in mode 5 are too tight. It could >>> fail the GET_FEATURES once in mode 5. Can you please dump every single >>> intermediate value in gpmi_nfc_compute_timings() (period, *_cycles, >>> use of half pêriods, tRP, sample delay, etc) as well as the content >>> of /sys/kernel/debug/clk/clk_summary (you'll need debugfs support >>> enabled and mounted). >>> >>> Also, can you be sure that the NAND chip is powered with 3.3V? >> >> Yes, 3.3V NAND chip. >> >> Using the attached patch I get the following trace: >> >> ... >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() >> sdr->tBERS_max=65535000000 >> sdr->tCCS_min=500000000 >> sdr->tPROG_max=65535000000 >> sdr->tR_max=200000000000000 >> sdr->tALH_min=20000 >> sdr->tADL_min=400000 >> sdr->tALS_min=50000 >> sdr->tAR_min=25000 >> sdr->tCEA_max=100000 >> sdr->tCEH_min=20000 >> sdr->tCH_min=20000 >> sdr->tCHZ_max=100000 >> sdr->tCLH_min=20000 >> sdr->tCLR_min=20000 >> sdr->tCLS_min=50000 >> sdr->tCOH_min=0 >> sdr->tCS_min=70000 >> sdr->tDH_min=20000 >> sdr->tDS_min=40000 >> sdr->tFEAT_max=1000000 >> sdr->tIR_min=10000 >> sdr->tITC_max=1000000 >> sdr->tRC_min=100000 >> sdr->tREA_max=40000 >> sdr->tREH_min=30000 >> sdr->tRHOH_min=0 >> sdr->tRHW_min=200000 >> sdr->tRHZ_max=200000 >> sdr->tRLOH_min=0 >> sdr->tRP_min=50000 >> sdr->tRR_min=40000 >> sdr->tRST_max=250000000000 >> sdr->tWB_max=200000 >> sdr->tWC_min=100000 >> sdr->tWH_min=30000 >> sdr->tWHR_min=120000 >> sdr->tWP_min=50000 >> sdr->tWW_min=100000 >> hw->clk_rate=22000000 >> wrn_dly_sel=0 >> period_ps=45454 >> addr_setup_cycles=2 >> data_setup_cycles=1 >> data_hold_cycles=1 >> busy_timeout_cycles=31302 >> hw->timing0=0x00020101 >> hw->timing1=0x60000000 >> dll_threshold_ps=12000 >> use_half_period=1 >> reference_period_ps=22727 >> tRP_ps=45454 >> sample_delay_ps=4294955664 >> sample_delay_factor=0 >> hw->ctrl1n=0x00000000 >> hw->ctrl1n=0x00000000 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() >> hw>clk_rate=22000000 >> clk_set_rate(r->clock[0], hw->clk_rate)=0 >> clk_get_rate(r->clock[0])=22000000 >> random: fast init done >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >> nand: Micron MT29F2G08ABAEAWP >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() >> sdr->tBERS_max=3000000000 >> sdr->tCCS_min=100000 >> sdr->tPROG_max=600000000 >> sdr->tR_max=25000000 >> sdr->tALH_min=20000 >> sdr->tADL_min=400000 >> sdr->tALS_min=50000 >> sdr->tAR_min=25000 >> sdr->tCEA_max=100000 >> sdr->tCEH_min=20000 >> sdr->tCH_min=20000 >> sdr->tCHZ_max=100000 >> sdr->tCLH_min=20000 >> sdr->tCLR_min=20000 >> sdr->tCLS_min=50000 >> sdr->tCOH_min=0 >> sdr->tCS_min=70000 >> sdr->tDH_min=20000 >> sdr->tDS_min=40000 >> sdr->tFEAT_max=1000000 >> sdr->tIR_min=10000 >> sdr->tITC_max=1000000 >> sdr->tRC_min=100000 >> sdr->tREA_max=40000 >> sdr->tREH_min=30000 >> sdr->tRHOH_min=0 >> sdr->tRHW_min=200000 >> sdr->tRHZ_max=200000 >> sdr->tRLOH_min=0 >> sdr->tRP_min=50000 >> sdr->tRR_min=40000 >> sdr->tRST_max=250000000000 >> sdr->tWB_max=200000 >> sdr->tWC_min=100000 >> sdr->tWH_min=30000 >> sdr->tWHR_min=120000 >> sdr->tWP_min=50000 >> sdr->tWW_min=100000 >> hw->clk_rate=22000000 >> wrn_dly_sel=0 >> period_ps=45454 >> addr_setup_cycles=2 >> data_setup_cycles=1 >> data_hold_cycles=1 >> busy_timeout_cycles=555 >> hw->timing0=0x00020101 >> hw->timing1=0xb0000000 >> dll_threshold_ps=12000 >> use_half_period=1 >> reference_period_ps=22727 >> tRP_ps=45454 >> sample_delay_ps=4294955664 >> sample_delay_factor=0 >> hw->ctrl1n=0x00000000 >> hw->ctrl1n=0x00000000 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() >> hw>clk_rate=22000000 >> clk_set_rate(r->clock[0], hw->clk_rate)=0 >> clk_get_rate(r->clock[0])=22000000 >> gpmi-nand 1806000.gpmi-nand: use legacy bch geometry >> drivers/mtd/nand/raw/nand_base.c(913): checking mode=5 >> drivers/mtd/nand/raw/nand_base.c(927): BREAKING AT mode=5 >> drivers/mtd/nand/raw/nand_base.c(932): chip->onfi_timing_mode_default=5 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() >> sdr->tBERS_max=3000000000 >> sdr->tCCS_min=100000 >> sdr->tPROG_max=600000000 >> sdr->tR_max=25000000 >> sdr->tALH_min=5000 >> sdr->tADL_min=400000 >> sdr->tALS_min=10000 >> sdr->tAR_min=10000 >> sdr->tCEA_max=25000 >> sdr->tCEH_min=20000 >> sdr->tCH_min=5000 >> sdr->tCHZ_max=30000 >> sdr->tCLH_min=5000 >> sdr->tCLR_min=10000 >> sdr->tCLS_min=10000 >> sdr->tCOH_min=15000 >> sdr->tCS_min=15000 >> sdr->tDH_min=5000 >> sdr->tDS_min=7000 >> sdr->tFEAT_max=1000000 >> sdr->tIR_min=0 >> sdr->tITC_max=1000000 >> sdr->tRC_min=20000 >> sdr->tREA_max=16000 >> sdr->tREH_min=7000 >> sdr->tRHOH_min=15000 >> sdr->tRHW_min=100000 >> sdr->tRHZ_max=100000 >> sdr->tRLOH_min=5000 >> sdr->tRP_min=10000 >> sdr->tRR_min=20000 >> sdr->tRST_max=500000000 >> sdr->tWB_max=100000 >> sdr->tWC_min=20000 >> sdr->tWH_min=7000 >> sdr->tWHR_min=80000 >> sdr->tWP_min=10000 >> sdr->tWW_min=100000 >> hw->clk_rate=100000000 >> wrn_dly_sel=3 >> period_ps=10000 >> addr_setup_cycles=1 >> data_setup_cycles=1 >> data_hold_cycles=1 >> busy_timeout_cycles=2510 >> hw->timing0=0x00010101 >> hw->timing1=0xe0000000 >> dll_threshold_ps=12000 >> use_half_period=0 >> reference_period_ps=10000 >> tRP_ps=10000 >> sample_delay_ps=80000 >> sample_delay_factor=8 >> hw->ctrl1n=0x00c00000 >> hw->ctrl1n=0x00c28000 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() >> hw>clk_rate=100000000 >> clk_set_rate(r->clock[0], hw->clk_rate)=0 >> clk_get_rate(r->clock[0])=99000000 >> Scanning device for bad blocks >> 5 fixed-partitions partitions found on MTD device gpmi-nand >> Creating 5 MTD partitions on "gpmi-nand": >> 0x000000000000-0x000000500000 : "u-boot" >> 0x000000500000-0x000000600000 : "u-boot-env" >> 0x000000600000-0x000000800000 : "log" >> 0x000000800000-0x000010000000 : "flash" >> 0x000000000000-0x000010000000 : "all" >> gpmi-nand 1806000.gpmi-nand: driver registered. >> ... >> >> >> And "cat /sys/kernel/debug/clk/clk_summary" gives: >> >> enable prepare protect duty >> clock count count count rate accuracy phase cycle >> --------------------------------------------------------------------------------------------- >> dummy 2 2 0 0 0 0 50000 >> cko2_sel 0 0 0 0 0 0 50000 >> cko2_podf 0 0 0 0 0 0 50000 >> cko2 0 0 0 0 0 0 50000 >> cko1_sel 0 0 0 0 0 0 50000 >> cko1_podf 0 0 0 0 0 0 50000 >> cko1 0 0 0 0 0 0 50000 >> cko 0 0 0 0 0 0 50000 >> usbphy2_gate 1 1 0 0 0 0 50000 >> usbphy1_gate 1 1 0 0 0 0 50000 >> ipp_di1 0 0 0 0 0 0 50000 >> ipp_di0 0 0 0 0 0 0 50000 >> osc 6 6 0 24000000 0 0 50000 >> perclk_sel 1 1 0 24000000 0 0 50000 >> perclk 3 3 0 24000000 0 0 50000 >> pwm7 0 0 0 24000000 0 0 50000 >> pwm6 0 0 0 24000000 0 0 50000 >> pwm5 0 0 0 24000000 0 0 50000 >> i2c4 0 0 0 24000000 0 0 50000 >> pwm8 0 0 0 24000000 0 0 50000 >> pwm4 0 0 0 24000000 0 0 50000 >> pwm3 0 0 0 24000000 0 0 50000 >> pwm2 0 0 0 24000000 0 0 50000 >> pwm1 0 0 0 24000000 0 0 50000 >> i2c3 0 0 0 24000000 0 0 50000 >> i2c2 1 1 0 24000000 0 0 50000 >> i2c1 0 0 0 24000000 0 0 50000 >> gpt1_serial 1 1 0 24000000 0 0 50000 >> gpt1_bus 1 1 0 24000000 0 0 50000 >> epit2 0 0 0 24000000 0 0 50000 >> epit1 0 0 0 24000000 0 0 50000 >> gpt2_serial 0 0 0 24000000 0 0 50000 >> gpt2_bus 0 0 0 24000000 0 0 50000 >> periph_clk2_sel 0 0 0 24000000 0 0 50000 >> periph_clk2 0 0 0 24000000 0 0 50000 >> gpt_3m 0 0 0 3000000 0 0 50000 >> csi_sel 0 0 0 24000000 0 0 50000 >> csi_podf 0 0 0 24000000 0 0 50000 >> csi 0 0 0 24000000 0 0 50000 >> pll7 1 1 0 480000000 0 0 50000 >> pll7_bypass 1 1 0 480000000 0 0 50000 >> pll7_usb_host 1 1 0 480000000 0 0 50000 >> usbphy2 1 1 0 480000000 0 0 50000 >> pll6 1 1 0 500000000 0 0 50000 >> pll6_bypass 1 1 0 500000000 0 0 50000 >> pll6_enet 2 2 0 500000000 0 0 50000 >> enet_ptp_ref 1 1 0 25000000 0 0 50000 >> enet_ptp 1 1 0 25000000 0 0 50000 >> enet2_ref 0 0 0 50000000 0 0 50000 >> enet_ref_125m 0 0 0 50000000 0 0 50000 >> enet_ref 2 2 0 50000000 0 0 50000 >> pll5 0 0 0 296600000 0 0 50000 >> pll5_bypass 0 0 0 296600000 0 0 50000 >> pll5_video 0 0 0 296600000 0 0 50000 >> pll5_post_div 0 0 0 74150000 0 0 50000 >> pll5_video_div 0 0 0 74150000 0 0 50000 >> pll4 0 0 0 147456000 0 0 50000 >> pll4_bypass 0 0 0 147456000 0 0 50000 >> pll4_audio 0 0 0 147456000 0 0 50000 >> pll4_post_div 0 0 0 36864000 0 0 50000 >> pll4_audio_div 0 0 0 36864000 0 0 50000 >> pll3 1 1 0 480000000 0 0 50000 >> pll3_bypass 1 1 0 480000000 0 0 50000 >> pll3_usb_otg 2 2 0 480000000 0 0 50000 >> spdif_sel 0 0 0 480000000 0 0 50000 >> spdif_pred 0 0 0 240000000 0 0 50000 >> spdif_podf 0 0 0 30000000 0 0 50000 >> spdif 0 0 0 30000000 0 0 50000 >> esai_sel 0 0 0 480000000 0 0 50000 >> esai_pred 0 0 0 240000000 0 0 50000 >> esai_podf 0 0 0 30000000 0 0 50000 >> esai_extal 0 0 0 30000000 0 0 50000 >> qspi1_sel 0 0 0 480000000 0 0 50000 >> qspi1_podf 0 0 0 240000000 0 0 50000 >> qspi1 0 0 0 240000000 0 0 50000 >> ldb_di1_div_7 0 0 0 68571428 0 0 50000 >> ldb_di1 0 0 0 68571428 0 0 50000 >> ldb_di1_div_3_5 0 0 0 137142857 0 0 50000 >> periph2_clk2_sel 0 0 0 480000000 0 0 50000 >> periph2_clk2 0 0 0 480000000 0 0 50000 >> pll3_60m 0 0 0 60000000 0 0 50000 >> can_sel 0 0 0 60000000 0 0 50000 >> can_podf 0 0 0 30000000 0 0 50000 >> can2_serial 0 0 0 30000000 0 0 50000 >> can1_serial 0 0 0 30000000 0 0 50000 >> ecspi_sel 0 0 0 60000000 0 0 50000 >> ecspi_podf 0 0 0 60000000 0 0 50000 >> ecspi4 0 0 0 60000000 0 0 50000 >> ecspi3 0 0 0 60000000 0 0 50000 >> ecspi2 0 0 0 60000000 0 0 50000 >> ecspi1 0 0 0 60000000 0 0 50000 >> pll3_80m 1 1 0 80000000 0 0 50000 >> uart_sel 1 1 0 80000000 0 0 50000 >> uart_podf 1 1 0 80000000 0 0 50000 >> uart8_serial 0 0 0 80000000 0 0 50000 >> uart7_serial 0 0 0 80000000 0 0 50000 >> uart1_serial 1 2 0 80000000 0 0 50000 >> uart6_serial 0 0 0 80000000 0 0 50000 >> uart5_serial 0 0 0 80000000 0 0 50000 >> uart4_serial 0 0 0 80000000 0 0 50000 >> uart3_serial 0 0 0 80000000 0 0 50000 >> uart2_serial 0 0 0 80000000 0 0 50000 >> pll3_pfd3_454m 0 0 0 454736842 0 0 50000 >> pll3_pfd2_508m 0 0 0 508235294 0 0 50000 >> epdc_pre_sel 0 0 0 508235294 0 0 50000 >> epdc_podf 0 0 0 254117647 0 0 50000 >> epdc_pix 0 0 0 254117647 0 0 50000 >> epdc_sel 0 0 0 254117647 0 0 50000 >> sai1_sel 0 0 0 508235294 0 0 50000 >> sai1_pred 0 0 0 127058824 0 0 50000 >> sai1_podf 0 0 0 63529412 0 0 50000 >> sai1 0 0 0 63529412 0 0 50000 >> sai2_sel 0 0 0 508235294 0 0 50000 >> sai2_pred 0 0 0 127058824 0 0 50000 >> sai2_podf 0 0 0 63529412 0 0 50000 >> sai2 0 0 0 63529412 0 0 50000 >> sai3_sel 0 0 0 508235294 0 0 50000 >> sai3_pred 0 0 0 127058824 0 0 50000 >> sai3_podf 0 0 0 63529412 0 0 50000 >> sai3 0 0 0 63529412 0 0 50000 >> pll3_pfd1_540m 0 0 0 540000000 0 0 50000 >> lcdif_pre_sel 0 0 0 540000000 0 0 50000 >> lcdif_pred 0 0 0 270000000 0 0 50000 >> lcdif_podf 0 0 0 135000000 0 0 50000 >> lcdif_pix 0 0 0 135000000 0 0 50000 >> iomuxc 0 0 0 135000000 0 0 50000 >> lcdif_sel 0 0 0 135000000 0 0 50000 >> pll3_pfd0_720m 0 0 0 720000000 0 0 50000 >> usbphy1 1 1 0 480000000 0 0 50000 >> pll2 1 1 0 528000000 0 0 50000 >> pll2_bypass 1 1 0 528000000 0 0 50000 >> pll2_bus 2 2 0 528000000 0 0 50000 >> ca7_secondary_sel 0 0 0 528000000 0 0 50000 >> step 0 0 0 528000000 0 0 50000 >> periph_pre 1 1 0 528000000 0 0 50000 >> periph 3 3 0 528000000 0 0 50000 >> ahb 7 7 0 132000000 0 0 50000 >> sdma 0 0 0 132000000 0 0 50000 >> rom 1 1 0 132000000 0 0 50000 >> esai_mem 0 0 0 132000000 0 0 50000 >> esai_ipg 0 0 0 132000000 0 0 50000 >> aips_tz3 1 1 0 132000000 0 0 50000 >> enet_ahb 2 2 0 132000000 0 0 50000 >> dcp 0 0 0 132000000 0 0 50000 >> asrc_mem 0 0 0 132000000 0 0 50000 >> asrc_ipg 0 0 0 132000000 0 0 50000 >> aips_tz2 1 1 0 132000000 0 0 50000 >> aips_tz1 1 1 0 132000000 0 0 50000 >> ipg 10 10 0 66000000 0 0 50000 >> wdog3 0 0 0 66000000 0 0 50000 >> uart8_ipg 0 0 0 66000000 0 0 50000 >> usboh3 2 2 0 66000000 0 0 50000 >> sai2_ipg 0 0 0 66000000 0 0 50000 >> sai1_ipg 0 0 0 66000000 0 0 50000 >> uart7_ipg 0 0 0 66000000 0 0 50000 >> uart1_ipg 1 2 0 66000000 0 0 50000 >> sai3_ipg 0 0 0 66000000 0 0 50000 >> spdif_gclk 0 0 0 66000000 0 0 50000 >> spba 0 0 0 66000000 0 0 50000 >> wdog2 0 0 0 66000000 0 0 50000 >> kpp 0 0 0 66000000 0 0 50000 >> mmdc_p1_ipg 0 0 0 66000000 0 0 50000 >> mmdc_p0_ipg 2 2 0 66000000 0 0 50000 >> wdog1 1 1 0 66000000 0 0 50000 >> gpio4 1 1 0 66000000 0 0 50000 >> uart6_ipg 0 0 0 66000000 0 0 50000 >> uart5_ipg 0 0 0 66000000 0 0 50000 >> gpio3 1 1 0 66000000 0 0 50000 >> ocotp 0 0 0 66000000 0 0 50000 >> gpio5 1 1 0 66000000 0 0 50000 >> gpio1 1 1 0 66000000 0 0 50000 >> uart4_ipg 0 0 0 66000000 0 0 50000 >> adc1 0 0 0 66000000 0 0 50000 >> uart3_ipg 0 0 0 66000000 0 0 50000 >> adc2 0 0 0 66000000 0 0 50000 >> gpio2 1 1 0 66000000 0 0 50000 >> uart2_ipg 0 0 0 66000000 0 0 50000 >> can2_ipg 0 0 0 66000000 0 0 50000 >> can1_ipg 0 0 0 66000000 0 0 50000 >> enet 2 2 0 66000000 0 0 50000 >> axi_sel 1 1 0 528000000 0 0 50000 >> axi_podf 2 2 0 264000000 0 0 50000 >> axi 1 1 0 264000000 0 0 50000 >> eim_slow_sel 0 0 0 264000000 0 0 50000 >> eim_slow_podf 0 0 0 132000000 0 0 50000 >> eim 0 0 0 132000000 0 0 50000 >> lcdif_apb 0 0 0 264000000 0 0 50000 >> pxp 0 0 0 264000000 0 0 50000 >> epdc_aclk 0 0 0 264000000 0 0 50000 >> pll2_pfd3_594m 0 0 0 594000000 0 0 50000 >> ldb_di0_sel 0 0 0 594000000 0 0 50000 >> ldb_di0_div_7 0 0 0 84857142 0 0 50000 >> ldb_di0 0 0 0 84857142 0 0 50000 >> ldb_di0_div_3_5 0 0 0 169714285 0 0 50000 >> pll2_pfd2_396m 2 2 0 396000000 0 0 50000 >> enfc_sel 0 0 0 396000000 0 0 50000 >> enfc_pred 0 0 0 99000000 0 0 50000 >> enfc_podf 0 0 0 99000000 0 0 50000 >> gpmi_io 0 0 0 99000000 0 0 50000 >> usdhc1_sel 0 0 0 396000000 0 0 50000 >> usdhc1_podf 0 0 0 198000000 0 0 50000 >> usdhc1 0 0 0 198000000 0 0 50000 >> usdhc2_sel 0 0 0 396000000 0 0 50000 >> usdhc2_podf 0 0 0 198000000 0 0 50000 >> usdhc2 0 0 0 198000000 0 0 50000 >> bch_sel 1 1 0 396000000 0 0 50000 >> bch_podf 1 1 0 99000000 0 0 50000 >> gpmi_apb 0 0 0 99000000 0 0 50000 >> gpmi_bch_apb 0 0 0 99000000 0 0 50000 >> per_bch 0 0 0 99000000 0 0 50000 >> apbh_dma 1 1 0 99000000 0 0 50000 >> gpmi_sel 0 0 0 396000000 0 0 50000 >> gpmi_podf 0 0 0 99000000 0 0 50000 >> gpmi_bch 0 0 0 99000000 0 0 50000 >> periph2_pre 1 1 0 396000000 0 0 50000 >> periph2 2 2 0 396000000 0 0 50000 >> mmdc_podf 2 2 0 396000000 0 0 50000 >> mmdc_p0_fast 1 1 0 396000000 0 0 50000 >> axi_alt_sel 0 0 0 396000000 0 0 50000 >> pll2_198m 0 0 0 198000000 0 0 50000 >> pll2_pfd1_594m 0 0 0 594000000 0 0 50000 >> pll2_pfd0_352m 0 0 0 352000000 0 0 50000 >> pll1 1 1 0 900000000 0 0 50000 >> pll1_bypass 1 1 0 900000000 0 0 50000 >> pll1_sys 1 1 0 900000000 0 0 50000 >> pll1_sw 1 1 0 900000000 0 0 50000 >> arm 1 1 0 900000000 0 0 50000 >> pll7_bypass_src 0 0 0 24000000 0 0 50000 >> pll6_bypass_src 0 0 0 24000000 0 0 50000 >> pll5_bypass_src 0 0 0 24000000 0 0 50000 >> pll4_bypass_src 0 0 0 24000000 0 0 50000 >> pll3_bypass_src 0 0 0 24000000 0 0 50000 >> pll2_bypass_src 0 0 0 24000000 0 0 50000 >> pll1_bypass_src 0 0 0 24000000 0 0 50000 >> ckil 0 0 0 32768 0 0 50000 >> >> >> Note that this was generated on a normal boot up (not failure). > > The values looks good. Can you try with the below diff applied? > --->8--- > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > index 334fe3130285..9771f6a82abe 100644 > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); > > /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ > - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; > - if (!dll_wait_time_us) > - dll_wait_time_us = 1; > + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); > > /* Wait for the DLL to settle. */ > - udelay(dll_wait_time_us); > + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); > } > > static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, Eventually it failed, in the same way with with same errors. Took quite a while, over 600 boot cycles. Note also that I had to hand merge the changes, since in 5.1.14 that gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. Regards Greg
On Fri, 2 Aug 2019 22:34:57 +1000 Greg Ungerer <gerg@kernel.org> wrote: > Hi Boris, > > On 31/7/19 4:28 pm, Boris Brezillon wrote: > > On Wed, 31 Jul 2019 12:05:44 +1000 > > Greg Ungerer <gerg@kernel.org> wrote: > > > >> Hi Miquel, Boris, > >> > >> On 30/7/19 6:38 pm, Miquel Raynal wrote: > >>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: > >>>> On 30/7/19 10:41 am, Greg Ungerer wrote: > >>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: > >>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: > >>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > >>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: > >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > >>>>> [snip] > >>>> Not sure if this is a useful data point... But I modified that > >>>> nand_init_data_interface() loop to start checking from data mode 4. > >>>> So now on every boot it defaults to mode 4. That has been running > >>>> most of the day, up to 900 boot cycles now, no failures. > >>> > >>> Ok so after having chatted quite a bit with Boris, it is very likely > >>> that, for these chips, the timings in mode 5 are too tight. It could > >>> fail the GET_FEATURES once in mode 5. Can you please dump every single > >>> intermediate value in gpmi_nfc_compute_timings() (period, *_cycles, > >>> use of half pêriods, tRP, sample delay, etc) as well as the content > >>> of /sys/kernel/debug/clk/clk_summary (you'll need debugfs support > >>> enabled and mounted). > >>> > >>> Also, can you be sure that the NAND chip is powered with 3.3V? > >> > >> Yes, 3.3V NAND chip. > >> > >> Using the attached patch I get the following trace: > >> > >> ... > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() > >> sdr->tBERS_max=65535000000 > >> sdr->tCCS_min=500000000 > >> sdr->tPROG_max=65535000000 > >> sdr->tR_max=200000000000000 > >> sdr->tALH_min=20000 > >> sdr->tADL_min=400000 > >> sdr->tALS_min=50000 > >> sdr->tAR_min=25000 > >> sdr->tCEA_max=100000 > >> sdr->tCEH_min=20000 > >> sdr->tCH_min=20000 > >> sdr->tCHZ_max=100000 > >> sdr->tCLH_min=20000 > >> sdr->tCLR_min=20000 > >> sdr->tCLS_min=50000 > >> sdr->tCOH_min=0 > >> sdr->tCS_min=70000 > >> sdr->tDH_min=20000 > >> sdr->tDS_min=40000 > >> sdr->tFEAT_max=1000000 > >> sdr->tIR_min=10000 > >> sdr->tITC_max=1000000 > >> sdr->tRC_min=100000 > >> sdr->tREA_max=40000 > >> sdr->tREH_min=30000 > >> sdr->tRHOH_min=0 > >> sdr->tRHW_min=200000 > >> sdr->tRHZ_max=200000 > >> sdr->tRLOH_min=0 > >> sdr->tRP_min=50000 > >> sdr->tRR_min=40000 > >> sdr->tRST_max=250000000000 > >> sdr->tWB_max=200000 > >> sdr->tWC_min=100000 > >> sdr->tWH_min=30000 > >> sdr->tWHR_min=120000 > >> sdr->tWP_min=50000 > >> sdr->tWW_min=100000 > >> hw->clk_rate=22000000 > >> wrn_dly_sel=0 > >> period_ps=45454 > >> addr_setup_cycles=2 > >> data_setup_cycles=1 > >> data_hold_cycles=1 > >> busy_timeout_cycles=31302 > >> hw->timing0=0x00020101 > >> hw->timing1=0x60000000 > >> dll_threshold_ps=12000 > >> use_half_period=1 > >> reference_period_ps=22727 > >> tRP_ps=45454 > >> sample_delay_ps=4294955664 > >> sample_delay_factor=0 > >> hw->ctrl1n=0x00000000 > >> hw->ctrl1n=0x00000000 > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() > >> hw>clk_rate=22000000 > >> clk_set_rate(r->clock[0], hw->clk_rate)=0 > >> clk_get_rate(r->clock[0])=22000000 > >> random: fast init done > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > >> nand: Micron MT29F2G08ABAEAWP > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() > >> sdr->tBERS_max=3000000000 > >> sdr->tCCS_min=100000 > >> sdr->tPROG_max=600000000 > >> sdr->tR_max=25000000 > >> sdr->tALH_min=20000 > >> sdr->tADL_min=400000 > >> sdr->tALS_min=50000 > >> sdr->tAR_min=25000 > >> sdr->tCEA_max=100000 > >> sdr->tCEH_min=20000 > >> sdr->tCH_min=20000 > >> sdr->tCHZ_max=100000 > >> sdr->tCLH_min=20000 > >> sdr->tCLR_min=20000 > >> sdr->tCLS_min=50000 > >> sdr->tCOH_min=0 > >> sdr->tCS_min=70000 > >> sdr->tDH_min=20000 > >> sdr->tDS_min=40000 > >> sdr->tFEAT_max=1000000 > >> sdr->tIR_min=10000 > >> sdr->tITC_max=1000000 > >> sdr->tRC_min=100000 > >> sdr->tREA_max=40000 > >> sdr->tREH_min=30000 > >> sdr->tRHOH_min=0 > >> sdr->tRHW_min=200000 > >> sdr->tRHZ_max=200000 > >> sdr->tRLOH_min=0 > >> sdr->tRP_min=50000 > >> sdr->tRR_min=40000 > >> sdr->tRST_max=250000000000 > >> sdr->tWB_max=200000 > >> sdr->tWC_min=100000 > >> sdr->tWH_min=30000 > >> sdr->tWHR_min=120000 > >> sdr->tWP_min=50000 > >> sdr->tWW_min=100000 > >> hw->clk_rate=22000000 > >> wrn_dly_sel=0 > >> period_ps=45454 > >> addr_setup_cycles=2 > >> data_setup_cycles=1 > >> data_hold_cycles=1 > >> busy_timeout_cycles=555 > >> hw->timing0=0x00020101 > >> hw->timing1=0xb0000000 > >> dll_threshold_ps=12000 > >> use_half_period=1 > >> reference_period_ps=22727 > >> tRP_ps=45454 > >> sample_delay_ps=4294955664 > >> sample_delay_factor=0 > >> hw->ctrl1n=0x00000000 > >> hw->ctrl1n=0x00000000 > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() > >> hw>clk_rate=22000000 > >> clk_set_rate(r->clock[0], hw->clk_rate)=0 > >> clk_get_rate(r->clock[0])=22000000 > >> gpmi-nand 1806000.gpmi-nand: use legacy bch geometry > >> drivers/mtd/nand/raw/nand_base.c(913): checking mode=5 > >> drivers/mtd/nand/raw/nand_base.c(927): BREAKING AT mode=5 > >> drivers/mtd/nand/raw/nand_base.c(932): chip->onfi_timing_mode_default=5 > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(426): gpmi_nfc_compute_timings() > >> sdr->tBERS_max=3000000000 > >> sdr->tCCS_min=100000 > >> sdr->tPROG_max=600000000 > >> sdr->tR_max=25000000 > >> sdr->tALH_min=5000 > >> sdr->tADL_min=400000 > >> sdr->tALS_min=10000 > >> sdr->tAR_min=10000 > >> sdr->tCEA_max=25000 > >> sdr->tCEH_min=20000 > >> sdr->tCH_min=5000 > >> sdr->tCHZ_max=30000 > >> sdr->tCLH_min=5000 > >> sdr->tCLR_min=10000 > >> sdr->tCLS_min=10000 > >> sdr->tCOH_min=15000 > >> sdr->tCS_min=15000 > >> sdr->tDH_min=5000 > >> sdr->tDS_min=7000 > >> sdr->tFEAT_max=1000000 > >> sdr->tIR_min=0 > >> sdr->tITC_max=1000000 > >> sdr->tRC_min=20000 > >> sdr->tREA_max=16000 > >> sdr->tREH_min=7000 > >> sdr->tRHOH_min=15000 > >> sdr->tRHW_min=100000 > >> sdr->tRHZ_max=100000 > >> sdr->tRLOH_min=5000 > >> sdr->tRP_min=10000 > >> sdr->tRR_min=20000 > >> sdr->tRST_max=500000000 > >> sdr->tWB_max=100000 > >> sdr->tWC_min=20000 > >> sdr->tWH_min=7000 > >> sdr->tWHR_min=80000 > >> sdr->tWP_min=10000 > >> sdr->tWW_min=100000 > >> hw->clk_rate=100000000 > >> wrn_dly_sel=3 > >> period_ps=10000 > >> addr_setup_cycles=1 > >> data_setup_cycles=1 > >> data_hold_cycles=1 > >> busy_timeout_cycles=2510 > >> hw->timing0=0x00010101 > >> hw->timing1=0xe0000000 > >> dll_threshold_ps=12000 > >> use_half_period=0 > >> reference_period_ps=10000 > >> tRP_ps=10000 > >> sample_delay_ps=80000 > >> sample_delay_factor=8 > >> hw->ctrl1n=0x00c00000 > >> hw->ctrl1n=0x00c28000 > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(547): gpmi_nfc_apply_timings() > >> hw>clk_rate=100000000 > >> clk_set_rate(r->clock[0], hw->clk_rate)=0 > >> clk_get_rate(r->clock[0])=99000000 > >> Scanning device for bad blocks > >> 5 fixed-partitions partitions found on MTD device gpmi-nand > >> Creating 5 MTD partitions on "gpmi-nand": > >> 0x000000000000-0x000000500000 : "u-boot" > >> 0x000000500000-0x000000600000 : "u-boot-env" > >> 0x000000600000-0x000000800000 : "log" > >> 0x000000800000-0x000010000000 : "flash" > >> 0x000000000000-0x000010000000 : "all" > >> gpmi-nand 1806000.gpmi-nand: driver registered. > >> ... > >> > >> > >> And "cat /sys/kernel/debug/clk/clk_summary" gives: > >> > >> enable prepare protect duty > >> clock count count count rate accuracy phase cycle > >> --------------------------------------------------------------------------------------------- > >> dummy 2 2 0 0 0 0 50000 > >> cko2_sel 0 0 0 0 0 0 50000 > >> cko2_podf 0 0 0 0 0 0 50000 > >> cko2 0 0 0 0 0 0 50000 > >> cko1_sel 0 0 0 0 0 0 50000 > >> cko1_podf 0 0 0 0 0 0 50000 > >> cko1 0 0 0 0 0 0 50000 > >> cko 0 0 0 0 0 0 50000 > >> usbphy2_gate 1 1 0 0 0 0 50000 > >> usbphy1_gate 1 1 0 0 0 0 50000 > >> ipp_di1 0 0 0 0 0 0 50000 > >> ipp_di0 0 0 0 0 0 0 50000 > >> osc 6 6 0 24000000 0 0 50000 > >> perclk_sel 1 1 0 24000000 0 0 50000 > >> perclk 3 3 0 24000000 0 0 50000 > >> pwm7 0 0 0 24000000 0 0 50000 > >> pwm6 0 0 0 24000000 0 0 50000 > >> pwm5 0 0 0 24000000 0 0 50000 > >> i2c4 0 0 0 24000000 0 0 50000 > >> pwm8 0 0 0 24000000 0 0 50000 > >> pwm4 0 0 0 24000000 0 0 50000 > >> pwm3 0 0 0 24000000 0 0 50000 > >> pwm2 0 0 0 24000000 0 0 50000 > >> pwm1 0 0 0 24000000 0 0 50000 > >> i2c3 0 0 0 24000000 0 0 50000 > >> i2c2 1 1 0 24000000 0 0 50000 > >> i2c1 0 0 0 24000000 0 0 50000 > >> gpt1_serial 1 1 0 24000000 0 0 50000 > >> gpt1_bus 1 1 0 24000000 0 0 50000 > >> epit2 0 0 0 24000000 0 0 50000 > >> epit1 0 0 0 24000000 0 0 50000 > >> gpt2_serial 0 0 0 24000000 0 0 50000 > >> gpt2_bus 0 0 0 24000000 0 0 50000 > >> periph_clk2_sel 0 0 0 24000000 0 0 50000 > >> periph_clk2 0 0 0 24000000 0 0 50000 > >> gpt_3m 0 0 0 3000000 0 0 50000 > >> csi_sel 0 0 0 24000000 0 0 50000 > >> csi_podf 0 0 0 24000000 0 0 50000 > >> csi 0 0 0 24000000 0 0 50000 > >> pll7 1 1 0 480000000 0 0 50000 > >> pll7_bypass 1 1 0 480000000 0 0 50000 > >> pll7_usb_host 1 1 0 480000000 0 0 50000 > >> usbphy2 1 1 0 480000000 0 0 50000 > >> pll6 1 1 0 500000000 0 0 50000 > >> pll6_bypass 1 1 0 500000000 0 0 50000 > >> pll6_enet 2 2 0 500000000 0 0 50000 > >> enet_ptp_ref 1 1 0 25000000 0 0 50000 > >> enet_ptp 1 1 0 25000000 0 0 50000 > >> enet2_ref 0 0 0 50000000 0 0 50000 > >> enet_ref_125m 0 0 0 50000000 0 0 50000 > >> enet_ref 2 2 0 50000000 0 0 50000 > >> pll5 0 0 0 296600000 0 0 50000 > >> pll5_bypass 0 0 0 296600000 0 0 50000 > >> pll5_video 0 0 0 296600000 0 0 50000 > >> pll5_post_div 0 0 0 74150000 0 0 50000 > >> pll5_video_div 0 0 0 74150000 0 0 50000 > >> pll4 0 0 0 147456000 0 0 50000 > >> pll4_bypass 0 0 0 147456000 0 0 50000 > >> pll4_audio 0 0 0 147456000 0 0 50000 > >> pll4_post_div 0 0 0 36864000 0 0 50000 > >> pll4_audio_div 0 0 0 36864000 0 0 50000 > >> pll3 1 1 0 480000000 0 0 50000 > >> pll3_bypass 1 1 0 480000000 0 0 50000 > >> pll3_usb_otg 2 2 0 480000000 0 0 50000 > >> spdif_sel 0 0 0 480000000 0 0 50000 > >> spdif_pred 0 0 0 240000000 0 0 50000 > >> spdif_podf 0 0 0 30000000 0 0 50000 > >> spdif 0 0 0 30000000 0 0 50000 > >> esai_sel 0 0 0 480000000 0 0 50000 > >> esai_pred 0 0 0 240000000 0 0 50000 > >> esai_podf 0 0 0 30000000 0 0 50000 > >> esai_extal 0 0 0 30000000 0 0 50000 > >> qspi1_sel 0 0 0 480000000 0 0 50000 > >> qspi1_podf 0 0 0 240000000 0 0 50000 > >> qspi1 0 0 0 240000000 0 0 50000 > >> ldb_di1_div_7 0 0 0 68571428 0 0 50000 > >> ldb_di1 0 0 0 68571428 0 0 50000 > >> ldb_di1_div_3_5 0 0 0 137142857 0 0 50000 > >> periph2_clk2_sel 0 0 0 480000000 0 0 50000 > >> periph2_clk2 0 0 0 480000000 0 0 50000 > >> pll3_60m 0 0 0 60000000 0 0 50000 > >> can_sel 0 0 0 60000000 0 0 50000 > >> can_podf 0 0 0 30000000 0 0 50000 > >> can2_serial 0 0 0 30000000 0 0 50000 > >> can1_serial 0 0 0 30000000 0 0 50000 > >> ecspi_sel 0 0 0 60000000 0 0 50000 > >> ecspi_podf 0 0 0 60000000 0 0 50000 > >> ecspi4 0 0 0 60000000 0 0 50000 > >> ecspi3 0 0 0 60000000 0 0 50000 > >> ecspi2 0 0 0 60000000 0 0 50000 > >> ecspi1 0 0 0 60000000 0 0 50000 > >> pll3_80m 1 1 0 80000000 0 0 50000 > >> uart_sel 1 1 0 80000000 0 0 50000 > >> uart_podf 1 1 0 80000000 0 0 50000 > >> uart8_serial 0 0 0 80000000 0 0 50000 > >> uart7_serial 0 0 0 80000000 0 0 50000 > >> uart1_serial 1 2 0 80000000 0 0 50000 > >> uart6_serial 0 0 0 80000000 0 0 50000 > >> uart5_serial 0 0 0 80000000 0 0 50000 > >> uart4_serial 0 0 0 80000000 0 0 50000 > >> uart3_serial 0 0 0 80000000 0 0 50000 > >> uart2_serial 0 0 0 80000000 0 0 50000 > >> pll3_pfd3_454m 0 0 0 454736842 0 0 50000 > >> pll3_pfd2_508m 0 0 0 508235294 0 0 50000 > >> epdc_pre_sel 0 0 0 508235294 0 0 50000 > >> epdc_podf 0 0 0 254117647 0 0 50000 > >> epdc_pix 0 0 0 254117647 0 0 50000 > >> epdc_sel 0 0 0 254117647 0 0 50000 > >> sai1_sel 0 0 0 508235294 0 0 50000 > >> sai1_pred 0 0 0 127058824 0 0 50000 > >> sai1_podf 0 0 0 63529412 0 0 50000 > >> sai1 0 0 0 63529412 0 0 50000 > >> sai2_sel 0 0 0 508235294 0 0 50000 > >> sai2_pred 0 0 0 127058824 0 0 50000 > >> sai2_podf 0 0 0 63529412 0 0 50000 > >> sai2 0 0 0 63529412 0 0 50000 > >> sai3_sel 0 0 0 508235294 0 0 50000 > >> sai3_pred 0 0 0 127058824 0 0 50000 > >> sai3_podf 0 0 0 63529412 0 0 50000 > >> sai3 0 0 0 63529412 0 0 50000 > >> pll3_pfd1_540m 0 0 0 540000000 0 0 50000 > >> lcdif_pre_sel 0 0 0 540000000 0 0 50000 > >> lcdif_pred 0 0 0 270000000 0 0 50000 > >> lcdif_podf 0 0 0 135000000 0 0 50000 > >> lcdif_pix 0 0 0 135000000 0 0 50000 > >> iomuxc 0 0 0 135000000 0 0 50000 > >> lcdif_sel 0 0 0 135000000 0 0 50000 > >> pll3_pfd0_720m 0 0 0 720000000 0 0 50000 > >> usbphy1 1 1 0 480000000 0 0 50000 > >> pll2 1 1 0 528000000 0 0 50000 > >> pll2_bypass 1 1 0 528000000 0 0 50000 > >> pll2_bus 2 2 0 528000000 0 0 50000 > >> ca7_secondary_sel 0 0 0 528000000 0 0 50000 > >> step 0 0 0 528000000 0 0 50000 > >> periph_pre 1 1 0 528000000 0 0 50000 > >> periph 3 3 0 528000000 0 0 50000 > >> ahb 7 7 0 132000000 0 0 50000 > >> sdma 0 0 0 132000000 0 0 50000 > >> rom 1 1 0 132000000 0 0 50000 > >> esai_mem 0 0 0 132000000 0 0 50000 > >> esai_ipg 0 0 0 132000000 0 0 50000 > >> aips_tz3 1 1 0 132000000 0 0 50000 > >> enet_ahb 2 2 0 132000000 0 0 50000 > >> dcp 0 0 0 132000000 0 0 50000 > >> asrc_mem 0 0 0 132000000 0 0 50000 > >> asrc_ipg 0 0 0 132000000 0 0 50000 > >> aips_tz2 1 1 0 132000000 0 0 50000 > >> aips_tz1 1 1 0 132000000 0 0 50000 > >> ipg 10 10 0 66000000 0 0 50000 > >> wdog3 0 0 0 66000000 0 0 50000 > >> uart8_ipg 0 0 0 66000000 0 0 50000 > >> usboh3 2 2 0 66000000 0 0 50000 > >> sai2_ipg 0 0 0 66000000 0 0 50000 > >> sai1_ipg 0 0 0 66000000 0 0 50000 > >> uart7_ipg 0 0 0 66000000 0 0 50000 > >> uart1_ipg 1 2 0 66000000 0 0 50000 > >> sai3_ipg 0 0 0 66000000 0 0 50000 > >> spdif_gclk 0 0 0 66000000 0 0 50000 > >> spba 0 0 0 66000000 0 0 50000 > >> wdog2 0 0 0 66000000 0 0 50000 > >> kpp 0 0 0 66000000 0 0 50000 > >> mmdc_p1_ipg 0 0 0 66000000 0 0 50000 > >> mmdc_p0_ipg 2 2 0 66000000 0 0 50000 > >> wdog1 1 1 0 66000000 0 0 50000 > >> gpio4 1 1 0 66000000 0 0 50000 > >> uart6_ipg 0 0 0 66000000 0 0 50000 > >> uart5_ipg 0 0 0 66000000 0 0 50000 > >> gpio3 1 1 0 66000000 0 0 50000 > >> ocotp 0 0 0 66000000 0 0 50000 > >> gpio5 1 1 0 66000000 0 0 50000 > >> gpio1 1 1 0 66000000 0 0 50000 > >> uart4_ipg 0 0 0 66000000 0 0 50000 > >> adc1 0 0 0 66000000 0 0 50000 > >> uart3_ipg 0 0 0 66000000 0 0 50000 > >> adc2 0 0 0 66000000 0 0 50000 > >> gpio2 1 1 0 66000000 0 0 50000 > >> uart2_ipg 0 0 0 66000000 0 0 50000 > >> can2_ipg 0 0 0 66000000 0 0 50000 > >> can1_ipg 0 0 0 66000000 0 0 50000 > >> enet 2 2 0 66000000 0 0 50000 > >> axi_sel 1 1 0 528000000 0 0 50000 > >> axi_podf 2 2 0 264000000 0 0 50000 > >> axi 1 1 0 264000000 0 0 50000 > >> eim_slow_sel 0 0 0 264000000 0 0 50000 > >> eim_slow_podf 0 0 0 132000000 0 0 50000 > >> eim 0 0 0 132000000 0 0 50000 > >> lcdif_apb 0 0 0 264000000 0 0 50000 > >> pxp 0 0 0 264000000 0 0 50000 > >> epdc_aclk 0 0 0 264000000 0 0 50000 > >> pll2_pfd3_594m 0 0 0 594000000 0 0 50000 > >> ldb_di0_sel 0 0 0 594000000 0 0 50000 > >> ldb_di0_div_7 0 0 0 84857142 0 0 50000 > >> ldb_di0 0 0 0 84857142 0 0 50000 > >> ldb_di0_div_3_5 0 0 0 169714285 0 0 50000 > >> pll2_pfd2_396m 2 2 0 396000000 0 0 50000 > >> enfc_sel 0 0 0 396000000 0 0 50000 > >> enfc_pred 0 0 0 99000000 0 0 50000 > >> enfc_podf 0 0 0 99000000 0 0 50000 > >> gpmi_io 0 0 0 99000000 0 0 50000 > >> usdhc1_sel 0 0 0 396000000 0 0 50000 > >> usdhc1_podf 0 0 0 198000000 0 0 50000 > >> usdhc1 0 0 0 198000000 0 0 50000 > >> usdhc2_sel 0 0 0 396000000 0 0 50000 > >> usdhc2_podf 0 0 0 198000000 0 0 50000 > >> usdhc2 0 0 0 198000000 0 0 50000 > >> bch_sel 1 1 0 396000000 0 0 50000 > >> bch_podf 1 1 0 99000000 0 0 50000 > >> gpmi_apb 0 0 0 99000000 0 0 50000 > >> gpmi_bch_apb 0 0 0 99000000 0 0 50000 > >> per_bch 0 0 0 99000000 0 0 50000 > >> apbh_dma 1 1 0 99000000 0 0 50000 > >> gpmi_sel 0 0 0 396000000 0 0 50000 > >> gpmi_podf 0 0 0 99000000 0 0 50000 > >> gpmi_bch 0 0 0 99000000 0 0 50000 > >> periph2_pre 1 1 0 396000000 0 0 50000 > >> periph2 2 2 0 396000000 0 0 50000 > >> mmdc_podf 2 2 0 396000000 0 0 50000 > >> mmdc_p0_fast 1 1 0 396000000 0 0 50000 > >> axi_alt_sel 0 0 0 396000000 0 0 50000 > >> pll2_198m 0 0 0 198000000 0 0 50000 > >> pll2_pfd1_594m 0 0 0 594000000 0 0 50000 > >> pll2_pfd0_352m 0 0 0 352000000 0 0 50000 > >> pll1 1 1 0 900000000 0 0 50000 > >> pll1_bypass 1 1 0 900000000 0 0 50000 > >> pll1_sys 1 1 0 900000000 0 0 50000 > >> pll1_sw 1 1 0 900000000 0 0 50000 > >> arm 1 1 0 900000000 0 0 50000 > >> pll7_bypass_src 0 0 0 24000000 0 0 50000 > >> pll6_bypass_src 0 0 0 24000000 0 0 50000 > >> pll5_bypass_src 0 0 0 24000000 0 0 50000 > >> pll4_bypass_src 0 0 0 24000000 0 0 50000 > >> pll3_bypass_src 0 0 0 24000000 0 0 50000 > >> pll2_bypass_src 0 0 0 24000000 0 0 50000 > >> pll1_bypass_src 0 0 0 24000000 0 0 50000 > >> ckil 0 0 0 32768 0 0 50000 > >> > >> > >> Note that this was generated on a normal boot up (not failure). > > > > The values looks good. Can you try with the below diff applied? > > --->8--- > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > > index 334fe3130285..9771f6a82abe 100644 > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > > @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); > > > > /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ > > - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; > > - if (!dll_wait_time_us) > > - dll_wait_time_us = 1; > > + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); > > > > /* Wait for the DLL to settle. */ > > - udelay(dll_wait_time_us); > > + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); > > } > > > > static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > > Eventually it failed, in the same way with with same errors. > Took quite a while, over 600 boot cycles. > > Note also that I had to hand merge the changes, since in 5.1.14 that > gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. Oh well. I guess the next thing to do would be to dump the timing regs and clk rate that are set by the bootloader (before the driver override them) or those applied by an older kernel (one that didn't have that issue).
Hi Boris, On 2/8/19 10:51 pm, Boris Brezillon wrote: > On Fri, 2 Aug 2019 22:34:57 +1000 > Greg Ungerer <gerg@kernel.org> wrote: >> On 31/7/19 4:28 pm, Boris Brezillon wrote: >>> On Wed, 31 Jul 2019 12:05:44 +1000 >>> Greg Ungerer <gerg@kernel.org> wrote: >>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: >>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: >>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: >>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: >>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >>>>>>> [snip] >>>> Note that this was generated on a normal boot up (not failure). >>> >>> The values looks good. Can you try with the below diff applied? >>> --->8--- >>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>> index 334fe3130285..9771f6a82abe 100644 >>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); >>> >>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ >>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; >>> - if (!dll_wait_time_us) >>> - dll_wait_time_us = 1; >>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); >>> >>> /* Wait for the DLL to settle. */ >>> - udelay(dll_wait_time_us); >>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); >>> } >>> >>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, >> >> Eventually it failed, in the same way with with same errors. >> Took quite a while, over 600 boot cycles. >> >> Note also that I had to hand merge the changes, since in 5.1.14 that >> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. > > Oh well. I guess the next thing to do would be to dump the timing regs > and clk rate that are set by the bootloader (before the driver override > them) or those applied by an older kernel (one that didn't have that > issue). Is this useful? With attached patch, I get the following dump of the timing settings in use: ... drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() HW_GPMI_TIMING0=0x00010203 (calculated=0x00020101) HW_GPMI_TIMING1=0x00000000 (calculated=0x60000000) HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00000000) r->clock[0]=22000000 (calculated=22000000) random: fast init done nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAEAWP nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() HW_GPMI_TIMING0=0x00010203 (calculated=0x00020101) HW_GPMI_TIMING1=0x00000000 (calculated=0xb0000000) HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00000000) r->clock[0]=22000000 (calculated=22000000) drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() HW_GPMI_TIMING0=0x00010203 (calculated=0x00010101) HW_GPMI_TIMING1=0x00000000 (calculated=0xe0000000) HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00c28000) r->clock[0]=22000000 (calculated=100000000) Scanning device for bad blocks 5 fixed-partitions partitions found on MTD device gpmi-nand Creating 5 MTD partitions on "gpmi-nand": 0x000000000000-0x000000500000 : "u-boot" 0x000000500000-0x000000600000 : "u-boot-env" 0x000000600000-0x000000800000 : "log" 0x000000800000-0x000010000000 : "flash" 0x000000000000-0x000010000000 : "all" gpmi-nand 1806000.gpmi-nand: driver registered. ... Regards Greg
Hi Greg, Greg Ungerer <gerg@kernel.org> wrote on Mon, 5 Aug 2019 15:51:05 +1000: > Hi Boris, > > On 2/8/19 10:51 pm, Boris Brezillon wrote: > > On Fri, 2 Aug 2019 22:34:57 +1000 > > Greg Ungerer <gerg@kernel.org> wrote: > >> On 31/7/19 4:28 pm, Boris Brezillon wrote: > >>> On Wed, 31 Jul 2019 12:05:44 +1000 > >>> Greg Ungerer <gerg@kernel.org> wrote: > >>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: > >>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: > >>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: > >>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: > >>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: > >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > >>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: > >>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > >>>>>>> [snip] > >>>> Note that this was generated on a normal boot up (not failure). > >>> > >>> The values looks good. Can you try with the below diff applied? > >>> --->8--- > >>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>> index 334fe3130285..9771f6a82abe 100644 > >>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > >>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); > >>> >>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ > >>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; > >>> - if (!dll_wait_time_us) > >>> - dll_wait_time_us = 1; > >>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); > >>> >>> /* Wait for the DLL to settle. */ > >>> - udelay(dll_wait_time_us); > >>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); > >>> } > >>> >>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > >> > >> Eventually it failed, in the same way with with same errors. > >> Took quite a while, over 600 boot cycles. > >> > >> Note also that I had to hand merge the changes, since in 5.1.14 that > >> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. > > > > Oh well. I guess the next thing to do would be to dump the timing regs > > and clk rate that are set by the bootloader (before the driver override > > them) or those applied by an older kernel (one that didn't have that > > issue). > > Is this useful? > > With attached patch, I get the following dump of the timing > settings in use: > > ... > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() > HW_GPMI_TIMING0=0x00010203 (calculated=0x00020101) > HW_GPMI_TIMING1=0x00000000 (calculated=0x60000000) > HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00000000) > r->clock[0]=22000000 (calculated=22000000) > random: fast init done > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() > HW_GPMI_TIMING0=0x00010203 (calculated=0x00020101) > HW_GPMI_TIMING1=0x00000000 (calculated=0xb0000000) > HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00000000) > r->clock[0]=22000000 (calculated=22000000) > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() > HW_GPMI_TIMING0=0x00010203 (calculated=0x00010101) > HW_GPMI_TIMING1=0x00000000 (calculated=0xe0000000) > HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00c28000) > r->clock[0]=22000000 (calculated=100000000) Why are the registers not updated? Is it the same situation when we get all the failures?
Hi Miquel, On 8/8/19 2:05 am, Miquel Raynal wrote: > Greg Ungerer <gerg@kernel.org> wrote on Mon, 5 Aug 2019 15:51:05 +1000: >> On 2/8/19 10:51 pm, Boris Brezillon wrote: >>> On Fri, 2 Aug 2019 22:34:57 +1000 >>> Greg Ungerer <gerg@kernel.org> wrote: >>>> On 31/7/19 4:28 pm, Boris Brezillon wrote: >>>>> On Wed, 31 Jul 2019 12:05:44 +1000 >>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: >>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: >>>>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: >>>>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: >>>>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: >>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>>>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >>>>>>>>> [snip] >>>>>> Note that this was generated on a normal boot up (not failure). >>>>> >>>>> The values looks good. Can you try with the below diff applied? >>>>> --->8--- >>>>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>> index 334fe3130285..9771f6a82abe 100644 >>>>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >>>>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); >>>>> >>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ >>>>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; >>>>> - if (!dll_wait_time_us) >>>>> - dll_wait_time_us = 1; >>>>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); >>>>> >>> /* Wait for the DLL to settle. */ >>>>> - udelay(dll_wait_time_us); >>>>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); >>>>> } >>>>> >>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, >>>> >>>> Eventually it failed, in the same way with with same errors. >>>> Took quite a while, over 600 boot cycles. >>>> >>>> Note also that I had to hand merge the changes, since in 5.1.14 that >>>> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. >>> >>> Oh well. I guess the next thing to do would be to dump the timing regs >>> and clk rate that are set by the bootloader (before the driver override >>> them) or those applied by an older kernel (one that didn't have that >>> issue). >> >> Is this useful? >> >> With attached patch, I get the following dump of the timing >> settings in use: >> >> ... >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() >> HW_GPMI_TIMING0=0x00010203 (calculated=0x00020101) >> HW_GPMI_TIMING1=0x00000000 (calculated=0x60000000) >> HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00000000) >> r->clock[0]=22000000 (calculated=22000000) >> random: fast init done >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >> nand: Micron MT29F2G08ABAEAWP >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() >> HW_GPMI_TIMING0=0x00010203 (calculated=0x00020101) >> HW_GPMI_TIMING1=0x00000000 (calculated=0xb0000000) >> HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00000000) >> r->clock[0]=22000000 (calculated=22000000) >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() >> HW_GPMI_TIMING0=0x00010203 (calculated=0x00010101) >> HW_GPMI_TIMING1=0x00000000 (calculated=0xe0000000) >> HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00c28000) >> r->clock[0]=22000000 (calculated=100000000) > > Why are the registers not updated? Is it the same situation when we > get all the failures? As per the patch that was attached to that email. The setting of the registers and clock was "#if 0" out so you could see what the power-up/boot-loader settings are. Those settings work reliably with no nand failures. In between running various tests I have left my hardware boot cycle testing with those settings. I don't have an exact number but it has probably run at least 100 hours and tens of thousands of boots with no problem using those. Or am I misunderstanding your question? Regards Greg
On Mon, 5 Aug 2019 15:51:05 +1000 Greg Ungerer <gerg@kernel.org> wrote: > Hi Boris, > > On 2/8/19 10:51 pm, Boris Brezillon wrote: > > On Fri, 2 Aug 2019 22:34:57 +1000 > > Greg Ungerer <gerg@kernel.org> wrote: > >> On 31/7/19 4:28 pm, Boris Brezillon wrote: > >>> On Wed, 31 Jul 2019 12:05:44 +1000 > >>> Greg Ungerer <gerg@kernel.org> wrote: > >>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: > >>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: > >>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: > >>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: > >>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: > >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > >>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: > >>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > >>>>>>> [snip] > >>>> Note that this was generated on a normal boot up (not failure). > >>> > >>> The values looks good. Can you try with the below diff applied? > >>> --->8--- > >>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>> index 334fe3130285..9771f6a82abe 100644 > >>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > >>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); > >>> > >>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ > >>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; > >>> - if (!dll_wait_time_us) > >>> - dll_wait_time_us = 1; > >>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); > >>> > >>> /* Wait for the DLL to settle. */ > >>> - udelay(dll_wait_time_us); > >>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); > >>> } > >>> > >>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > >> > >> Eventually it failed, in the same way with with same errors. > >> Took quite a while, over 600 boot cycles. > >> > >> Note also that I had to hand merge the changes, since in 5.1.14 that > >> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. > > > > Oh well. I guess the next thing to do would be to dump the timing regs > > and clk rate that are set by the bootloader (before the driver override > > them) or those applied by an older kernel (one that didn't have that > > issue). > > Is this useful? Hm, looks like it's configured in mode 0, so no, it's not super useful. Can you try booting an older kernel (one that didn't have the ->setup_data_interface() hook implemented). > > With attached patch, I get the following dump of the timing > settings in use: > > ... > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() > HW_GPMI_TIMING0=0x00010203 (calculated=0x00020101) > HW_GPMI_TIMING1=0x00000000 (calculated=0x60000000) > HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00000000) > r->clock[0]=22000000 (calculated=22000000) > random: fast init done > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() > HW_GPMI_TIMING0=0x00010203 (calculated=0x00020101) > HW_GPMI_TIMING1=0x00000000 (calculated=0xb0000000) > HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00000000) > r->clock[0]=22000000 (calculated=22000000) > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() > HW_GPMI_TIMING0=0x00010203 (calculated=0x00010101) > HW_GPMI_TIMING1=0x00000000 (calculated=0xe0000000) > HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00c28000) > r->clock[0]=22000000 (calculated=100000000) > Scanning device for bad blocks > 5 fixed-partitions partitions found on MTD device gpmi-nand > Creating 5 MTD partitions on "gpmi-nand": > 0x000000000000-0x000000500000 : "u-boot" > 0x000000500000-0x000000600000 : "u-boot-env" > 0x000000600000-0x000000800000 : "log" > 0x000000800000-0x000010000000 : "flash" > 0x000000000000-0x000010000000 : "all" > gpmi-nand 1806000.gpmi-nand: driver registered. > ... > > Regards > Greg > >
Hi Boris, On 9/8/19 2:36 am, Boris Brezillon wrote: > On Mon, 5 Aug 2019 15:51:05 +1000 > Greg Ungerer <gerg@kernel.org> wrote: >> On 2/8/19 10:51 pm, Boris Brezillon wrote: >>> On Fri, 2 Aug 2019 22:34:57 +1000 >>> Greg Ungerer <gerg@kernel.org> wrote: >>>> On 31/7/19 4:28 pm, Boris Brezillon wrote: >>>>> On Wed, 31 Jul 2019 12:05:44 +1000 >>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: >>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: >>>>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: >>>>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: >>>>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: >>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>>>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >>>>>>>>> [snip] >>>>>> Note that this was generated on a normal boot up (not failure). >>>>> >>>>> The values looks good. Can you try with the below diff applied? >>>>> --->8--- >>>>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>> index 334fe3130285..9771f6a82abe 100644 >>>>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >>>>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); >>>>> >>>>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ >>>>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; >>>>> - if (!dll_wait_time_us) >>>>> - dll_wait_time_us = 1; >>>>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); >>>>> >>>>> /* Wait for the DLL to settle. */ >>>>> - udelay(dll_wait_time_us); >>>>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); >>>>> } >>>>> >>>>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, >>>> >>>> Eventually it failed, in the same way with with same errors. >>>> Took quite a while, over 600 boot cycles. >>>> >>>> Note also that I had to hand merge the changes, since in 5.1.14 that >>>> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. >>> >>> Oh well. I guess the next thing to do would be to dump the timing regs >>> and clk rate that are set by the bootloader (before the driver override >>> them) or those applied by an older kernel (one that didn't have that >>> issue). >> >> Is this useful? > > Hm, looks like it's configured in mode 0, so no, it's not super useful. > Can you try booting an older kernel (one that didn't have the > ->setup_data_interface() hook implemented). Ok. I went back from 5.1 and the first kernel I could find that returned no grep hits for "setup_data_interface" was 4.16. So I built for my target with that and added similar trace to dump the hardware register settings for that. Debug output looks like this now for it: ... drivers/mtd/nand/gpmi-nand/gpmi-nand.c(807): gpmi_get_clks() clk_get_rate(r->clock[0])=22000000 drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() HW_GPMI_TIMING0=0x00010203 HW_GPMI_TIMING1=0x05000000 nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAEAWP nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 drivers/mtd/nand/gpmi-nand/gpmi-lib.c(966): enable_edo_mode() clk_get_rate(r->clock[0])=99000000 gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() HW_GPMI_TIMING0=0x00010101 HW_GPMI_TIMING1=0x90000000 Scanning device for bad blocks 5 ofpart partitions found on MTD device gpmi-nand Creating 5 MTD partitions on "gpmi-nand": 0x000000000000-0x000000500000 : "u-boot" 0x000000500000-0x000000600000 : "u-boot-env" 0x000000600000-0x000000800000 : "log" 0x000000800000-0x000010000000 : "flash" 0x000000000000-0x000010000000 : "all" gpmi-nand 1806000.gpmi-nand: driver registered. ... I am running boot tests now on this to confirm it actually runs reliably. Regards Greg >> With attached patch, I get the following dump of the timing >> settings in use: >> >> ... >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() >> HW_GPMI_TIMING0=0x00010203 (calculated=0x00020101) >> HW_GPMI_TIMING1=0x00000000 (calculated=0x60000000) >> HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00000000) >> r->clock[0]=22000000 (calculated=22000000) >> random: fast init done >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >> nand: Micron MT29F2G08ABAEAWP >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() >> HW_GPMI_TIMING0=0x00010203 (calculated=0x00020101) >> HW_GPMI_TIMING1=0x00000000 (calculated=0xb0000000) >> HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00000000) >> r->clock[0]=22000000 (calculated=22000000) >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(490): gpmi_nfc_apply_timings() >> HW_GPMI_TIMING0=0x00010203 (calculated=0x00010101) >> HW_GPMI_TIMING1=0x00000000 (calculated=0xe0000000) >> HW_GPMI_CTRL1_SET=0x01c4000c (calculated=0x00c28000) >> r->clock[0]=22000000 (calculated=100000000) >> Scanning device for bad blocks >> 5 fixed-partitions partitions found on MTD device gpmi-nand >> Creating 5 MTD partitions on "gpmi-nand": >> 0x000000000000-0x000000500000 : "u-boot" >> 0x000000500000-0x000000600000 : "u-boot-env" >> 0x000000600000-0x000000800000 : "log" >> 0x000000800000-0x000010000000 : "flash" >> 0x000000000000-0x000010000000 : "all" >> gpmi-nand 1806000.gpmi-nand: driver registered. >> ... >> >> Regards >> Greg >> >> > >
On Fri, 9 Aug 2019 15:20:52 +1000 Greg Ungerer <gerg@kernel.org> wrote: > Hi Boris, > > On 9/8/19 2:36 am, Boris Brezillon wrote: > > On Mon, 5 Aug 2019 15:51:05 +1000 > > Greg Ungerer <gerg@kernel.org> wrote: > >> On 2/8/19 10:51 pm, Boris Brezillon wrote: > >>> On Fri, 2 Aug 2019 22:34:57 +1000 > >>> Greg Ungerer <gerg@kernel.org> wrote: > >>>> On 31/7/19 4:28 pm, Boris Brezillon wrote: > >>>>> On Wed, 31 Jul 2019 12:05:44 +1000 > >>>>> Greg Ungerer <gerg@kernel.org> wrote: > >>>>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: > >>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: > >>>>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: > >>>>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: > >>>>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: > >>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > >>>>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: > >>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > >>>>>>>>> [snip] > >>>>>> Note that this was generated on a normal boot up (not failure). > >>>>> > >>>>> The values looks good. Can you try with the below diff applied? > >>>>> --->8--- > >>>>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>> index 334fe3130285..9771f6a82abe 100644 > >>>>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > >>>>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); > >>>>> > >>>>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ > >>>>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; > >>>>> - if (!dll_wait_time_us) > >>>>> - dll_wait_time_us = 1; > >>>>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); > >>>>> > >>>>> /* Wait for the DLL to settle. */ > >>>>> - udelay(dll_wait_time_us); > >>>>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); > >>>>> } > >>>>> > >>>>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > >>>> > >>>> Eventually it failed, in the same way with with same errors. > >>>> Took quite a while, over 600 boot cycles. > >>>> > >>>> Note also that I had to hand merge the changes, since in 5.1.14 that > >>>> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. > >>> > >>> Oh well. I guess the next thing to do would be to dump the timing regs > >>> and clk rate that are set by the bootloader (before the driver override > >>> them) or those applied by an older kernel (one that didn't have that > >>> issue). > >> > >> Is this useful? > > > > Hm, looks like it's configured in mode 0, so no, it's not super useful. > > Can you try booting an older kernel (one that didn't have the > > ->setup_data_interface() hook implemented). > > Ok. I went back from 5.1 and the first kernel I could find that > returned no grep hits for "setup_data_interface" was 4.16. > > So I built for my target with that and added similar trace to dump > the hardware register settings for that. Debug output looks like > this now for it: > > ... > drivers/mtd/nand/gpmi-nand/gpmi-nand.c(807): gpmi_get_clks() > clk_get_rate(r->clock[0])=22000000 > drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() > HW_GPMI_TIMING0=0x00010203 > HW_GPMI_TIMING1=0x05000000 > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > drivers/mtd/nand/gpmi-nand/gpmi-lib.c(966): enable_edo_mode() > clk_get_rate(r->clock[0])=99000000 > gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 > drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() > HW_GPMI_TIMING0=0x00010101 TIMING0 match the one you have with 5.1 kernels. > HW_GPMI_TIMING1=0x90000000 And we even have a bigger timeout value in 5.1 (0xe0000000), so we should be all safe WRT to timings in TIMING{0,1}. Can you dump CTRL1? > Scanning device for bad blocks > 5 ofpart partitions found on MTD device gpmi-nand > Creating 5 MTD partitions on "gpmi-nand": > 0x000000000000-0x000000500000 : "u-boot" > 0x000000500000-0x000000600000 : "u-boot-env" > 0x000000600000-0x000000800000 : "log" > 0x000000800000-0x000010000000 : "flash" > 0x000000000000-0x000010000000 : "all" > gpmi-nand 1806000.gpmi-nand: driver registered. > ... >
On 9/8/19 4:23 pm, Boris Brezillon wrote: > On Fri, 9 Aug 2019 15:20:52 +1000 > Greg Ungerer <gerg@kernel.org> wrote: >> On 9/8/19 2:36 am, Boris Brezillon wrote: >>> On Mon, 5 Aug 2019 15:51:05 +1000 >>> Greg Ungerer <gerg@kernel.org> wrote: >>>> On 2/8/19 10:51 pm, Boris Brezillon wrote: >>>>> On Fri, 2 Aug 2019 22:34:57 +1000 >>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>> On 31/7/19 4:28 pm, Boris Brezillon wrote: >>>>>>> On Wed, 31 Jul 2019 12:05:44 +1000 >>>>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: >>>>>>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: >>>>>>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: >>>>>>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: >>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>>>>>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >>>>>>>>>>> [snip] >>>>>>>> Note that this was generated on a normal boot up (not failure). >>>>>>> >>>>>>> The values looks good. Can you try with the below diff applied? >>>>>>> --->8--- >>>>>>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>> index 334fe3130285..9771f6a82abe 100644 >>>>>>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >>>>>>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); >>>>>>> >>>>>>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ >>>>>>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; >>>>>>> - if (!dll_wait_time_us) >>>>>>> - dll_wait_time_us = 1; >>>>>>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); >>>>>>> >>>>>>> /* Wait for the DLL to settle. */ >>>>>>> - udelay(dll_wait_time_us); >>>>>>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); >>>>>>> } >>>>>>> >>>>>>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, >>>>>> >>>>>> Eventually it failed, in the same way with with same errors. >>>>>> Took quite a while, over 600 boot cycles. >>>>>> >>>>>> Note also that I had to hand merge the changes, since in 5.1.14 that >>>>>> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. >>>>> >>>>> Oh well. I guess the next thing to do would be to dump the timing regs >>>>> and clk rate that are set by the bootloader (before the driver override >>>>> them) or those applied by an older kernel (one that didn't have that >>>>> issue). >>>> >>>> Is this useful? >>> >>> Hm, looks like it's configured in mode 0, so no, it's not super useful. >>> Can you try booting an older kernel (one that didn't have the >>> ->setup_data_interface() hook implemented). >> >> Ok. I went back from 5.1 and the first kernel I could find that >> returned no grep hits for "setup_data_interface" was 4.16. >> >> So I built for my target with that and added similar trace to dump >> the hardware register settings for that. Debug output looks like >> this now for it: >> >> ... >> drivers/mtd/nand/gpmi-nand/gpmi-nand.c(807): gpmi_get_clks() >> clk_get_rate(r->clock[0])=22000000 >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() >> HW_GPMI_TIMING0=0x00010203 >> HW_GPMI_TIMING1=0x05000000 >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >> nand: Micron MT29F2G08ABAEAWP >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(966): enable_edo_mode() >> clk_get_rate(r->clock[0])=99000000 >> gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() >> HW_GPMI_TIMING0=0x00010101 > > TIMING0 match the one you have with 5.1 kernels. > >> HW_GPMI_TIMING1=0x90000000 > > And we even have a bigger timeout value in 5.1 (0xe0000000), so we > should be all safe WRT to timings in TIMING{0,1}. > > Can you dump CTRL1? drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() HW_GPMI_TIMING0=0x00010101 HW_GPMI_TIMING1=0x90000000 HW_GPMI_CTRL1_SET=0x01c4800c Scanning device for bad blocks 5 ofpart partitions found on MTD device gpmi-nand Creating 5 MTD partitions on "gpmi-nand": 0x000000000000-0x000000500000 : "u-boot" 0x000000500000-0x000000600000 : "u-boot-env" 0x000000600000-0x000000800000 : "log" 0x000000800000-0x000010000000 : "flash" 0x000000000000-0x000010000000 : "all" gpmi-nand 1806000.gpmi-nand: driver registered. Regards Greg >> Scanning device for bad blocks >> 5 ofpart partitions found on MTD device gpmi-nand >> Creating 5 MTD partitions on "gpmi-nand": >> 0x000000000000-0x000000500000 : "u-boot" >> 0x000000500000-0x000000600000 : "u-boot-env" >> 0x000000600000-0x000000800000 : "log" >> 0x000000800000-0x000010000000 : "flash" >> 0x000000000000-0x000010000000 : "all" >> gpmi-nand 1806000.gpmi-nand: driver registered. >> ... >> > >
On Fri, 9 Aug 2019 16:55:22 +1000 Greg Ungerer <gerg@kernel.org> wrote: > On 9/8/19 4:23 pm, Boris Brezillon wrote: > > On Fri, 9 Aug 2019 15:20:52 +1000 > > Greg Ungerer <gerg@kernel.org> wrote: > >> On 9/8/19 2:36 am, Boris Brezillon wrote: > >>> On Mon, 5 Aug 2019 15:51:05 +1000 > >>> Greg Ungerer <gerg@kernel.org> wrote: > >>>> On 2/8/19 10:51 pm, Boris Brezillon wrote: > >>>>> On Fri, 2 Aug 2019 22:34:57 +1000 > >>>>> Greg Ungerer <gerg@kernel.org> wrote: > >>>>>> On 31/7/19 4:28 pm, Boris Brezillon wrote: > >>>>>>> On Wed, 31 Jul 2019 12:05:44 +1000 > >>>>>>> Greg Ungerer <gerg@kernel.org> wrote: > >>>>>>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: > >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: > >>>>>>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: > >>>>>>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: > >>>>>>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: > >>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > >>>>>>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: > >>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > >>>>>>>>>>> [snip] > >>>>>>>> Note that this was generated on a normal boot up (not failure). > >>>>>>> > >>>>>>> The values looks good. Can you try with the below diff applied? > >>>>>>> --->8--- > >>>>>>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>>>> index 334fe3130285..9771f6a82abe 100644 > >>>>>>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>>>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>>>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > >>>>>>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); > >>>>>>> > >>>>>>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ > >>>>>>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; > >>>>>>> - if (!dll_wait_time_us) > >>>>>>> - dll_wait_time_us = 1; > >>>>>>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); > >>>>>>> > >>>>>>> /* Wait for the DLL to settle. */ > >>>>>>> - udelay(dll_wait_time_us); > >>>>>>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); > >>>>>>> } > >>>>>>> > >>>>>>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > >>>>>> > >>>>>> Eventually it failed, in the same way with with same errors. > >>>>>> Took quite a while, over 600 boot cycles. > >>>>>> > >>>>>> Note also that I had to hand merge the changes, since in 5.1.14 that > >>>>>> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. > >>>>> > >>>>> Oh well. I guess the next thing to do would be to dump the timing regs > >>>>> and clk rate that are set by the bootloader (before the driver override > >>>>> them) or those applied by an older kernel (one that didn't have that > >>>>> issue). > >>>> > >>>> Is this useful? > >>> > >>> Hm, looks like it's configured in mode 0, so no, it's not super useful. > >>> Can you try booting an older kernel (one that didn't have the > >>> ->setup_data_interface() hook implemented). > >> > >> Ok. I went back from 5.1 and the first kernel I could find that > >> returned no grep hits for "setup_data_interface" was 4.16. > >> > >> So I built for my target with that and added similar trace to dump > >> the hardware register settings for that. Debug output looks like > >> this now for it: > >> > >> ... > >> drivers/mtd/nand/gpmi-nand/gpmi-nand.c(807): gpmi_get_clks() > >> clk_get_rate(r->clock[0])=22000000 > >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() > >> HW_GPMI_TIMING0=0x00010203 > >> HW_GPMI_TIMING1=0x05000000 > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > >> nand: Micron MT29F2G08ABAEAWP > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(966): enable_edo_mode() > >> clk_get_rate(r->clock[0])=99000000 > >> gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 > >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() > >> HW_GPMI_TIMING0=0x00010101 > > > > TIMING0 match the one you have with 5.1 kernels. > > > >> HW_GPMI_TIMING1=0x90000000 > > > > And we even have a bigger timeout value in 5.1 (0xe0000000), so we > > should be all safe WRT to timings in TIMING{0,1}. > > > > Can you dump CTRL1? > > drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() > HW_GPMI_TIMING0=0x00010101 > HW_GPMI_TIMING1=0x90000000 > HW_GPMI_CTRL1_SET=0x01c4800c The read/write delay fields seem to match, but there are a few more fields set in this version: - DECOUPLE_CS - BCH_MODE - DEV_RESET - CTRL1_ATA_IRQRDY_POLARITY__ACTIVEHIGH Looks like those fields are not explicitly set in the gpmi_begin() patch, but maybe you dumped CTRL1. Would you mind sharing your patch? If I'm right and you indeed dumped CTRL1, it might be worth doing the same in your 5.1 kernel so we can more easily compare those dumps. While at it, can you dump CTRL1 before and after applying the changes? > Scanning device for bad blocks > 5 ofpart partitions found on MTD device gpmi-nand > Creating 5 MTD partitions on "gpmi-nand": > 0x000000000000-0x000000500000 : "u-boot" > 0x000000500000-0x000000600000 : "u-boot-env" > 0x000000600000-0x000000800000 : "log" > 0x000000800000-0x000010000000 : "flash" > 0x000000000000-0x000010000000 : "all" > gpmi-nand 1806000.gpmi-nand: driver registered. > > Regards > Greg > > > >> Scanning device for bad blocks > >> 5 ofpart partitions found on MTD device gpmi-nand > >> Creating 5 MTD partitions on "gpmi-nand": > >> 0x000000000000-0x000000500000 : "u-boot" > >> 0x000000500000-0x000000600000 : "u-boot-env" > >> 0x000000600000-0x000000800000 : "log" > >> 0x000000800000-0x000010000000 : "flash" > >> 0x000000000000-0x000010000000 : "all" > >> gpmi-nand 1806000.gpmi-nand: driver registered. > >> ... > >> > > > >
Hi Boris, On 9/8/19 5:32 pm, Boris Brezillon wrote: > On Fri, 9 Aug 2019 16:55:22 +1000 > Greg Ungerer <gerg@kernel.org> wrote: >> On 9/8/19 4:23 pm, Boris Brezillon wrote: >>> On Fri, 9 Aug 2019 15:20:52 +1000 >>> Greg Ungerer <gerg@kernel.org> wrote: >>>> On 9/8/19 2:36 am, Boris Brezillon wrote: >>>>> On Mon, 5 Aug 2019 15:51:05 +1000 >>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>> On 2/8/19 10:51 pm, Boris Brezillon wrote: >>>>>>> On Fri, 2 Aug 2019 22:34:57 +1000 >>>>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>>>> On 31/7/19 4:28 pm, Boris Brezillon wrote: >>>>>>>>> On Wed, 31 Jul 2019 12:05:44 +1000 >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>>>>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: >>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: >>>>>>>>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: >>>>>>>>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: >>>>>>>>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: >>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>>>>>>>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >>>>>>>>>>>>> [snip] >>>>>>>>>> Note that this was generated on a normal boot up (not failure). >>>>>>>>> >>>>>>>>> The values looks good. Can you try with the below diff applied? >>>>>>>>> --->8--- >>>>>>>>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>>>> index 334fe3130285..9771f6a82abe 100644 >>>>>>>>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>>>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>>>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >>>>>>>>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); >>>>>>>>> >>>>>>>>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ >>>>>>>>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; >>>>>>>>> - if (!dll_wait_time_us) >>>>>>>>> - dll_wait_time_us = 1; >>>>>>>>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); >>>>>>>>> >>>>>>>>> /* Wait for the DLL to settle. */ >>>>>>>>> - udelay(dll_wait_time_us); >>>>>>>>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); >>>>>>>>> } >>>>>>>>> >>>>>>>>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, >>>>>>>> >>>>>>>> Eventually it failed, in the same way with with same errors. >>>>>>>> Took quite a while, over 600 boot cycles. >>>>>>>> >>>>>>>> Note also that I had to hand merge the changes, since in 5.1.14 that >>>>>>>> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. >>>>>>> >>>>>>> Oh well. I guess the next thing to do would be to dump the timing regs >>>>>>> and clk rate that are set by the bootloader (before the driver override >>>>>>> them) or those applied by an older kernel (one that didn't have that >>>>>>> issue). >>>>>> >>>>>> Is this useful? >>>>> >>>>> Hm, looks like it's configured in mode 0, so no, it's not super useful. >>>>> Can you try booting an older kernel (one that didn't have the >>>>> ->setup_data_interface() hook implemented). >>>> >>>> Ok. I went back from 5.1 and the first kernel I could find that >>>> returned no grep hits for "setup_data_interface" was 4.16. >>>> >>>> So I built for my target with that and added similar trace to dump >>>> the hardware register settings for that. Debug output looks like >>>> this now for it: >>>> >>>> ... >>>> drivers/mtd/nand/gpmi-nand/gpmi-nand.c(807): gpmi_get_clks() >>>> clk_get_rate(r->clock[0])=22000000 >>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() >>>> HW_GPMI_TIMING0=0x00010203 >>>> HW_GPMI_TIMING1=0x05000000 >>>> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >>>> nand: Micron MT29F2G08ABAEAWP >>>> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(966): enable_edo_mode() >>>> clk_get_rate(r->clock[0])=99000000 >>>> gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 >>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() >>>> HW_GPMI_TIMING0=0x00010101 >>> >>> TIMING0 match the one you have with 5.1 kernels. >>> >>>> HW_GPMI_TIMING1=0x90000000 >>> >>> And we even have a bigger timeout value in 5.1 (0xe0000000), so we >>> should be all safe WRT to timings in TIMING{0,1}. >>> >>> Can you dump CTRL1? >> >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() >> HW_GPMI_TIMING0=0x00010101 >> HW_GPMI_TIMING1=0x90000000 >> HW_GPMI_CTRL1_SET=0x01c4800c > > The read/write delay fields seem to match, but there are a few more > fields set in this version: > - DECOUPLE_CS > - BCH_MODE > - DEV_RESET > - CTRL1_ATA_IRQRDY_POLARITY__ACTIVEHIGH > > Looks like those fields are not explicitly set in the gpmi_begin() > patch, but maybe you dumped CTRL1. Would you mind sharing your patch? Attached. > If I'm right and you indeed dumped CTRL1, it might be worth doing the > same in your 5.1 kernel so we can more easily compare those dumps. > While at it, can you dump CTRL1 before and after applying the changes? Will do. I am out of my lab for the weekend, but I'll get those numbers first thing Monday morning. Regards Greg
On Fri, 9 Aug 2019 23:57:08 +1000 Greg Ungerer <gerg@kernel.org> wrote: > Hi Boris, > > On 9/8/19 5:32 pm, Boris Brezillon wrote: > > On Fri, 9 Aug 2019 16:55:22 +1000 > > Greg Ungerer <gerg@kernel.org> wrote: > >> On 9/8/19 4:23 pm, Boris Brezillon wrote: > >>> On Fri, 9 Aug 2019 15:20:52 +1000 > >>> Greg Ungerer <gerg@kernel.org> wrote: > >>>> On 9/8/19 2:36 am, Boris Brezillon wrote: > >>>>> On Mon, 5 Aug 2019 15:51:05 +1000 > >>>>> Greg Ungerer <gerg@kernel.org> wrote: > >>>>>> On 2/8/19 10:51 pm, Boris Brezillon wrote: > >>>>>>> On Fri, 2 Aug 2019 22:34:57 +1000 > >>>>>>> Greg Ungerer <gerg@kernel.org> wrote: > >>>>>>>> On 31/7/19 4:28 pm, Boris Brezillon wrote: > >>>>>>>>> On Wed, 31 Jul 2019 12:05:44 +1000 > >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote: > >>>>>>>>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: > >>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: > >>>>>>>>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: > >>>>>>>>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: > >>>>>>>>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: > >>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > >>>>>>>>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: > >>>>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > >>>>>>>>>>>>> [snip] > >>>>>>>>>> Note that this was generated on a normal boot up (not failure). > >>>>>>>>> > >>>>>>>>> The values looks good. Can you try with the below diff applied? > >>>>>>>>> --->8--- > >>>>>>>>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>>>>>> index 334fe3130285..9771f6a82abe 100644 > >>>>>>>>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>>>>>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>>>>>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > >>>>>>>>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); > >>>>>>>>> > >>>>>>>>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ > >>>>>>>>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; > >>>>>>>>> - if (!dll_wait_time_us) > >>>>>>>>> - dll_wait_time_us = 1; > >>>>>>>>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); > >>>>>>>>> > >>>>>>>>> /* Wait for the DLL to settle. */ > >>>>>>>>> - udelay(dll_wait_time_us); > >>>>>>>>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > >>>>>>>> > >>>>>>>> Eventually it failed, in the same way with with same errors. > >>>>>>>> Took quite a while, over 600 boot cycles. > >>>>>>>> > >>>>>>>> Note also that I had to hand merge the changes, since in 5.1.14 that > >>>>>>>> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. > >>>>>>> > >>>>>>> Oh well. I guess the next thing to do would be to dump the timing regs > >>>>>>> and clk rate that are set by the bootloader (before the driver override > >>>>>>> them) or those applied by an older kernel (one that didn't have that > >>>>>>> issue). > >>>>>> > >>>>>> Is this useful? > >>>>> > >>>>> Hm, looks like it's configured in mode 0, so no, it's not super useful. > >>>>> Can you try booting an older kernel (one that didn't have the > >>>>> ->setup_data_interface() hook implemented). > >>>> > >>>> Ok. I went back from 5.1 and the first kernel I could find that > >>>> returned no grep hits for "setup_data_interface" was 4.16. > >>>> > >>>> So I built for my target with that and added similar trace to dump > >>>> the hardware register settings for that. Debug output looks like > >>>> this now for it: > >>>> > >>>> ... > >>>> drivers/mtd/nand/gpmi-nand/gpmi-nand.c(807): gpmi_get_clks() > >>>> clk_get_rate(r->clock[0])=22000000 > >>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() > >>>> HW_GPMI_TIMING0=0x00010203 > >>>> HW_GPMI_TIMING1=0x05000000 > >>>> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > >>>> nand: Micron MT29F2G08ABAEAWP > >>>> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > >>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(966): enable_edo_mode() > >>>> clk_get_rate(r->clock[0])=99000000 > >>>> gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 > >>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() > >>>> HW_GPMI_TIMING0=0x00010101 > >>> > >>> TIMING0 match the one you have with 5.1 kernels. > >>> > >>>> HW_GPMI_TIMING1=0x90000000 > >>> > >>> And we even have a bigger timeout value in 5.1 (0xe0000000), so we > >>> should be all safe WRT to timings in TIMING{0,1}. > >>> > >>> Can you dump CTRL1? > >> > >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() > >> HW_GPMI_TIMING0=0x00010101 > >> HW_GPMI_TIMING1=0x90000000 > >> HW_GPMI_CTRL1_SET=0x01c4800c > > > > The read/write delay fields seem to match, but there are a few more > > fields set in this version: > > - DECOUPLE_CS > > - BCH_MODE > > - DEV_RESET > > - CTRL1_ATA_IRQRDY_POLARITY__ACTIVEHIGH > > > > Looks like those fields are not explicitly set in the gpmi_begin() > > patch, but maybe you dumped CTRL1. Would you mind sharing your patch? > > Attached. Hm, you should read CTRL1 instead of CTRL1_SET which I guess is WO.
Hi Boris, On 9/8/19 11:59 pm, Boris Brezillon wrote: > On Fri, 9 Aug 2019 23:57:08 +1000 > Greg Ungerer <gerg@kernel.org> wrote: >> On 9/8/19 5:32 pm, Boris Brezillon wrote: >>> On Fri, 9 Aug 2019 16:55:22 +1000 >>> Greg Ungerer <gerg@kernel.org> wrote: >>>> On 9/8/19 4:23 pm, Boris Brezillon wrote: >>>>> On Fri, 9 Aug 2019 15:20:52 +1000 >>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>> On 9/8/19 2:36 am, Boris Brezillon wrote: >>>>>>> On Mon, 5 Aug 2019 15:51:05 +1000 >>>>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>>>> On 2/8/19 10:51 pm, Boris Brezillon wrote: >>>>>>>>> On Fri, 2 Aug 2019 22:34:57 +1000 >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>>>>>> On 31/7/19 4:28 pm, Boris Brezillon wrote: >>>>>>>>>>> On Wed, 31 Jul 2019 12:05:44 +1000 >>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>>>>>>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: >>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: >>>>>>>>>>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: >>>>>>>>>>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: >>>>>>>>>>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: >>>>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>>>>>>>>>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>>>>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >>>>>>>>>>>>>>> [snip] >>>>>>>>>>>> Note that this was generated on a normal boot up (not failure). >>>>>>>>>>> >>>>>>>>>>> The values looks good. Can you try with the below diff applied? >>>>>>>>>>> --->8--- >>>>>>>>>>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>>>>>> index 334fe3130285..9771f6a82abe 100644 >>>>>>>>>>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>>>>>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>>>>>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >>>>>>>>>>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); >>>>>>>>>>> >>>>>>>>>>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ >>>>>>>>>>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; >>>>>>>>>>> - if (!dll_wait_time_us) >>>>>>>>>>> - dll_wait_time_us = 1; >>>>>>>>>>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); >>>>>>>>>>> >>>>>>>>>>> /* Wait for the DLL to settle. */ >>>>>>>>>>> - udelay(dll_wait_time_us); >>>>>>>>>>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, >>>>>>>>>> >>>>>>>>>> Eventually it failed, in the same way with with same errors. >>>>>>>>>> Took quite a while, over 600 boot cycles. >>>>>>>>>> >>>>>>>>>> Note also that I had to hand merge the changes, since in 5.1.14 that >>>>>>>>>> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. >>>>>>>>> >>>>>>>>> Oh well. I guess the next thing to do would be to dump the timing regs >>>>>>>>> and clk rate that are set by the bootloader (before the driver override >>>>>>>>> them) or those applied by an older kernel (one that didn't have that >>>>>>>>> issue). >>>>>>>> >>>>>>>> Is this useful? >>>>>>> >>>>>>> Hm, looks like it's configured in mode 0, so no, it's not super useful. >>>>>>> Can you try booting an older kernel (one that didn't have the >>>>>>> ->setup_data_interface() hook implemented). >>>>>> >>>>>> Ok. I went back from 5.1 and the first kernel I could find that >>>>>> returned no grep hits for "setup_data_interface" was 4.16. >>>>>> >>>>>> So I built for my target with that and added similar trace to dump >>>>>> the hardware register settings for that. Debug output looks like >>>>>> this now for it: >>>>>> >>>>>> ... >>>>>> drivers/mtd/nand/gpmi-nand/gpmi-nand.c(807): gpmi_get_clks() >>>>>> clk_get_rate(r->clock[0])=22000000 >>>>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() >>>>>> HW_GPMI_TIMING0=0x00010203 >>>>>> HW_GPMI_TIMING1=0x05000000 >>>>>> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >>>>>> nand: Micron MT29F2G08ABAEAWP >>>>>> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >>>>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(966): enable_edo_mode() >>>>>> clk_get_rate(r->clock[0])=99000000 >>>>>> gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 >>>>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() >>>>>> HW_GPMI_TIMING0=0x00010101 >>>>> >>>>> TIMING0 match the one you have with 5.1 kernels. >>>>> >>>>>> HW_GPMI_TIMING1=0x90000000 >>>>> >>>>> And we even have a bigger timeout value in 5.1 (0xe0000000), so we >>>>> should be all safe WRT to timings in TIMING{0,1}. >>>>> >>>>> Can you dump CTRL1? >>>> >>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() >>>> HW_GPMI_TIMING0=0x00010101 >>>> HW_GPMI_TIMING1=0x90000000 >>>> HW_GPMI_CTRL1_SET=0x01c4800c >>> >>> The read/write delay fields seem to match, but there are a few more >>> fields set in this version: >>> - DECOUPLE_CS >>> - BCH_MODE >>> - DEV_RESET >>> - CTRL1_ATA_IRQRDY_POLARITY__ACTIVEHIGH >>> >>> Looks like those fields are not explicitly set in the gpmi_begin() >>> patch, but maybe you dumped CTRL1. Would you mind sharing your patch? >> >> Attached. > > Hm, you should read CTRL1 instead of CTRL1_SET which I guess is WO. Here is 2 sets of trace dumping the same set of registers. This first is on the linux-4.16 kernel: Linux version 4.16.0 (gerg@goober) (gcc version 4.8.3 (GCC)) #9 Mon Aug 12 10:46:25 AEST 2019 ... nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAEAWP nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 gpmi-nand 1806000.gpmi-nand: use legacy bch geometry gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1110): gpmi_begin() HW_GPMI_TIMING0=0x00010101 HW_GPMI_TIMING1=0x90000000 HW_GPMI_CTRL1=0x01c6800c r->clock[0]=99000000 Scanning device for bad blocks 5 ofpart partitions found on MTD device gpmi-nand Creating 5 MTD partitions on "gpmi-nand": 0x000000000000-0x000000500000 : "u-boot" 0x000000500000-0x000000600000 : "u-boot-env" 0x000000600000-0x000000800000 : "log" 0x000000800000-0x000010000000 : "flash" 0x000000000000-0x000010000000 : "all" gpmi-nand 1806000.gpmi-nand: driver registered. ... And then this is from the 5.1.14 kernel: Linux version 5.1.14 (gerg@goober) (gcc version 4.8.3 (GCC)) #25 Mon Aug 12 10:49:21 AEST 2019 ... nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAEAWP nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(510): gpmi_nfc_apply_timings() HW_GPMI_TIMING0=0x00020101 HW_GPMI_TIMING1=0xb0000000 HW_GPMI_CTRL1=0x0104000c r->clock[0]=22000000 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(510): gpmi_nfc_apply_timings() HW_GPMI_TIMING0=0x00010101 HW_GPMI_TIMING1=0xe0000000 HW_GPMI_CTRL1=0x01c6800c r->clock[0]=99000000 Scanning device for bad blocks 5 fixed-partitions partitions found on MTD device gpmi-nand Creating 5 MTD partitions on "gpmi-nand": 0x000000000000-0x000000500000 : "u-boot" 0x000000500000-0x000000600000 : "u-boot-env" 0x000000600000-0x000000800000 : "log" 0x000000800000-0x000010000000 : "flash" 0x000000000000-0x000010000000 : "all" gpmi-nand 1806000.gpmi-nand: driver registered. Register settings read back from the registers themselves at the end of the respective setting routines (so gpmi_begin() for 4.16 and gpmi_nfc_apply_timings() for 5.1.14) So something I notice here is that gpmi_nfc_apply_timings() is being run multiple times. When I look back to the original failure dumps the first error ("DMA timeout, last DMA") occurred after the device type messages ("nand: 256 MiB, SLC,..."). Is it happening with that higher clock rate still set? Regards Greg
On 12/8/19 12:50 pm, Greg Ungerer wrote: > On 9/8/19 11:59 pm, Boris Brezillon wrote: >> On Fri, 9 Aug 2019 23:57:08 +1000 >> Greg Ungerer <gerg@kernel.org> wrote: >>> On 9/8/19 5:32 pm, Boris Brezillon wrote: >>>> On Fri, 9 Aug 2019 16:55:22 +1000 >>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>> On 9/8/19 4:23 pm, Boris Brezillon wrote: >>>>>> On Fri, 9 Aug 2019 15:20:52 +1000 >>>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>>> On 9/8/19 2:36 am, Boris Brezillon wrote: >>>>>>>> On Mon, 5 Aug 2019 15:51:05 +1000 >>>>>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>>>>> On 2/8/19 10:51 pm, Boris Brezillon wrote: >>>>>>>>>> On Fri, 2 Aug 2019 22:34:57 +1000 >>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>>>>>>> On 31/7/19 4:28 pm, Boris Brezillon wrote: >>>>>>>>>>>> On Wed, 31 Jul 2019 12:05:44 +1000 >>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote: >>>>>>>>>>>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: >>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: >>>>>>>>>>>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: >>>>>>>>>>>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: >>>>>>>>>>>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: >>>>>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: >>>>>>>>>>>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: >>>>>>>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: >>>>>>>>>>>>>>>> [snip] >>>>>>>>>>>>> Note that this was generated on a normal boot up (not failure). >>>>>>>>>>>> >>>>>>>>>>>> The values looks good. Can you try with the below diff applied? >>>>>>>>>>>> --->8--- >>>>>>>>>>>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>>>>>>> index 334fe3130285..9771f6a82abe 100644 >>>>>>>>>>>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>>>>>>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c >>>>>>>>>>>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) >>>>>>>>>>>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); >>>>>>>>>>>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ >>>>>>>>>>>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; >>>>>>>>>>>> - if (!dll_wait_time_us) >>>>>>>>>>>> - dll_wait_time_us = 1; >>>>>>>>>>>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); >>>>>>>>>>>> /* Wait for the DLL to settle. */ >>>>>>>>>>>> - udelay(dll_wait_time_us); >>>>>>>>>>>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); >>>>>>>>>>>> } >>>>>>>>>>>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, >>>>>>>>>>> >>>>>>>>>>> Eventually it failed, in the same way with with same errors. >>>>>>>>>>> Took quite a while, over 600 boot cycles. >>>>>>>>>>> >>>>>>>>>>> Note also that I had to hand merge the changes, since in 5.1.14 that >>>>>>>>>>> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. >>>>>>>>>> >>>>>>>>>> Oh well. I guess the next thing to do would be to dump the timing regs >>>>>>>>>> and clk rate that are set by the bootloader (before the driver override >>>>>>>>>> them) or those applied by an older kernel (one that didn't have that >>>>>>>>>> issue). >>>>>>>>> >>>>>>>>> Is this useful? >>>>>>>> >>>>>>>> Hm, looks like it's configured in mode 0, so no, it's not super useful. >>>>>>>> Can you try booting an older kernel (one that didn't have the >>>>>>>> ->setup_data_interface() hook implemented). >>>>>>> >>>>>>> Ok. I went back from 5.1 and the first kernel I could find that >>>>>>> returned no grep hits for "setup_data_interface" was 4.16. >>>>>>> >>>>>>> So I built for my target with that and added similar trace to dump >>>>>>> the hardware register settings for that. Debug output looks like >>>>>>> this now for it: >>>>>>> >>>>>>> ... >>>>>>> drivers/mtd/nand/gpmi-nand/gpmi-nand.c(807): gpmi_get_clks() >>>>>>> clk_get_rate(r->clock[0])=22000000 >>>>>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() >>>>>>> HW_GPMI_TIMING0=0x00010203 >>>>>>> HW_GPMI_TIMING1=0x05000000 >>>>>>> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >>>>>>> nand: Micron MT29F2G08ABAEAWP >>>>>>> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >>>>>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(966): enable_edo_mode() >>>>>>> clk_get_rate(r->clock[0])=99000000 >>>>>>> gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 >>>>>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() >>>>>>> HW_GPMI_TIMING0=0x00010101 >>>>>> >>>>>> TIMING0 match the one you have with 5.1 kernels. >>>>>>> HW_GPMI_TIMING1=0x90000000 >>>>>> >>>>>> And we even have a bigger timeout value in 5.1 (0xe0000000), so we >>>>>> should be all safe WRT to timings in TIMING{0,1}. >>>>>> >>>>>> Can you dump CTRL1? >>>>> >>>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() >>>>> HW_GPMI_TIMING0=0x00010101 >>>>> HW_GPMI_TIMING1=0x90000000 >>>>> HW_GPMI_CTRL1_SET=0x01c4800c >>>> >>>> The read/write delay fields seem to match, but there are a few more >>>> fields set in this version: >>>> - DECOUPLE_CS >>>> - BCH_MODE >>>> - DEV_RESET >>>> - CTRL1_ATA_IRQRDY_POLARITY__ACTIVEHIGH >>>> >>>> Looks like those fields are not explicitly set in the gpmi_begin() >>>> patch, but maybe you dumped CTRL1. Would you mind sharing your patch? >>> >>> Attached. >> >> Hm, you should read CTRL1 instead of CTRL1_SET which I guess is WO. > > > Here is 2 sets of trace dumping the same set of registers. > This first is on the linux-4.16 kernel: > > Linux version 4.16.0 (gerg@goober) (gcc version 4.8.3 (GCC)) #9 Mon Aug 12 10:46:25 AEST 2019 > ... > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > gpmi-nand 1806000.gpmi-nand: use legacy bch geometry > gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 > drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1110): gpmi_begin() > HW_GPMI_TIMING0=0x00010101 > HW_GPMI_TIMING1=0x90000000 > HW_GPMI_CTRL1=0x01c6800c > r->clock[0]=99000000 > Scanning device for bad blocks > 5 ofpart partitions found on MTD device gpmi-nand > Creating 5 MTD partitions on "gpmi-nand": > 0x000000000000-0x000000500000 : "u-boot" > 0x000000500000-0x000000600000 : "u-boot-env" > 0x000000600000-0x000000800000 : "log" > 0x000000800000-0x000010000000 : "flash" > 0x000000000000-0x000010000000 : "all" > gpmi-nand 1806000.gpmi-nand: driver registered. > ... > > > And then this is from the 5.1.14 kernel: > > Linux version 5.1.14 (gerg@goober) (gcc version 4.8.3 (GCC)) #25 Mon Aug 12 10:49:21 AEST 2019 > ... > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(510): gpmi_nfc_apply_timings() > HW_GPMI_TIMING0=0x00020101 > HW_GPMI_TIMING1=0xb0000000 > HW_GPMI_CTRL1=0x0104000c > r->clock[0]=22000000 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(510): gpmi_nfc_apply_timings() > HW_GPMI_TIMING0=0x00010101 > HW_GPMI_TIMING1=0xe0000000 > HW_GPMI_CTRL1=0x01c6800c > r->clock[0]=99000000 > Scanning device for bad blocks > 5 fixed-partitions partitions found on MTD device gpmi-nand > Creating 5 MTD partitions on "gpmi-nand": > 0x000000000000-0x000000500000 : "u-boot" > 0x000000500000-0x000000600000 : "u-boot-env" > 0x000000600000-0x000000800000 : "log" > 0x000000800000-0x000010000000 : "flash" > 0x000000000000-0x000010000000 : "all" > gpmi-nand 1806000.gpmi-nand: driver registered. > > > Register settings read back from the registers themselves at the end > of the respective setting routines (so gpmi_begin() for 4.16 and > gpmi_nfc_apply_timings() for 5.1.14) > > So something I notice here is that gpmi_nfc_apply_timings() is > being run multiple times. When I look back to the original > failure dumps the first error ("DMA timeout, last DMA") occurred > after the device type messages ("nand: 256 MiB, SLC,..."). Is it > happening with that higher clock rate still set? Looks like that is not the case... ... nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAEAWP nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(510): gpmi_nfc_apply_timings() HW_GPMI_TIMING0=0x00020101 HW_GPMI_TIMING1=0xb0000000 HW_GPMI_CTRL1=0x0104000c r->clock[0]=22000000 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(510): gpmi_nfc_apply_timings() HW_GPMI_TIMING0=0x00010101 HW_GPMI_TIMING1=0xe0000000 HW_GPMI_CTRL1=0x01c6800c r->clock[0]=99000000 gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA gpmi-nand 1806000.gpmi-nand: Show GPMI registers : gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000100 gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 gpmi-nand 1806000.gpmi-nand: Show BCH registers : gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 Regards Greg
On Mon, 12 Aug 2019 12:50:36 +1000 Greg Ungerer <gerg@kernel.org> wrote: > Hi Boris, > > On 9/8/19 11:59 pm, Boris Brezillon wrote: > > On Fri, 9 Aug 2019 23:57:08 +1000 > > Greg Ungerer <gerg@kernel.org> wrote: > >> On 9/8/19 5:32 pm, Boris Brezillon wrote: > >>> On Fri, 9 Aug 2019 16:55:22 +1000 > >>> Greg Ungerer <gerg@kernel.org> wrote: > >>>> On 9/8/19 4:23 pm, Boris Brezillon wrote: > >>>>> On Fri, 9 Aug 2019 15:20:52 +1000 > >>>>> Greg Ungerer <gerg@kernel.org> wrote: > >>>>>> On 9/8/19 2:36 am, Boris Brezillon wrote: > >>>>>>> On Mon, 5 Aug 2019 15:51:05 +1000 > >>>>>>> Greg Ungerer <gerg@kernel.org> wrote: > >>>>>>>> On 2/8/19 10:51 pm, Boris Brezillon wrote: > >>>>>>>>> On Fri, 2 Aug 2019 22:34:57 +1000 > >>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote: > >>>>>>>>>> On 31/7/19 4:28 pm, Boris Brezillon wrote: > >>>>>>>>>>> On Wed, 31 Jul 2019 12:05:44 +1000 > >>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote: > >>>>>>>>>>>> On 30/7/19 6:38 pm, Miquel Raynal wrote: > >>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Tue, 30 Jul 2019 16:06:55 +1000: > >>>>>>>>>>>>>> On 30/7/19 10:41 am, Greg Ungerer wrote: > >>>>>>>>>>>>>>> On 30/7/19 10:28 am, Greg Ungerer wrote: > >>>>>>>>>>>>>>>> On 29/7/19 10:47 pm, Miquel Raynal wrote: > >>>>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > >>>>>>>>>>>>>>>>>> On 29/7/19 6:36 pm, Miquel Raynal wrote: > >>>>>>>>>>>>>>>>>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > >>>>>>>>>>>>>>> [snip] > >>>>>>>>>>>> Note that this was generated on a normal boot up (not failure). > >>>>>>>>>>> > >>>>>>>>>>> The values looks good. Can you try with the below diff applied? > >>>>>>>>>>> --->8--- > >>>>>>>>>>> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>>>>>>>> index 334fe3130285..9771f6a82abe 100644 > >>>>>>>>>>> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>>>>>>>> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > >>>>>>>>>>> @@ -721,12 +721,10 @@ static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > >>>>>>>>>>> writel(hw->ctrl1n, gpmi_regs + HW_GPMI_CTRL1_SET); > >>>>>>>>>>> > >>>>>>>>>>> /* Wait 64 clock cycles before using the GPMI after enabling the DLL */ > >>>>>>>>>>> - dll_wait_time_us = USEC_PER_SEC / hw->clk_rate * 64; > >>>>>>>>>>> - if (!dll_wait_time_us) > >>>>>>>>>>> - dll_wait_time_us = 1; > >>>>>>>>>>> + dll_wait_time_us = DIV_ROUND_UP(USEC_PER_SEC * 64, hw->clk_rate); > >>>>>>>>>>> > >>>>>>>>>>> /* Wait for the DLL to settle. */ > >>>>>>>>>>> - udelay(dll_wait_time_us); > >>>>>>>>>>> + usleep_range(dll_wait_time_us, dll_wait_time_us * 10); > >>>>>>>>>>> } > >>>>>>>>>>> > >>>>>>>>>>> static int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > >>>>>>>>>> > >>>>>>>>>> Eventually it failed, in the same way with with same errors. > >>>>>>>>>> Took quite a while, over 600 boot cycles. > >>>>>>>>>> > >>>>>>>>>> Note also that I had to hand merge the changes, since in 5.1.14 that > >>>>>>>>>> gpmi_nfc_apply_timings() is in gpmi-lib.c. But it was trivial to do. > >>>>>>>>> > >>>>>>>>> Oh well. I guess the next thing to do would be to dump the timing regs > >>>>>>>>> and clk rate that are set by the bootloader (before the driver override > >>>>>>>>> them) or those applied by an older kernel (one that didn't have that > >>>>>>>>> issue). > >>>>>>>> > >>>>>>>> Is this useful? > >>>>>>> > >>>>>>> Hm, looks like it's configured in mode 0, so no, it's not super useful. > >>>>>>> Can you try booting an older kernel (one that didn't have the > >>>>>>> ->setup_data_interface() hook implemented). > >>>>>> > >>>>>> Ok. I went back from 5.1 and the first kernel I could find that > >>>>>> returned no grep hits for "setup_data_interface" was 4.16. > >>>>>> > >>>>>> So I built for my target with that and added similar trace to dump > >>>>>> the hardware register settings for that. Debug output looks like > >>>>>> this now for it: > >>>>>> > >>>>>> ... > >>>>>> drivers/mtd/nand/gpmi-nand/gpmi-nand.c(807): gpmi_get_clks() > >>>>>> clk_get_rate(r->clock[0])=22000000 > >>>>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() > >>>>>> HW_GPMI_TIMING0=0x00010203 > >>>>>> HW_GPMI_TIMING1=0x05000000 > >>>>>> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > >>>>>> nand: Micron MT29F2G08ABAEAWP > >>>>>> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > >>>>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(966): enable_edo_mode() > >>>>>> clk_get_rate(r->clock[0])=99000000 > >>>>>> gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 > >>>>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() > >>>>>> HW_GPMI_TIMING0=0x00010101 > >>>>> > >>>>> TIMING0 match the one you have with 5.1 kernels. > >>>>> > >>>>>> HW_GPMI_TIMING1=0x90000000 > >>>>> > >>>>> And we even have a bigger timeout value in 5.1 (0xe0000000), so we > >>>>> should be all safe WRT to timings in TIMING{0,1}. > >>>>> > >>>>> Can you dump CTRL1? > >>>> > >>>> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1054): gpmi_begin() > >>>> HW_GPMI_TIMING0=0x00010101 > >>>> HW_GPMI_TIMING1=0x90000000 > >>>> HW_GPMI_CTRL1_SET=0x01c4800c > >>> > >>> The read/write delay fields seem to match, but there are a few more > >>> fields set in this version: > >>> - DECOUPLE_CS > >>> - BCH_MODE > >>> - DEV_RESET > >>> - CTRL1_ATA_IRQRDY_POLARITY__ACTIVEHIGH > >>> > >>> Looks like those fields are not explicitly set in the gpmi_begin() > >>> patch, but maybe you dumped CTRL1. Would you mind sharing your patch? > >> > >> Attached. > > > > Hm, you should read CTRL1 instead of CTRL1_SET which I guess is WO. > > > Here is 2 sets of trace dumping the same set of registers. > This first is on the linux-4.16 kernel: > > Linux version 4.16.0 (gerg@goober) (gcc version 4.8.3 (GCC)) #9 Mon Aug 12 10:46:25 AEST 2019 > ... > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > gpmi-nand 1806000.gpmi-nand: use legacy bch geometry > gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 > drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1110): gpmi_begin() > HW_GPMI_TIMING0=0x00010101 > HW_GPMI_TIMING1=0x90000000 > HW_GPMI_CTRL1=0x01c6800c > r->clock[0]=99000000 > Scanning device for bad blocks > 5 ofpart partitions found on MTD device gpmi-nand > Creating 5 MTD partitions on "gpmi-nand": > 0x000000000000-0x000000500000 : "u-boot" > 0x000000500000-0x000000600000 : "u-boot-env" > 0x000000600000-0x000000800000 : "log" > 0x000000800000-0x000010000000 : "flash" > 0x000000000000-0x000010000000 : "all" > gpmi-nand 1806000.gpmi-nand: driver registered. > ... > > > And then this is from the 5.1.14 kernel: > > Linux version 5.1.14 (gerg@goober) (gcc version 4.8.3 (GCC)) #25 Mon Aug 12 10:49:21 AEST 2019 > ... > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(510): gpmi_nfc_apply_timings() > HW_GPMI_TIMING0=0x00020101 > HW_GPMI_TIMING1=0xb0000000 > HW_GPMI_CTRL1=0x0104000c > r->clock[0]=22000000 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(510): gpmi_nfc_apply_timings() > HW_GPMI_TIMING0=0x00010101 > HW_GPMI_TIMING1=0xe0000000 > HW_GPMI_CTRL1=0x01c6800c > r->clock[0]=99000000 > Scanning device for bad blocks > 5 fixed-partitions partitions found on MTD device gpmi-nand > Creating 5 MTD partitions on "gpmi-nand": > 0x000000000000-0x000000500000 : "u-boot" > 0x000000500000-0x000000600000 : "u-boot-env" > 0x000000600000-0x000000800000 : "log" > 0x000000800000-0x000010000000 : "flash" > 0x000000000000-0x000010000000 : "all" > gpmi-nand 1806000.gpmi-nand: driver registered. > > > Register settings read back from the registers themselves at the end > of the respective setting routines (so gpmi_begin() for 4.16 and > gpmi_nfc_apply_timings() for 5.1.14) Hm, CTRL1 is identical. Can you dump all regs at the beginning and at the end of those funcs?
Hi Boris, On 12/8/19 5:31 pm, Boris Brezillon wrote: > On Mon, 12 Aug 2019 12:50:36 +1000 [snip] > Hm, CTRL1 is identical. Can you dump all regs at the beginning and at > the end of those funcs? Here is a more complete dump of registers. Trace points are at entry and exit of the respective functions in the different kernel versions. Register dumping code is identical for both. Linux version 4.16.0 (gerg@goober) (gcc version 4.8.3 (GCC)) #10 Tue Aug 13 10:24:28 AEST 2019 ... drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1073): gpmi_begin(): ENTRY HW_GPMI_CTRL0=0x00000000 HW_GPMI_CTRL1=0x01c4000c HW_GPMI_COMPARE=0x00000000 HW_GPMI_ECCCTRL=0x00000400 HW_GPMI_ECCCOUNT=0x00000000 HW_GPMI_PAYLOAD=0x00000000 HW_GPMI_AUXILIARY=0x00000000 HW_GPMI_TIMING0=0x00010203 HW_GPMI_TIMING1=0x00000000 HW_GPMI_TIMING2=0x23023336 HW_GPMI_DATA=0x00000000 HW_GPMI_STAT=0xff000005 HW_GPMI_DEBUG=0x00000000 r->clock[0]=22000000 nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAEAWP nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1073): gpmi_begin(): ENTRY HW_GPMI_CTRL0=0x01800001 HW_GPMI_CTRL1=0x0104000c HW_GPMI_COMPARE=0x00000000 HW_GPMI_ECCCTRL=0x00000000 HW_GPMI_ECCCOUNT=0x00000000 HW_GPMI_PAYLOAD=0x00000000 HW_GPMI_AUXILIARY=0x00000000 HW_GPMI_TIMING0=0x00010203 HW_GPMI_TIMING1=0x05000000 HW_GPMI_TIMING2=0x23023336 HW_GPMI_DATA=0x00000000 HW_GPMI_STAT=0xff000005 HW_GPMI_DEBUG=0x00000101 r->clock[0]=99000000 drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1136): gpmi_begin(): EXIT HW_GPMI_CTRL0=0x01800001 HW_GPMI_CTRL1=0x01c6800c HW_GPMI_COMPARE=0x00000000 HW_GPMI_ECCCTRL=0x00000000 HW_GPMI_ECCCOUNT=0x00000000 HW_GPMI_PAYLOAD=0x00000000 HW_GPMI_AUXILIARY=0x00000000 HW_GPMI_TIMING0=0x00010101 HW_GPMI_TIMING1=0x90000000 HW_GPMI_TIMING2=0x23023336 HW_GPMI_DATA=0x00000000 HW_GPMI_STAT=0xff000005 HW_GPMI_DEBUG=0x00000101 r->clock[0]=99000000 Scanning device for bad blocks 5 ofpart partitions found on MTD device gpmi-nand Creating 5 MTD partitions on "gpmi-nand": 0x000000000000-0x000000500000 : "u-boot" 0x000000500000-0x000000600000 : "u-boot-env" 0x000000600000-0x000000800000 : "log" 0x000000800000-0x000010000000 : "flash" 0x000000000000-0x000010000000 : "all" gpmi-nand 1806000.gpmi-nand: driver registered. ... Note that the first ENTRY dump has no matching EXIT dump. From the code I assume it is returning from gpmi_begin() at the "if (!hw.sample_delay_factor)" check. And for the 5.1.14 kernel: Linux version 5.1.14 (gerg@goober) (gcc version 4.8.3 (GCC)) #27 Tue Aug 13 10:20:32 AEST 2019 ... drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY HW_GPMI_CTRL0=0x00000000 HW_GPMI_CTRL1=0x01c4000c HW_GPMI_COMPARE=0x00000000 HW_GPMI_ECCCTRL=0x00000400 HW_GPMI_ECCCOUNT=0x00000000 HW_GPMI_PAYLOAD=0x00000000 HW_GPMI_AUXILIARY=0x00000000 HW_GPMI_TIMING0=0x00010203 HW_GPMI_TIMING1=0x00000000 HW_GPMI_TIMING2=0x23023336 HW_GPMI_DATA=0x00000000 HW_GPMI_STAT=0xff000005 HW_GPMI_DEBUG=0x00000000 r->clock[0]=22000000 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT HW_GPMI_CTRL0=0x00000000 HW_GPMI_CTRL1=0x0104000c HW_GPMI_COMPARE=0x00000000 HW_GPMI_ECCCTRL=0x00000400 HW_GPMI_ECCCOUNT=0x00000000 HW_GPMI_PAYLOAD=0x00000000 HW_GPMI_AUXILIARY=0x00000000 HW_GPMI_TIMING0=0x00020101 HW_GPMI_TIMING1=0x60000000 HW_GPMI_TIMING2=0x23023336 HW_GPMI_DATA=0x00000000 HW_GPMI_STAT=0xff000005 HW_GPMI_DEBUG=0x00000000 r->clock[0]=22000000 random: fast init done nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: Micron MT29F2G08ABAEAWP nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY HW_GPMI_CTRL0=0x01800001 HW_GPMI_CTRL1=0x0104000c HW_GPMI_COMPARE=0x00000000 HW_GPMI_ECCCTRL=0x00000000 HW_GPMI_ECCCOUNT=0x00000000 HW_GPMI_PAYLOAD=0x00000000 HW_GPMI_AUXILIARY=0x00000000 HW_GPMI_TIMING0=0x00020101 HW_GPMI_TIMING1=0x60000000 HW_GPMI_TIMING2=0x23023336 HW_GPMI_DATA=0x0000003f HW_GPMI_STAT=0xff000005 HW_GPMI_DEBUG=0x00000101 r->clock[0]=22000000 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT HW_GPMI_CTRL0=0x01800001 HW_GPMI_CTRL1=0x0104000c HW_GPMI_COMPARE=0x00000000 HW_GPMI_ECCCTRL=0x00000000 HW_GPMI_ECCCOUNT=0x00000000 HW_GPMI_PAYLOAD=0x00000000 HW_GPMI_AUXILIARY=0x00000000 HW_GPMI_TIMING0=0x00020101 HW_GPMI_TIMING1=0xb0000000 HW_GPMI_TIMING2=0x23023336 HW_GPMI_DATA=0x0000003f HW_GPMI_STAT=0xff000005 HW_GPMI_DEBUG=0x00000101 r->clock[0]=22000000 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY HW_GPMI_CTRL0=0x01800001 HW_GPMI_CTRL1=0x0104000c HW_GPMI_COMPARE=0x00000000 HW_GPMI_ECCCTRL=0x00000000 HW_GPMI_ECCCOUNT=0x00000000 HW_GPMI_PAYLOAD=0x00000000 HW_GPMI_AUXILIARY=0x00000000 HW_GPMI_TIMING0=0x00020101 HW_GPMI_TIMING1=0xb0000000 HW_GPMI_TIMING2=0x23023336 HW_GPMI_DATA=0x000000e0 HW_GPMI_STAT=0xff000005 HW_GPMI_DEBUG=0x00000000 r->clock[0]=22000000 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT HW_GPMI_CTRL0=0x01800001 HW_GPMI_CTRL1=0x01c6800c HW_GPMI_COMPARE=0x00000000 HW_GPMI_ECCCTRL=0x00000000 HW_GPMI_ECCCOUNT=0x00000000 HW_GPMI_PAYLOAD=0x00000000 HW_GPMI_AUXILIARY=0x00000000 HW_GPMI_TIMING0=0x00010101 HW_GPMI_TIMING1=0xe0000000 HW_GPMI_TIMING2=0x23023336 HW_GPMI_DATA=0x000000e0 HW_GPMI_STAT=0xff000005 HW_GPMI_DEBUG=0x00000000 r->clock[0]=99000000 Scanning device for bad blocks 5 fixed-partitions partitions found on MTD device gpmi-nand Creating 5 MTD partitions on "gpmi-nand": 0x000000000000-0x000000500000 : "u-boot" 0x000000500000-0x000000600000 : "u-boot-env" 0x000000600000-0x000000800000 : "log" 0x000000800000-0x000010000000 : "flash" 0x000000000000-0x000010000000 : "all" gpmi-nand 1806000.gpmi-nand: driver registered. ... Regards Greg
Hi Greg On Tue, Aug 13, 2019 at 2:50 AM Greg Ungerer <gerg@kernel.org> wrote: > > Hi Boris, > > On 12/8/19 5:31 pm, Boris Brezillon wrote: > > On Mon, 12 Aug 2019 12:50:36 +1000 > [snip] > > Hm, CTRL1 is identical. Can you dump all regs at the beginning and at > > the end of those funcs? > > Here is a more complete dump of registers. Trace points are at > entry and exit of the respective functions in the different > kernel versions. Register dumping code is identical for both. > > > Linux version 4.16.0 (gerg@goober) (gcc version 4.8.3 (GCC)) #10 Tue Aug 13 10:24:28 AEST 2019 > ... I ran an overnight reboot process on linux-4.19.y tag: v4.19.169 and I'm not able to reproduce up to now. Do you have an update from your side? Michael > drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1073): gpmi_begin(): ENTRY > HW_GPMI_CTRL0=0x00000000 > HW_GPMI_CTRL1=0x01c4000c > HW_GPMI_COMPARE=0x00000000 > HW_GPMI_ECCCTRL=0x00000400 > HW_GPMI_ECCCOUNT=0x00000000 > HW_GPMI_PAYLOAD=0x00000000 > HW_GPMI_AUXILIARY=0x00000000 > HW_GPMI_TIMING0=0x00010203 > HW_GPMI_TIMING1=0x00000000 > HW_GPMI_TIMING2=0x23023336 > HW_GPMI_DATA=0x00000000 > HW_GPMI_STAT=0xff000005 > HW_GPMI_DEBUG=0x00000000 > r->clock[0]=22000000 > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 > drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1073): gpmi_begin(): ENTRY > HW_GPMI_CTRL0=0x01800001 > HW_GPMI_CTRL1=0x0104000c > HW_GPMI_COMPARE=0x00000000 > HW_GPMI_ECCCTRL=0x00000000 > HW_GPMI_ECCCOUNT=0x00000000 > HW_GPMI_PAYLOAD=0x00000000 > HW_GPMI_AUXILIARY=0x00000000 > HW_GPMI_TIMING0=0x00010203 > HW_GPMI_TIMING1=0x05000000 > HW_GPMI_TIMING2=0x23023336 > HW_GPMI_DATA=0x00000000 > HW_GPMI_STAT=0xff000005 > HW_GPMI_DEBUG=0x00000101 > r->clock[0]=99000000 > drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1136): gpmi_begin(): EXIT > HW_GPMI_CTRL0=0x01800001 > HW_GPMI_CTRL1=0x01c6800c > HW_GPMI_COMPARE=0x00000000 > HW_GPMI_ECCCTRL=0x00000000 > HW_GPMI_ECCCOUNT=0x00000000 > HW_GPMI_PAYLOAD=0x00000000 > HW_GPMI_AUXILIARY=0x00000000 > HW_GPMI_TIMING0=0x00010101 > HW_GPMI_TIMING1=0x90000000 > HW_GPMI_TIMING2=0x23023336 > HW_GPMI_DATA=0x00000000 > HW_GPMI_STAT=0xff000005 > HW_GPMI_DEBUG=0x00000101 > r->clock[0]=99000000 > Scanning device for bad blocks > 5 ofpart partitions found on MTD device gpmi-nand > Creating 5 MTD partitions on "gpmi-nand": > 0x000000000000-0x000000500000 : "u-boot" > 0x000000500000-0x000000600000 : "u-boot-env" > 0x000000600000-0x000000800000 : "log" > 0x000000800000-0x000010000000 : "flash" > 0x000000000000-0x000010000000 : "all" > gpmi-nand 1806000.gpmi-nand: driver registered. > ... > > Note that the first ENTRY dump has no matching EXIT dump. From the > code I assume it is returning from gpmi_begin() at the > "if (!hw.sample_delay_factor)" check. > > > And for the 5.1.14 kernel: > > Linux version 5.1.14 (gerg@goober) (gcc version 4.8.3 (GCC)) #27 Tue Aug 13 10:20:32 AEST 2019 > ... > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY > HW_GPMI_CTRL0=0x00000000 > HW_GPMI_CTRL1=0x01c4000c > HW_GPMI_COMPARE=0x00000000 > HW_GPMI_ECCCTRL=0x00000400 > HW_GPMI_ECCCOUNT=0x00000000 > HW_GPMI_PAYLOAD=0x00000000 > HW_GPMI_AUXILIARY=0x00000000 > HW_GPMI_TIMING0=0x00010203 > HW_GPMI_TIMING1=0x00000000 > HW_GPMI_TIMING2=0x23023336 > HW_GPMI_DATA=0x00000000 > HW_GPMI_STAT=0xff000005 > HW_GPMI_DEBUG=0x00000000 > r->clock[0]=22000000 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT > HW_GPMI_CTRL0=0x00000000 > HW_GPMI_CTRL1=0x0104000c > HW_GPMI_COMPARE=0x00000000 > HW_GPMI_ECCCTRL=0x00000400 > HW_GPMI_ECCCOUNT=0x00000000 > HW_GPMI_PAYLOAD=0x00000000 > HW_GPMI_AUXILIARY=0x00000000 > HW_GPMI_TIMING0=0x00020101 > HW_GPMI_TIMING1=0x60000000 > HW_GPMI_TIMING2=0x23023336 > HW_GPMI_DATA=0x00000000 > HW_GPMI_STAT=0xff000005 > HW_GPMI_DEBUG=0x00000000 > r->clock[0]=22000000 > random: fast init done > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: Micron MT29F2G08ABAEAWP > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY > HW_GPMI_CTRL0=0x01800001 > HW_GPMI_CTRL1=0x0104000c > HW_GPMI_COMPARE=0x00000000 > HW_GPMI_ECCCTRL=0x00000000 > HW_GPMI_ECCCOUNT=0x00000000 > HW_GPMI_PAYLOAD=0x00000000 > HW_GPMI_AUXILIARY=0x00000000 > HW_GPMI_TIMING0=0x00020101 > HW_GPMI_TIMING1=0x60000000 > HW_GPMI_TIMING2=0x23023336 > HW_GPMI_DATA=0x0000003f > HW_GPMI_STAT=0xff000005 > HW_GPMI_DEBUG=0x00000101 > r->clock[0]=22000000 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT > HW_GPMI_CTRL0=0x01800001 > HW_GPMI_CTRL1=0x0104000c > HW_GPMI_COMPARE=0x00000000 > HW_GPMI_ECCCTRL=0x00000000 > HW_GPMI_ECCCOUNT=0x00000000 > HW_GPMI_PAYLOAD=0x00000000 > HW_GPMI_AUXILIARY=0x00000000 > HW_GPMI_TIMING0=0x00020101 > HW_GPMI_TIMING1=0xb0000000 > HW_GPMI_TIMING2=0x23023336 > HW_GPMI_DATA=0x0000003f > HW_GPMI_STAT=0xff000005 > HW_GPMI_DEBUG=0x00000101 > r->clock[0]=22000000 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY > HW_GPMI_CTRL0=0x01800001 > HW_GPMI_CTRL1=0x0104000c > HW_GPMI_COMPARE=0x00000000 > HW_GPMI_ECCCTRL=0x00000000 > HW_GPMI_ECCCOUNT=0x00000000 > HW_GPMI_PAYLOAD=0x00000000 > HW_GPMI_AUXILIARY=0x00000000 > HW_GPMI_TIMING0=0x00020101 > HW_GPMI_TIMING1=0xb0000000 > HW_GPMI_TIMING2=0x23023336 > HW_GPMI_DATA=0x000000e0 > HW_GPMI_STAT=0xff000005 > HW_GPMI_DEBUG=0x00000000 > r->clock[0]=22000000 > drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT > HW_GPMI_CTRL0=0x01800001 > HW_GPMI_CTRL1=0x01c6800c > HW_GPMI_COMPARE=0x00000000 > HW_GPMI_ECCCTRL=0x00000000 > HW_GPMI_ECCCOUNT=0x00000000 > HW_GPMI_PAYLOAD=0x00000000 > HW_GPMI_AUXILIARY=0x00000000 > HW_GPMI_TIMING0=0x00010101 > HW_GPMI_TIMING1=0xe0000000 > HW_GPMI_TIMING2=0x23023336 > HW_GPMI_DATA=0x000000e0 > HW_GPMI_STAT=0xff000005 > HW_GPMI_DEBUG=0x00000000 > r->clock[0]=99000000 > Scanning device for bad blocks > 5 fixed-partitions partitions found on MTD device gpmi-nand > Creating 5 MTD partitions on "gpmi-nand": > 0x000000000000-0x000000500000 : "u-boot" > 0x000000500000-0x000000600000 : "u-boot-env" > 0x000000600000-0x000000800000 : "log" > 0x000000800000-0x000010000000 : "flash" > 0x000000000000-0x000010000000 : "all" > gpmi-nand 1806000.gpmi-nand: driver registered. > ... > > > Regards > Greg > >
Hi Michael, Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on Thu, 28 Jan 2021 10:45:29 +0100: > Hi Greg > > On Tue, Aug 13, 2019 at 2:50 AM Greg Ungerer <gerg@kernel.org> wrote: > > > > Hi Boris, > > > > On 12/8/19 5:31 pm, Boris Brezillon wrote: > > > On Mon, 12 Aug 2019 12:50:36 +1000 > > [snip] > > > Hm, CTRL1 is identical. Can you dump all regs at the beginning and at > > > the end of those funcs? > > > > Here is a more complete dump of registers. Trace points are at > > entry and exit of the respective functions in the different > > kernel versions. Register dumping code is identical for both. > > > > > > Linux version 4.16.0 (gerg@goober) (gcc version 4.8.3 (GCC)) #10 Tue Aug 13 10:24:28 AEST 2019 > > ... > > I ran an overnight reboot process on linux-4.19.y tag: v4.19.169 and > I'm not able to reproduce up to now. > > Do you have an update from your side? Perhaps this patch [1] was backported to v4.19.169 (I didn't check) but it might have fixed the DMA issue you were seeing, at least in the recent kernel versions. [1] mtd: rawnand: gpmi: Fix the random DMA timeout issue Thanks, Miquèl
Hi On Thu, Jan 28, 2021 at 11:26 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > Hi Michael, > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > Thu, 28 Jan 2021 10:45:29 +0100: > > > Hi Greg > > > > On Tue, Aug 13, 2019 at 2:50 AM Greg Ungerer <gerg@kernel.org> wrote: > > > > > > Hi Boris, > > > > > > On 12/8/19 5:31 pm, Boris Brezillon wrote: > > > > On Mon, 12 Aug 2019 12:50:36 +1000 > > > [snip] > > > > Hm, CTRL1 is identical. Can you dump all regs at the beginning and at > > > > the end of those funcs? > > > > > > Here is a more complete dump of registers. Trace points are at > > > entry and exit of the respective functions in the different > > > kernel versions. Register dumping code is identical for both. > > > > > > > > > Linux version 4.16.0 (gerg@goober) (gcc version 4.8.3 (GCC)) #10 Tue Aug 13 10:24:28 AEST 2019 > > > ... > > > > I ran an overnight reboot process on linux-4.19.y tag: v4.19.169 and > > I'm not able to reproduce up to now. > > > > Do you have an update from your side? > > Perhaps this patch [1] was backported to v4.19.169 (I didn't check) but > it might have fixed the DMA issue you were seeing, at least in the > recent kernel versions. It's not there and should not go because this patch fixes another regression. Apply it means apply anyway both. I will decode the registers on those thread and check the difference Michael > > [1] mtd: rawnand: gpmi: Fix the random DMA timeout issue > > Thanks, > Miquèl
Hi I have a couple of questions On Thu, Jan 28, 2021 at 11:35 AM Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote: > > Hi > > On Thu, Jan 28, 2021 at 11:26 AM Miquel Raynal > <miquel.raynal@bootlin.com> wrote: > > > > Hi Michael, > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > Thu, 28 Jan 2021 10:45:29 +0100: > > > > > Hi Greg > > > > > > On Tue, Aug 13, 2019 at 2:50 AM Greg Ungerer <gerg@kernel.org> wrote: > > > > > > > > Hi Boris, > > > > > > > > On 12/8/19 5:31 pm, Boris Brezillon wrote: > > > > > On Mon, 12 Aug 2019 12:50:36 +1000 > > > > [snip] > > > > > Hm, CTRL1 is identical. Can you dump all regs at the beginning and at > > > > > the end of those funcs? > > > > > > > > Here is a more complete dump of registers. Trace points are at > > > > entry and exit of the respective functions in the different > > > > kernel versions. Register dumping code is identical for both. > > > > > > > > > > > > Linux version 4.16.0 (gerg@goober) (gcc version 4.8.3 (GCC)) #10 Tue Aug 13 10:24:28 AEST 2019 > > > > ... > > > > > > I ran an overnight reboot process on linux-4.19.y tag: v4.19.169 and > > > I'm not able to reproduce up to now. > > > > > > Do you have an update from your side? > > > > Perhaps this patch [1] was backported to v4.19.169 (I didn't check) but > > it might have fixed the DMA issue you were seeing, at least in the > > recent kernel versions. > > It's not there and should not go because this patch fixes another > regression. Apply > it means apply anyway both. > I will decode the registers on those thread and check the difference > maxchips is 2 for IMX6 and I think that is connected to NAND_CE0_B and NAND_CE1_B, When we probe I think both are probed. Is that right? Michael > Michae > > > > > [1] mtd: rawnand: gpmi: Fix the random DMA timeout issue > > > > Thanks, > > Miquèl > > > > -- > Michael Nazzareno Trimarchi > Amarula Solutions BV > COO Co-Founder > Cruquiuskade 47 Amsterdam 1018 AM NL > T. +31(0)851119172 > M. +39(0)3479132170 > [`as] https://www.amarulasolutions.com
Hi Michael, On 28/1/21 7:45 pm, Michael Nazzareno Trimarchi wrote: > Hi Greg > > On Tue, Aug 13, 2019 at 2:50 AM Greg Ungerer <gerg@kernel.org> wrote: >> >> Hi Boris, >> >> On 12/8/19 5:31 pm, Boris Brezillon wrote: >>> On Mon, 12 Aug 2019 12:50:36 +1000 >> [snip] >>> Hm, CTRL1 is identical. Can you dump all regs at the beginning and at >>> the end of those funcs? >> >> Here is a more complete dump of registers. Trace points are at >> entry and exit of the respective functions in the different >> kernel versions. Register dumping code is identical for both. >> >> >> Linux version 4.16.0 (gerg@goober) (gcc version 4.8.3 (GCC)) #10 Tue Aug 13 10:24:28 AEST 2019 >> ... > > I ran an overnight reboot process on linux-4.19.y tag: v4.19.169 and > I'm not able to reproduce up to now. > > Do you have an update from your side? I haven't seen the problem for a while now. I pretty much only run modern kernels, I am currently using 5.10. So I am guessing it was resolved somewhere at some point, but I don't know exactly what change resolved it. Regards Greg >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1073): gpmi_begin(): ENTRY >> HW_GPMI_CTRL0=0x00000000 >> HW_GPMI_CTRL1=0x01c4000c >> HW_GPMI_COMPARE=0x00000000 >> HW_GPMI_ECCCTRL=0x00000400 >> HW_GPMI_ECCCOUNT=0x00000000 >> HW_GPMI_PAYLOAD=0x00000000 >> HW_GPMI_AUXILIARY=0x00000000 >> HW_GPMI_TIMING0=0x00010203 >> HW_GPMI_TIMING1=0x00000000 >> HW_GPMI_TIMING2=0x23023336 >> HW_GPMI_DATA=0x00000000 >> HW_GPMI_STAT=0xff000005 >> HW_GPMI_DEBUG=0x00000000 >> r->clock[0]=22000000 >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >> nand: Micron MT29F2G08ABAEAWP >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >> gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1073): gpmi_begin(): ENTRY >> HW_GPMI_CTRL0=0x01800001 >> HW_GPMI_CTRL1=0x0104000c >> HW_GPMI_COMPARE=0x00000000 >> HW_GPMI_ECCCTRL=0x00000000 >> HW_GPMI_ECCCOUNT=0x00000000 >> HW_GPMI_PAYLOAD=0x00000000 >> HW_GPMI_AUXILIARY=0x00000000 >> HW_GPMI_TIMING0=0x00010203 >> HW_GPMI_TIMING1=0x05000000 >> HW_GPMI_TIMING2=0x23023336 >> HW_GPMI_DATA=0x00000000 >> HW_GPMI_STAT=0xff000005 >> HW_GPMI_DEBUG=0x00000101 >> r->clock[0]=99000000 >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1136): gpmi_begin(): EXIT >> HW_GPMI_CTRL0=0x01800001 >> HW_GPMI_CTRL1=0x01c6800c >> HW_GPMI_COMPARE=0x00000000 >> HW_GPMI_ECCCTRL=0x00000000 >> HW_GPMI_ECCCOUNT=0x00000000 >> HW_GPMI_PAYLOAD=0x00000000 >> HW_GPMI_AUXILIARY=0x00000000 >> HW_GPMI_TIMING0=0x00010101 >> HW_GPMI_TIMING1=0x90000000 >> HW_GPMI_TIMING2=0x23023336 >> HW_GPMI_DATA=0x00000000 >> HW_GPMI_STAT=0xff000005 >> HW_GPMI_DEBUG=0x00000101 >> r->clock[0]=99000000 >> Scanning device for bad blocks >> 5 ofpart partitions found on MTD device gpmi-nand >> Creating 5 MTD partitions on "gpmi-nand": >> 0x000000000000-0x000000500000 : "u-boot" >> 0x000000500000-0x000000600000 : "u-boot-env" >> 0x000000600000-0x000000800000 : "log" >> 0x000000800000-0x000010000000 : "flash" >> 0x000000000000-0x000010000000 : "all" >> gpmi-nand 1806000.gpmi-nand: driver registered. >> ... >> >> Note that the first ENTRY dump has no matching EXIT dump. From the >> code I assume it is returning from gpmi_begin() at the >> "if (!hw.sample_delay_factor)" check. >> >> >> And for the 5.1.14 kernel: >> >> Linux version 5.1.14 (gerg@goober) (gcc version 4.8.3 (GCC)) #27 Tue Aug 13 10:20:32 AEST 2019 >> ... >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY >> HW_GPMI_CTRL0=0x00000000 >> HW_GPMI_CTRL1=0x01c4000c >> HW_GPMI_COMPARE=0x00000000 >> HW_GPMI_ECCCTRL=0x00000400 >> HW_GPMI_ECCCOUNT=0x00000000 >> HW_GPMI_PAYLOAD=0x00000000 >> HW_GPMI_AUXILIARY=0x00000000 >> HW_GPMI_TIMING0=0x00010203 >> HW_GPMI_TIMING1=0x00000000 >> HW_GPMI_TIMING2=0x23023336 >> HW_GPMI_DATA=0x00000000 >> HW_GPMI_STAT=0xff000005 >> HW_GPMI_DEBUG=0x00000000 >> r->clock[0]=22000000 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT >> HW_GPMI_CTRL0=0x00000000 >> HW_GPMI_CTRL1=0x0104000c >> HW_GPMI_COMPARE=0x00000000 >> HW_GPMI_ECCCTRL=0x00000400 >> HW_GPMI_ECCCOUNT=0x00000000 >> HW_GPMI_PAYLOAD=0x00000000 >> HW_GPMI_AUXILIARY=0x00000000 >> HW_GPMI_TIMING0=0x00020101 >> HW_GPMI_TIMING1=0x60000000 >> HW_GPMI_TIMING2=0x23023336 >> HW_GPMI_DATA=0x00000000 >> HW_GPMI_STAT=0xff000005 >> HW_GPMI_DEBUG=0x00000000 >> r->clock[0]=22000000 >> random: fast init done >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda >> nand: Micron MT29F2G08ABAEAWP >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY >> HW_GPMI_CTRL0=0x01800001 >> HW_GPMI_CTRL1=0x0104000c >> HW_GPMI_COMPARE=0x00000000 >> HW_GPMI_ECCCTRL=0x00000000 >> HW_GPMI_ECCCOUNT=0x00000000 >> HW_GPMI_PAYLOAD=0x00000000 >> HW_GPMI_AUXILIARY=0x00000000 >> HW_GPMI_TIMING0=0x00020101 >> HW_GPMI_TIMING1=0x60000000 >> HW_GPMI_TIMING2=0x23023336 >> HW_GPMI_DATA=0x0000003f >> HW_GPMI_STAT=0xff000005 >> HW_GPMI_DEBUG=0x00000101 >> r->clock[0]=22000000 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT >> HW_GPMI_CTRL0=0x01800001 >> HW_GPMI_CTRL1=0x0104000c >> HW_GPMI_COMPARE=0x00000000 >> HW_GPMI_ECCCTRL=0x00000000 >> HW_GPMI_ECCCOUNT=0x00000000 >> HW_GPMI_PAYLOAD=0x00000000 >> HW_GPMI_AUXILIARY=0x00000000 >> HW_GPMI_TIMING0=0x00020101 >> HW_GPMI_TIMING1=0xb0000000 >> HW_GPMI_TIMING2=0x23023336 >> HW_GPMI_DATA=0x0000003f >> HW_GPMI_STAT=0xff000005 >> HW_GPMI_DEBUG=0x00000101 >> r->clock[0]=22000000 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY >> HW_GPMI_CTRL0=0x01800001 >> HW_GPMI_CTRL1=0x0104000c >> HW_GPMI_COMPARE=0x00000000 >> HW_GPMI_ECCCTRL=0x00000000 >> HW_GPMI_ECCCOUNT=0x00000000 >> HW_GPMI_PAYLOAD=0x00000000 >> HW_GPMI_AUXILIARY=0x00000000 >> HW_GPMI_TIMING0=0x00020101 >> HW_GPMI_TIMING1=0xb0000000 >> HW_GPMI_TIMING2=0x23023336 >> HW_GPMI_DATA=0x000000e0 >> HW_GPMI_STAT=0xff000005 >> HW_GPMI_DEBUG=0x00000000 >> r->clock[0]=22000000 >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT >> HW_GPMI_CTRL0=0x01800001 >> HW_GPMI_CTRL1=0x01c6800c >> HW_GPMI_COMPARE=0x00000000 >> HW_GPMI_ECCCTRL=0x00000000 >> HW_GPMI_ECCCOUNT=0x00000000 >> HW_GPMI_PAYLOAD=0x00000000 >> HW_GPMI_AUXILIARY=0x00000000 >> HW_GPMI_TIMING0=0x00010101 >> HW_GPMI_TIMING1=0xe0000000 >> HW_GPMI_TIMING2=0x23023336 >> HW_GPMI_DATA=0x000000e0 >> HW_GPMI_STAT=0xff000005 >> HW_GPMI_DEBUG=0x00000000 >> r->clock[0]=99000000 >> Scanning device for bad blocks >> 5 fixed-partitions partitions found on MTD device gpmi-nand >> Creating 5 MTD partitions on "gpmi-nand": >> 0x000000000000-0x000000500000 : "u-boot" >> 0x000000500000-0x000000600000 : "u-boot-env" >> 0x000000600000-0x000000800000 : "log" >> 0x000000800000-0x000010000000 : "flash" >> 0x000000000000-0x000010000000 : "all" >> gpmi-nand 1806000.gpmi-nand: driver registered. >> ... >> >> >> Regards >> Greg >> >> > >
Hi Miquel commit f8e6ad14388067f91b26d044185d95623fbc9535 Author: Michael Trimarchi <michael@amarulasolutions.com> Date: Fri Jan 29 08:46:53 2021 +0100 mtd: nand: Calculate the clock before enable it Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> Change-Id: I79b0da39de0a9b32ea0b002fa200d7f44d4f8ce7 diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c index 322a008290e5..0bca52b3bc8f 100644 --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c @@ -377,6 +377,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, const struct nand_sdr_timings *sdr) { struct gpmi_nfc_hardware_timing *hw = &this->hw; + struct resources *r = &this->resources; unsigned int dll_threshold_ps = this->devdata->max_chain_delay; unsigned int period_ps, reference_period_ps; unsigned int data_setup_cycles, data_hold_cycles, addr_setup_cycles; @@ -440,6 +441,8 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, hw->ctrl1n |= BF_GPMI_CTRL1_RDN_DELAY(sample_delay_factor) | BM_GPMI_CTRL1_DLL_ENABLE | (use_half_period ? BM_GPMI_CTRL1_HALF_PERIOD : 0); + + clk_set_rate(r->clock[0], hw->clk_rate); } void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) @@ -449,8 +452,6 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) void __iomem *gpmi_regs = r->gpmi_regs; unsigned int dll_wait_time_us; - clk_set_rate(r->clock[0], hw->clk_rate); - writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); Right now I have this change applied and seems fine. That is the only difference I get. Clock is apply a bit earlier that when is enabled it. Michael On Fri, Jan 29, 2021 at 1:43 PM Greg Ungerer <gerg@kernel.org> wrote: > > Hi Michael, > > On 28/1/21 7:45 pm, Michael Nazzareno Trimarchi wrote: > > Hi Greg > > > > On Tue, Aug 13, 2019 at 2:50 AM Greg Ungerer <gerg@kernel.org> wrote: > >> > >> Hi Boris, > >> > >> On 12/8/19 5:31 pm, Boris Brezillon wrote: > >>> On Mon, 12 Aug 2019 12:50:36 +1000 > >> [snip] > >>> Hm, CTRL1 is identical. Can you dump all regs at the beginning and at > >>> the end of those funcs? > >> > >> Here is a more complete dump of registers. Trace points are at > >> entry and exit of the respective functions in the different > >> kernel versions. Register dumping code is identical for both. > >> > >> > >> Linux version 4.16.0 (gerg@goober) (gcc version 4.8.3 (GCC)) #10 Tue Aug 13 10:24:28 AEST 2019 > >> ... > > > > I ran an overnight reboot process on linux-4.19.y tag: v4.19.169 and > > I'm not able to reproduce up to now. > > > > Do you have an update from your side? > > I haven't seen the problem for a while now. > I pretty much only run modern kernels, I am currently using 5.10. > So I am guessing it was resolved somewhere at some point, but I > don't know exactly what change resolved it. > > Regards > Greg > > > > >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1073): gpmi_begin(): ENTRY > >> HW_GPMI_CTRL0=0x00000000 > >> HW_GPMI_CTRL1=0x01c4000c > >> HW_GPMI_COMPARE=0x00000000 > >> HW_GPMI_ECCCTRL=0x00000400 > >> HW_GPMI_ECCCOUNT=0x00000000 > >> HW_GPMI_PAYLOAD=0x00000000 > >> HW_GPMI_AUXILIARY=0x00000000 > >> HW_GPMI_TIMING0=0x00010203 > >> HW_GPMI_TIMING1=0x00000000 > >> HW_GPMI_TIMING2=0x23023336 > >> HW_GPMI_DATA=0x00000000 > >> HW_GPMI_STAT=0xff000005 > >> HW_GPMI_DEBUG=0x00000000 > >> r->clock[0]=22000000 > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > >> nand: Micron MT29F2G08ABAEAWP > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > >> gpmi-nand 1806000.gpmi-nand: enable the asynchronous EDO mode 5 > >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1073): gpmi_begin(): ENTRY > >> HW_GPMI_CTRL0=0x01800001 > >> HW_GPMI_CTRL1=0x0104000c > >> HW_GPMI_COMPARE=0x00000000 > >> HW_GPMI_ECCCTRL=0x00000000 > >> HW_GPMI_ECCCOUNT=0x00000000 > >> HW_GPMI_PAYLOAD=0x00000000 > >> HW_GPMI_AUXILIARY=0x00000000 > >> HW_GPMI_TIMING0=0x00010203 > >> HW_GPMI_TIMING1=0x05000000 > >> HW_GPMI_TIMING2=0x23023336 > >> HW_GPMI_DATA=0x00000000 > >> HW_GPMI_STAT=0xff000005 > >> HW_GPMI_DEBUG=0x00000101 > >> r->clock[0]=99000000 > >> drivers/mtd/nand/gpmi-nand/gpmi-lib.c(1136): gpmi_begin(): EXIT > >> HW_GPMI_CTRL0=0x01800001 > >> HW_GPMI_CTRL1=0x01c6800c > >> HW_GPMI_COMPARE=0x00000000 > >> HW_GPMI_ECCCTRL=0x00000000 > >> HW_GPMI_ECCCOUNT=0x00000000 > >> HW_GPMI_PAYLOAD=0x00000000 > >> HW_GPMI_AUXILIARY=0x00000000 > >> HW_GPMI_TIMING0=0x00010101 > >> HW_GPMI_TIMING1=0x90000000 > >> HW_GPMI_TIMING2=0x23023336 > >> HW_GPMI_DATA=0x00000000 > >> HW_GPMI_STAT=0xff000005 > >> HW_GPMI_DEBUG=0x00000101 > >> r->clock[0]=99000000 > >> Scanning device for bad blocks > >> 5 ofpart partitions found on MTD device gpmi-nand > >> Creating 5 MTD partitions on "gpmi-nand": > >> 0x000000000000-0x000000500000 : "u-boot" > >> 0x000000500000-0x000000600000 : "u-boot-env" > >> 0x000000600000-0x000000800000 : "log" > >> 0x000000800000-0x000010000000 : "flash" > >> 0x000000000000-0x000010000000 : "all" > >> gpmi-nand 1806000.gpmi-nand: driver registered. > >> ... > >> > >> Note that the first ENTRY dump has no matching EXIT dump. From the > >> code I assume it is returning from gpmi_begin() at the > >> "if (!hw.sample_delay_factor)" check. > >> > >> > >> And for the 5.1.14 kernel: > >> > >> Linux version 5.1.14 (gerg@goober) (gcc version 4.8.3 (GCC)) #27 Tue Aug 13 10:20:32 AEST 2019 > >> ... > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY > >> HW_GPMI_CTRL0=0x00000000 > >> HW_GPMI_CTRL1=0x01c4000c > >> HW_GPMI_COMPARE=0x00000000 > >> HW_GPMI_ECCCTRL=0x00000400 > >> HW_GPMI_ECCCOUNT=0x00000000 > >> HW_GPMI_PAYLOAD=0x00000000 > >> HW_GPMI_AUXILIARY=0x00000000 > >> HW_GPMI_TIMING0=0x00010203 > >> HW_GPMI_TIMING1=0x00000000 > >> HW_GPMI_TIMING2=0x23023336 > >> HW_GPMI_DATA=0x00000000 > >> HW_GPMI_STAT=0xff000005 > >> HW_GPMI_DEBUG=0x00000000 > >> r->clock[0]=22000000 > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT > >> HW_GPMI_CTRL0=0x00000000 > >> HW_GPMI_CTRL1=0x0104000c > >> HW_GPMI_COMPARE=0x00000000 > >> HW_GPMI_ECCCTRL=0x00000400 > >> HW_GPMI_ECCCOUNT=0x00000000 > >> HW_GPMI_PAYLOAD=0x00000000 > >> HW_GPMI_AUXILIARY=0x00000000 > >> HW_GPMI_TIMING0=0x00020101 > >> HW_GPMI_TIMING1=0x60000000 > >> HW_GPMI_TIMING2=0x23023336 > >> HW_GPMI_DATA=0x00000000 > >> HW_GPMI_STAT=0xff000005 > >> HW_GPMI_DEBUG=0x00000000 > >> r->clock[0]=22000000 > >> random: fast init done > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > >> nand: Micron MT29F2G08ABAEAWP > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY > >> HW_GPMI_CTRL0=0x01800001 > >> HW_GPMI_CTRL1=0x0104000c > >> HW_GPMI_COMPARE=0x00000000 > >> HW_GPMI_ECCCTRL=0x00000000 > >> HW_GPMI_ECCCOUNT=0x00000000 > >> HW_GPMI_PAYLOAD=0x00000000 > >> HW_GPMI_AUXILIARY=0x00000000 > >> HW_GPMI_TIMING0=0x00020101 > >> HW_GPMI_TIMING1=0x60000000 > >> HW_GPMI_TIMING2=0x23023336 > >> HW_GPMI_DATA=0x0000003f > >> HW_GPMI_STAT=0xff000005 > >> HW_GPMI_DEBUG=0x00000101 > >> r->clock[0]=22000000 > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT > >> HW_GPMI_CTRL0=0x01800001 > >> HW_GPMI_CTRL1=0x0104000c > >> HW_GPMI_COMPARE=0x00000000 > >> HW_GPMI_ECCCTRL=0x00000000 > >> HW_GPMI_ECCCOUNT=0x00000000 > >> HW_GPMI_PAYLOAD=0x00000000 > >> HW_GPMI_AUXILIARY=0x00000000 > >> HW_GPMI_TIMING0=0x00020101 > >> HW_GPMI_TIMING1=0xb0000000 > >> HW_GPMI_TIMING2=0x23023336 > >> HW_GPMI_DATA=0x0000003f > >> HW_GPMI_STAT=0xff000005 > >> HW_GPMI_DEBUG=0x00000101 > >> r->clock[0]=22000000 > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(512): gpmi_nfc_apply_timings(): ENTRY > >> HW_GPMI_CTRL0=0x01800001 > >> HW_GPMI_CTRL1=0x0104000c > >> HW_GPMI_COMPARE=0x00000000 > >> HW_GPMI_ECCCTRL=0x00000000 > >> HW_GPMI_ECCCOUNT=0x00000000 > >> HW_GPMI_PAYLOAD=0x00000000 > >> HW_GPMI_AUXILIARY=0x00000000 > >> HW_GPMI_TIMING0=0x00020101 > >> HW_GPMI_TIMING1=0xb0000000 > >> HW_GPMI_TIMING2=0x23023336 > >> HW_GPMI_DATA=0x000000e0 > >> HW_GPMI_STAT=0xff000005 > >> HW_GPMI_DEBUG=0x00000000 > >> r->clock[0]=22000000 > >> drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c(536): gpmi_nfc_apply_timings(): EXIT > >> HW_GPMI_CTRL0=0x01800001 > >> HW_GPMI_CTRL1=0x01c6800c > >> HW_GPMI_COMPARE=0x00000000 > >> HW_GPMI_ECCCTRL=0x00000000 > >> HW_GPMI_ECCCOUNT=0x00000000 > >> HW_GPMI_PAYLOAD=0x00000000 > >> HW_GPMI_AUXILIARY=0x00000000 > >> HW_GPMI_TIMING0=0x00010101 > >> HW_GPMI_TIMING1=0xe0000000 > >> HW_GPMI_TIMING2=0x23023336 > >> HW_GPMI_DATA=0x000000e0 > >> HW_GPMI_STAT=0xff000005 > >> HW_GPMI_DEBUG=0x00000000 > >> r->clock[0]=99000000 > >> Scanning device for bad blocks > >> 5 fixed-partitions partitions found on MTD device gpmi-nand > >> Creating 5 MTD partitions on "gpmi-nand": > >> 0x000000000000-0x000000500000 : "u-boot" > >> 0x000000500000-0x000000600000 : "u-boot-env" > >> 0x000000600000-0x000000800000 : "log" > >> 0x000000800000-0x000010000000 : "flash" > >> 0x000000000000-0x000010000000 : "all" > >> gpmi-nand 1806000.gpmi-nand: driver registered. > >> ... > >> > >> > >> Regards > >> Greg > >> > >> > > > >
Hi Michael, Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on Sat, 30 Jan 2021 10:41:29 +0100: > Hi Miquel > > commit f8e6ad14388067f91b26d044185d95623fbc9535 > Author: Michael Trimarchi <michael@amarulasolutions.com> > Date: Fri Jan 29 08:46:53 2021 +0100 > > mtd: nand: Calculate the clock before enable it > > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > Change-Id: I79b0da39de0a9b32ea0b002fa200d7f44d4f8ce7 > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > index 322a008290e5..0bca52b3bc8f 100644 > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > @@ -377,6 +377,7 @@ static void gpmi_nfc_compute_timings(struct > gpmi_nand_data *this, > const struct nand_sdr_timings *sdr) > { > struct gpmi_nfc_hardware_timing *hw = &this->hw; > + struct resources *r = &this->resources; > unsigned int dll_threshold_ps = this->devdata->max_chain_delay; > unsigned int period_ps, reference_period_ps; > unsigned int data_setup_cycles, data_hold_cycles, addr_setup_cycles; > @@ -440,6 +441,8 @@ static void gpmi_nfc_compute_timings(struct > gpmi_nand_data *this, > hw->ctrl1n |= BF_GPMI_CTRL1_RDN_DELAY(sample_delay_factor) | > BM_GPMI_CTRL1_DLL_ENABLE | > (use_half_period ? BM_GPMI_CTRL1_HALF_PERIOD : 0); > + > + clk_set_rate(r->clock[0], hw->clk_rate); > } > > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > @@ -449,8 +452,6 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > void __iomem *gpmi_regs = r->gpmi_regs; > unsigned int dll_wait_time_us; > > - clk_set_rate(r->clock[0], hw->clk_rate); > - > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > Right now I have this change applied and seems fine. That is the only > difference I get. Clock is apply a bit earlier that when is enabled > it. This is very interesting. So this would mean the issue you are experiencing comes from the clock driver which kind of returns too early from clk_set_rate()? Could you report this to the clk ML/NXP clk maintainers and keep us in copy? If it is as global as it sounds, we might not be the only ones affected. Thanks, Miquèl
Hi On Mon, Feb 1, 2021 at 3:13 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > Hi Michael, > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > Sat, 30 Jan 2021 10:41:29 +0100: > > > Hi Miquel > > > > commit f8e6ad14388067f91b26d044185d95623fbc9535 > > Author: Michael Trimarchi <michael@amarulasolutions.com> > > Date: Fri Jan 29 08:46:53 2021 +0100 > > > > mtd: nand: Calculate the clock before enable it > > > > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > > Change-Id: I79b0da39de0a9b32ea0b002fa200d7f44d4f8ce7 > > > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > index 322a008290e5..0bca52b3bc8f 100644 > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > @@ -377,6 +377,7 @@ static void gpmi_nfc_compute_timings(struct > > gpmi_nand_data *this, > > const struct nand_sdr_timings *sdr) > > { > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > + struct resources *r = &this->resources; > > unsigned int dll_threshold_ps = this->devdata->max_chain_delay; > > unsigned int period_ps, reference_period_ps; > > unsigned int data_setup_cycles, data_hold_cycles, addr_setup_cycles; > > @@ -440,6 +441,8 @@ static void gpmi_nfc_compute_timings(struct > > gpmi_nand_data *this, > > hw->ctrl1n |= BF_GPMI_CTRL1_RDN_DELAY(sample_delay_factor) | > > BM_GPMI_CTRL1_DLL_ENABLE | > > (use_half_period ? BM_GPMI_CTRL1_HALF_PERIOD : 0); > > + > > + clk_set_rate(r->clock[0], hw->clk_rate); > > } > > > > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > @@ -449,8 +452,6 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > void __iomem *gpmi_regs = r->gpmi_regs; > > unsigned int dll_wait_time_us; > > > > - clk_set_rate(r->clock[0], hw->clk_rate); > > - > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > > > Right now I have this change applied and seems fine. That is the only > > difference I get. Clock is apply a bit earlier that when is enabled > > it. > > This is very interesting. So this would mean the issue you are > experiencing comes from the clock driver which kind of returns too > early from clk_set_rate()? Could you report this to the clk ML/NXP clk > maintainers and keep us in copy? If it is as global as it sounds, we > might not be the only ones affected. > The imx28 is broken too, so it's a general problem. I need to trace it down I have a reverting for lts but it\s not the way to go Michael > Thanks, > Miquèl
Hi On Mon, Feb 1, 2021 at 3:32 PM Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote: > > Hi > > On Mon, Feb 1, 2021 at 3:13 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > Hi Michael, > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > Sat, 30 Jan 2021 10:41:29 +0100: > > > > > Hi Miquel > > > > > > commit f8e6ad14388067f91b26d044185d95623fbc9535 > > > Author: Michael Trimarchi <michael@amarulasolutions.com> > > > Date: Fri Jan 29 08:46:53 2021 +0100 > > > > > > mtd: nand: Calculate the clock before enable it > > > > > > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > > > Change-Id: I79b0da39de0a9b32ea0b002fa200d7f44d4f8ce7 > > > > > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > index 322a008290e5..0bca52b3bc8f 100644 > > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > @@ -377,6 +377,7 @@ static void gpmi_nfc_compute_timings(struct > > > gpmi_nand_data *this, > > > const struct nand_sdr_timings *sdr) > > > { > > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > > + struct resources *r = &this->resources; > > > unsigned int dll_threshold_ps = this->devdata->max_chain_delay; > > > unsigned int period_ps, reference_period_ps; > > > unsigned int data_setup_cycles, data_hold_cycles, addr_setup_cycles; > > > @@ -440,6 +441,8 @@ static void gpmi_nfc_compute_timings(struct > > > gpmi_nand_data *this, > > > hw->ctrl1n |= BF_GPMI_CTRL1_RDN_DELAY(sample_delay_factor) | > > > BM_GPMI_CTRL1_DLL_ENABLE | > > > (use_half_period ? BM_GPMI_CTRL1_HALF_PERIOD : 0); > > > + > > > + clk_set_rate(r->clock[0], hw->clk_rate); > > > } > > > > > > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > @@ -449,8 +452,6 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > void __iomem *gpmi_regs = r->gpmi_regs; > > > unsigned int dll_wait_time_us; > > > > > > - clk_set_rate(r->clock[0], hw->clk_rate); > > > - > > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > > > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > > > > > Right now I have this change applied and seems fine. That is the only > > > difference I get. Clock is apply a bit earlier that when is enabled > > > it. > > > > This is very interesting. So this would mean the issue you are > > experiencing comes from the clock driver which kind of returns too > > early from clk_set_rate()? Could you report this to the clk ML/NXP clk > > maintainers and keep us in copy? If it is as global as it sounds, we > > might not be the only ones affected. > > > > The imx28 is broken too, so it's a general problem. I need to trace it down > I have a reverting for lts but it\s not the way to go > For imx28 you ask to set the rate to 22Mhz but you don't care about the clock that you get back. You get back 12Mhz because the base clock is 24 Mhz and seems that it can not get the point. You need to check if the clock requested is in range or ask for set_rate_clk_min to avoid to have somenthing lower. Then for imx6ull because is sporadic I think that is more connected to the clk_set_rate and when you change the register. Can not be a setting time? Michael > Michael > > > Thanks, > > Miquèl > > > > -- > Michael Nazzareno Trimarchi > Amarula Solutions BV > COO Co-Founder > Cruquiuskade 47 Amsterdam 1018 AM NL > T. +31(0)851119172 > M. +39(0)3479132170 > [`as] https://www.amarulasolutions.com
Hi Michael, Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on Mon, 1 Feb 2021 16:08:23 +0100: > Hi > > On Mon, Feb 1, 2021 at 3:32 PM Michael Nazzareno Trimarchi > <michael@amarulasolutions.com> wrote: > > > > Hi > > > > On Mon, Feb 1, 2021 at 3:13 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > > > Hi Michael, > > > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > > Sat, 30 Jan 2021 10:41:29 +0100: > > > > > > > Hi Miquel > > > > > > > > commit f8e6ad14388067f91b26d044185d95623fbc9535 > > > > Author: Michael Trimarchi <michael@amarulasolutions.com> > > > > Date: Fri Jan 29 08:46:53 2021 +0100 > > > > > > > > mtd: nand: Calculate the clock before enable it > > > > > > > > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > > > > Change-Id: I79b0da39de0a9b32ea0b002fa200d7f44d4f8ce7 > > > > > > > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > index 322a008290e5..0bca52b3bc8f 100644 > > > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > @@ -377,6 +377,7 @@ static void gpmi_nfc_compute_timings(struct > > > > gpmi_nand_data *this, > > > > const struct nand_sdr_timings *sdr) > > > > { > > > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > > > + struct resources *r = &this->resources; > > > > unsigned int dll_threshold_ps = this->devdata->max_chain_delay; > > > > unsigned int period_ps, reference_period_ps; > > > > unsigned int data_setup_cycles, data_hold_cycles, addr_setup_cycles; > > > > @@ -440,6 +441,8 @@ static void gpmi_nfc_compute_timings(struct > > > > gpmi_nand_data *this, > > > > hw->ctrl1n |= BF_GPMI_CTRL1_RDN_DELAY(sample_delay_factor) | > > > > BM_GPMI_CTRL1_DLL_ENABLE | > > > > (use_half_period ? BM_GPMI_CTRL1_HALF_PERIOD : 0); > > > > + > > > > + clk_set_rate(r->clock[0], hw->clk_rate); > > > > } > > > > > > > > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > @@ -449,8 +452,6 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > void __iomem *gpmi_regs = r->gpmi_regs; > > > > unsigned int dll_wait_time_us; > > > > > > > > - clk_set_rate(r->clock[0], hw->clk_rate); > > > > - > > > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > > > > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > > > > > > > Right now I have this change applied and seems fine. That is the only > > > > difference I get. Clock is apply a bit earlier that when is enabled > > > > it. > > > > > > This is very interesting. So this would mean the issue you are > > > experiencing comes from the clock driver which kind of returns too > > > early from clk_set_rate()? Could you report this to the clk ML/NXP clk > > > maintainers and keep us in copy? If it is as global as it sounds, we > > > might not be the only ones affected. > > > > > > > The imx28 is broken too, so it's a general problem. I need to trace it down > > I have a reverting for lts but it\s not the way to go > > > > For imx28 you ask to set the rate to 22Mhz but you don't care about the clock > that you get back. You get back 12Mhz because the base clock is 24 Mhz and seems > that it can not get the point. You need to check if the clock > requested is in range or ask > for set_rate_clk_min to avoid to have somenthing lower. Then for > imx6ull because is sporadic > I think that is more connected to the clk_set_rate and when you change > the register. Can not be a > setting time? So, if I understand correctly, we face two different problems: - imx6*: seems like a clock issue regarding the clock settlement - imx28: actual NAND driver issue (does not check the validity of the new frequency). This should be handled properly in ->setup_interface(). Thanks, Miquèl
Hi On Mon, Feb 1, 2021 at 4:14 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > Hi Michael, > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > Mon, 1 Feb 2021 16:08:23 +0100: > > > Hi > > > > On Mon, Feb 1, 2021 at 3:32 PM Michael Nazzareno Trimarchi > > <michael@amarulasolutions.com> wrote: > > > > > > Hi > > > > > > On Mon, Feb 1, 2021 at 3:13 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > > > > > Hi Michael, > > > > > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > > > Sat, 30 Jan 2021 10:41:29 +0100: > > > > > > > > > Hi Miquel > > > > > > > > > > commit f8e6ad14388067f91b26d044185d95623fbc9535 > > > > > Author: Michael Trimarchi <michael@amarulasolutions.com> > > > > > Date: Fri Jan 29 08:46:53 2021 +0100 > > > > > > > > > > mtd: nand: Calculate the clock before enable it > > > > > > > > > > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > > > > > Change-Id: I79b0da39de0a9b32ea0b002fa200d7f44d4f8ce7 > > > > > > > > > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > index 322a008290e5..0bca52b3bc8f 100644 > > > > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > @@ -377,6 +377,7 @@ static void gpmi_nfc_compute_timings(struct > > > > > gpmi_nand_data *this, > > > > > const struct nand_sdr_timings *sdr) > > > > > { > > > > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > > > > + struct resources *r = &this->resources; > > > > > unsigned int dll_threshold_ps = this->devdata->max_chain_delay; > > > > > unsigned int period_ps, reference_period_ps; > > > > > unsigned int data_setup_cycles, data_hold_cycles, addr_setup_cycles; > > > > > @@ -440,6 +441,8 @@ static void gpmi_nfc_compute_timings(struct > > > > > gpmi_nand_data *this, > > > > > hw->ctrl1n |= BF_GPMI_CTRL1_RDN_DELAY(sample_delay_factor) | > > > > > BM_GPMI_CTRL1_DLL_ENABLE | > > > > > (use_half_period ? BM_GPMI_CTRL1_HALF_PERIOD : 0); > > > > > + > > > > > + clk_set_rate(r->clock[0], hw->clk_rate); > > > > > } > > > > > > > > > > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > @@ -449,8 +452,6 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > void __iomem *gpmi_regs = r->gpmi_regs; > > > > > unsigned int dll_wait_time_us; > > > > > > > > > > - clk_set_rate(r->clock[0], hw->clk_rate); > > > > > - > > > > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > > > > > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > > > > > > > > > Right now I have this change applied and seems fine. That is the only > > > > > difference I get. Clock is apply a bit earlier that when is enabled > > > > > it. > > > > > > > > This is very interesting. So this would mean the issue you are > > > > experiencing comes from the clock driver which kind of returns too > > > > early from clk_set_rate()? Could you report this to the clk ML/NXP clk > > > > maintainers and keep us in copy? If it is as global as it sounds, we > > > > might not be the only ones affected. > > > > > > > > > > The imx28 is broken too, so it's a general problem. I need to trace it down > > > I have a reverting for lts but it\s not the way to go > > > > > > > For imx28 you ask to set the rate to 22Mhz but you don't care about the clock > > that you get back. You get back 12Mhz because the base clock is 24 Mhz and seems > > that it can not get the point. You need to check if the clock > > requested is in range or ask > > for set_rate_clk_min to avoid to have somenthing lower. Then for > > imx6ull because is sporadic > > I think that is more connected to the clk_set_rate and when you change > > the register. Can not be a > > setting time? > > So, if I understand correctly, we face two different problems: > - imx6*: seems like a clock issue regarding the clock settlement > - imx28: actual NAND driver issue (does not check the validity of the > new frequency). This should be handled properly in > ->setup_interface(). I'm planning to work on it, I need to deliver on the field updates on the new LTS release. Everything is correct what you said. I will check and try to fix the issues properly after this Michael > > Thanks, > Miquèl
On Monday, 29 July 2019, 08:41:51 CEST, Greg Ungerer wrote: > Hi Miquel, > > I am experiencing a problem with NAND flash DMA timeouts on > iMX6ull based boards. The problem is very similar to that > described in: > > https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > > That didn't come to any specific resolution that I could see > in that thread. Hi all, I am joining this thread because I am also affected by this problem. I use kernel 5.10.65-rt53 but I have seen this issue on many previous versions. In the past I only recognized this on my development setup but now this has been found by our testing team. In our test setup we simply perform a reboot every 30s. After 5 to 200 cycles the test stops due to this error. The kernel version I use already includes: > Han Xu <han.xu@nxp.com> > mtd: rawnand: gpmi: Fix the random DMA timeout issue Additionally I tried ... > Michael Trimarchi <michael@amarulasolutions.com> > mtd: nand: Calculate the clock before enable it ... but the problem still persists. In my case, some registers show different values (annotated below): > > The boot trace on the console for me looks like this: > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xdc > nand: Micron MT29F2G08ABAEAWP nand: Micron MT29F4G08ABADAH4 > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 gpmi-nand 1806000.nand-controller: offset 0x0c0 : 0x00000202 > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > gpmi-nand 1806000.gpmi-nand: Show BCH registers : > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 gpmi-nand 1806000.nand-controller: offset 0x000 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 gpmi-nand 1806000.nand-controller: offset 0x080 : 0x070a4080 > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 gpmi-nand 1806000.nand-controller: offset 0x090 : 0x10da4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > gpmi-nand 1806000.gpmi-nand: BCH Geometry : > GF length : 13 > ECC Strength : 8 > Page Size in Bytes : 2110 > Metadata Size in Bytes : 10 > ECC Chunk0 Size in Bytes: 512 > ECC Chunkn Size in Bytes: 512 ECC Chunk Size in Bytes: 512 > ECC Chunk Count : 4 > Payload Size in Bytes : 2048 > Auxiliary Size in Bytes: 16 > Auxiliary Status Offset: 12 > Block Mark Byte Offset : 1999 > Block Mark Bit Offset : 0 Please let me know if further information is required. regards Christian
Hi Christian On Mon, Oct 4, 2021 at 7:54 AM Christian Eggers <ceggers@arri.de> wrote: > > On Monday, 29 July 2019, 08:41:51 CEST, Greg Ungerer wrote: > > Hi Miquel, > > > > I am experiencing a problem with NAND flash DMA timeouts on > > iMX6ull based boards. The problem is very similar to that > > described in: > > > > https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > > > > That didn't come to any specific resolution that I could see > > in that thread. > > Hi all, > > I am joining this thread because I am also affected by this problem. I use > kernel 5.10.65-rt53 but I have seen this issue on many previous versions. In the > past I only recognized this on my development setup but now this has been found > by our testing team. > > In our test setup we simply perform a reboot every 30s. After 5 to 200 cycles > the test stops due to this error. > > The kernel version I use already includes: > > > Han Xu <han.xu@nxp.com> > > mtd: rawnand: gpmi: Fix the random DMA timeout issue > > Additionally I tried ... > > > Michael Trimarchi <michael@amarulasolutions.com> > > mtd: nand: Calculate the clock before enable it > > ... but the problem still persists. > > In my case, some registers show different values (annotated below): > > > > > The boot trace on the console for me looks like this: > > > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xdc > > nand: Micron MT29F2G08ABAEAWP > nand: Micron MT29F4G08ABADAH4 > > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > gpmi-nand 1806000.nand-controller: offset 0x0c0 : 0x00000202 > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > gpmi-nand 1806000.nand-controller: offset 0x000 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > gpmi-nand 1806000.nand-controller: offset 0x080 : 0x070a4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > gpmi-nand 1806000.nand-controller: offset 0x090 : 0x10da4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > GF length : 13 > > ECC Strength : 8 > > Page Size in Bytes : 2110 > > Metadata Size in Bytes : 10 > > ECC Chunk0 Size in Bytes: 512 > > ECC Chunkn Size in Bytes: 512 > ECC Chunk Size in Bytes: 512 > > ECC Chunk Count : 4 > > Payload Size in Bytes : 2048 > > Auxiliary Size in Bytes: 16 > > Auxiliary Status Offset: 12 > > Block Mark Byte Offset : 1999 > > Block Mark Bit Offset : 0 > > Please let me know if further information is required. I need to continue on it, during the following days. I have stopped moving to LTS 4.19.y and with my partial revert. The problem as usual was to go to production on some devices. Anyway I have the device that has this problem. I can restart next weekend. One of the thing I notice that make not work on imx28 is: if (sdr->tRC_min >= 30000) { /* ONFI non-EDO modes [0-3] */ hw->clk_rate = 22000000; wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_4_TO_8NS; } else if (sdr->tRC_min >= 25000) { /* ONFI EDO mode 4 */ hw->clk_rate = 80000000; wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY; } else { /* ONFI EDO mode 5 */ hw->clk_rate = 100000000; wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY; } Here there is an assumption that your clk_rate can be set to that rate but on imx28, the parent clock of the NAND one can not let it go to those speed. Changing it let it really set to the wrong value, so imx28 was totally broken. The other computation was based not on fixed clock rate but I think even on clk_get_rate Michael > > regards > Christian > > > >
Hi Michael, Christian, michael@amarulasolutions.com wrote on Mon, 4 Oct 2021 08:27:54 +0200: > Hi Christian > > On Mon, Oct 4, 2021 at 7:54 AM Christian Eggers <ceggers@arri.de> wrote: > > > > On Monday, 29 July 2019, 08:41:51 CEST, Greg Ungerer wrote: > > > Hi Miquel, > > > > > > I am experiencing a problem with NAND flash DMA timeouts on > > > iMX6ull based boards. The problem is very similar to that > > > described in: > > > > > > https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > > > > > > That didn't come to any specific resolution that I could see > > > in that thread. > > > > Hi all, > > > > I am joining this thread because I am also affected by this problem. I use > > kernel 5.10.65-rt53 but I have seen this issue on many previous versions. In the > > past I only recognized this on my development setup but now this has been found > > by our testing team. > > > > In our test setup we simply perform a reboot every 30s. After 5 to 200 cycles > > the test stops due to this error. > > > > The kernel version I use already includes: > > > > > Han Xu <han.xu@nxp.com> > > > mtd: rawnand: gpmi: Fix the random DMA timeout issue > > > > Additionally I tried ... > > > > > Michael Trimarchi <michael@amarulasolutions.com> > > > mtd: nand: Calculate the clock before enable it > > > > ... but the problem still persists. > > > > In my case, some registers show different values (annotated below): > > > > > > > > The boot trace on the console for me looks like this: > > > > > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xdc > > > nand: Micron MT29F2G08ABAEAWP > > nand: Micron MT29F4G08ABADAH4 > > > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > > gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > > gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > > gpmi-nand 1806000.nand-controller: offset 0x0c0 : 0x00000202 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > > gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > > gpmi-nand 1806000.nand-controller: offset 0x000 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > > gpmi-nand 1806000.nand-controller: offset 0x080 : 0x070a4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > > gpmi-nand 1806000.nand-controller: offset 0x090 : 0x10da4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > > gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > > gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > > gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > > gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > > GF length : 13 > > > ECC Strength : 8 > > > Page Size in Bytes : 2110 > > > Metadata Size in Bytes : 10 > > > ECC Chunk0 Size in Bytes: 512 > > > ECC Chunkn Size in Bytes: 512 > > ECC Chunk Size in Bytes: 512 > > > ECC Chunk Count : 4 > > > Payload Size in Bytes : 2048 > > > Auxiliary Size in Bytes: 16 > > > Auxiliary Status Offset: 12 > > > Block Mark Byte Offset : 1999 > > > Block Mark Bit Offset : 0 > > > > Please let me know if further information is required. > > I need to continue on it, during the following days. I have stopped > moving to LTS 4.19.y and with my partial revert. > The problem as usual was to go to production on some devices. Anyway I > have the device that has this problem. I can > restart next weekend. One of the thing I notice that make not work on imx28 is: > > if (sdr->tRC_min >= 30000) { > /* ONFI non-EDO modes [0-3] */ > hw->clk_rate = 22000000; > wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_4_TO_8NS; > } else if (sdr->tRC_min >= 25000) { > /* ONFI EDO mode 4 */ > hw->clk_rate = 80000000; > wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY; > } else { > /* ONFI EDO mode 5 */ > hw->clk_rate = 100000000; > wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY; > } > > Here there is an assumption that your clk_rate can be set to that rate > but on imx28, the parent clock of the NAND one can not > let it go to those speed. Changing it let it really set to the wrong > value, so imx28 was totally broken. The other computation was based > not on fixed clock rate but I think even on clk_get_rate Interesting finding. I guess we should try to apply the desired block rate and if the final clock rate is too far from what is achievable and works we should refuse the requested configuration. The core will automatically try the slowest -but perhaps working- modes. Thanks, Miquèl
On 21/10/04 05:33PM, Miquel Raynal wrote: > Hi Michael, Christian, > > michael@amarulasolutions.com wrote on Mon, 4 Oct 2021 08:27:54 +0200: > > > Hi Christian > > > > On Mon, Oct 4, 2021 at 7:54 AM Christian Eggers <ceggers@arri.de> wrote: > > > > > > On Monday, 29 July 2019, 08:41:51 CEST, Greg Ungerer wrote: > > > > Hi Miquel, > > > > > > > > I am experiencing a problem with NAND flash DMA timeouts on > > > > iMX6ull based boards. The problem is very similar to that > > > > described in: > > > > > > > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinux-mtd.infradead.narkive.com%2FJIUulfFB%2Fgpmi-imx6ull-timeout-on-dma&data=04%7C01%7Chan.xu%40nxp.com%7C278d7b93edbb4b72923408d9874c5ffe%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637689584362563293%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=uSkVTsEF9yhHt5ZstJMbPIjUQbHzjhiHMjO9eDgFSg0%3D&reserved=0 > > > > > > > > That didn't come to any specific resolution that I could see > > > > in that thread. > > > > > > Hi all, > > > > > > I am joining this thread because I am also affected by this problem. I use > > > kernel 5.10.65-rt53 but I have seen this issue on many previous versions. In the > > > past I only recognized this on my development setup but now this has been found > > > by our testing team. > > > > > > In our test setup we simply perform a reboot every 30s. After 5 to 200 cycles > > > the test stops due to this error. > > > > > > The kernel version I use already includes: > > > > > > > Han Xu <han.xu@nxp.com> > > > > mtd: rawnand: gpmi: Fix the random DMA timeout issue > > > > > > Additionally I tried ... > > > > > > > Michael Trimarchi <michael@amarulasolutions.com> > > > > mtd: nand: Calculate the clock before enable it > > > > > > ... but the problem still persists. > > > > > > In my case, some registers show different values (annotated below): > > > > > > > > > > > The boot trace on the console for me looks like this: > > > > > > > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xdc > > > > nand: Micron MT29F2G08ABAEAWP > > > nand: Micron MT29F4G08ABADAH4 > > > > nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > > > gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > > > gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > > > gpmi-nand 1806000.nand-controller: offset 0x0c0 : 0x00000202 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > > > gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > > > gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > > > gpmi-nand 1806000.nand-controller: offset 0x000 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > > > gpmi-nand 1806000.nand-controller: offset 0x080 : 0x070a4080 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > > > gpmi-nand 1806000.nand-controller: offset 0x090 : 0x10da4080 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > > > gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > > > gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > > > GF length : 13 > > > > ECC Strength : 8 > > > > Page Size in Bytes : 2110 > > > > Metadata Size in Bytes : 10 > > > > ECC Chunk0 Size in Bytes: 512 > > > > ECC Chunkn Size in Bytes: 512 > > > ECC Chunk Size in Bytes: 512 > > > > ECC Chunk Count : 4 > > > > Payload Size in Bytes : 2048 > > > > Auxiliary Size in Bytes: 16 > > > > Auxiliary Status Offset: 12 > > > > Block Mark Byte Offset : 1999 > > > > Block Mark Bit Offset : 0 > > > > > > Please let me know if further information is required. Could you please try to add clock dis/enable when setting clock rate, in case clock glitches. clk_disable_unprepare(r->clock[0]); clk_set_rate(r->clock[0], hw->clk_rate); clk_prepare_enable(r->clock[0]); > > > > I need to continue on it, during the following days. I have stopped > > moving to LTS 4.19.y and with my partial revert. > > The problem as usual was to go to production on some devices. Anyway I > > have the device that has this problem. I can > > restart next weekend. One of the thing I notice that make not work on imx28 is: > > > > if (sdr->tRC_min >= 30000) { > > /* ONFI non-EDO modes [0-3] */ > > hw->clk_rate = 22000000; > > wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_4_TO_8NS; > > } else if (sdr->tRC_min >= 25000) { > > /* ONFI EDO mode 4 */ > > hw->clk_rate = 80000000; > > wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY; > > } else { > > /* ONFI EDO mode 5 */ > > hw->clk_rate = 100000000; > > wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY; > > } > > > > Here there is an assumption that your clk_rate can be set to that rate > > but on imx28, the parent clock of the NAND one can not > > let it go to those speed. Changing it let it really set to the wrong > > value, so imx28 was totally broken. The other computation was based > > not on fixed clock rate but I think even on clk_get_rate > > Interesting finding. I guess we should try to apply the desired block > rate and if the final clock rate is too far from what is achievable and > works we should refuse the requested configuration. The core will > automatically try the slowest -but perhaps working- modes. > > Thanks, > Miquèl
On Monday, 4 October 2021, 18:06:20 CEST, Han Xu wrote: > > Could you please try to add clock dis/enable when setting clock rate, in case > clock glitches. > > clk_disable_unprepare(r->clock[0]); > clk_set_rate(r->clock[0], hw->clk_rate); > clk_prepare_enable(r->clock[0]); > This looks promising! Last night I had a successful test run with more than 600 reboots! The test only stopped because my network was disturbed by the daily backup run... I'll do further tests within the next 24 hours. regards Christian
+ set PHYTEC developers on (B)CC On Monday, 4 October 2021, 18:06:20 CEST, Han Xu wrote: > > Could you please try to add clock dis/enable when setting clock rate, in case > clock glitches. > > clk_disable_unprepare(r->clock[0]); > clk_set_rate(r->clock[0], hw->clk_rate); > clk_prepare_enable(r->clock[0]); > With this change, we made over 2000 successful reboots without any GPMI DMA timeout problems! From PHYTEC (our BSP supplier), I got some possible background for this problem. For older revisions of IMX6DQ there was an errata (ERR007117, [1]) in the ROM bootloader which triggers a similar / the same behavior: > For raw NAND boot, ROM switches the source of enfc_clk_root from PLL2_PFD2 > to PLL3. The root clock is required to be gated before switching the source > clock. If the root clock is not gated, clock glitches might be passed to > the divider that follows the clock mux, and the divider might behave > unpredictably. > ... > This problem can also occur elsewhere in application code if the root clock > is not properly gated when the clock configuration is changed. In my case (Linux boot on i.MX6ULL), I recognized that on the 3rd call of gpmi_nfc_apply_timings(), the rate of r->clock[0] is changed - from: 22 MHz (CS2CDR::ENFC_CLK_PRED=6 and CS2CDR::ENFC_CLK_PODF=3) - to: 100 MHz (CS2CDR::ENFC_CLK_PRED=4 and CS2CDR::ENFC_CLK_PODF=1) The proposal from from Han Xu > clk_disable_unprepare(r->clock[0]); > clk_set_rate(r->clock[0], hw->clk_rate); > clk_prepare_enable(r->clock[0]); disables only CCGR4::CG14 [RAWNAND_U_GPMI_BCH_INPUT_GPMI_IO_CLK_ENABLE] during the rate change. While this works very fine on my system, this probably doesn't fulfill the requirements in the errata description: > For other occurrences in application code, the following procedure should be > followed to change the clock configuration for the enfc_clk_root: > 1) Gate (disable) the GPMI/BCH clocks in register CCM_CCGR4. > 2) Gate (disable) the enfc_clk_root before changing the enfc_clk_root source > or dividers by clearing CCM_CCGR2[CG7] to 2’b00. This disables the > iomux_ipt_clk_io_clk. > 3) Configure CCM_CS2CDR for the new clock source configuration. > 4) Enable enfc_clk_root by setting CCM_CCGR2[CG7] to 2’b11. This enables the > iomux_ipt_clk_io_clk. > 5) Enable the GPMI/BCH clocks in register CCM_CCGR4 I got another solution from PHYTEC ([2], not lengthy tested yet), which disables all GPMI/BCH clocks on CCGR4 (verified with a JTAG debugger): - CCGR4::CG15 [RAWNAND_U_GPMI_INPUT_APB_CLK_ENABLE] - CCGR4::CG14 [RAWNAND_U_GPMI_BCH_INPUT_GPMI_IO_CLK_ENABLE] - CCGR4::CG13 [RAWNAND_U_GPMI_BCH_INPUT_BCH_CLK_ENABLE] - CCGR4::CG12 [RAWNAND_U_BCH_INPUT_APB_CLK_ENABLE] - CCGR4::CG6 [PL301_MX6QPER1_BCHCLK_ENABLE] CCM_CCGR2[CG7] is used for IOMUX_IPT_CLK_IO_ENABLE on i.MX6ULL, so step 2. seems not to apply here. Actually I don't know how to gate ENFC_CLK_ROOT. I will cyclic test the solution from PHYTEC over the weekend. @Han Xu: Should I prefer the solution from PHYTEC? @Stefan Riedmueller: Are you willing to commit this upstream? regards Christian [1] https://www.nxp.com/docs/en/errata/IMX6DQCE.pdf [2] https://git.phytec.de/linux-mainline/commit/?h=v5.10.48-phy&id=866939ea8110764d9c12af960d746e2f7f5debe3
Hi Christian, On Fri, 2021-10-08 at 11:55 +0200, Christian Eggers wrote: > + set PHYTEC developers on (B)CC > > On Monday, 4 October 2021, 18:06:20 CEST, Han Xu wrote: > > Could you please try to add clock dis/enable when setting clock rate, in > > case > > clock glitches. > > > > clk_disable_unprepare(r->clock[0]); > > clk_set_rate(r->clock[0], hw->clk_rate); > > clk_prepare_enable(r->clock[0]); > > > > With this change, we made over 2000 successful reboots without any GPMI DMA > timeout problems! > > From PHYTEC (our BSP supplier), I got some possible background for this > problem. For older revisions of IMX6DQ there was an errata (ERR007117, [1]) > in the ROM bootloader which triggers a similar / the same behavior: > > > For raw NAND boot, ROM switches the source of enfc_clk_root from PLL2_PFD2 > > to PLL3. The root clock is required to be gated before switching the > > source > > clock. If the root clock is not gated, clock glitches might be passed to > > the divider that follows the clock mux, and the divider might behave > > unpredictably. > > ... > > This problem can also occur elsewhere in application code if the root > > clock > > is not properly gated when the clock configuration is changed. > > In my case (Linux boot on i.MX6ULL), I recognized that on the 3rd call of > gpmi_nfc_apply_timings(), the rate of r->clock[0] is changed > > - from: 22 MHz (CS2CDR::ENFC_CLK_PRED=6 and CS2CDR::ENFC_CLK_PODF=3) > - to: 100 MHz (CS2CDR::ENFC_CLK_PRED=4 and CS2CDR::ENFC_CLK_PODF=1) > > The proposal from from Han Xu > > clk_disable_unprepare(r->clock[0]); > > clk_set_rate(r->clock[0], hw->clk_rate); > > clk_prepare_enable(r->clock[0]); > disables only CCGR4::CG14 [RAWNAND_U_GPMI_BCH_INPUT_GPMI_IO_CLK_ENABLE] > during > the rate change. While this works very fine on my system, this probably > doesn't > fulfill the requirements in the errata description: > > > For other occurrences in application code, the following procedure should > > be > > followed to change the clock configuration for the enfc_clk_root: > > 1) Gate (disable) the GPMI/BCH clocks in register CCM_CCGR4. > > 2) Gate (disable) the enfc_clk_root before changing the enfc_clk_root > > source > > or dividers by clearing CCM_CCGR2[CG7] to 2’b00. This disables the > > iomux_ipt_clk_io_clk. > > 3) Configure CCM_CS2CDR for the new clock source configuration. > > 4) Enable enfc_clk_root by setting CCM_CCGR2[CG7] to 2’b11. This enables > > the > > iomux_ipt_clk_io_clk. > > 5) Enable the GPMI/BCH clocks in register CCM_CCGR4 > > I got another solution from PHYTEC ([2], not lengthy tested yet), which > disables all GPMI/BCH clocks on CCGR4 (verified with a JTAG debugger): > - CCGR4::CG15 [RAWNAND_U_GPMI_INPUT_APB_CLK_ENABLE] > - CCGR4::CG14 [RAWNAND_U_GPMI_BCH_INPUT_GPMI_IO_CLK_ENABLE] > - CCGR4::CG13 [RAWNAND_U_GPMI_BCH_INPUT_BCH_CLK_ENABLE] > - CCGR4::CG12 [RAWNAND_U_BCH_INPUT_APB_CLK_ENABLE] > - CCGR4::CG6 [PL301_MX6QPER1_BCHCLK_ENABLE] > > CCM_CCGR2[CG7] is used for IOMUX_IPT_CLK_IO_ENABLE on i.MX6ULL, so step 2. > seems not to apply here. Actually I don't know how to gate ENFC_CLK_ROOT. > > I will cyclic test the solution from PHYTEC over the weekend. > > @Han Xu: Should I prefer the solution from PHYTEC? > @Stefan Riedmueller: Are you willing to commit this upstream? Yes sure, I can prepare a patch beginning of next week. BTW, we have seen these DMA timeout issues on the i.MX6 SOCs as well. So this fix is not only for the i.MX 6ULL. Regards, Stefan > > regards > Christian > > > [1] https://www.nxp.com/docs/en/errata/IMX6DQCE.pdf > [2] > https://git.phytec.de/linux-mainline/commit/?h=v5.10.48-phy&id=866939ea8110764d9c12af960d746e2f7f5debe3 > > >
Hello, S.Riedmueller@phytec.de wrote on Fri, 8 Oct 2021 12:08:01 +0000: > Hi Christian, > > On Fri, 2021-10-08 at 11:55 +0200, Christian Eggers wrote: > > + set PHYTEC developers on (B)CC > > > > On Monday, 4 October 2021, 18:06:20 CEST, Han Xu wrote: > > > Could you please try to add clock dis/enable when setting clock rate, in > > > case > > > clock glitches. > > > > > > clk_disable_unprepare(r->clock[0]); > > > clk_set_rate(r->clock[0], hw->clk_rate); > > > clk_prepare_enable(r->clock[0]); > > > > > > > With this change, we made over 2000 successful reboots without any GPMI DMA > > timeout problems! > > > > From PHYTEC (our BSP supplier), I got some possible background for this > > problem. For older revisions of IMX6DQ there was an errata (ERR007117, [1]) > > in the ROM bootloader which triggers a similar / the same behavior: > > > > > For raw NAND boot, ROM switches the source of enfc_clk_root from PLL2_PFD2 > > > to PLL3. The root clock is required to be gated before switching the > > > source > > > clock. If the root clock is not gated, clock glitches might be passed to > > > the divider that follows the clock mux, and the divider might behave > > > unpredictably. > > > ... > > > This problem can also occur elsewhere in application code if the root > > > clock > > > is not properly gated when the clock configuration is changed. > > > > In my case (Linux boot on i.MX6ULL), I recognized that on the 3rd call of > > gpmi_nfc_apply_timings(), the rate of r->clock[0] is changed > > > > - from: 22 MHz (CS2CDR::ENFC_CLK_PRED=6 and CS2CDR::ENFC_CLK_PODF=3) > > - to: 100 MHz (CS2CDR::ENFC_CLK_PRED=4 and CS2CDR::ENFC_CLK_PODF=1) > > > > The proposal from from Han Xu > > > clk_disable_unprepare(r->clock[0]); > > > clk_set_rate(r->clock[0], hw->clk_rate); > > > clk_prepare_enable(r->clock[0]); > > disables only CCGR4::CG14 [RAWNAND_U_GPMI_BCH_INPUT_GPMI_IO_CLK_ENABLE] > > during > > the rate change. While this works very fine on my system, this probably > > doesn't > > fulfill the requirements in the errata description: > > > > > For other occurrences in application code, the following procedure should > > > be > > > followed to change the clock configuration for the enfc_clk_root: > > > 1) Gate (disable) the GPMI/BCH clocks in register CCM_CCGR4. > > > 2) Gate (disable) the enfc_clk_root before changing the enfc_clk_root > > > source > > > or dividers by clearing CCM_CCGR2[CG7] to 2’b00. This disables the > > > iomux_ipt_clk_io_clk. > > > 3) Configure CCM_CS2CDR for the new clock source configuration. > > > 4) Enable enfc_clk_root by setting CCM_CCGR2[CG7] to 2’b11. This enables > > > the > > > iomux_ipt_clk_io_clk. > > > 5) Enable the GPMI/BCH clocks in register CCM_CCGR4 > > > > I got another solution from PHYTEC ([2], not lengthy tested yet), which > > disables all GPMI/BCH clocks on CCGR4 (verified with a JTAG debugger): > > - CCGR4::CG15 [RAWNAND_U_GPMI_INPUT_APB_CLK_ENABLE] > > - CCGR4::CG14 [RAWNAND_U_GPMI_BCH_INPUT_GPMI_IO_CLK_ENABLE] > > - CCGR4::CG13 [RAWNAND_U_GPMI_BCH_INPUT_BCH_CLK_ENABLE] > > - CCGR4::CG12 [RAWNAND_U_BCH_INPUT_APB_CLK_ENABLE] > > - CCGR4::CG6 [PL301_MX6QPER1_BCHCLK_ENABLE] > > > > CCM_CCGR2[CG7] is used for IOMUX_IPT_CLK_IO_ENABLE on i.MX6ULL, so step 2. > > seems not to apply here. Actually I don't know how to gate ENFC_CLK_ROOT. > > > > I will cyclic test the solution from PHYTEC over the weekend. > > > > @Han Xu: Should I prefer the solution from PHYTEC? > > @Stefan Riedmueller: Are you willing to commit this upstream? > > Yes sure, I can prepare a patch beginning of next week. > BTW, we have seen these DMA timeout issues on the i.MX6 SOCs as well. So this > fix is not only for the i.MX 6ULL. Could it be possible to quantify the extra time spent in disabling/re-enabling clocks just for the record? So far this driver only supports a single chip and thus frequency changes do not happen frequently (a couple times at boot) but if someday we introduce support for several chips it might become very impacting. Thanks, Miquèl
On Friday, 8 October 2021, 11:55:56 CEST, Christian Eggers wrote: > I got another solution from PHYTEC ([2], not lengthy tested yet), which > disables all GPMI/BCH clocks on CCGR4 > -static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > +static int gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > { > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > struct resources *r = &this->resources; > void __iomem *gpmi_regs = r->gpmi_regs; > unsigned int dll_wait_time_us; > > + int ret; > > + ret = __gpmi_enable_clk(this, false); > + if (ret) > + return ret; > > clk_set_rate(r->clock[0], hw->clk_rate); > > + ret = __gpmi_enable_clk(this, true); > + if (ret) > + return ret; > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); I think that this is also required for the first call to clk_set_rate() in gpmi_get_clks(). From the kernel's point of view, the clocks are not enabled yet, so no "guard" is required. Putting the same guard here will actually give a warning at runtime that I am trying to disable a clock which is not enabled. But on my system, all clock gates are enabled by the bootloader (barebox) and therefore the glitch could also happen here. If setting the clock to a "default" rate is really required (on my system, the clock is switched from 99 MHz to 22 MHz, and back to 99 MHz on a later call...), I suggest moving this call to gpmi_nand_probe() (below __gpmi_enable_clock()) and guard it. The result would look like: ... ret = acquire_resources(this); if (ret) goto exit_acquire_resources; ret = __gpmi_enable_clk(this, true); if (ret) goto exit_acquire_resources; if (GPMI_IS_MX6(this)) { /* * Set the default value for the gpmi clock. * * If you want to use the ONFI nand which is in the * Synchronous Mode, you should change the clock as you need. */ __gpmi_enable_clk(this, false); clk_set_rate(this->resources.clock[0], 22000000); __gpmi_enable_clk(this, true); } pm_runtime_set_autosuspend_delay(&pdev->dev, 500); ... It looks a little bit useless to enable the clocks and immediately disable them. But probably this is the only way to make sure that clocks enabled by the bootloader are certainly off. regards Christian
On Friday, 8 October 2021, 14:27:24 CEST, Miquel Raynal wrote: > Hello, > > Could it be possible to quantify the extra time spent in > disabling/re-enabling clocks just for the record? So far this driver > only supports a single chip and thus frequency changes do not happen > frequently (a couple times at boot) but if someday we introduce support > for several chips it might become very impacting. > > Thanks, > Miquèl > ktime_t start = ktime_get(), duration; __gpmi_enable_clk(this, false); clk_set_rate(r->clock[0], hw->clk_rate); __gpmi_enable_clk(this, true); duration = ktime_get() - start; printk(KERN_ERR "%s() duration = %lld ns\n", __func__, ktime_to_ns(duration)); --> gpmi_nfc_apply_timings() duration = 39250 ns ... nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xdc nand: Micron MT29F4G08ABADAH4 nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 gpmi_nfc_apply_timings() duration = 36750 ns gpmi_nfc_apply_timings() duration = 170750 ns On i.MX6ULL @ 792MHz this takes less than 250 us. regards Christian
Hi Christian, ceggers@arri.de wrote on Fri, 8 Oct 2021 15:11:59 +0200: > On Friday, 8 October 2021, 11:55:56 CEST, Christian Eggers wrote: > > > I got another solution from PHYTEC ([2], not lengthy tested yet), which > > disables all GPMI/BCH clocks on CCGR4 > > > -static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > +static int gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > { > > > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > struct resources *r = &this->resources; > > void __iomem *gpmi_regs = r->gpmi_regs; > > unsigned int dll_wait_time_us; > > > > + int ret; > > > > + ret = __gpmi_enable_clk(this, false); > > + if (ret) > > + return ret; > > > > clk_set_rate(r->clock[0], hw->clk_rate); > > > > + ret = __gpmi_enable_clk(this, true); > > + if (ret) > > + return ret; > > > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > > I think that this is also required for the first call to clk_set_rate() in > gpmi_get_clks(). From the kernel's point of view, the clocks are not enabled > yet, so no "guard" is required. Putting the same guard here will actually give > a warning at runtime that I am trying to disable a clock which is not enabled. > But on my system, all clock gates are enabled by the bootloader (barebox) > and therefore the glitch could also happen here. > > If setting the clock to a "default" rate is really required (on my system, the > clock is switched from 99 MHz to 22 MHz, and back to 99 MHz on a later call...), > I suggest moving this call to gpmi_nand_probe() (below __gpmi_enable_clock()) > and guard it. The result would look like: > > ... > ret = acquire_resources(this); > if (ret) > goto exit_acquire_resources; > > ret = __gpmi_enable_clk(this, true); > if (ret) > goto exit_acquire_resources; > > if (GPMI_IS_MX6(this)) { > /* > * Set the default value for the gpmi clock. > * > * If you want to use the ONFI nand which is in the > * Synchronous Mode, you should change the clock as you need. > */ > __gpmi_enable_clk(this, false); > clk_set_rate(this->resources.clock[0], 22000000); > __gpmi_enable_clk(this, true); > } > > pm_runtime_set_autosuspend_delay(&pdev->dev, 500); If this clock (as I understand) does not prevent us to access the registers but only feeds the external NAND bus part, then there is no need to enable it in the probe, just acquiring it will be enough. Then, the first call for an IO operation with ->must_apply_timings should: if (imx6) disable_clk(); clk_set_rate(); if (imx6) enable_clk(); I believe this should cover it all. > ... > > It looks a little bit useless to enable the clocks and immediately disable > them. But probably this is the only way to make sure that clocks enabled by > the bootloader are certainly off. > > regards > Christian > > > Thanks, Miquèl
Hi Christian, ceggers@arri.de wrote on Fri, 8 Oct 2021 15:13:17 +0200: > On Friday, 8 October 2021, 14:27:24 CEST, Miquel Raynal wrote: > > Hello, > > > > Could it be possible to quantify the extra time spent in > > disabling/re-enabling clocks just for the record? So far this driver > > only supports a single chip and thus frequency changes do not happen > > frequently (a couple times at boot) but if someday we introduce support > > for several chips it might become very impacting. > > > > Thanks, > > Miquèl > > > > > ktime_t start = ktime_get(), duration; > __gpmi_enable_clk(this, false); > clk_set_rate(r->clock[0], hw->clk_rate); > __gpmi_enable_clk(this, true); > duration = ktime_get() - start; > printk(KERN_ERR "%s() duration = %lld ns\n", __func__, ktime_to_ns(duration)); > > --> > > gpmi_nfc_apply_timings() duration = 39250 ns > ... > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xdc > nand: Micron MT29F4G08ABADAH4 > nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > gpmi_nfc_apply_timings() duration = 36750 ns > gpmi_nfc_apply_timings() duration = 170750 ns > > On i.MX6ULL @ 792MHz this takes less than 250 us. Ok thanks for the feedback. This is not negligible if we start switching between chips with different frequencies quite regularly. Thanks, Miquèl
miquel.raynal@bootlin.com wrote on Fri, 8 Oct 2021 15:29:05 +0200: > Hi Christian, > > ceggers@arri.de wrote on Fri, 8 Oct 2021 15:11:59 +0200: > > > On Friday, 8 October 2021, 11:55:56 CEST, Christian Eggers wrote: > > > > > I got another solution from PHYTEC ([2], not lengthy tested yet), which > > > disables all GPMI/BCH clocks on CCGR4 > > > > > -static void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > +static int gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > > { > > > > > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > > struct resources *r = &this->resources; > > > void __iomem *gpmi_regs = r->gpmi_regs; > > > unsigned int dll_wait_time_us; > > > > > > + int ret; > > > > > > + ret = __gpmi_enable_clk(this, false); > > > + if (ret) > > > + return ret; > > > > > > clk_set_rate(r->clock[0], hw->clk_rate); > > > > > > + ret = __gpmi_enable_clk(this, true); > > > + if (ret) > > > + return ret; > > > > > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > > > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > > > > > I think that this is also required for the first call to clk_set_rate() in > > gpmi_get_clks(). From the kernel's point of view, the clocks are not enabled > > yet, so no "guard" is required. Putting the same guard here will actually give > > a warning at runtime that I am trying to disable a clock which is not enabled. > > But on my system, all clock gates are enabled by the bootloader (barebox) > > and therefore the glitch could also happen here. > > > > If setting the clock to a "default" rate is really required (on my system, the > > clock is switched from 99 MHz to 22 MHz, and back to 99 MHz on a later call...), > > I suggest moving this call to gpmi_nand_probe() (below __gpmi_enable_clock()) > > and guard it. The result would look like: > > > > ... > > ret = acquire_resources(this); > > if (ret) > > goto exit_acquire_resources; > > > > ret = __gpmi_enable_clk(this, true); > > if (ret) > > goto exit_acquire_resources; > > > > if (GPMI_IS_MX6(this)) { > > /* > > * Set the default value for the gpmi clock. > > * > > * If you want to use the ONFI nand which is in the > > * Synchronous Mode, you should change the clock as you need. > > */ > > __gpmi_enable_clk(this, false); > > clk_set_rate(this->resources.clock[0], 22000000); > > __gpmi_enable_clk(this, true); > > } > > > > pm_runtime_set_autosuspend_delay(&pdev->dev, 500); > > If this clock (as I understand) does not prevent us to access the > registers but only feeds the external NAND bus part, then there is no > need to enable it in the probe, just acquiring it will be enough. > Then, the first call for an IO operation with ->must_apply_timings > should: > > if (imx6) > disable_clk(); > > clk_set_rate(); > > if (imx6) > enable_clk(); Actually we should ensure clks are enabled in the !imx6 case anyway, but this is needed only once so either we keep enabling the clock in the probe or we check here if the clk has already been enabled or not. Cheers, Miquèl
On Friday, 8 October 2021, 15:36:31 CEST, Miquel Raynal wrote: > > miquel.raynal@bootlin.com wrote on Fri, 8 Oct 2021 15:29:05 +0200: > > > > > If this clock (as I understand) does not prevent us to access the > > registers but only feeds the external NAND bus part, then there is no > > need to enable it in the probe, just acquiring it will be enough. clocks[0] is "gpmi_io" which sounds more like i/o than registers. So lets try to remove the initial call to clk_set_rate(). > > Then, the first call for an IO operation with ->must_apply_timings > > should: > > > > if (imx6) > > disable_clk(); > > > > clk_set_rate(); > > > > if (imx6) > > enable_clk(); Do you think that the need for avoiding clock glitches is i.MX6 specific? The errata I mentioned is specific for the bootloader software, but (I think) the requirement for switching off the clocks gates prior changing the dividers may apply also for other series. > Actually we should ensure clks are enabled in the !imx6 case anyway, > but this is needed only once so either we keep enabling the clock in > the probe or we check here if the clk has already been enabled or not. The clocks are already enabled (and kept on) in probe. The initial call to clk_set_rate() is just above this (but the clocks are not disabled at this stage as all gates have been enabled by the boot loader). regards Christian
Hi Christian, ceggers@arri.de wrote on Fri, 8 Oct 2021 15:49:21 +0200: > On Friday, 8 October 2021, 15:36:31 CEST, Miquel Raynal wrote: > > > > miquel.raynal@bootlin.com wrote on Fri, 8 Oct 2021 15:29:05 +0200: > > > > > > > > If this clock (as I understand) does not prevent us to access the > > > registers but only feeds the external NAND bus part, then there is no > > > need to enable it in the probe, just acquiring it will be enough. > > clocks[0] is "gpmi_io" which sounds more like i/o than registers. So lets > try to remove the initial call to clk_set_rate(). > > > > Then, the first call for an IO operation with ->must_apply_timings > > > should: > > > > > > if (imx6) > > > disable_clk(); > > > > > > clk_set_rate(); > > > > > > if (imx6) > > > enable_clk(); > > Do you think that the need for avoiding clock glitches is i.MX6 specific? > The errata I mentioned is specific for the bootloader software, but (I think) > the requirement for switching off the clocks gates prior changing the dividers > may apply also for other series. I honestly don't know, perhaps Han have more details about it. If you think it's a wider issue, then we can just do the disable/enable step without any further checks. > > Actually we should ensure clks are enabled in the !imx6 case anyway, > > but this is needed only once so either we keep enabling the clock in > > the probe or we check here if the clk has already been enabled or not. > The clocks are already enabled (and kept on) in probe. The initial call to > clk_set_rate() is just above this (but the clocks are not disabled at this > stage as all gates have been enabled by the boot loader). The IO clock should be enabled and set to a particular rate the first time the die is selected to perform a NAND operation, or when we switch from one device to the other (this does not apply to the GPMI driver for now). So we can drop the enable/set_rate call in the probe if the assumption that this clock only feeds the external bus is right. Thanks, Miquèl
Hi On Fri, Oct 8, 2021 at 6:07 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > Hi Christian, > > ceggers@arri.de wrote on Fri, 8 Oct 2021 15:49:21 +0200: > > > On Friday, 8 October 2021, 15:36:31 CEST, Miquel Raynal wrote: > > > > > > miquel.raynal@bootlin.com wrote on Fri, 8 Oct 2021 15:29:05 +0200: > > > > > > > > > > > If this clock (as I understand) does not prevent us to access the > > > > registers but only feeds the external NAND bus part, then there is no > > > > need to enable it in the probe, just acquiring it will be enough. > > > > clocks[0] is "gpmi_io" which sounds more like i/o than registers. So lets > > try to remove the initial call to clk_set_rate(). > > > > > > Then, the first call for an IO operation with ->must_apply_timings > > > > should: > > > > > > > > if (imx6) > > > > disable_clk(); > > > > > > > > clk_set_rate(); > > > > > > > > if (imx6) > > > > enable_clk(); > > > > Do you think that the need for avoiding clock glitches is i.MX6 specific? > > The errata I mentioned is specific for the bootloader software, but (I think) > > the requirement for switching off the clocks gates prior changing the dividers > > may apply also for other series. > > I honestly don't know, perhaps Han have more details about it. If you > think it's a wider issue, then we can just do the disable/enable step > without any further checks. > Still don't explain why it was working on the old driver. The glitch was already there, so just a delay can do the trick. For imx28 we need to reparent to a different clock the nand driver in order to get the frequency we want in EDO mode. I'm still thinking that set the frequency without without get back and understand if in that edo mode is valid is still a bug. Michael > > > Actually we should ensure clks are enabled in the !imx6 case anyway, > > > but this is needed only once so either we keep enabling the clock in > > > the probe or we check here if the clk has already been enabled or not. > > The clocks are already enabled (and kept on) in probe. The initial call to > > clk_set_rate() is just above this (but the clocks are not disabled at this > > stage as all gates have been enabled by the boot loader). > > The IO clock should be enabled and set to a particular rate the first > time the die is selected to perform a NAND operation, or when we switch > from one device to the other (this does not apply to the GPMI driver > for now). So we can drop the enable/set_rate call in the probe if the > assumption that this clock only feeds the external bus is right. > > Thanks, > Miquèl
On Friday, 8 October 2021, 18:07:52 CEST, Miquel Raynal wrote: > Hi Christian, > > ceggers@arri.de wrote on Fri, 8 Oct 2021 15:49:21 +0200: > > > On Friday, 8 October 2021, 15:36:31 CEST, Miquel Raynal wrote: > > > > > > miquel.raynal@bootlin.com wrote on Fri, 8 Oct 2021 15:29:05 +0200: > > > > > > > > > > > If this clock (as I understand) does not prevent us to access the > > > > registers but only feeds the external NAND bus part, then there is no > > > > need to enable it in the probe, just acquiring it will be enough. > > > > clocks[0] is "gpmi_io" which sounds more like i/o than registers. So lets > > try to remove the initial call to clk_set_rate(). From the GPMI description (i.MX6 ULL): > [GPMI] Registers are clocked on the HCLK domain. The I/O and pin timing are > clocked on a dedicated GPMICLK domain. GPMICLK can be set to maximize I/O > performance. Additionally, figure 17-1 in IMX6ULLRM.pdf shows that both (BCH and GPMI) registers are connected to APBH. I checked this with the debugger: For accessing the BCH and GPMI registers, only CCGR0::CG2 [APBHDMA_HCLK_ENABLE] is required. This bit is enabled in mxs_dma_alloc_chan_resources(): -000|mxs_dma_alloc_chan_resources(chan = 0xC2090154) -001|__refcount_inc_not_zero(inline) -001|refcount_inc_not_zero(inline) -001|kref_get_unless_zero(inline) -001|dma_chan_get(:chan = 0xC2090154) -002|find_candidate(device = 0xC2090050, :mask = 0xC209BD38, :fn = 0xC0254679, :fn_param = 0xC209BD3C) -003|__dma_request_channel(:mask = 0xC209BD38, :fn = 0xC0254679, :fn_param = 0xC209BD3C, np = 0xC7EEB0B0) -004|mxs_dma_xlate(:dma_spec = 0xC209BD60, :ofdma = 0xC2264840) -005|of_dma_request_slave_channel(np = 0xC7EEB2B4, name = 0xC04F923A -> "rx-tx") -006|dma_request_chan(dev = 0xC2182410, :name = 0xC04F923A -> "rx-tx") -007|acquire_dma_channels(inline) -007|acquire_resources(:this = 0xC200F840) -008|gpmi_nand_probe(:pdev = 0xC2182400) The root clock is BCH_CLK_ROOT. It doesn't depend on ENFC_PRED or ENFC_PODF (the dividers which are actually set in clk_set_rate(r->clock[0], 22000000)). --> clk_set_rate(r->clock[0], 22000000) is not required for accessing the registers. I removed the call entirely and everything works fine. > > > > > > Then, the first call for an IO operation with ->must_apply_timings > > > > should: > > > > > > > > if (imx6) > > > > disable_clk(); > > > > > > > > clk_set_rate(); > > > > > > > > if (imx6) > > > > enable_clk(); > > > > Do you think that the need for avoiding clock glitches is i.MX6 specific? > > The errata I mentioned is specific for the bootloader software, but (I think) > > the requirement for switching off the clocks gates prior changing the dividers > > may apply also for other series. > > I honestly don't know, perhaps Han have more details about it. If you > think it's a wider issue, then we can just do the disable/enable step > without any further checks. I also don't know. I can not find the required sequence in the reference manual (only in the errata sheet), so I cannot compare with other series. For best performance we can start with checking for GPMI_IS_MX6Q(x) and extend it later if this issue comes up on other devices. I sent a question for this on NXP community: https://community.nxp.com/t5/i-MX-Processors/ERR007117-Which-i-MX-devices-require-gating-the-clocks-when/m-p/1353018 > > > Actually we should ensure clks are enabled in the !imx6 case anyway, > > > but this is needed only once so either we keep enabling the clock in > > > the probe or we check here if the clk has already been enabled or not. > > The clocks are already enabled (and kept on) in probe. The initial call to > > clk_set_rate() is just above this (but the clocks are not disabled at this > > stage as all gates have been enabled by the boot loader). > > The IO clock should be enabled and set to a particular rate the first > time the die is selected to perform a NAND operation, or when we switch > from one device to the other (this does not apply to the GPMI driver > for now). So we can drop the enable/set_rate call in the probe if the > assumption that this clock only feeds the external bus is right. I think that this assumption is right. regards Christian
On Friday, 8 October 2021, 14:08:01 CEST, Stefan Riedmüller wrote: > On Fri, 2021-10-08 at 11:55 +0200, Christian Eggers wrote: > > @Stefan Riedmueller: Are you willing to commit this upstream? > > Yes sure, I can prepare a patch beginning of next week. > BTW, we have seen these DMA timeout issues on the i.MX6 SOCs as well. So this > fix is not only for the i.MX 6ULL. Current status: - If have entirely removed the following part: if (GPMI_IS_MX6(this)) /* * Set the default value for the gpmi clock. * * If you want to use the ONFI nand which is in the * Synchronous Mode, you should change the clock as you need. */ clk_set_rate(r->clock[0], 22000000); - I applied your patch: https://git.phytec.de/linux-mainline/commit/?h=v5.10.48-phy&id=866939ea8110764d9c12af960d746e2f7f5debe3 Last night I made a cycle test with a phyCORE i.MX6ULL. Over 1700 cycles were successful! regards Christian
Hi Michael, michael@amarulasolutions.com wrote on Sat, 9 Oct 2021 07:53:26 +0200: > Hi > > On Fri, Oct 8, 2021 at 6:07 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > Hi Christian, > > > > ceggers@arri.de wrote on Fri, 8 Oct 2021 15:49:21 +0200: > > > > > On Friday, 8 October 2021, 15:36:31 CEST, Miquel Raynal wrote: > > > > > > > > miquel.raynal@bootlin.com wrote on Fri, 8 Oct 2021 15:29:05 +0200: > > > > > > > > > > > > > > If this clock (as I understand) does not prevent us to access the > > > > > registers but only feeds the external NAND bus part, then there is no > > > > > need to enable it in the probe, just acquiring it will be enough. > > > > > > clocks[0] is "gpmi_io" which sounds more like i/o than registers. So lets > > > try to remove the initial call to clk_set_rate(). > > > > > > > > Then, the first call for an IO operation with ->must_apply_timings > > > > > should: > > > > > > > > > > if (imx6) > > > > > disable_clk(); > > > > > > > > > > clk_set_rate(); > > > > > > > > > > if (imx6) > > > > > enable_clk(); > > > > > > Do you think that the need for avoiding clock glitches is i.MX6 specific? > > > The errata I mentioned is specific for the bootloader software, but (I think) > > > the requirement for switching off the clocks gates prior changing the dividers > > > may apply also for other series. > > > > I honestly don't know, perhaps Han have more details about it. If you > > think it's a wider issue, then we can just do the disable/enable step > > without any further checks. > > > > Still don't explain why it was working on the old driver. The glitch > was already there, > so just a delay can do the trick. For imx28 we need to reparent to a > different clock the > nand driver in order to get the frequency we want in EDO mode. I'm > still thinking that > set the frequency without without get back and understand if in that > edo mode is valid > is still a bug. There are possibly two bugs here, I am also in favor of checking that the received frequencies match what we expect, so please provide a patch to also cover that situation. Thanks, Miquèl
On Saturday, 9 October 2021, 08:26:36 CEST, Christian Eggers wrote: > > > Do you think that the need for avoiding clock glitches is i.MX6 specific? > > > The errata I mentioned is specific for the bootloader software, but (I think) > > > the requirement for switching off the clocks gates prior changing the dividers > > > may apply also for other series. > > > > I honestly don't know, perhaps Han have more details about it. If you > > think it's a wider issue, then we can just do the disable/enable step > > without any further checks. > I also don't know. I can not find the required sequence in the reference manual > (only in the errata sheet), so I cannot compare with other series. For best > performance we can start with checking for GPMI_IS_MX6Q(x) and extend it later > if this issue comes up on other devices. > > I sent a question for this on NXP community: > https://community.nxp.com/t5/i-MX-Processors/ERR007117-Which-i-MX-devices-require-gating-the-clocks-when/m-p/1353018 > > 1. Which i.MX models / series require this sequence? > 2. Where can I find this sequence in the reference manuals (e.g. for i.MX6 ULL)? > 3. How is CCM_CCGR2[CG7] (iomux_ipt_clk_io_clk) related to "gating enfc_clk_root"? @Han Xu: Can you provide further information about this? Do you have contact to the hardware developers? regards Christian
Hi On Mon, Feb 01, 2021 at 04:14:33PM +0100, Miquel Raynal wrote: > Hi Michael, > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > Mon, 1 Feb 2021 16:08:23 +0100: > > > Hi > > > > On Mon, Feb 1, 2021 at 3:32 PM Michael Nazzareno Trimarchi > > <michael@amarulasolutions.com> wrote: > > > > > > Hi > > > > > > On Mon, Feb 1, 2021 at 3:13 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > > > > > Hi Michael, > > > > > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > > > Sat, 30 Jan 2021 10:41:29 +0100: > > > > > > > > > Hi Miquel > > > > > > > > > > commit f8e6ad14388067f91b26d044185d95623fbc9535 > > > > > Author: Michael Trimarchi <michael@amarulasolutions.com> > > > > > Date: Fri Jan 29 08:46:53 2021 +0100 > > > > > > > > > > mtd: nand: Calculate the clock before enable it > > > > > > > > > > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > > > > > Change-Id: I79b0da39de0a9b32ea0b002fa200d7f44d4f8ce7 > > > > > > > > > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > index 322a008290e5..0bca52b3bc8f 100644 > > > > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > @@ -377,6 +377,7 @@ static void gpmi_nfc_compute_timings(struct > > > > > gpmi_nand_data *this, > > > > > const struct nand_sdr_timings *sdr) > > > > > { > > > > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > > > > + struct resources *r = &this->resources; > > > > > unsigned int dll_threshold_ps = this->devdata->max_chain_delay; > > > > > unsigned int period_ps, reference_period_ps; > > > > > unsigned int data_setup_cycles, data_hold_cycles, addr_setup_cycles; > > > > > @@ -440,6 +441,8 @@ static void gpmi_nfc_compute_timings(struct > > > > > gpmi_nand_data *this, > > > > > hw->ctrl1n |= BF_GPMI_CTRL1_RDN_DELAY(sample_delay_factor) | > > > > > BM_GPMI_CTRL1_DLL_ENABLE | > > > > > (use_half_period ? BM_GPMI_CTRL1_HALF_PERIOD : 0); > > > > > + > > > > > + clk_set_rate(r->clock[0], hw->clk_rate); > > > > > } > > > > > > > > > > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > @@ -449,8 +452,6 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > void __iomem *gpmi_regs = r->gpmi_regs; > > > > > unsigned int dll_wait_time_us; > > > > > > > > > > - clk_set_rate(r->clock[0], hw->clk_rate); > > > > > - > > > > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > > > > > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > > > > > > > > > Right now I have this change applied and seems fine. That is the only > > > > > difference I get. Clock is apply a bit earlier that when is enabled > > > > > it. > > > > > > > > This is very interesting. So this would mean the issue you are > > > > experiencing comes from the clock driver which kind of returns too > > > > early from clk_set_rate()? Could you report this to the clk ML/NXP clk > > > > maintainers and keep us in copy? If it is as global as it sounds, we > > > > might not be the only ones affected. > > > > > > > > > > The imx28 is broken too, so it's a general problem. I need to trace it down > > > I have a reverting for lts but it\s not the way to go > > > > > > > For imx28 you ask to set the rate to 22Mhz but you don't care about the clock > > that you get back. You get back 12Mhz because the base clock is 24 Mhz and seems > > that it can not get the point. You need to check if the clock > > requested is in range or ask > > for set_rate_clk_min to avoid to have somenthing lower. Then for > > imx6ull because is sporadic > > I think that is more connected to the clk_set_rate and when you change > > the register. Can not be a > > setting time? > > So, if I understand correctly, we face two different problems: > - imx6*: seems like a clock issue regarding the clock settlement > - imx28: actual NAND driver issue (does not check the validity of the > new frequency). This should be handled properly in > ->setup_interface(). > Somenthing like this? Not compile/tested diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c index 4d08e4ab5c1b..cc8146ab1b78 100644 --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c @@ -644,7 +644,7 @@ static int bch_set_geometry(struct gpmi_nand_data *this) * RDN_DELAY = ----------------------- {3} * RP */ -static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, +static int gpmi_nfc_compute_timings(struct gpmi_nand_data *this, const struct nand_sdr_timings *sdr) { struct gpmi_nfc_hardware_timing *hw = &this->hw; @@ -656,6 +656,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, int sample_delay_ps, sample_delay_factor; u16 busy_timeout_cycles; u8 wrn_dly_sel; + long clk_rate; if (sdr->tRC_min >= 30000) { /* ONFI non-EDO modes [0-3] */ @@ -671,6 +672,10 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY; } + clk_rate = clk_round_rate(r->clock[0], hw->clk_rate); + if (clk_rate < hw->clk_rate || clk_rate <= 0) + return -ENOTSUPP; + /* SDR core timings are given in picoseconds */ period_ps = div_u64((u64)NSEC_PER_SEC * 1000, hw->clk_rate); @@ -746,6 +751,7 @@ static int gpmi_setup_interface(struct nand_chip *chip, int chipnr, { struct gpmi_nand_data *this = nand_get_controller_data(chip); const struct nand_sdr_timings *sdr; + int ret = 0; /* Retrieve required NAND timings */ sdr = nand_get_sdr_timings(conf); @@ -761,11 +767,11 @@ static int gpmi_setup_interface(struct nand_chip *chip, int chipnr, return 0; /* Do the actual derivation of the controller timings */ - gpmi_nfc_compute_timings(this, sdr); - - this->hw.must_apply_timings = true; + ret = gpmi_nfc_compute_timings(this, sdr); + if (!ret) + this->hw.must_apply_timings = true; - return 0; + return ret; } /* Clears a BCH interrupt. */ > Thanks, > Miquèl
Hi On Fri, Oct 15, 2021 at 10:05 PM Michael Trimarchi <michael@amarulasolutions.com> wrote: > > Hi > > On Mon, Feb 01, 2021 at 04:14:33PM +0100, Miquel Raynal wrote: > > Hi Michael, > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > Mon, 1 Feb 2021 16:08:23 +0100: > > > > > Hi > > > > > > On Mon, Feb 1, 2021 at 3:32 PM Michael Nazzareno Trimarchi > > > <michael@amarulasolutions.com> wrote: > > > > > > > > Hi > > > > > > > > On Mon, Feb 1, 2021 at 3:13 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > > > > > > > Hi Michael, > > > > > > > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > > > > Sat, 30 Jan 2021 10:41:29 +0100: > > > > > > > > > > > Hi Miquel > > > > > > > > > > > > commit f8e6ad14388067f91b26d044185d95623fbc9535 > > > > > > Author: Michael Trimarchi <michael@amarulasolutions.com> > > > > > > Date: Fri Jan 29 08:46:53 2021 +0100 > > > > > > > > > > > > mtd: nand: Calculate the clock before enable it > > > > > > > > > > > > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > > > > > > Change-Id: I79b0da39de0a9b32ea0b002fa200d7f44d4f8ce7 > > > > > > > > > > > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > index 322a008290e5..0bca52b3bc8f 100644 > > > > > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > @@ -377,6 +377,7 @@ static void gpmi_nfc_compute_timings(struct > > > > > > gpmi_nand_data *this, > > > > > > const struct nand_sdr_timings *sdr) > > > > > > { > > > > > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > > > > > + struct resources *r = &this->resources; > > > > > > unsigned int dll_threshold_ps = this->devdata->max_chain_delay; > > > > > > unsigned int period_ps, reference_period_ps; > > > > > > unsigned int data_setup_cycles, data_hold_cycles, addr_setup_cycles; > > > > > > @@ -440,6 +441,8 @@ static void gpmi_nfc_compute_timings(struct > > > > > > gpmi_nand_data *this, > > > > > > hw->ctrl1n |= BF_GPMI_CTRL1_RDN_DELAY(sample_delay_factor) | > > > > > > BM_GPMI_CTRL1_DLL_ENABLE | > > > > > > (use_half_period ? BM_GPMI_CTRL1_HALF_PERIOD : 0); > > > > > > + > > > > > > + clk_set_rate(r->clock[0], hw->clk_rate); > > > > > > } > > > > > > > > > > > > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > > @@ -449,8 +452,6 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > > void __iomem *gpmi_regs = r->gpmi_regs; > > > > > > unsigned int dll_wait_time_us; > > > > > > > > > > > > - clk_set_rate(r->clock[0], hw->clk_rate); > > > > > > - > > > > > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > > > > > > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > > > > > > > > > > > Right now I have this change applied and seems fine. That is the only > > > > > > difference I get. Clock is apply a bit earlier that when is enabled > > > > > > it. > > > > > > > > > > This is very interesting. So this would mean the issue you are > > > > > experiencing comes from the clock driver which kind of returns too > > > > > early from clk_set_rate()? Could you report this to the clk ML/NXP clk > > > > > maintainers and keep us in copy? If it is as global as it sounds, we > > > > > might not be the only ones affected. > > > > > > > > > > > > > The imx28 is broken too, so it's a general problem. I need to trace it down > > > > I have a reverting for lts but it\s not the way to go > > > > > > > > > > For imx28 you ask to set the rate to 22Mhz but you don't care about the clock > > > that you get back. You get back 12Mhz because the base clock is 24 Mhz and seems > > > that it can not get the point. You need to check if the clock > > > requested is in range or ask > > > for set_rate_clk_min to avoid to have somenthing lower. Then for > > > imx6ull because is sporadic > > > I think that is more connected to the clk_set_rate and when you change > > > the register. Can not be a > > > setting time? > > > > So, if I understand correctly, we face two different problems: > > - imx6*: seems like a clock issue regarding the clock settlement > > - imx28: actual NAND driver issue (does not check the validity of the > > new frequency). This should be handled properly in > > ->setup_interface(). > > > > Somenthing like this? Not compile/tested > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > index 4d08e4ab5c1b..cc8146ab1b78 100644 > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > @@ -644,7 +644,7 @@ static int bch_set_geometry(struct gpmi_nand_data *this) > * RDN_DELAY = ----------------------- {3} > * RP > */ > -static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > +static int gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > const struct nand_sdr_timings *sdr) > { > struct gpmi_nfc_hardware_timing *hw = &this->hw; > @@ -656,6 +656,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > int sample_delay_ps, sample_delay_factor; > u16 busy_timeout_cycles; > u8 wrn_dly_sel; > + long clk_rate; > > if (sdr->tRC_min >= 30000) { > /* ONFI non-EDO modes [0-3] */ > @@ -671,6 +672,10 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY; > } > > + clk_rate = clk_round_rate(r->clock[0], hw->clk_rate); > + if (clk_rate < hw->clk_rate || clk_rate <= 0) > + return -ENOTSUPP; > + > /* SDR core timings are given in picoseconds */ > period_ps = div_u64((u64)NSEC_PER_SEC * 1000, hw->clk_rate); Not sure here or: period_ps = div_u64((u64)NSEC_PER_SEC * 1000, clk_rate); Michael > > @@ -746,6 +751,7 @@ static int gpmi_setup_interface(struct nand_chip *chip, int chipnr, > { > struct gpmi_nand_data *this = nand_get_controller_data(chip); > const struct nand_sdr_timings *sdr; > + int ret = 0; > > /* Retrieve required NAND timings */ > sdr = nand_get_sdr_timings(conf); > @@ -761,11 +767,11 @@ static int gpmi_setup_interface(struct nand_chip *chip, int chipnr, > return 0; > > /* Do the actual derivation of the controller timings */ > - gpmi_nfc_compute_timings(this, sdr); > - > - this->hw.must_apply_timings = true; > + ret = gpmi_nfc_compute_timings(this, sdr); > + if (!ret) > + this->hw.must_apply_timings = true; > > - return 0; > + return ret; > } > > /* Clears a BCH interrupt. */ > > Thanks, > > Miquèl
Hi Michael, michael@amarulasolutions.com wrote on Fri, 15 Oct 2021 22:05:41 +0200: > Hi > > On Mon, Feb 01, 2021 at 04:14:33PM +0100, Miquel Raynal wrote: > > Hi Michael, > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > Mon, 1 Feb 2021 16:08:23 +0100: > > > > > Hi > > > > > > On Mon, Feb 1, 2021 at 3:32 PM Michael Nazzareno Trimarchi > > > <michael@amarulasolutions.com> wrote: > > > > > > > > Hi > > > > > > > > On Mon, Feb 1, 2021 at 3:13 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > > > > > > > Hi Michael, > > > > > > > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > > > > Sat, 30 Jan 2021 10:41:29 +0100: > > > > > > > > > > > Hi Miquel > > > > > > > > > > > > commit f8e6ad14388067f91b26d044185d95623fbc9535 > > > > > > Author: Michael Trimarchi <michael@amarulasolutions.com> > > > > > > Date: Fri Jan 29 08:46:53 2021 +0100 > > > > > > > > > > > > mtd: nand: Calculate the clock before enable it > > > > > > > > > > > > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > > > > > > Change-Id: I79b0da39de0a9b32ea0b002fa200d7f44d4f8ce7 > > > > > > > > > > > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > index 322a008290e5..0bca52b3bc8f 100644 > > > > > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > @@ -377,6 +377,7 @@ static void gpmi_nfc_compute_timings(struct > > > > > > gpmi_nand_data *this, > > > > > > const struct nand_sdr_timings *sdr) > > > > > > { > > > > > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > > > > > + struct resources *r = &this->resources; > > > > > > unsigned int dll_threshold_ps = this->devdata->max_chain_delay; > > > > > > unsigned int period_ps, reference_period_ps; > > > > > > unsigned int data_setup_cycles, data_hold_cycles, addr_setup_cycles; > > > > > > @@ -440,6 +441,8 @@ static void gpmi_nfc_compute_timings(struct > > > > > > gpmi_nand_data *this, > > > > > > hw->ctrl1n |= BF_GPMI_CTRL1_RDN_DELAY(sample_delay_factor) | > > > > > > BM_GPMI_CTRL1_DLL_ENABLE | > > > > > > (use_half_period ? BM_GPMI_CTRL1_HALF_PERIOD : 0); > > > > > > + > > > > > > + clk_set_rate(r->clock[0], hw->clk_rate); > > > > > > } > > > > > > > > > > > > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > > @@ -449,8 +452,6 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > > void __iomem *gpmi_regs = r->gpmi_regs; > > > > > > unsigned int dll_wait_time_us; > > > > > > > > > > > > - clk_set_rate(r->clock[0], hw->clk_rate); > > > > > > - > > > > > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > > > > > > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > > > > > > > > > > > Right now I have this change applied and seems fine. That is the only > > > > > > difference I get. Clock is apply a bit earlier that when is enabled > > > > > > it. > > > > > > > > > > This is very interesting. So this would mean the issue you are > > > > > experiencing comes from the clock driver which kind of returns too > > > > > early from clk_set_rate()? Could you report this to the clk ML/NXP clk > > > > > maintainers and keep us in copy? If it is as global as it sounds, we > > > > > might not be the only ones affected. > > > > > > > > > > > > > The imx28 is broken too, so it's a general problem. I need to trace it down > > > > I have a reverting for lts but it\s not the way to go > > > > > > > > > > For imx28 you ask to set the rate to 22Mhz but you don't care about the clock > > > that you get back. You get back 12Mhz because the base clock is 24 Mhz and seems > > > that it can not get the point. You need to check if the clock > > > requested is in range or ask > > > for set_rate_clk_min to avoid to have somenthing lower. Then for > > > imx6ull because is sporadic > > > I think that is more connected to the clk_set_rate and when you change > > > the register. Can not be a > > > setting time? > > > > So, if I understand correctly, we face two different problems: > > - imx6*: seems like a clock issue regarding the clock settlement > > - imx28: actual NAND driver issue (does not check the validity of the > > new frequency). This should be handled properly in > > ->setup_interface(). > > > > Somenthing like this? Not compile/tested > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > index 4d08e4ab5c1b..cc8146ab1b78 100644 > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > @@ -644,7 +644,7 @@ static int bch_set_geometry(struct gpmi_nand_data *this) > * RDN_DELAY = ----------------------- {3} > * RP > */ > -static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > +static int gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > const struct nand_sdr_timings *sdr) > { > struct gpmi_nfc_hardware_timing *hw = &this->hw; > @@ -656,6 +656,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > int sample_delay_ps, sample_delay_factor; > u16 busy_timeout_cycles; > u8 wrn_dly_sel; > + long clk_rate; > > if (sdr->tRC_min >= 30000) { > /* ONFI non-EDO modes [0-3] */ > @@ -671,6 +672,10 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY; > } > > + clk_rate = clk_round_rate(r->clock[0], hw->clk_rate); > + if (clk_rate < hw->clk_rate || clk_rate <= 0) > + return -ENOTSUPP; I believe clk_rate < hw->clk_rate will always match cases where clk_rate <= 0 ? The check looks very strict though. Will it even pass on i.MX6? Perhaps we could verify something like a 10% error which might grab all the erroneous situations? if (abs(clk_rate - hw->clk_rate) > (hw->clk_rate / 10)) return -ENOTSUPP; > + > /* SDR core timings are given in picoseconds */ > period_ps = div_u64((u64)NSEC_PER_SEC * 1000, hw->clk_rate); > > @@ -746,6 +751,7 @@ static int gpmi_setup_interface(struct nand_chip *chip, int chipnr, > { > struct gpmi_nand_data *this = nand_get_controller_data(chip); > const struct nand_sdr_timings *sdr; > + int ret = 0; > > /* Retrieve required NAND timings */ > sdr = nand_get_sdr_timings(conf); > @@ -761,11 +767,11 @@ static int gpmi_setup_interface(struct nand_chip *chip, int chipnr, > return 0; > > /* Do the actual derivation of the controller timings */ > - gpmi_nfc_compute_timings(this, sdr); > - > - this->hw.must_apply_timings = true; > + ret = gpmi_nfc_compute_timings(this, sdr); > + if (!ret) > + this->hw.must_apply_timings = true; > > - return 0; > + return ret; > } > > /* Clears a BCH interrupt. */ > > Thanks, > > Miquèl Otherwise looks good, thanks! Miquèl
Hi On Mon, Oct 18, 2021 at 9:19 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > Hi Michael, > > michael@amarulasolutions.com wrote on Fri, 15 Oct 2021 22:05:41 +0200: > > > Hi > > > > On Mon, Feb 01, 2021 at 04:14:33PM +0100, Miquel Raynal wrote: > > > Hi Michael, > > > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > > Mon, 1 Feb 2021 16:08:23 +0100: > > > > > > > Hi > > > > > > > > On Mon, Feb 1, 2021 at 3:32 PM Michael Nazzareno Trimarchi > > > > <michael@amarulasolutions.com> wrote: > > > > > > > > > > Hi > > > > > > > > > > On Mon, Feb 1, 2021 at 3:13 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > > > > > > > > > > > Hi Michael, > > > > > > > > > > > > Michael Nazzareno Trimarchi <michael@amarulasolutions.com> wrote on > > > > > > Sat, 30 Jan 2021 10:41:29 +0100: > > > > > > > > > > > > > Hi Miquel > > > > > > > > > > > > > > commit f8e6ad14388067f91b26d044185d95623fbc9535 > > > > > > > Author: Michael Trimarchi <michael@amarulasolutions.com> > > > > > > > Date: Fri Jan 29 08:46:53 2021 +0100 > > > > > > > > > > > > > > mtd: nand: Calculate the clock before enable it > > > > > > > > > > > > > > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > > > > > > > Change-Id: I79b0da39de0a9b32ea0b002fa200d7f44d4f8ce7 > > > > > > > > > > > > > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > > b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > > index 322a008290e5..0bca52b3bc8f 100644 > > > > > > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > > @@ -377,6 +377,7 @@ static void gpmi_nfc_compute_timings(struct > > > > > > > gpmi_nand_data *this, > > > > > > > const struct nand_sdr_timings *sdr) > > > > > > > { > > > > > > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > > > > > > + struct resources *r = &this->resources; > > > > > > > unsigned int dll_threshold_ps = this->devdata->max_chain_delay; > > > > > > > unsigned int period_ps, reference_period_ps; > > > > > > > unsigned int data_setup_cycles, data_hold_cycles, addr_setup_cycles; > > > > > > > @@ -440,6 +441,8 @@ static void gpmi_nfc_compute_timings(struct > > > > > > > gpmi_nand_data *this, > > > > > > > hw->ctrl1n |= BF_GPMI_CTRL1_RDN_DELAY(sample_delay_factor) | > > > > > > > BM_GPMI_CTRL1_DLL_ENABLE | > > > > > > > (use_half_period ? BM_GPMI_CTRL1_HALF_PERIOD : 0); > > > > > > > + > > > > > > > + clk_set_rate(r->clock[0], hw->clk_rate); > > > > > > > } > > > > > > > > > > > > > > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > > > @@ -449,8 +452,6 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > > > void __iomem *gpmi_regs = r->gpmi_regs; > > > > > > > unsigned int dll_wait_time_us; > > > > > > > > > > > > > > - clk_set_rate(r->clock[0], hw->clk_rate); > > > > > > > - > > > > > > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > > > > > > > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > > > > > > > > > > > > > Right now I have this change applied and seems fine. That is the only > > > > > > > difference I get. Clock is apply a bit earlier that when is enabled > > > > > > > it. > > > > > > > > > > > > This is very interesting. So this would mean the issue you are > > > > > > experiencing comes from the clock driver which kind of returns too > > > > > > early from clk_set_rate()? Could you report this to the clk ML/NXP clk > > > > > > maintainers and keep us in copy? If it is as global as it sounds, we > > > > > > might not be the only ones affected. > > > > > > > > > > > > > > > > The imx28 is broken too, so it's a general problem. I need to trace it down > > > > > I have a reverting for lts but it\s not the way to go > > > > > > > > > > > > > For imx28 you ask to set the rate to 22Mhz but you don't care about the clock > > > > that you get back. You get back 12Mhz because the base clock is 24 Mhz and seems > > > > that it can not get the point. You need to check if the clock > > > > requested is in range or ask > > > > for set_rate_clk_min to avoid to have somenthing lower. Then for > > > > imx6ull because is sporadic > > > > I think that is more connected to the clk_set_rate and when you change > > > > the register. Can not be a > > > > setting time? > > > > > > So, if I understand correctly, we face two different problems: > > > - imx6*: seems like a clock issue regarding the clock settlement > > > - imx28: actual NAND driver issue (does not check the validity of the > > > new frequency). This should be handled properly in > > > ->setup_interface(). > > > > > > > Somenthing like this? Not compile/tested > > > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > > index 4d08e4ab5c1b..cc8146ab1b78 100644 > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > > @@ -644,7 +644,7 @@ static int bch_set_geometry(struct gpmi_nand_data *this) > > * RDN_DELAY = ----------------------- {3} > > * RP > > */ > > -static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > > +static int gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > > const struct nand_sdr_timings *sdr) > > { > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > @@ -656,6 +656,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > > int sample_delay_ps, sample_delay_factor; > > u16 busy_timeout_cycles; > > u8 wrn_dly_sel; > > + long clk_rate; > > > > if (sdr->tRC_min >= 30000) { > > /* ONFI non-EDO modes [0-3] */ > > @@ -671,6 +672,10 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > > wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY; > > } > > > > + clk_rate = clk_round_rate(r->clock[0], hw->clk_rate); > > + if (clk_rate < hw->clk_rate || clk_rate <= 0) > > + return -ENOTSUPP; > > I believe clk_rate < hw->clk_rate will always match cases where > clk_rate <= 0 ? > > The check looks very strict though. Will it even pass on i.MX6? Perhaps > we could verify something like a 10% error which might grab all the > erroneous situations? According to what I read the clk is the min that we can accept. So any clock from EDO4 to EDO5 should be ok. My concern is that calculation. I need to read it properly. I don't think that put 10% or any will help us, until we now that is possible or not. I will even anyway put a warning Michael > > if (abs(clk_rate - hw->clk_rate) > (hw->clk_rate / 10)) > return -ENOTSUPP; > > > + > > /* SDR core timings are given in picoseconds */ > > period_ps = div_u64((u64)NSEC_PER_SEC * 1000, hw->clk_rate); > > > > @@ -746,6 +751,7 @@ static int gpmi_setup_interface(struct nand_chip *chip, int chipnr, > > { > > struct gpmi_nand_data *this = nand_get_controller_data(chip); > > const struct nand_sdr_timings *sdr; > > + int ret = 0; > > > > /* Retrieve required NAND timings */ > > sdr = nand_get_sdr_timings(conf); > > @@ -761,11 +767,11 @@ static int gpmi_setup_interface(struct nand_chip *chip, int chipnr, > > return 0; > > > > /* Do the actual derivation of the controller timings */ > > - gpmi_nfc_compute_timings(this, sdr); > > - > > - this->hw.must_apply_timings = true; > > + ret = gpmi_nfc_compute_timings(this, sdr); > > + if (!ret) > > + this->hw.must_apply_timings = true; > > > > - return 0; > > + return ret; > > } > > > > /* Clears a BCH interrupt. */ > > > Thanks, > > > Miquèl > > Otherwise looks good, thanks! > > Miquèl -- Michael Nazzareno Trimarchi Co-Founder & Chief Executive Officer M. +39 347 913 2170 michael@amarulasolutions.com
Hi Michael, > > > > > > > > commit f8e6ad14388067f91b26d044185d95623fbc9535 > > > > > > > > Author: Michael Trimarchi <michael@amarulasolutions.com> > > > > > > > > Date: Fri Jan 29 08:46:53 2021 +0100 > > > > > > > > > > > > > > > > mtd: nand: Calculate the clock before enable it > > > > > > > > > > > > > > > > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> > > > > > > > > Change-Id: I79b0da39de0a9b32ea0b002fa200d7f44d4f8ce7 > > > > > > > > > > > > > > > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > > > b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > > > index 322a008290e5..0bca52b3bc8f 100644 > > > > > > > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > > > > > > > @@ -377,6 +377,7 @@ static void gpmi_nfc_compute_timings(struct > > > > > > > > gpmi_nand_data *this, > > > > > > > > const struct nand_sdr_timings *sdr) > > > > > > > > { > > > > > > > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > > > > > > > + struct resources *r = &this->resources; > > > > > > > > unsigned int dll_threshold_ps = this->devdata->max_chain_delay; > > > > > > > > unsigned int period_ps, reference_period_ps; > > > > > > > > unsigned int data_setup_cycles, data_hold_cycles, addr_setup_cycles; > > > > > > > > @@ -440,6 +441,8 @@ static void gpmi_nfc_compute_timings(struct > > > > > > > > gpmi_nand_data *this, > > > > > > > > hw->ctrl1n |= BF_GPMI_CTRL1_RDN_DELAY(sample_delay_factor) | > > > > > > > > BM_GPMI_CTRL1_DLL_ENABLE | > > > > > > > > (use_half_period ? BM_GPMI_CTRL1_HALF_PERIOD : 0); > > > > > > > > + > > > > > > > > + clk_set_rate(r->clock[0], hw->clk_rate); > > > > > > > > } > > > > > > > > > > > > > > > > void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > > > > @@ -449,8 +452,6 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > > > > > > > void __iomem *gpmi_regs = r->gpmi_regs; > > > > > > > > unsigned int dll_wait_time_us; > > > > > > > > > > > > > > > > - clk_set_rate(r->clock[0], hw->clk_rate); > > > > > > > > - > > > > > > > > writel(hw->timing0, gpmi_regs + HW_GPMI_TIMING0); > > > > > > > > writel(hw->timing1, gpmi_regs + HW_GPMI_TIMING1); > > > > > > > > > > > > > > > > Right now I have this change applied and seems fine. That is the only > > > > > > > > difference I get. Clock is apply a bit earlier that when is enabled > > > > > > > > it. > > > > > > > > > > > > > > This is very interesting. So this would mean the issue you are > > > > > > > experiencing comes from the clock driver which kind of returns too > > > > > > > early from clk_set_rate()? Could you report this to the clk ML/NXP clk > > > > > > > maintainers and keep us in copy? If it is as global as it sounds, we > > > > > > > might not be the only ones affected. > > > > > > > > > > > > > > > > > > > The imx28 is broken too, so it's a general problem. I need to trace it down > > > > > > I have a reverting for lts but it\s not the way to go > > > > > > > > > > > > > > > > For imx28 you ask to set the rate to 22Mhz but you don't care about the clock > > > > > that you get back. You get back 12Mhz because the base clock is 24 Mhz and seems > > > > > that it can not get the point. You need to check if the clock > > > > > requested is in range or ask > > > > > for set_rate_clk_min to avoid to have somenthing lower. Then for > > > > > imx6ull because is sporadic > > > > > I think that is more connected to the clk_set_rate and when you change > > > > > the register. Can not be a > > > > > setting time? > > > > > > > > So, if I understand correctly, we face two different problems: > > > > - imx6*: seems like a clock issue regarding the clock settlement > > > > - imx28: actual NAND driver issue (does not check the validity of the > > > > new frequency). This should be handled properly in > > > > ->setup_interface(). > > > > > > > > > > Somenthing like this? Not compile/tested > > > > > > diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > > > index 4d08e4ab5c1b..cc8146ab1b78 100644 > > > --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > > > +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c > > > @@ -644,7 +644,7 @@ static int bch_set_geometry(struct gpmi_nand_data *this) > > > * RDN_DELAY = ----------------------- {3} > > > * RP > > > */ > > > -static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > > > +static int gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > > > const struct nand_sdr_timings *sdr) > > > { > > > struct gpmi_nfc_hardware_timing *hw = &this->hw; > > > @@ -656,6 +656,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > > > int sample_delay_ps, sample_delay_factor; > > > u16 busy_timeout_cycles; > > > u8 wrn_dly_sel; > > > + long clk_rate; > > > > > > if (sdr->tRC_min >= 30000) { > > > /* ONFI non-EDO modes [0-3] */ > > > @@ -671,6 +672,10 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > > > wrn_dly_sel = BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY; > > > } > > > > > > + clk_rate = clk_round_rate(r->clock[0], hw->clk_rate); > > > + if (clk_rate < hw->clk_rate || clk_rate <= 0) > > > + return -ENOTSUPP; > > > > I believe clk_rate < hw->clk_rate will always match cases where > > clk_rate <= 0 ? > > > > The check looks very strict though. Will it even pass on i.MX6? Perhaps > > we could verify something like a 10% error which might grab all the > > erroneous situations? > > According to what I read the clk is the min that we can accept. So any clock > from EDO4 to EDO5 should be ok. My concern is that calculation. I need to read > it properly. I don't think that put 10% or any will help us, until we > now that is possible or not. 10% was for the example, what I mean is that it is very common to request a clock to run at 100MHz and to read it at eg. 93MHz. Your check won't pass in this case and we cannot get the necessary test coverage in order to ensure that we won't break working boards. The thing is, if the calculation are made using hw->clk_rate we will always get the register values wrong anyway and in this case it's true that only experience will tell us if such a clock works or not. However, if the calculations are made with clk_rate instead, it is likely that we will get more accurate timings which very likely will work. So perhaps the right solution would be to use the real clock rate instead than refusing clock rates which do not match our strict expectations? > > > > if (abs(clk_rate - hw->clk_rate) > (hw->clk_rate / 10)) > > return -ENOTSUPP; > > > > > + > > > /* SDR core timings are given in picoseconds */ > > > period_ps = div_u64((u64)NSEC_PER_SEC * 1000, hw->clk_rate); > > > > > > @@ -746,6 +751,7 @@ static int gpmi_setup_interface(struct nand_chip *chip, int chipnr, > > > { > > > struct gpmi_nand_data *this = nand_get_controller_data(chip); > > > const struct nand_sdr_timings *sdr; > > > + int ret = 0; > > > > > > /* Retrieve required NAND timings */ > > > sdr = nand_get_sdr_timings(conf); > > > @@ -761,11 +767,11 @@ static int gpmi_setup_interface(struct nand_chip *chip, int chipnr, > > > return 0; > > > > > > /* Do the actual derivation of the controller timings */ > > > - gpmi_nfc_compute_timings(this, sdr); > > > - > > > - this->hw.must_apply_timings = true; > > > + ret = gpmi_nfc_compute_timings(this, sdr); > > > + if (!ret) > > > + this->hw.must_apply_timings = true; > > > > > > - return 0; > > > + return ret; > > > } > > > > > > /* Clears a BCH interrupt. */ > > > > Thanks, > > > > Miquèl > > > > Otherwise looks good, thanks! > > > > Miquèl > Thanks, Miquèl
--- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) { +#if 0 struct gpmi_nfc_hardware_timing *hw = &this->hw; struct resources *r = &this->resources; void __iomem *gpmi_regs = r->gpmi_regs; @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) /* Wait for the DLL to settle. */ udelay(dll_wait_time_us); +#endif } int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr,