diff mbox

[U-Boot] am33xx : bit-flip correction in oob

Message ID CAAfyv37_3P0StjdK1jHRCR-Xuv3oQC2iJ25HHSDbgdHx_+VVxA@mail.gmail.com
State Not Applicable
Delegated to: Tom Rini
Headers show

Commit Message

Belisko Marek April 25, 2014, 7:55 a.m. UTC
Hi Pekon,

On Thu, Apr 24, 2014 at 1:53 PM, Gupta, Pekon <pekon@ti.com> wrote:
> Hi Marek,
>
>>From: Belisko Marek [mailto:marek.belisko@gmail.com]
>>On Wed, Apr 23, 2014 at 8:04 PM, Gupta, Pekon <pekon@ti.com> wrote:
>>>>From: Belisko Marek [mailto:marek.belisko@gmail.com]
>>>>
>>>>CC-ing Pekon Gupta which add those changes in commit:
>>>>6e562b1106ea6afc78752f50925e87e9dd14f8b4
>>>>
>>>>On Tue, Apr 15, 2014 at 12:47 PM, Belisko Marek <marek.belisko@gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> we're running 2014.04-rc3 on custom am335x board (same configuration as BBB).
>>>>> When spl is loading u-boot from nand flash we can see a lot of
>>>>> messages in console:
>>>>> nand: bit-flip corrected @oob=0
>>>>>
>>>>> It is always the same position (seems to be first byte in oob).
>>>>> Anybody experienced same problem?
>>>>> I've tested on 2 different flashes and result is same.
>>>>>
>>>>I was able to track down that read ecc from gpmc return always first
>>>>byte as 0xFF (in omap_calculate_ecc())
>>>
>>> This shouldn't be the case if the page is programmed.
>>> Following is the expected ECC layout of BCH8
>>Thanks for reply and sorry for wrong explanation. I mean read_ecc[0]
>>byte = 0xff.
>>What I found also is that data in read_ecc was completely different
>>from displayed when issue nand dump.
>>I found patch which fixes issue [1]. The u-boot doesn't anymore print
>>bit-flips in oob. I need to fix
>>also spl nand_read_page as when loading u-boot from nand in spl still
>>report issues.
>>
>>One more question. Shouldn;t code for bitflip in oob looks like this:
>>--- a/drivers/mtd/nand/omap_gpmc.c
>>+++ b/drivers/mtd/nand/omap_gpmc.c
>>@@ -403,7 +403,7 @@ static int omap_correct_data_bch(struct mtd_info
>>*mtd, uint8_t *dat,
>>                        dat[byte_pos] ^= 1 << bit_pos;
>>                        printf("nand: bit-flip corrected @data=%d\n", byte_pos);
>>                } else if (byte_pos < error_max) {
>>-                       read_ecc[byte_pos - SECTOR_BYTES] = 1 << bit_pos;
>>+                       read_ecc[byte_pos - SECTOR_BYTES] ^= 1 << bit_pos;
>>                        debug("nand: bit-flip corrected @oob=%d\n", byte_pos -
>>                                                                SECTOR_BYTES);
>>                } else {
>>
> Yes absolutely. Please send a patch for this.
> I think this got missed because read_ecc[] does not reach application layer.
> Above layers are only interested in data portion, so even if the read_ecc[]
> is uncorrected it doesn't matter to them. But yes this should be fixed.
> Thanks for catching this.
OK I'll do.
>
>
>>
>>[1] - http://e2e.ti.com/support/arm/sitara_arm/f/791/t/259699.aspx
>>
> Ahh.. Yes, I know of this issue..
> " The issue is mainly due to a NAND protocol violation in the omap driver since the Random Data Output command (05h-E0h) expects to see only the column address that should be addressed within the already loaded read page into the read buffer. Only 2 address cycles with ALE active should be provided between the 05h and E0h commands. The Page read command expects the full address footprint (2bytes for column address + 3bytes for row address), but once the page is loaded into the read buffer, Random Data Output should be used with only 2bytes for column address."
>
> But the mentioned fix is not at right place. This issue is there in
> nand_base.c: nand_command_lp()  which is part of generic NAND
> framework. So I suggest to use following patch instead.
With original patch (or with your) it fix only u-boot but it doesn't
fix loading u-boot from SPL as it using custom nand_read_page (in
am335x_spl_bch.c)
and not from nand_bases there must be other update to fix this issue.
Probably something like:
> --------------
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 5d3232c..80256b5 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -674,7 +674,7 @@ static void nand_command_lp(struct mtd_info *mtd, unsigned int command,
>                         ctrl &= ~NAND_CTRL_CHANGE;
>                         chip->cmd_ctrl(mtd, column >> 8, ctrl);
>                 }
> -               if (page_addr != -1) {
> +               if (page_addr != -11 && command != NAND_CMD_RNDOUT) {
                                            ^^^small typo ;)
>                         chip->cmd_ctrl(mtd, page_addr, ctrl);
>                         chip->cmd_ctrl(mtd, page_addr >> 8,
>                                        NAND_NCE | NAND_ALE);
> --------------
>
> I think this is issue with Micron NAND also, because even the Micron
> datasheet says that NAND_CMD_RNDOUT should take only 2cycles
> of column address. But probably Micron devices don't crib on extra
> address cycles, they just ignore it.
I've tested it also on Micron flash and it doesn't report any
bit-flips (without patch).
>
>
> with regards, pekon

BR,

marek

Comments

pekon gupta April 25, 2014, 8:19 a.m. UTC | #1
Hi Marek,

>From: Belisko Marek [mailto:marek.belisko@gmail.com]
[...]
>With original patch (or with your) it fix only u-boot but it doesn't
>fix loading u-boot from SPL as it using custom nand_read_page (in
>am335x_spl_bch.c)
>and not from nand_bases there must be other update to fix this issue.
>Probably something like:
>--- a/drivers/mtd/nand/am335x_spl_bch.c
>+++ b/drivers/mtd/nand/am335x_spl_bch.c
>@@ -64,14 +64,16 @@ static int nand_command(int block, int page, uint32_t offs,
>        NAND_CTRL_ALE | NAND_CTRL_CHANGE); /* A[7:0] */
>  hwctrl(&mtd, (offs >> 8) & 0xff, NAND_CTRL_ALE); /* A[11:9] */
>  /* Row address */
>- hwctrl(&mtd, (page_addr & 0xff), NAND_CTRL_ALE); /* A[19:12] */
>- hwctrl(&mtd, ((page_addr >> 8) & 0xff),
>-       NAND_CTRL_ALE); /* A[27:20] */
>-#ifdef CONFIG_SYS_NAND_5_ADDR_CYCLE
>- /* One more address cycle for devices > 128MiB */
>- hwctrl(&mtd, (page_addr >> 16) & 0x0f,
>-       NAND_CTRL_ALE); /* A[31:28] */
>-#endif
>+ if (cmd != NAND_CMD_RNDOUT) {
>+ hwctrl(&mtd, (page_addr & 0xff), NAND_CTRL_ALE); /* A[19:12] */
>+ hwctrl(&mtd, ((page_addr >> 8) & 0xff), NAND_CTRL_ALE); /* A[27:20] */
>+
>+ #ifdef CONFIG_SYS_NAND_5_ADDR_CYCLE
>+ /* One more address cycle for devices > 128MiB */
>+ hwctrl(&mtd, (page_addr >> 16) & 0x0f, NAND_CTRL_ALE); /* A[31:28] */
>+ #endif
>+ }
>+
>  hwctrl(&mtd, NAND_CMD_NONE, NAND_NCE | NAND_CTRL_CHANGE);
>

Yes, SPL needs to be handled separately in both spl_simple.c and am335x_spl_bch.c.
Please submit a formal patch for this, so I can test it with multiple devices.

Thanks for fixing this..

with regards, pekon
diff mbox

Patch

--- a/drivers/mtd/nand/am335x_spl_bch.c
+++ b/drivers/mtd/nand/am335x_spl_bch.c
@@ -64,14 +64,16 @@  static int nand_command(int block, int page, uint32_t offs,
        NAND_CTRL_ALE | NAND_CTRL_CHANGE); /* A[7:0] */
  hwctrl(&mtd, (offs >> 8) & 0xff, NAND_CTRL_ALE); /* A[11:9] */
  /* Row address */
- hwctrl(&mtd, (page_addr & 0xff), NAND_CTRL_ALE); /* A[19:12] */
- hwctrl(&mtd, ((page_addr >> 8) & 0xff),
-       NAND_CTRL_ALE); /* A[27:20] */
-#ifdef CONFIG_SYS_NAND_5_ADDR_CYCLE
- /* One more address cycle for devices > 128MiB */
- hwctrl(&mtd, (page_addr >> 16) & 0x0f,
-       NAND_CTRL_ALE); /* A[31:28] */
-#endif
+ if (cmd != NAND_CMD_RNDOUT) {
+ hwctrl(&mtd, (page_addr & 0xff), NAND_CTRL_ALE); /* A[19:12] */
+ hwctrl(&mtd, ((page_addr >> 8) & 0xff), NAND_CTRL_ALE); /* A[27:20] */
+
+ #ifdef CONFIG_SYS_NAND_5_ADDR_CYCLE
+ /* One more address cycle for devices > 128MiB */
+ hwctrl(&mtd, (page_addr >> 16) & 0x0f, NAND_CTRL_ALE); /* A[31:28] */
+ #endif
+ }
+
  hwctrl(&mtd, NAND_CMD_NONE, NAND_NCE | NAND_CTRL_CHANGE);

>