Patchwork [RESEND] mtd: nand: nand_bbt: scan for next free bbt block if writing bbt fails

login
register
mail settings
Submitter Jeff Westfahl
Date May 16, 2013, 1:35 p.m.
Message ID <OF488B7050.CCC82AE1-ON86257B6D.004A7953-86257B6D.004AAE09@ni.com>
Download mbox | patch
Permalink /patch/244324/
State New
Headers show

Comments

Jeff Westfahl - May 16, 2013, 1:35 p.m.
If erasing or writing the BBT fails, we should mark the current BBT block
as bad and use the BBT descriptor to scan for the next available unused
block in the BBT. We should only return a failure if there isn't any space
left.

Signed-off-by: Jeff Westfahl <jeff.westfahl@ni.com>
---
 drivers/mtd/nand/nand_bbt.c |   21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

                 * Automatic placement of the bad block table. Search 
direction
@@ -831,14 +832,26 @@ static int write_bbt(struct mtd_info *mtd, uint8_t 
*buf,
                einfo.addr = to;
                einfo.len = 1 << this->bbt_erase_shift;
                res = nand_erase_nand(mtd, &einfo, 1);
-               if (res < 0)
-                       goto outerr;
+               if (res < 0) {
+                       /* This block is bad. Mark it as such and see if 
there's
+                          another available in the BBT area. */
+                       int block = page >>
+                               (this->bbt_erase_shift - 
this->page_shift);
+                       this->bbt[block >> 2] |= 0x01 << ((block & 0x03) 
<< 1);
+                       goto next;
+               }
 
                res = scan_write_bbt(mtd, to, len, buf,
                                td->options & NAND_BBT_NO_OOB ? NULL :
                                &buf[len]);
-               if (res < 0)
-                       goto outerr;
+               if (res < 0) {
+                       /* This block is bad. Mark it as such and see if 
there's
+                          another available in the BBT area. */
+                       int block = page >>
+                               (this->bbt_erase_shift - 
this->page_shift);
+                       this->bbt[block >> 2] |= 0x01 << ((block & 0x03) 
<< 1);
+                       goto next;
+               }
 
                pr_info("Bad block table written to 0x%012llx, version 
0x%02X\n",
                         (unsigned long long)to, td->version[chip]);
Brian Norris - July 22, 2013, 8:30 a.m.
Hi Jeff,

Sorry this has sat a while. I have a few comments now, and I expect to
take an even closer look in the near future.

On Thu, May 16, 2013 at 6:35 AM, Jeff Westfahl <jeff.westfahl@ni.com> wrote:
> If erasing or writing the BBT fails, we should mark the current BBT block
> as bad and use the BBT descriptor to scan for the next available unused
> block in the BBT. We should only return a failure if there isn't any space
> left.

This idea looks good. We can of course run into a worn-out BBT block,
especially since we're pretty dumb at write-erase cycles (every new
bad block causes a full erasure/reprogram). So we should mark failing
BBT blocks as bad and move to the next one.

A general comment: it looks like you take care of updating the BBT
with the new bad block you're marking, but we expect that any
newly-marked bad block should be marked in both the BBT and in the OOB
area of the block itself (just in case we ever have to fall back to
using the bad block markers in each block again). See
nand_default_block_markbad() in nand_base.c.

> Signed-off-by: Jeff Westfahl <jeff.westfahl@ni.com>
> ---
>  drivers/mtd/nand/nand_bbt.c |   21 +++++++++++++++++----
>  1 file changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/mtd/nand/nand_bbt.c b/drivers/mtd/nand/nand_bbt.c
> index 2672643..900bda1 100644
> --- a/drivers/mtd/nand/nand_bbt.c
> +++ b/drivers/mtd/nand/nand_bbt.c
> @@ -697,6 +697,7 @@ static int write_bbt(struct mtd_info *mtd, uint8_t
> *buf,
>                         page = td->pages[chip];
>                         goto write;
>                 }
> +       next:
>
>                 /*
>                  * Automatic placement of the bad block table. Search
> direction
> @@ -831,14 +832,26 @@ static int write_bbt(struct mtd_info *mtd, uint8_t
> *buf,
>                 einfo.addr = to;
>                 einfo.len = 1 << this->bbt_erase_shift;
>                 res = nand_erase_nand(mtd, &einfo, 1);
> -               if (res < 0)
> -                       goto outerr;
> +               if (res < 0) {
> +                       /* This block is bad. Mark it as such and see if
> there's

You probably don't want to turn *all* errors into new bad blocks. You
probably should only mark a block bad for some combination of -EIO (a
failed IO operation) and/or einfo.state == MTD_ERASE_FAILED, and not
for things like a read-only MTD (?), an out-of-bounds flash address
(shouldn't happen at this point?), or anything else that theoretically
could happen.

> +                          another available in the BBT area. */

You also may want to take a glance at Documentation/CodingStyle and/or
run scripts/checkpatch.pl on your patch. The comment closure doesn't
follow the CodingStyle.

> +                       int block = page >>
> +                               (this->bbt_erase_shift -
> this->page_shift);
> +                       this->bbt[block >> 2] |= 0x01 << ((block & 0x03)
> << 1);

It's been a while since I really looked at the nand_bbt.c coding
details... This is kinda ugly! (Note that this not particularly a
criticism of this part of your patch; this line follows the (IMO bad)
style of the rest of nand_bbt.c). I mean, we could at least use a nice
macro or something... I may look into whether this file deserves some
new cleanup macros, but that would be independent of this patch.

> +                       goto next;
> +               }
>
>                 res = scan_write_bbt(mtd, to, len, buf,
>                                 td->options & NAND_BBT_NO_OOB ? NULL :
>                                 &buf[len]);
> -               if (res < 0)
> -                       goto outerr;
> +               if (res < 0) {
> +                       /* This block is bad. Mark it as such and see if
> there's
> +                          another available in the BBT area. */

Similar comment here: you don't want to turn *all* errors into new bad
blocks, only -EIO.

> +                       int block = page >>
> +                               (this->bbt_erase_shift -
> this->page_shift);
> +                       this->bbt[block >> 2] |= 0x01 << ((block & 0x03)
> << 1);
> +                       goto next;
> +               }
>
>                 pr_info("Bad block table written to 0x%012llx, version
> 0x%02X\n",
>                          (unsigned long long)to, td->version[chip]);

(Comment to myself mostly:) write_bbt() is a monstrous function; 203
lines before this patch! And that's after cheating with lines like
this:

                case 1: sft = 3; sftmsk = 0x07; msk[0] = 0x00; msk[1] = 0x01;

But in light of this observation, it may be time to consider some refactoring...

Brian

[NOTE: I just crafted this reply and realized I actually received 3
copies of this patch via the mailing list, and I'm replying to the
only one that is line-wrapped. Sorry.]

Patch

diff --git a/drivers/mtd/nand/nand_bbt.c b/drivers/mtd/nand/nand_bbt.c
index 2672643..900bda1 100644
--- a/drivers/mtd/nand/nand_bbt.c
+++ b/drivers/mtd/nand/nand_bbt.c
@@ -697,6 +697,7 @@  static int write_bbt(struct mtd_info *mtd, uint8_t 
*buf,
                        page = td->pages[chip];
                        goto write;
                }
+       next:
 
                /*