Patchwork mtd: nand: omap: save Bad-Block-Table (BBT) on device

login
register
mail settings
Submitter pekon gupta
Date July 23, 2014, 11:54 a.m.
Message ID <1406116461-27089-1-git-send-email-pekon@ti.com>
Download mbox | patch
Permalink /patch/372875/
State New
Headers show

Comments

pekon gupta - July 23, 2014, 11:54 a.m.
This patch makes OMAP NAND driver to
- save Bad-Block-Table (BBT) on NAND Flash device
- scan on device BBT during probe

Signed-off-by: Pekon Gupta <pekon@ti.com>
---
 drivers/mtd/nand/omap2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Ezequiel García - July 23, 2014, 12:05 p.m.
Hi Pekon,

On 23 Jul 05:24 PM, Pekon Gupta wrote:
> This patch makes OMAP NAND driver to
> - save Bad-Block-Table (BBT) on NAND Flash device
> - scan on device BBT during probe
> 

Doesn't this break backward compatibility? Please correct me if I'm wrong
here, but if I apply this patch, and boot with my current NAND flash
contents, the kernel won't have any place to store the BBT (assuming
there are no blocks reserved).
pekon gupta - July 23, 2014, 4:26 p.m.
Hi Ezequiel,

>From: Ezequiel Garcia [mailto:ezequiel@vanguardiasur.com.ar]
>>On 23 Jul 05:24 PM, Pekon Gupta wrote:
>> This patch makes OMAP NAND driver to
>> - save Bad-Block-Table (BBT) on NAND Flash device
>> - scan on device BBT during probe
>>
>
>Doesn't this break backward compatibility? Please correct me if I'm wrong
>here, but if I apply this patch, and boot with my current NAND flash
>contents, the kernel won't have any place to store the BBT (assuming
>there are no blocks reserved).

Good point. I'll rather like your opinion to judge my below understanding
as you were recently involved in some NAND BBT features additions..

When bbt_options "NAND_BBT_USE_FLASH" is enabled, default BBT
options will be used which are:

$KERNEL/drivers/mtd/nand_bbt.c
static struct nand_bbt_descr bbt_main_descr = {
	.options = NAND_BBT_LASTBLOCK | NAND_BBT_CREATE | NAND_BBT_WRITE
		| NAND_BBT_2BIT | NAND_BBT_VERSION | NAND_BBT_PERCHIP,

As default options have NAND_BBT_CREATE and NAND_BBT_WRITE,
so kernel will try to reserve erase-blocks for main-BBT and mirror-BBT
towards end of NAND device.

- Case-1: If there is un-allocated space (not allocated to any partition)
  at end of Flash device then it will be used. And there is no problem.

- Case-2: If there is no free space towards the end of flash, then ...
As per my understanding, the create_bbt() would try scanning whole
NAND to find empty blocks and then it creates BBT wherever it finds
empty blocks. And those blocks are marked as BAD with BBT signature
to prevent getting erased or over-written by any user-data.

Is my understanding correct?
If yes, then this will not break backward compatibility, as on upgrading
the kernel new BBT will be automatically written on first boot. And
it will be used in subsequent boots.
Need feedbacks ...

with regards, pekon
Ezequiel García - July 23, 2014, 8:27 p.m.
(Brian, correct me if I'm wrong here, you're the BBT expert).

On 23 Jul 04:26 PM, Gupta, Pekon wrote:
> >From: Ezequiel Garcia [mailto:ezequiel@vanguardiasur.com.ar]
> >>On 23 Jul 05:24 PM, Pekon Gupta wrote:
> >> This patch makes OMAP NAND driver to
> >> - save Bad-Block-Table (BBT) on NAND Flash device
> >> - scan on device BBT during probe
> >>
> >
> >Doesn't this break backward compatibility? Please correct me if I'm wrong
> >here, but if I apply this patch, and boot with my current NAND flash
> >contents, the kernel won't have any place to store the BBT (assuming
> >there are no blocks reserved).
> 
> Good point. I'll rather like your opinion to judge my below understanding
> as you were recently involved in some NAND BBT features additions..
> 
> When bbt_options "NAND_BBT_USE_FLASH" is enabled, default BBT
> options will be used which are:
> 
> $KERNEL/drivers/mtd/nand_bbt.c
> static struct nand_bbt_descr bbt_main_descr = {
> 	.options = NAND_BBT_LASTBLOCK | NAND_BBT_CREATE | NAND_BBT_WRITE
> 		| NAND_BBT_2BIT | NAND_BBT_VERSION | NAND_BBT_PERCHIP,
> 
> As default options have NAND_BBT_CREATE and NAND_BBT_WRITE,
> so kernel will try to reserve erase-blocks for main-BBT and mirror-BBT
> towards end of NAND device.
> 
> - Case-1: If there is un-allocated space (not allocated to any partition)
>   at end of Flash device then it will be used. And there is no problem.
> 

Yes, so far, so good.

> - Case-2: If there is no free space towards the end of flash, then ...
> As per my understanding, the create_bbt() would try scanning whole
> NAND to find empty blocks and then it creates BBT wherever it finds
> empty blocks. And those blocks are marked as BAD with BBT signature
> to prevent getting erased or over-written by any user-data.
> 

OK, I just went through the BBT code, trying to see if I was missing
something. I can't see any place where the kernel scans the device
and searches for empty space for the BBT.

Instead, what create_bbt() seem to do is scan the whole device, searching
for bad blocks to fill an in-memory BBT. I presume that if I have some
blocks marked as bad in the OOB region, they are identified as bad
and the BBT is filled.

I think that write_bbt() is where the (previously created, in-memory) BBT
is written to flash. It goes like this:

write_bbt()
{
[..]
                /*
                 * Automatic placement of the bad block table. Search direction
                 * top -> down?
                 */
                if (td->options & NAND_BBT_LASTBLOCK) {
                        startblock = numblocks * (chip + 1) - 1;
                        dir = -1;
                } else {
                        startblock = chip * numblocks;
                        dir = 1;
                }

                for (i = 0; i < td->maxblocks; i++) {
                        int block = startblock + dir * i;
                        /* Check, if the block is bad */
                        switch (bbt_get_entry(this, block)) {
                        case BBT_BLOCK_WORN:
                        case BBT_BLOCK_FACTORY_BAD:
                                continue;
                        }
                        page = block <<
                                (this->bbt_erase_shift - this->page_shift);
                        /* Check, if the block is used by the mirror table */
                        if (!md || md->pages[chip] != page)
                                goto write;
                }
                pr_err("No space left to write bad block table\n");
                return -ENOSPC;
		[..]
}

which fails if there's no room left for the BBT. This will happen on every
boot, so it's not an ideal situation.
 
> Is my understanding correct?
> If yes, then this will not break backward compatibility, as on upgrading
> the kernel new BBT will be automatically written on first boot. And
> it will be used in subsequent boots.
> Need feedbacks ...
> 

In fact, you do have a simple way to solve this. Just support BBT through
the "nand-on-flash-bbt" devicetree property, so a user can tell if his flash
has a BBT or not. See pxa3xx-nand.c, which should be correct.
Brian Norris - July 24, 2014, 1:54 a.m.
On Wed, Jul 23, 2014 at 05:27:46PM -0300, ezequiel@vanguardiasur.com.ar wrote:
> On 23 Jul 04:26 PM, Gupta, Pekon wrote:
> > >From: Ezequiel Garcia [mailto:ezequiel@vanguardiasur.com.ar]
> > >>On 23 Jul 05:24 PM, Pekon Gupta wrote:
> > >> This patch makes OMAP NAND driver to
> > >> - save Bad-Block-Table (BBT) on NAND Flash device
> > >> - scan on device BBT during probe
> > >>
> > >
> > >Doesn't this break backward compatibility? Please correct me if I'm wrong
> > >here, but if I apply this patch, and boot with my current NAND flash
> > >contents, the kernel won't have any place to store the BBT (assuming
> > >there are no blocks reserved).
> > 
> > Good point. I'll rather like your opinion to judge my below understanding
> > as you were recently involved in some NAND BBT features additions..
> > 
> > When bbt_options "NAND_BBT_USE_FLASH" is enabled, default BBT
> > options will be used which are:
> > 
> > $KERNEL/drivers/mtd/nand_bbt.c
> > static struct nand_bbt_descr bbt_main_descr = {
> > 	.options = NAND_BBT_LASTBLOCK | NAND_BBT_CREATE | NAND_BBT_WRITE
> > 		| NAND_BBT_2BIT | NAND_BBT_VERSION | NAND_BBT_PERCHIP,
> > 
> > As default options have NAND_BBT_CREATE and NAND_BBT_WRITE,
> > so kernel will try to reserve erase-blocks for main-BBT and mirror-BBT
> > towards end of NAND device.
> > 
> > - Case-1: If there is un-allocated space (not allocated to any partition)

Partitioning has no effect on the BBT process. BBT is run only for the
'master' MTD, and it has no regard for previous data on the flash. Given
that...

> >   at end of Flash device then it will be used. And there is no problem.
> > 
> 
> Yes, so far, so good.

Agreed. It is especially safe because that area was not partitioned, and
therefore should not contain important data.

> > - Case-2: If there is no free space towards the end of flash, then ...
> > As per my understanding, the create_bbt() would try scanning whole
> > NAND to find empty blocks and then it creates BBT wherever it finds
> > empty blocks. And those blocks are marked as BAD with BBT signature
> > to prevent getting erased or over-written by any user-data.
> > 
> 
> OK, I just went through the BBT code, trying to see if I was missing
> something. I can't see any place where the kernel scans the device
> and searches for empty space for the BBT.

Right, on-flash BBT has no notion of "empty space", really; it just
knows what region is reserved for its use (the last few blocks of the
device).

> Instead, what create_bbt() seem to do is scan the whole device, searching
> for bad blocks to fill an in-memory BBT. I presume that if I have some
> blocks marked as bad in the OOB region, they are identified as bad
> and the BBT is filled.
> 
> I think that write_bbt() is where the (previously created, in-memory) BBT
> is written to flash. It goes like this:
> 
> write_bbt()
> {
> [..]
>                 /*
>                  * Automatic placement of the bad block table. Search direction
>                  * top -> down?
>                  */
>                 if (td->options & NAND_BBT_LASTBLOCK) {
>                         startblock = numblocks * (chip + 1) - 1;
>                         dir = -1;
>                 } else {
>                         startblock = chip * numblocks;
>                         dir = 1;
>                 }
> 
>                 for (i = 0; i < td->maxblocks; i++) {
>                         int block = startblock + dir * i;
>                         /* Check, if the block is bad */
>                         switch (bbt_get_entry(this, block)) {
>                         case BBT_BLOCK_WORN:
>                         case BBT_BLOCK_FACTORY_BAD:
>                                 continue;
>                         }
>                         page = block <<
>                                 (this->bbt_erase_shift - this->page_shift);
>                         /* Check, if the block is used by the mirror table */
>                         if (!md || md->pages[chip] != page)
>                                 goto write;
>                 }
>                 pr_err("No space left to write bad block table\n");
>                 return -ENOSPC;
> 		[..]
> }
> 
> which fails if there's no room left for the BBT. This will happen on every
> boot, so it's not an ideal situation.

BTW, td->maxblocks==4 by default. There's no 'scanning the whole
device'; it only scans the last 4 blocks. But you're correct that it's
not ideal, since it will fail if there are 4 bad blocks at the end.

> > Is my understanding correct?
> > If yes, then this will not break backward compatibility, as on upgrading
> > the kernel new BBT will be automatically written on first boot. And
> > it will be used in subsequent boots.
> > Need feedbacks ...
> > 
> 
> In fact, you do have a simple way to solve this. Just support BBT through
> the "nand-on-flash-bbt" devicetree property, so a user can tell if his flash
> has a BBT or not. See pxa3xx-nand.c, which should be correct.

Yes, you can use the 'nand-on-flash-bbt' property, and that probably
makes the most sense.

The only real backwards-compatibility concern you'd have for
unconditionally enabling on-flash BBT (like in this patch) is if you had
previous file system data in the last 4 blocks; nand_bbt will just
clobber it, breaking your file system. For this reason, using DT is
probably a good idea -- you're opting in, rather than being forced in by
a kernel upgrade.

Beyond the backwards-compatibility concern, you still have other
concerns about on-flash BBT's robustness. Limiting yourself to a region
of 4 blocks is one potential issue. There are others (e.g., lack of CRC
protection), but none that have been real show-stoppers. I have a few
generations of products running it here.

BTW, there's a recent thread about GPMI and its "bad block mark
swapping", which has led to somebody wanting a new DT binding for
'on-flash-bbt-no-oob-bbm' or something like that, to mirror the
(excellently named -- by me) NAND_BBT_NO_OOB_BBM option. I'm not real
happy with adding a random assortment of configurable BBT flags into the
DT ABI, without fully describing and standardizing the BBT format. It's
kind of a vacuous target right now, which is just defined by the
70%-baked solution in our current implementation... (I guess I need to
go reply to the GPMI/NO_OOB_BBM thread soon!)

Brian

[1] http://lists.infradead.org/pipermail/linux-mtd/2014-March/052666.html
    http://lists.infradead.org/pipermail/linux-mtd/2014-March/052796.html
    http://lists.infradead.org/pipermail/linux-mtd/2014-March/052988.html
Ezequiel García - July 24, 2014, 1:40 p.m.
On 23 Jul 06:54 PM, Brian Norris wrote:
> > > 
> > > - Case-1: If there is un-allocated space (not allocated to any partition)
> 
> Partitioning has no effect on the BBT process. BBT is run only for the
> 'master' MTD, and it has no regard for previous data on the flash. Given
> that...
> 

Right.

> > >   at end of Flash device then it will be used. And there is no problem.
> > > 
> > 
> > Yes, so far, so good.
> 
> Agreed. It is especially safe because that area was not partitioned, and
> therefore should not contain important data.
> 
> > > - Case-2: If there is no free space towards the end of flash, then ...
> > > As per my understanding, the create_bbt() would try scanning whole
> > > NAND to find empty blocks and then it creates BBT wherever it finds
> > > empty blocks. And those blocks are marked as BAD with BBT signature
> > > to prevent getting erased or over-written by any user-data.
> > > 
> > 
> > OK, I just went through the BBT code, trying to see if I was missing
> > something. I can't see any place where the kernel scans the device
> > and searches for empty space for the BBT.
> 
> Right, on-flash BBT has no notion of "empty space", really; it just
> knows what region is reserved for its use (the last few blocks of the
> device).
> 

Right. And in fact, that's clear from the snippet I pasted below.

> > Instead, what create_bbt() seem to do is scan the whole device, searching
> > for bad blocks to fill an in-memory BBT. I presume that if I have some
> > blocks marked as bad in the OOB region, they are identified as bad
> > and the BBT is filled.
> > 
> > I think that write_bbt() is where the (previously created, in-memory) BBT
> > is written to flash. It goes like this:
> > 
> > write_bbt()
> > {
> > [..]
> >                 /*
> >                  * Automatic placement of the bad block table. Search direction
> >                  * top -> down?
> >                  */
> >                 if (td->options & NAND_BBT_LASTBLOCK) {
> >                         startblock = numblocks * (chip + 1) - 1;
> >                         dir = -1;
> >                 } else {
> >                         startblock = chip * numblocks;
> >                         dir = 1;
> >                 }
> > 
> >                 for (i = 0; i < td->maxblocks; i++) {
> >                         int block = startblock + dir * i;
> >                         /* Check, if the block is bad */
> >                         switch (bbt_get_entry(this, block)) {
> >                         case BBT_BLOCK_WORN:
> >                         case BBT_BLOCK_FACTORY_BAD:
> >                                 continue;
> >                         }
> >                         page = block <<
> >                                 (this->bbt_erase_shift - this->page_shift);
> >                         /* Check, if the block is used by the mirror table */
> >                         if (!md || md->pages[chip] != page)
> >                                 goto write;
> >                 }
> >                 pr_err("No space left to write bad block table\n");
> >                 return -ENOSPC;
> > 		[..]
> > }
> > 
> > which fails if there's no room left for the BBT. This will happen on every
> > boot, so it's not an ideal situation.
> 
> BTW, td->maxblocks==4 by default. There's no 'scanning the whole
> device'; it only scans the last 4 blocks. But you're correct that it's
> not ideal, since it will fail if there are 4 bad blocks at the end.
> 

Yeah, I was completely wrong. Thanks for the insight!

> > > Is my understanding correct?
> > > If yes, then this will not break backward compatibility, as on upgrading
> > > the kernel new BBT will be automatically written on first boot. And
> > > it will be used in subsequent boots.
> > > Need feedbacks ...
> > > 
> > 
> > In fact, you do have a simple way to solve this. Just support BBT through
> > the "nand-on-flash-bbt" devicetree property, so a user can tell if his flash
> > has a BBT or not. See pxa3xx-nand.c, which should be correct.
> 
> Yes, you can use the 'nand-on-flash-bbt' property, and that probably
> makes the most sense.
> 
> The only real backwards-compatibility concern you'd have for
> unconditionally enabling on-flash BBT (like in this patch) is if you had
> previous file system data in the last 4 blocks; nand_bbt will just
> clobber it, breaking your file system. For this reason, using DT is
> probably a good idea -- you're opting in, rather than being forced in by
> a kernel upgrade.
> 

FWIW, having the kernel wipe your precious data, is even worse than having
him fail; which means this represents a much more serious drawback.
This is why any modification to a NAND driver that involves a different
'view' of the flash, should be considered with a lot of care!

> Beyond the backwards-compatibility concern, you still have other
> concerns about on-flash BBT's robustness. Limiting yourself to a region
> of 4 blocks is one potential issue. There are others (e.g., lack of CRC
> protection), but none that have been real show-stoppers. I have a few
> generations of products running it here.
> 

Hm.. Can you guys mention what's the benefit of using a BBT, against keeping
the bad block mark in the OOB region? (Other than the fact that some ECC
strength may not leave any available OOB).
pekon gupta - July 24, 2014, 5:56 p.m.
>From: Brian Norris [mailto:computersforpeace@gmail.com]
>>On Wed, Jul 23, 2014 at 05:27:46PM -0300, ezequiel@vanguardiasur.com.ar wrote:
[...]
>> In fact, you do have a simple way to solve this. Just support BBT through
>> the "nand-on-flash-bbt" devicetree property, so a user can tell if his flash
>> has a BBT or not. See pxa3xx-nand.c, which should be correct.
>
>Yes, you can use the 'nand-on-flash-bbt' property, and that probably
>makes the most sense.
>
>The only real backwards-compatibility concern you'd have for
>unconditionally enabling on-flash BBT (like in this patch) is if you had
>previous file system data in the last 4 blocks; nand_bbt will just
>clobber it, breaking your file system. For this reason, using DT is
>probably a good idea -- you're opting in, rather than being forced in by
>a kernel upgrade.
>
>Beyond the backwards-compatibility concern, you still have other
>concerns about on-flash BBT's robustness. Limiting yourself to a region
>of 4 blocks is one potential issue. There are others (e.g., lack of CRC
>protection), but none that have been real show-stoppers. I have a few
>generations of products running it here.
>
>BTW, there's a recent thread about GPMI and its "bad block mark
>swapping", which has led to somebody wanting a new DT binding for
>'on-flash-bbt-no-oob-bbm' or something like that, to mirror the
>(excellently named -- by me) NAND_BBT_NO_OOB_BBM option. I'm not real
>happy with adding a random assortment of configurable BBT flags into the
>DT ABI, without fully describing and standardizing the BBT format. It's
>kind of a vacuous target right now, which is just defined by the
>70%-baked solution in our current implementation... (I guess I need to
>go reply to the GPMI/NO_OOB_BBM thread soon!)
>
>Brian
>
>[1] http://lists.infradead.org/pipermail/linux-mtd/2014-March/052666.html
>    http://lists.infradead.org/pipermail/linux-mtd/2014-March/052796.html
>    http://lists.infradead.org/pipermail/linux-mtd/2014-March/052988.html

Thanks much to both of you.
Yes, "nand-on-flash-bbt" devicetree property seems the correct approach.
I'll study the above threads, and then get back with appropriate patch.
So, please ignore this patch.

with regards, pekon
Brian Norris - July 24, 2014, 7:11 p.m.
Hi Ezequiel,

On Thu, Jul 24, 2014 at 10:40:18AM -0300, Ezequiel Garcia wrote:
> On 23 Jul 06:54 PM, Brian Norris wrote:
> > The only real backwards-compatibility concern you'd have for
> > unconditionally enabling on-flash BBT (like in this patch) is if you had
> > previous file system data in the last 4 blocks; nand_bbt will just
> > clobber it, breaking your file system. For this reason, using DT is
> > probably a good idea -- you're opting in, rather than being forced in by
> > a kernel upgrade.
> 
> FWIW, having the kernel wipe your precious data, is even worse than having
> him fail; which means this represents a much more serious drawback.
> This is why any modification to a NAND driver that involves a different
> 'view' of the flash, should be considered with a lot of care!

Yes, that's pretty serious. Perhaps there could be some more attention
paid to this issue (throwing out some half-baked ideas: tag partitions
as exclusive 'BBT' partitions, to ensure data / BBT separation; or
prevent nand_bbt from reserving non-empty blocks for BBT). But for now,
this issue can be best dealt with by using the supported DT option to
explicitly opt in.

> > Beyond the backwards-compatibility concern, you still have other
> > concerns about on-flash BBT's robustness. Limiting yourself to a region
> > of 4 blocks is one potential issue. There are others (e.g., lack of CRC
> > protection), but none that have been real show-stoppers. I have a few
> > generations of products running it here.
> 
> Hm.. Can you guys mention what's the benefit of using a BBT, against keeping
> the bad block mark in the OOB region? (Other than the fact that some ECC
> strength may not leave any available OOB).

The primary reason would be that NAND datasheets specify it these days
:) But a better argument: nobody guarantees that you can write a 
bad block marker to a worn out block; you may just get program failures.
This has been acknowledged by several developers over the last several
years, some of whom I think were quoting their recollection from talking
w/ manufacturers (does that make this 3rd-hand info?). Most recently,
Lothar brought this up [1].

Additionally, you get a boot-time performance improvement if you only
have to read a few pages, instead of a page or two from every block on
the flash.

Brian

[1] http://lists.infradead.org/pipermail/linux-mtd/2014-July/054767.html

Patch

diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
index f0ed92e..2ad7451 100644
--- a/drivers/mtd/nand/omap2.c
+++ b/drivers/mtd/nand/omap2.c
@@ -1663,7 +1663,7 @@  static int omap_nand_probe(struct platform_device *pdev)
 	mtd->owner		= THIS_MODULE;
 	nand_chip		= &info->nand;
 	nand_chip->ecc.priv	= NULL;
-	nand_chip->options	|= NAND_SKIP_BBTSCAN;
+	nand_chip->bbt_options  |= NAND_BBT_USE_FLASH;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	nand_chip->IO_ADDR_R = devm_ioremap_resource(&pdev->dev, res);