[v2] mtd: nand: write bad block marker even with BBT

Message ID	1324332231-30884-1-git-send-email-computersforpeace@gmail.com
State	New, archived
Headers	show Return-Path: <linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org> X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from merlin.infradead.org (merlin.infradead.org [IPv6:2001:4978:20e::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 9A6A8B705D for <incoming@patchwork.ozlabs.org>; Tue, 20 Dec 2011 09:07:30 +1100 (EST) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1RclLd-0007N6-4z; Mon, 19 Dec 2011 22:06:17 +0000 Received: from mail-iy0-f177.google.com ([209.85.210.177]) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1RclLa-0007Ms-Qz for linux-mtd@lists.infradead.org; Mon, 19 Dec 2011 22:06:15 +0000 Received: by iadk27 with SMTP id k27so9184427iad.36 for <linux-mtd@lists.infradead.org>; Mon, 19 Dec 2011 14:06:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:to:cc:subject:date:message-id:x-mailer; bh=EjZ5ck7P8+bHE/u2RJUQI3SbouOd4FTn6psLvQiDYGc=; b=eMac+gNt4oq2kN/gB7TYvbIxnGMRLn8aZtJ2+X3iQbdgBBQbIS2ZjGT8IzapKcCqE9 H0QeX1pbUijHbrVMzohZn/I/YrKaLRMZlmRBBo4rVHDdHrIrUSxD4oM9Q/jZsu6GYBi1 q92qU9nQNMeBnR9gg4P7uup0q4tHmCQEuYKdo= Received: by 10.50.188.129 with SMTP id ga1mr28946474igc.69.1324332371501; Mon, 19 Dec 2011 14:06:11 -0800 (PST) Received: from localhost.localdomain (cpe-76-174-190-12.socal.res.rr.com. [76.174.190.12]) by mx.google.com with ESMTPS id f2sm59384482ibg.9.2011.12.19.14.06.04 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 19 Dec 2011 14:06:10 -0800 (PST) From: Brian Norris <computersforpeace@gmail.com> To: <linux-mtd@lists.infradead.org> Subject: [PATCH v2] mtd: nand: write bad block marker even with BBT Date: Mon, 19 Dec 2011 14:03:51 -0800 Message-Id: <1324332231-30884-1-git-send-email-computersforpeace@gmail.com> X-Mailer: git-send-email 1.7.5.4 X-Spam-Note: CRM114 invocation failed X-Spam-Score: -1.9 (-) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-1.9 points) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (computersforpeace[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 SINGLE_HEADER_1K A single header contains 1K-2K characters Cc: Randy Dunlap <randy.dunlap@oracle.com>, Baruch Siach <baruch@tkos.co.il>, Dan Carpenter <error27@gmail.com>, Sebastian Andrzej Siewior <bigeasy@linutronix.de>, Nicolas Ferre <nicolas.ferre@atmel.com>, Dominik Brodowski <linux@dominikbrodowski.net>, Barry Song <barry.song@analog.com>, Gabor Juhos <juhosg@openwrt.org>, Guillaume LECERF <glecerf@gmail.com>, Jonas Gorski <jonas.gorski@gmail.com>, Jamie Iles <jamie@jamieiles.com>, Ivan Djelic <ivan.djelic@parrot.com>, Robert Jarzmik <robert.jarzmik@free.fr>, David Woodhouse <David.Woodhouse@intel.com>, Maxim Levitsky <maximlevitsky@gmail.com>, Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>, Kevin Cernekee <cernekee@gmail.com>, Kulikov Vasiliy <segooon@gmail.com>, Jim Quinlan <jim2101024@gmail.com>, Andres Salomon <dilinger@queued.net>, Axel Lin <axel.lin@gmail.com>, Anatolij Gustschin <agust@denx.de>, Mike Frysinger <vapier@gentoo.org>, Arnd Bergmann <arnd@arndb.de>, Lei Wen <leiwen@marvell.com>, Sascha Hauer <s.hauer@pengutronix.de>, Sukumar Ghorai <s-ghorai@ti.com>, Artem Bityutskiy <artem.bityutskiy@intel.com>, Florian Fainelli <florian@openwrt.org>, Peter Wippich <pewi@gw-instruments.de>, Matthieu CASTET <matthieu.castet@parrot.com>, Kyungmin Park <kyungmin.park@samsung.com>, Shmulik Ladkani <shmulik.ladkani@gmail.com>, Wolfram Sang <w.sang@pengutronix.de>, Chuanxiao Dong <chuanxiao.dong@intel.com>, Joe Perches <joe@perches.com>, Brian Norris <computersforpeace@gmail.com>, Roman Tereshonkov <roman.tereshonkov@nokia.com>, Adrian Hunter <adrian.hunter@nokia.com> X-BeenThere: linux-mtd@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Linux MTD discussion mailing list <linux-mtd.lists.infradead.org> List-Unsubscribe: <http://lists.infradead.org/mailman/options/linux-mtd>, <mailto:linux-mtd-request@lists.infradead.org?subject=unsubscribe> List-Archive: <http://lists.infradead.org/pipermail/linux-mtd/> List-Post: <mailto:linux-mtd@lists.infradead.org> List-Help: <mailto:linux-mtd-request@lists.infradead.org?subject=help> List-Subscribe: <http://lists.infradead.org/mailman/listinfo/linux-mtd>, <mailto:linux-mtd-request@lists.infradead.org?subject=subscribe> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: linux-mtd-bounces@lists.infradead.org Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org

Brian Norris Dec. 19, 2011, 10:03 p.m. UTC

Currently, the flash-based BBT implementation writes bad block data only
to its flash-based table and not to the OOB marker area. Then, as new
bad blocks are marked over time, the OOB markers become out of date and
the flash-based table becomes the only source of current bad block
information. This can be a problem when:

 * bootloader cannot read the flash-based BBT format
 * BBT is corrupted and the flash must be rescanned for bad
   blocks; we want to remember bad blocks that were marked from Linux

In an attempt to keep the bad block markers in sync with the flash-based
BBT, this patch changes the default so that we write bad block markers
to the proper OOB area on each block in addition to flash-based BBT.

Theoretically, the bad block table and the OOB markers can still get out
of sync if the system experiences a power cut between writing the BBT to
flash and writing the OOB marker to a newly-marked bad block. However,
this is a relatively unlikely event, as new bad blocks shouldn't appear
frequently.

Note that this is a change from the previous default flash-based BBT
behavior. If any contributors rely on the old behavior, they are welcome
to introduce an option flag for it.

Adapted from code by Matthieu Castet.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
---
v2: Explain potential power cut issues and remove option for retaining
    old behavior. I CC'd various MTD contributors; speak up if the new
    default is unacceptable!

drivers/mtd/nand/nand_base.c |   59 +++++++++++++++++++++++------------------
 1 files changed, 33 insertions(+), 26 deletions(-)

Brian Norris Dec. 19, 2011, 10:16 p.m. UTC | #1

Sorry for two e-mails; apparently several addresses I reached were no
longer valid. I'm removing Randy Dunlap and Sukumar Ghorai from the CC
list, and I updated Barry Song and Adrian Hunter's addresses.

Hopefully this can prevent other people from receiving too many
bounces :) Patch description copied below.

On Mon, Dec 19, 2011 at 2:03 PM, Brian Norris
<computersforpeace@gmail.com> wrote:
> Currently, the flash-based BBT implementation writes bad block data only
> to its flash-based table and not to the OOB marker area. Then, as new
> bad blocks are marked over time, the OOB markers become out of date and
> the flash-based table becomes the only source of current bad block
> information. This can be a problem when:
>
>  * bootloader cannot read the flash-based BBT format
>  * BBT is corrupted and the flash must be rescanned for bad
>   blocks; we want to remember bad blocks that were marked from Linux
>
> In an attempt to keep the bad block markers in sync with the flash-based
> BBT, this patch changes the default so that we write bad block markers
> to the proper OOB area on each block in addition to flash-based BBT.
>
> Theoretically, the bad block table and the OOB markers can still get out
> of sync if the system experiences a power cut between writing the BBT to
> flash and writing the OOB marker to a newly-marked bad block. However,
> this is a relatively unlikely event, as new bad blocks shouldn't appear
> frequently.
>
> Note that this is a change from the previous default flash-based BBT
> behavior. If any contributors rely on the old behavior, they are welcome
> to introduce an option flag for it.
>
> Adapted from code by Matthieu Castet.
>
> Signed-off-by: Brian Norris <computersforpeace@gmail.com>

Brian

Sebastian Andrzej Siewior Dec. 20, 2011, 8:49 a.m. UTC | #2

On 12/19/2011 11:03 PM, Brian Norris wrote:
> Currently, the flash-based BBT implementation writes bad block data only
> to its flash-based table and not to the OOB marker area. Then, as new
> bad blocks are marked over time, the OOB markers become out of date and
> the flash-based table becomes the only source of current bad block
> information. This can be a problem when:
>
>   * bootloader cannot read the flash-based BBT format
>   * BBT is corrupted and the flash must be rescanned for bad
>     blocks; we want to remember bad blocks that were marked from Linux
>
> In an attempt to keep the bad block markers in sync with the flash-based
> BBT, this patch changes the default so that we write bad block markers
> to the proper OOB area on each block in addition to flash-based BBT.
>
> Theoretically, the bad block table and the OOB markers can still get out
> of sync if the system experiences a power cut between writing the BBT to
> flash and writing the OOB marker to a newly-marked bad block. However,
> this is a relatively unlikely event, as new bad blocks shouldn't appear
> frequently.
>

The marker and BBT may get out of sync. You should use either the one 
_or_ the other but not both. Why use both anyway? I use BBT because I
have no room left in OOB or updating a single area in OOB is too
complicated / hardly possible.
In the former case, reading that area again and interpreting it as
something as it was once is a bad thing.

Why is the BBT corrupted? You have two tables. If one is broken you
have still the other. If you lose one block which was about to be
marked bad you find soon enough.

Sebastian

Brian Norris Dec. 20, 2011, 6:17 p.m. UTC | #3

Hi Sebastian,

On Tue, Dec 20, 2011 at 12:49 AM, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> On 12/19/2011 11:03 PM, Brian Norris wrote:
>>
>> Currently, the flash-based BBT implementation writes bad block data only
>> to its flash-based table and not to the OOB marker area. Then, as new
>> bad blocks are marked over time, the OOB markers become out of date and
>> the flash-based table becomes the only source of current bad block
>> information. This can be a problem when:
>>
>>  * bootloader cannot read the flash-based BBT format
>>  * BBT is corrupted and the flash must be rescanned for bad
>>    blocks; we want to remember bad blocks that were marked from Linux
>>
>> In an attempt to keep the bad block markers in sync with the flash-based
>> BBT, this patch changes the default so that we write bad block markers
>> to the proper OOB area on each block in addition to flash-based BBT.
>>
>> Theoretically, the bad block table and the OOB markers can still get out
>> of sync if the system experiences a power cut between writing the BBT to
>> flash and writing the OOB marker to a newly-marked bad block. However,
>> this is a relatively unlikely event, as new bad blocks shouldn't appear
>> frequently.
>>
>
> The marker and BBT may get out of sync. You should use either the one _or_
> the other but not both. Why use both anyway?

Two reasons for using OOB area were stated above. There are cases where:
* bootloader cannot read the flash-based BBT format [... but it still
may need to recognize bad blocks]
* BBT is corrupted and the flash must be rescanned for bad blocks; we
want to remember bad blocks that were marked from Linux

In the above cases, one might still want to use flash-based BBT for
boot-time performance.

> I use BBT because I
> have no room left in OOB or updating a single area in OOB is too
> complicated / hardly possible.
> In the former case, reading that area again and interpreting it as
> something as it was once is a bad thing.

Well, then that is one case where you want an option to avoid writing
to markers back to OOB. We may need an option flag to enforce the old
behavior.

However, your use case is not the only use case. I use flash-based BBT
because of the boot-time performance improvement when I don't have to
rescan the whole flash. I believe there are others who use flash-based
BBT who want the ability to fall back to OOB bad block markers.

> Why is the BBT corrupted? You have two tables. If one is broken you
> have still the other. If you lose one block which was about to be
> marked bad you find soon enough.

There are certainly cases where the table could be erased/overwritten
(and therefore "corrupted"), like by a bootloader that doesn't
recognize the BBT, by a software bug, etc. Also, a table entry that
was written once and then read back over a lifetime might wear out and
start to have errors. Recent patches have tried to solve this with
"scrubbing" and, if necessary, rescanning the flash for bad blocks to
rebuild the table.

Also, mirroring is not a requirement for flash-based BBT. A driver
could provide a custom, non-mirrored configuration.

Brian

Artem Bityutskiy Dec. 20, 2011, 8:49 p.m. UTC | #4

On Tue, 2011-12-20 at 10:17 -0800, Brian Norris wrote:
> > I use BBT because I
> > have no room left in OOB or updating a single area in OOB is too
> > complicated / hardly possible.
> > In the former case, reading that area again and interpreting it as
> > something as it was once is a bad thing.
> 
> Well, then that is one case where you want an option to avoid writing
> to markers back to OOB. We may need an option flag to enforce the old
> behavior. 

I guess there should be some way of detecting this case (no space in
OOB) and just not writing.

Artem.

Brian Norris Dec. 22, 2011, 1:15 a.m. UTC | #5

On Tue, Dec 20, 2011 at 12:49 PM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
> On Tue, 2011-12-20 at 10:17 -0800, Brian Norris wrote:
>> > I use BBT because I
>> > have no room left in OOB or updating a single area in OOB is too
>> > complicated / hardly possible.
>> > In the former case, reading that area again and interpreting it as
>> > something as it was once is a bad thing.
>>
>> Well, then that is one case where you want an option to avoid writing
>> to markers back to OOB. We may need an option flag to enforce the old
>> behavior.
>
> I guess there should be some way of detecting this case (no space in
> OOB) and just not writing.

Any suggestions on how to detect no space in OOB? ECC layout structure
is not necessarily reliable, as its fields are designed for
determining what's available apart from BBM and ECC (for userspace or
filesystem data).

My idea: instead of adding more layout info about free space for
writing BBM, we can add a flag to avoid writing BBM to OOB
(NAND_NO_WRITE_OOB?). Would this suffice for you, Sebastian? It would
effectively be a reversal of my v1 patch; we write to both the
flash-BBT and the OOB BBM by default and provide an option that
prevents writing to OOB.

If we use NAND_NO_WRITE_OOB, are there other reasonable situations in
which we should prevent OOB writes according to this flag?

Another question: I noticed that when writing bad block markers to
OOB, we simply overwrite without erasing first. This is a problem on
MLC, which aren't designed to be written twice; they usually can write
some of the bits to 0, but not all of them. Is it reasonable to change
nand_default_block_markbad() to always have it erase the block before
writing? We can just ignore errors (if this is a really bad block that
can't be erased properly). This would be subject of a second patch of
course.

Brian

Shmulik Ladkani Dec. 23, 2011, 8:13 a.m. UTC | #6

On Wed, 21 Dec 2011 17:15:29 -0800 Brian Norris <computersforpeace@gmail.com> wrote:
> Another question: I noticed that when writing bad block markers to
> OOB, we simply overwrite without erasing first. This is a problem on
> MLC, which aren't designed to be written twice; they usually can write
> some of the bits to 0, but not all of them. Is it reasonable to change
> nand_default_block_markbad() to always have it erase the block before
> writing?

Not an MLC expert, but could it be that for these MLC chips,
NAND_BBT_USE_FLASH is usually set, hence the complexities invloved in
re-programming the OOB are simply avoided?

If so, it stregthens your NAND_BBT_WRITE_BBM/NAND_NO_WRITE_OOB approach,
which goes hand-in-hand with Sebastian's scenario where there's no room
in the OOB.

Brian Norris Jan. 6, 2012, 3:10 a.m. UTC | #7

On Fri, Dec 23, 2011 at 12:13 AM, Shmulik Ladkani
<shmulik.ladkani@gmail.com> wrote:
> On Wed, 21 Dec 2011 17:15:29 -0800 Brian Norris <computersforpeace@gmail.com> wrote:
>> Another question: I noticed that when writing bad block markers to
>> OOB, we simply overwrite without erasing first. This is a problem on
>> MLC, which aren't designed to be written twice; they usually can write
>> some of the bits to 0, but not all of them. Is it reasonable to change
>> nand_default_block_markbad() to always have it erase the block before
>> writing?
>
> Not an MLC expert, but could it be that for these MLC chips,
> NAND_BBT_USE_FLASH is usually set, hence the complexities invloved in
> re-programming the OOB are simply avoided?

I can't comment on others' use of MLC + NAND_BBT_USE_FLASH, but I use
NAND_BBT_USE_FLASH for all NAND.

NAND_BBT_USE_FLASH only applies to the bad block *table* location -
that it will be stored in flash. This does not exclude the possibility
of still writing/recognizing bad block markers in OOB. Thus, I do not
understand your how "the complexities invloved in re-programming the
OOB are simply avoided". In fact, I feel that (when possible) it's
best to write to *both* a flash-based BBT and the OOB region of the
bad block, so I do not plan to avoid the complexities of reprogramming
the OOB. Please correct me if I'm missing your point.

> If so, it stregthens your NAND_BBT_WRITE_BBM/NAND_NO_WRITE_OOB approach,
> which goes hand-in-hand with Sebastian's scenario where there's no room
> in the OOB.

I will proceed with v2 adding a NAND_NO_WRITE_OOB flag to provide an
option for cases like Sebastian's, where there is no writeable OOB
space whatsoever.

Perhaps the paragraph beginning with "Another question:" should be
relegated to a separate patch/thread to avoid confusion between two
related but distinct issues. I will try to write a patch soon that
implements my suggestion and tag it onto a v2 patch series.

Brian

[v2] mtd: nand: write bad block marker even with BBT

Commit Message

Comments

Patch