diff mbox

mtd: nand: Properly initialize 'mtd->bitflip_threshold' prior 'scan_bbt()' is invoked

Message ID 20120608182906.57b4844a@halley
State Accepted
Commit ea3b2ea24ef0f2ef9c6795b19cff456195b6728a
Headers show

Commit Message

Shmulik Ladkani June 8, 2012, 3:29 p.m. UTC
As of edbc454 [mtd: driver _read() returns max_bitflips; mtd_read()
returns -EUCLEAN], 'mtd->bitflip_threshold' must be set for mtd devices
having ECC, prior any 'mtd_read()' call.
Otherwise, 'mtd_read()' will falsely return -EUCLEAN.

Normally, 'mtd->bitflip_threshold' is initialized when the MTD is added.

However, this is too late for NAND MTDs, as 'scan_bbt()' is invoked
prior the existing initialization of 'mtd->bitflip_threshold'.

This is a problem since 'scan_bbt()' calls 'mtd_read()', in the case
of a flash-based bad block table.
It resulted in a falsely reported bitflips indication during BBT read,
which lead to constant scrubbing of the flash BBT blocks.

Initialize 'mtd->bitflip_threshold' to its default value (if not already
set by the driver), prior invocation of 'scan_bbt()'.

Reported-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
---
- The issue was introduced in 3.5-rc1 and needs to be merged before 3.5
- Sascha, I've credited you as the reporter, is that ok?
- Sascha, care to test and verify it works for your system?

Comments

Shmulik Ladkani June 8, 2012, 4:17 p.m. UTC | #1
Artem, David,

Patch was accidentally submitted twice.

Disregard this post please.

Sorry for the inconvenience.

Regards,
Shmulik

On Fri, 8 Jun 2012 18:29:06 +0300 Shmulik Ladkani <shmulik@jungo.com> wrote:
> As of edbc454 [mtd: driver _read() returns max_bitflips; mtd_read()
> returns -EUCLEAN], 'mtd->bitflip_threshold' must be set for mtd devices
> having ECC, prior any 'mtd_read()' call.
> Otherwise, 'mtd_read()' will falsely return -EUCLEAN.
> 
> Normally, 'mtd->bitflip_threshold' is initialized when the MTD is added.
> 
> However, this is too late for NAND MTDs, as 'scan_bbt()' is invoked
> prior the existing initialization of 'mtd->bitflip_threshold'.
> 
> This is a problem since 'scan_bbt()' calls 'mtd_read()', in the case
> of a flash-based bad block table.
> It resulted in a falsely reported bitflips indication during BBT read,
> which lead to constant scrubbing of the flash BBT blocks.
> 
> Initialize 'mtd->bitflip_threshold' to its default value (if not already
> set by the driver), prior invocation of 'scan_bbt()'.
> 
> Reported-by: Sascha Hauer <s.hauer@pengutronix.de>
> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Artem Bityutskiy June 8, 2012, 5:12 p.m. UTC | #2
On Fri, 2012-06-08 at 18:29 +0300, Shmulik Ladkani wrote:
> As of edbc454 [mtd: driver _read() returns max_bitflips; mtd_read()
> returns -EUCLEAN], 'mtd->bitflip_threshold' must be set for mtd devices
> having ECC, prior any 'mtd_read()' call.
> Otherwise, 'mtd_read()' will falsely return -EUCLEAN.

Pushed to l2-mtd.git, thanks! Sascha - does it fix the issue for you?
Sascha Hauer June 9, 2012, 9:43 a.m. UTC | #3
On Fri, Jun 08, 2012 at 06:29:06PM +0300, Shmulik Ladkani wrote:
> As of edbc454 [mtd: driver _read() returns max_bitflips; mtd_read()
> returns -EUCLEAN], 'mtd->bitflip_threshold' must be set for mtd devices
> having ECC, prior any 'mtd_read()' call.
> Otherwise, 'mtd_read()' will falsely return -EUCLEAN.
> 
> Normally, 'mtd->bitflip_threshold' is initialized when the MTD is added.
> 
> However, this is too late for NAND MTDs, as 'scan_bbt()' is invoked
> prior the existing initialization of 'mtd->bitflip_threshold'.
> 
> This is a problem since 'scan_bbt()' calls 'mtd_read()', in the case
> of a flash-based bad block table.
> It resulted in a falsely reported bitflips indication during BBT read,
> which lead to constant scrubbing of the flash BBT blocks.
> 
> Initialize 'mtd->bitflip_threshold' to its default value (if not already
> set by the driver), prior invocation of 'scan_bbt()'.
> 
> Reported-by: Sascha Hauer <s.hauer@pengutronix.de>
> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> ---
> - The issue was introduced in 3.5-rc1 and needs to be merged before 3.5
> - Sascha, I've credited you as the reporter, is that ok?

Sure, that's ok.

> - Sascha, care to test and verify it works for your system?

Yes, this puts my system back to work

Tested-by: Sascha Hauer <s.hauer@pengutronix.de>


> 
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 61805e7..0a8724e 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -3484,6 +3484,13 @@ int nand_scan_tail(struct mtd_info *mtd)
>  	/* propagate ecc info to mtd_info */
>  	mtd->ecclayout = chip->ecc.layout;
>  	mtd->ecc_strength = chip->ecc.strength;
> +	/*
> +	 * Initialize bitflip_threshold to its default prior scan_bbt() call.
> +	 * scan_bbt() might invoke mtd_read(), thus bitflip_threshold must be
> +	 * properly set.
> +	 */
> +	if (!mtd->bitflip_threshold)
> +		mtd->bitflip_threshold = mtd->ecc_strength;
>  
>  	/* Check, if we should skip the bad block table scan */
>  	if (chip->options & NAND_SKIP_BBTSCAN)
> -- 
> 1.7.5.4
>
David Woodhouse June 9, 2012, 11:03 a.m. UTC | #4
On Sat, 2012-06-09 at 11:43 +0200, Sascha Hauer wrote:
> 
> > - The issue was introduced in 3.5-rc1 and needs to be merged before
> 3.5
> > - Sascha, I've credited you as the reporter, is that ok?
> 
> Sure, that's ok.
> 
> > - Sascha, care to test and verify it works for your system?
> 
> Yes, this puts my system back to work
> 
> Tested-by: Sascha Hauer <s.hauer@pengutronix.de> 

Thanks. I've pushed this to linux-mtd.git and will ask Linus to pull it
shortly.
Mike Dunn June 9, 2012, 3:47 p.m. UTC | #5
On 06/08/2012 08:29 AM, Shmulik Ladkani wrote:
> As of edbc454 [mtd: driver _read() returns max_bitflips; mtd_read()
> returns -EUCLEAN], 'mtd->bitflip_threshold' must be set for mtd devices
> having ECC, prior any 'mtd_read()' call.
> Otherwise, 'mtd_read()' will falsely return -EUCLEAN.
> 
> Normally, 'mtd->bitflip_threshold' is initialized when the MTD is added.
> 
> However, this is too late for NAND MTDs, as 'scan_bbt()' is invoked
> prior the existing initialization of 'mtd->bitflip_threshold'.
> 
> This is a problem since 'scan_bbt()' calls 'mtd_read()', in the case
> of a flash-based bad block table.
> It resulted in a falsely reported bitflips indication during BBT read,
> which lead to constant scrubbing of the flash BBT blocks.
> 
> Initialize 'mtd->bitflip_threshold' to its default value (if not already
> set by the driver), prior invocation of 'scan_bbt()'.
> 

Reviewed-by: Mike Dunn <mikedunn@newsguy.com>
(FWIW).

Thanks again Shmulik!

I do think that the patch to the bad block management code that I sent yesterday
should be merged as well.  That would also fix the problem, but it is really a
separate issue.  Currently ecc errors are ignored during the BBT creation for
some configurations but not for others (like that of the mxc_nand controller).
It looks like an oversight that they are not ignored, and it makes sense to
ignore them, since there's not much you can do about them at that time, and the
functions higher up the call stack don't try to.

Thanks,
Mike
diff mbox

Patch

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 61805e7..0a8724e 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -3484,6 +3484,13 @@  int nand_scan_tail(struct mtd_info *mtd)
 	/* propagate ecc info to mtd_info */
 	mtd->ecclayout = chip->ecc.layout;
 	mtd->ecc_strength = chip->ecc.strength;
+	/*
+	 * Initialize bitflip_threshold to its default prior scan_bbt() call.
+	 * scan_bbt() might invoke mtd_read(), thus bitflip_threshold must be
+	 * properly set.
+	 */
+	if (!mtd->bitflip_threshold)
+		mtd->bitflip_threshold = mtd->ecc_strength;
 
 	/* Check, if we should skip the bad block table scan */
 	if (chip->options & NAND_SKIP_BBTSCAN)