Message ID | 1339093928-17545-1-git-send-email-mikedunn@newsguy.com |
---|---|
State | New, archived |
Headers | show |
Hi Mike, On Thu, 7 Jun 2012 11:32:08 -0700 Mike Dunn <mikedunn@newsguy.com> wrote: > Ignore ecc errors in the scan_read_raw_oob() function. Also removed code that > is now redundant. > > Signed-off-by: Mike Dunn <mikedunn@newsguy.com> > --- > drivers/mtd/nand/nand_bbt.c | 6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/mtd/nand/nand_bbt.c b/drivers/mtd/nand/nand_bbt.c > index 30d1319..585da44 100644 > --- a/drivers/mtd/nand/nand_bbt.c > +++ b/drivers/mtd/nand/nand_bbt.c > @@ -319,7 +319,8 @@ static int scan_read_raw_oob(struct mtd_info *mtd, uint8_t *buf, loff_t offs, > > res = mtd_read_oob(mtd, offs, &ops); > > - if (res) > + /* Ignore ECC errors when checking for BBM */ > + if (res && !mtd_is_bitflip_or_eccerr(res)) > return res; IMO this is not necessary. Note the 'ops.mode' is initialized to MTD_OPS_RAW; meaning, no ECC is performed during read ('ecc.read_page_raw' will be invoked). Thus, EUCLEAN/EBADMSG should not be reported. Am I missing something here? > @@ -406,8 +407,7 @@ static int scan_block_full(struct mtd_info *mtd, struct nand_bbt_descr *bd, > int ret, j; > > ret = scan_read_raw_oob(mtd, buf, offs, readlen); > - /* Ignore ECC errors when checking for BBM */ > - if (ret && !mtd_is_bitflip_or_eccerr(ret)) > + if (ret) > return ret; Don't understand why 'mtd_is_bitflip_or_eccerr' is originally tested. As noted above, 'scan_read_raw_oob' won't return EUCLEAN/EBADMSG, as it uses MTD_OPS_RAW. And what's even stranger is the code in 'scan_block_fast()', quote: ops.datbuf = NULL; ops.mode = MTD_OPS_PLACE_OOB; for (j = 0; j < len; j++) { ret = mtd_read_oob(mtd, offs, &ops); /* Ignore ECC errors when checking for BBM */ if (ret && !mtd_is_bitflip_or_eccerr(ret)) return ret; Here, 'MTD_OPS_PLACE_OOB' is explicitly used - so *theoretically* 'mtd_read_oob' might return EUCLEAN/EBADMSG, see 'nand_do_read_oob', (although NO driver YET reports maxbitflips from its 'chip->ecc.read_oob' method). Why not use 'MTD_OPS_RAW' in 'scan_block_fast()' as well (as done in 'scan_block_full')? Regards, Shmulik
Hi, Friendly comment: it would be helpful to include me on these types of threads, since I am responsible for much of the code in discussion here. (Try 'git blame'). There are some things I apparently got wrong in the first place and some new issues since the bitflips_threshold introduction. Also, there's some incomplete information, since (as noted below) some things are not used in-tree, but they are in use in my out-of-tree driver for which I submitted the original code. I believe that the changes are generally useful, though, if understood and utilized properly. On Sun, Jun 10, 2012 at 4:50 AM, Shmulik Ladkani <shmulik.ladkani@gmail.com> wrote: > On Thu, 7 Jun 2012 11:32:08 -0700 Mike Dunn <mikedunn@newsguy.com> wrote: >> Ignore ecc errors in the scan_read_raw_oob() function. Also removed code that >> is now redundant. >> >> Signed-off-by: Mike Dunn <mikedunn@newsguy.com> >> --- >> drivers/mtd/nand/nand_bbt.c | 6 +++--- >> 1 files changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/mtd/nand/nand_bbt.c b/drivers/mtd/nand/nand_bbt.c >> index 30d1319..585da44 100644 >> --- a/drivers/mtd/nand/nand_bbt.c >> +++ b/drivers/mtd/nand/nand_bbt.c >> @@ -319,7 +319,8 @@ static int scan_read_raw_oob(struct mtd_info *mtd, uint8_t *buf, loff_t offs, >> >> res = mtd_read_oob(mtd, offs, &ops); >> >> - if (res) >> + /* Ignore ECC errors when checking for BBM */ >> + if (res && !mtd_is_bitflip_or_eccerr(res)) >> return res; > > IMO this is not necessary. > Note the 'ops.mode' is initialized to MTD_OPS_RAW; meaning, no ECC is > performed during read ('ecc.read_page_raw' will be invoked). > Thus, EUCLEAN/EBADMSG should not be reported. > Am I missing something here? Shmulik is correct. A check for bitflips on a MTD_OPS_RAW operation is unnecessary. >> @@ -406,8 +407,7 @@ static int scan_block_full(struct mtd_info *mtd, struct nand_bbt_descr *bd, >> int ret, j; >> >> ret = scan_read_raw_oob(mtd, buf, offs, readlen); >> - /* Ignore ECC errors when checking for BBM */ >> - if (ret && !mtd_is_bitflip_or_eccerr(ret)) >> + if (ret) >> return ret; > > Don't understand why 'mtd_is_bitflip_or_eccerr' is originally tested. > As noted above, 'scan_read_raw_oob' won't return EUCLEAN/EBADMSG, as it > uses MTD_OPS_RAW. FWIW, I originally was *trying* to mirror functionality in scan_block_full() and scan_block_fast(), but my code was wrong apparently. See below. > And what's even stranger is the code in 'scan_block_fast()', quote: > > ops.datbuf = NULL; > ops.mode = MTD_OPS_PLACE_OOB; > > for (j = 0; j < len; j++) { > ret = mtd_read_oob(mtd, offs, &ops); > /* Ignore ECC errors when checking for BBM */ > if (ret && !mtd_is_bitflip_or_eccerr(ret)) > return ret; > > Here, 'MTD_OPS_PLACE_OOB' is explicitly used - so *theoretically* > 'mtd_read_oob' might return EUCLEAN/EBADMSG, see 'nand_do_read_oob', > (although NO driver YET reports maxbitflips from its 'chip->ecc.read_oob' > method). I think we discussed the "read_oob" case and max_bitflips previously, where I noted that my out-of-tree driver uses ECC that covers OOB and page data and sets ecc_stats.corrected and ecc_stats.failed according to the entire page+OOB when ecc.read_oob is called. So, this call to mtd_read_oob() may return EUCLEAN/EBADMSG when used my driver. This isn't a problem and should be left alone I think. > Why not use 'MTD_OPS_RAW' in 'scan_block_fast()' as well (as done in > 'scan_block_full')? Because when ECC is available on OOB, it is good to utilize it, so that bitflips don't cause good blocks to be marked bad. Less importantly, when bad block markers are written by Linux (with ECC), then these markers can be read back more reliably. (I note that there is a nand_chip.badblockbits option that could theoretically alleviate the first issue, but I see it is not actually implemented throughout nand_bbt.c - only in nand_base.c) IMO, 'scan_block_full' contains the erroneous code for now. When I changed scan_block_fast(), I half-heartedly changed scan_block_full() as well, but it does not appear on my code path, so I didn't notice the discrepancy. This brings up a side issue: I think many of the "_raw" function names in nand_bbt.c are misleading. They do not all use MTD_OPS_RAW (and shouldn't). Effectively, I would prefer that *all* the calls in nand_bbt.c use the non-RAW version of the MTD/NAND interfaces, and then ignore the errors if sensible. e.g., when reading a bad block marker. But nand_bbt.c does a lot more (reading BBTs from flash, checking for "Bbt0" and "1tbB" table marker, etc.) that *must* use ECC to provide any robustness. So, I can send a patch that straightens out naming and brings scan_block_fast() and scan_block_full() into alignment on using MTD_OPS_PLACE_OOB. Then I think that we should continue using this code in both: /* Ignore ECC errors when checking for BBM */ if (ret && !mtd_is_bitflip_or_eccerr(ret)) And we can (as agreed previously?) avoid using/checking max_bitflips in mtd_read_oob(), although there is the datbuf+oob case that calls nand_do_read_ops... Brian
Thanks Brian for your comments, On Sun, 10 Jun 2012 22:45:50 -0700 Brian Norris <computersforpeace@gmail.com> wrote: > Friendly comment: it would be helpful to include me on these types of > threads, since I am responsible for much of the code in discussion > here. (Try 'git blame'). Sure, will do. I usually do a 'git blame', for some reason was careless this time... > > Here, 'MTD_OPS_PLACE_OOB' is explicitly used - so *theoretically* > > 'mtd_read_oob' might return EUCLEAN/EBADMSG, see 'nand_do_read_oob', > > (although NO driver YET reports maxbitflips from its 'chip->ecc.read_oob' > > method). > > I think we discussed the "read_oob" case and max_bitflips previously, > where I noted that my out-of-tree driver uses ECC that covers OOB and > page data and sets ecc_stats.corrected and ecc_stats.failed according > to the entire page+OOB when ecc.read_oob is called. So, this call to > mtd_read_oob() may return EUCLEAN/EBADMSG when used my driver. This > isn't a problem and should be left alone I think. Bad phrasing on my side, should have said "although NO in-tree driver". > > Why not use 'MTD_OPS_RAW' in 'scan_block_fast()' as well (as done in > > 'scan_block_full')? > > Because when ECC is available on OOB, it is good to utilize it, so > that bitflips don't cause good blocks to be marked bad. Less > importantly, when bad block markers are written by Linux (with ECC), > then these markers can be read back more reliably. Ok (due to the second reason, as we're discussing the BBM read implementation). > So, I can send a patch that straightens out naming and brings > scan_block_fast() and scan_block_full() into alignment on using > MTD_OPS_PLACE_OOB. Then I think that we should continue using this > code in both: > /* Ignore ECC errors when checking for BBM */ > if (ret && !mtd_is_bitflip_or_eccerr(ret)) Sounds reasonable. > And we can (as agreed previously?) avoid using/checking max_bitflips > in mtd_read_oob(), although there is the datbuf+oob case that calls > nand_do_read_ops... This is a separate thing that needs to be addressed. See my findings in [1]. Regards, Shmulik [1] http://lists.infradead.org/pipermail/linux-mtd/2012-June/042194.html
On 06/10/2012 11:41 PM, Shmulik Ladkani wrote: > Thanks Brian for your comments, You're welcome. And thanks for the interest in reviewing the code. Few people actually delve into nand_bbt.c, it seems :) > On Sun, 10 Jun 2012 22:45:50 -0700 Brian Norris<computersforpeace@gmail.com> wrote: >>> Why not use 'MTD_OPS_RAW' in 'scan_block_fast()' as well (as done in >>> 'scan_block_full')? >> >> Because when ECC is available on OOB, it is good to utilize it, so >> that bitflips don't cause good blocks to be marked bad. Less >> importantly, when bad block markers are written by Linux (with ECC), >> then these markers can be read back more reliably. > > Ok (due to the second reason, as we're discussing the BBM read > implementation). I guess I worded my 'first' reason a little wrong. I meant: so that bitflips don't cause good blocks (0xff) to be read as bad blocks (bitflip => non-0xff). I think this would be far more significant, as a single bitflip in the BBM byte could cause errors. >> So, I can send a patch that straightens out naming and brings >> scan_block_fast() and scan_block_full() into alignment on using >> MTD_OPS_PLACE_OOB. Then I think that we should continue using this >> code in both: >> /* Ignore ECC errors when checking for BBM */ >> if (ret&& !mtd_is_bitflip_or_eccerr(ret)) > > Sounds reasonable. OK, I'll do that this week. >> And we can (as agreed previously?) avoid using/checking max_bitflips >> in mtd_read_oob(), although there is the datbuf+oob case that calls >> nand_do_read_ops... > > This is a separate thing that needs to be addressed. > See my findings in [1]. Right, I saw both threads kinda simultaneously. It looks like Mike may need to add some max_bitflips logic to mtd_read_oob after all. Brian
On Sun, 2012-06-10 at 22:45 -0700, Brian Norris wrote: > > This brings up a side issue: I think many of the "_raw" function names > in nand_bbt.c are misleading. They do not all use MTD_OPS_RAW (and > shouldn't). Effectively, I would prefer that *all* the calls in > nand_bbt.c use the non-RAW version of the MTD/NAND interfaces, and > then ignore the errors if sensible. e.g., when reading a bad block > marker. But nand_bbt.c does a lot more (reading BBTs from flash, > checking for "Bbt0" and "1tbB" table marker, etc.) that *must* use ECC > to provide any robustness. > > So, I can send a patch that straightens out naming and brings > scan_block_fast() and scan_block_full() into alignment on using > MTD_OPS_PLACE_OOB. Sounds like a good idea to me.
On 06/10/2012 10:45 PM, Brian Norris wrote: > > On Sun, Jun 10, 2012 at 4:50 AM, Shmulik Ladkani > <shmulik.ladkani@gmail.com> wrote: >> On Thu, 7 Jun 2012 11:32:08 -0700 Mike Dunn <mikedunn@newsguy.com> wrote: >>> Ignore ecc errors in the scan_read_raw_oob() function. Also removed code that >>> is now redundant. >>> >>> Signed-off-by: Mike Dunn <mikedunn@newsguy.com> >>> --- >>> drivers/mtd/nand/nand_bbt.c | 6 +++--- >>> 1 files changed, 3 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/mtd/nand/nand_bbt.c b/drivers/mtd/nand/nand_bbt.c >>> index 30d1319..585da44 100644 >>> --- a/drivers/mtd/nand/nand_bbt.c >>> +++ b/drivers/mtd/nand/nand_bbt.c >>> @@ -319,7 +319,8 @@ static int scan_read_raw_oob(struct mtd_info *mtd, uint8_t *buf, loff_t offs, >>> >>> res = mtd_read_oob(mtd, offs, &ops); >>> >>> - if (res) >>> + /* Ignore ECC errors when checking for BBM */ >>> + if (res && !mtd_is_bitflip_or_eccerr(res)) >>> return res; >> >> IMO this is not necessary. >> Note the 'ops.mode' is initialized to MTD_OPS_RAW; meaning, no ECC is >> performed during read ('ecc.read_page_raw' will be invoked). >> Thus, EUCLEAN/EBADMSG should not be reported. >> Am I missing something here? > > Shmulik is correct. A check for bitflips on a MTD_OPS_RAW operation is > unnecessary. Yes, I missed the ops mode raw. Obviously my analysis was wrong. Good that nand_bbt is getting cleaned up some. Thanks again Brian and Shmulik. Mike
diff --git a/drivers/mtd/nand/nand_bbt.c b/drivers/mtd/nand/nand_bbt.c index 30d1319..585da44 100644 --- a/drivers/mtd/nand/nand_bbt.c +++ b/drivers/mtd/nand/nand_bbt.c @@ -319,7 +319,8 @@ static int scan_read_raw_oob(struct mtd_info *mtd, uint8_t *buf, loff_t offs, res = mtd_read_oob(mtd, offs, &ops); - if (res) + /* Ignore ECC errors when checking for BBM */ + if (res && !mtd_is_bitflip_or_eccerr(res)) return res; buf += mtd->oobsize + mtd->writesize; @@ -406,8 +407,7 @@ static int scan_block_full(struct mtd_info *mtd, struct nand_bbt_descr *bd, int ret, j; ret = scan_read_raw_oob(mtd, buf, offs, readlen); - /* Ignore ECC errors when checking for BBM */ - if (ret && !mtd_is_bitflip_or_eccerr(ret)) + if (ret) return ret; for (j = 0; j < len; j++, buf += scanlen) {
Ignore ecc errors in the scan_read_raw_oob() function. Also removed code that is now redundant. Signed-off-by: Mike Dunn <mikedunn@newsguy.com> --- drivers/mtd/nand/nand_bbt.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-)