Message ID | 1340408145-24531-9-git-send-email-computersforpeace@gmail.com |
---|---|
State | New, archived |
Headers | show |
Hi, On Fri, 22 Jun 2012 16:35:45 -0700 Brian Norris <computersforpeace@gmail.com> wrote: > scan_read_raw_oob() is used in only in places where the MTD_OPS_PLACE_OOB > mode is preferable MTD_OPS_RAW mode, so use MTD_OPS_PLACE_OOB instead. Please consider rephrasing, text is bit confusing. > + * Scan read data from data+OOB. May traverse multiple pages, interleaving > + * page,OOB,page,OOB,... in buf. In the case 'len' is less than 'mtd->writesize' (possible from a 'scan_block_full' call), the read oob will be placed immediately after the partially read page, quoting scan_read_raw_oob: ops.len = min(len, (size_t)mtd->writesize); ops.oobbuf = buf + ops.len; Any idea if this is intentional or a bug? Regards, Shmulik
On Tue, Jun 26, 2012 at 7:09 AM, Shmulik Ladkani <shmulik.ladkani@gmail.com> wrote: > On Fri, 22 Jun 2012 16:35:45 -0700 Brian Norris <computersforpeace@gmail.com> wrote: >> scan_read_raw_oob() is used in only in places where the MTD_OPS_PLACE_OOB >> mode is preferable MTD_OPS_RAW mode, so use MTD_OPS_PLACE_OOB instead. > > Please consider rephrasing, text is bit confusing. I realized I was missing the word "to". So the edit would read: scan_read_raw_oob() is used in only in places where the MTD_OPS_PLACE_OOB mode is preferable to MTD_OPS_RAW mode, so use MTD_OPS_PLACE_OOB instead. I can rephrase more, regarding what each mode does, but I'm wondering if this clears up your understanding. >> + * Scan read data from data+OOB. May traverse multiple pages, interleaving >> + * page,OOB,page,OOB,... in buf. > > In the case 'len' is less than 'mtd->writesize' (possible from a > 'scan_block_full' call), the read oob will be placed immediately after > the partially read page, quoting scan_read_raw_oob: > > ops.len = min(len, (size_t)mtd->writesize); > ops.oobbuf = buf + ops.len; > > Any idea if this is intentional or a bug? Not really. I honestly don't fully understand many of the options in NAND_BBT, but I'm trying to work through some of it. I think there's a lot of half-baked stuff in here with not-very-good code structure. Some of the code is written as if it were generic, but it's only meant for some very specific circumstances. With that said, here's a thought: it looks like your hypothetical "len < mtd->writesize" would only occur with NAND_BBT_SCANEMPTY disabled and NAND_BBT_SCANALLPAGES enabled. But I only see them used together: NAND_BBT_SCANEMPTY | NAND_BBT_SCANALLPAGES (omap2.c and with agand_flashbased in nand_bbt.c). So maybe it's a kind of "bug" for a case that was never really thought out. And if they must be used together, maybe we should document that. Brian
Hi Brian, Brian Norris a écrit : > scan_read_raw_oob() is used in only in places where the MTD_OPS_PLACE_OOB > mode is preferable MTD_OPS_RAW mode, so use MTD_OPS_PLACE_OOB instead. > MTD_OPS_PLACE_OOB provides the same functionality with the potential[1] > added bonus of error correction. > > This brings scan_block_full() in line with scan_block_fast() so that they > both read bad block markers with MTD_OPS_PLACE_OOB. This can help in > preventing 0xff markers (in good blocks) from being interpreted as bad > block indicators in the presence of a single bitflip. As far I understand the code, this work when "chip->ecc.read_oob" (used in nand_do_read_oob) correct bit flip. But I see no "chip->ecc.read_oob" implementation that can return bit flip. Is that expected ? This can also work when nand_do_read_ops is used (ops->datbuf != NULL). But it is hard to see case where it can correct bit flip in bad block marker. Do you have any exemple ? For example the default chip->ecc.read_page (nand_read_page_hwecc) doesn't read oob with ecc [1]. Matthieu PS : Did you have any comment on http://thread.gmane.org/gmane.linux.drivers.mtd/42243 ? [1] for (i = 0; eccsteps; eccsteps--, i += eccbytes, p += eccsize) { chip->ecc.hwctl(mtd, NAND_ECC_READ); chip->read_buf(mtd, p, eccsize); chip->ecc.calculate(mtd, p, &ecc_calc[i]); } chip->read_buf(mtd, chip->oob_poi, mtd->oobsize); > > Note that ECC error codes (EUCLEAN or EBADMSG) are already silently > ignored in all users of scan_read_raw_oob(). > > [1] Few drivers perform proper error correction on OOB data. In those > cases, the use of MTD_OPS_RAW vs. MTD_OPS_PLACE_OOB is not > significant. > > Signed-off-by: Brian Norris <computersforpeace@gmail.com> > Cc: Shmulik Ladkani <shmulik.ladkani@gmail.com> > Cc: Mike Dunn <mikedunn@newsguy.com> > Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > drivers/mtd/nand/nand_bbt.c | 27 ++++++++++++++++++++------- > 1 file changed, 20 insertions(+), 7 deletions(-) > > diff --git a/drivers/mtd/nand/nand_bbt.c b/drivers/mtd/nand/nand_bbt.c > index 45cdb97..7b9a0a7 100644 > --- a/drivers/mtd/nand/nand_bbt.c > +++ b/drivers/mtd/nand/nand_bbt.c > @@ -289,14 +289,24 @@ static int scan_read_data(struct mtd_info *mtd, uint8_t *buf, loff_t offs, > return mtd_read(mtd, offs, len, &retlen, buf); > } > > -/* Scan read data from flash */ > +/** > + * scan_read_oob - [GENERIC] Scan data+OOB region to buffer > + * @mtd: MTD device structure > + * @buf: temporary buffer > + * @offs: offset at which to scan > + * @len: length of data region to read > + * > + * Scan read data from data+OOB. May traverse multiple pages, interleaving > + * page,OOB,page,OOB,... in buf. Completes transfer and returns the "strongest" > + * ECC condition (error or bitflip). May quit on the first (non-ECC) error. > + */ > static int scan_read_oob(struct mtd_info *mtd, uint8_t *buf, loff_t offs, > size_t len) > { > struct mtd_oob_ops ops; > - int res; > + int res, ret = 0; > > - ops.mode = MTD_OPS_RAW; > + ops.mode = MTD_OPS_PLACE_OOB; > ops.ooboffs = 0; > ops.ooblen = mtd->oobsize; > > @@ -306,15 +316,18 @@ static int scan_read_oob(struct mtd_info *mtd, uint8_t *buf, loff_t offs, > ops.oobbuf = buf + ops.len; > > res = mtd_read_oob(mtd, offs, &ops); > - > - if (res) > - return res; > + if (res) { > + if (!mtd_is_bitflip_or_eccerr(res)) > + return res; > + else if (mtd_is_eccerr(res) || !ret) > + ret = res; > + } > > buf += mtd->oobsize + mtd->writesize; > len -= mtd->writesize; > offs += mtd->writesize; > } > - return 0; > + return ret; > } > > static int scan_read(struct mtd_info *mtd, uint8_t *buf, loff_t offs,
On Tue, Jul 10, 2012 at 12:45 AM, Matthieu CASTET <matthieu.castet@parrot.com> wrote: > Brian Norris a écrit : >> scan_read_raw_oob() is used in only in places where the MTD_OPS_PLACE_OOB >> mode is preferable MTD_OPS_RAW mode, so use MTD_OPS_PLACE_OOB instead. >> MTD_OPS_PLACE_OOB provides the same functionality with the potential[1] >> added bonus of error correction. >> >> This brings scan_block_full() in line with scan_block_fast() so that they >> both read bad block markers with MTD_OPS_PLACE_OOB. This can help in >> preventing 0xff markers (in good blocks) from being interpreted as bad >> block indicators in the presence of a single bitflip. > > As far I understand the code, this work when "chip->ecc.read_oob" (used in > nand_do_read_oob) correct bit flip. > > But I see no "chip->ecc.read_oob" implementation that can return bit flip. Is > that expected ? I have an out-of-tree driver that corrects OOB bitflips. Is there really no other HW out there that corrects OOB errors? Anyway, I understand that my driver is an outlier here, but I don't see a real disadvantage in these changes. But on the positive side, I expect that in the future, more drivers/HW will either want to stop using OOB for anything at all or will want ECC protection for OOB. > This can also work when nand_do_read_ops is used (ops->datbuf != NULL). But it > is hard to see case where it can correct bit flip in bad block marker. Do you > have any exemple ? First of all, this has no effect if the driver does not protect OOB with ECC (i.e., for OOB-only reads, MTD_OPS_PLACE_OOB == MTD_OPS_RAW). So the following argument only applies when OOB is ECC-protected. Consider a *good* block that is written with filesystem data. On bootup, Linux may scan this block's BBM to check if it is bad. If a bitflip occurs in the bad block marker, then it may be erroneously considered bad. Similarly, if a block was marked bad from wear (not factory-marked), then its BBM may be written along with ECC protection. Then, when we scan for bad blocks, it will be protected from bitflips that could possibly cause 0x00 to appear non-zero. (This is not a big issue, since 'non-zero' is still bad, as long as 0x00 didn't flip to 0xff - quite unlikely...) > PS : Did you have any comment on > http://thread.gmane.org/gmane.linux.drivers.mtd/42243 ? I read it, and it seems promising. I agree with much of the premise (that nand_bbt.c is ugly and repetitive at times) but haven't had enough time to review properly. Sorry. I'm a bit backlogged and will be for a few weeks, I think. But I'll see what I can do. Thanks, Brian
On 07/13/2012 10:39 AM, Brian Norris wrote: > > I have an out-of-tree driver that corrects OOB bitflips. Is there > really no other HW out there that corrects OOB errors? The diskonchip g3 and g4 devices use a hw-generated hamming byte to protect the first eight oob bytes (those not used for page data ecc) with ecc of one bit strength, but neither driver implements the correcting algorithm yet. > > Anyway, I understand that my driver is an outlier here, but I don't > see a real disadvantage in these changes. But on the positive side, I > expect that in the future, more drivers/HW will either want to stop > using OOB for anything at all or will want ECC protection for OOB. > >> This can also work when nand_do_read_ops is used (ops->datbuf != NULL). But it >> is hard to see case where it can correct bit flip in bad block marker. Do you >> have any exemple ? > > First of all, this has no effect if the driver does not protect OOB > with ECC (i.e., for OOB-only reads, MTD_OPS_PLACE_OOB == MTD_OPS_RAW). > So the following argument only applies when OOB is ECC-protected. > > Consider a *good* block that is written with filesystem data. On > bootup, Linux may scan this block's BBM to check if it is bad. If a > bitflip occurs in the bad block marker, then it may be erroneously > considered bad. Yeah, this is a strong argument for ecc on oob-only reads. Thanks, Mike
On Sun, Jul 15, 2012 at 10:01:24PM +0200, Mike Dunn wrote: > On 07/13/2012 10:39 AM, Brian Norris wrote: > > > > I have an out-of-tree driver that corrects OOB bitflips. Is there > > really no other HW out there that corrects OOB errors? > > > The diskonchip g3 and g4 devices use a hw-generated hamming byte to protect the > first eight oob bytes (those not used for page data ecc) with ecc of one bit > strength, but neither driver implements the correcting algorithm yet. > > > > > > Anyway, I understand that my driver is an outlier here, but I don't > > see a real disadvantage in these changes. But on the positive side, I > > expect that in the future, more drivers/HW will either want to stop > > using OOB for anything at all or will want ECC protection for OOB. > > > >> This can also work when nand_do_read_ops is used (ops->datbuf != NULL). But it > >> is hard to see case where it can correct bit flip in bad block marker. Do you > >> have any exemple ? > > > > First of all, this has no effect if the driver does not protect OOB > > with ECC (i.e., for OOB-only reads, MTD_OPS_PLACE_OOB == MTD_OPS_RAW). > > So the following argument only applies when OOB is ECC-protected. > > > > Consider a *good* block that is written with filesystem data. On > > bootup, Linux may scan this block's BBM to check if it is bad. If a > > bitflip occurs in the bad block marker, then it may be erroneously > > considered bad. > > > Yeah, this is a strong argument for ecc on oob-only reads. Hello Mike, I think it is a strong argument for a robust reading of BBM, rather than an argument for ECC on OOB-only reads. By "robust reading", I mean simply looking at the Hamming weight of the marker (the number of 1s in the BBM) rather than its value, as done in nand_block_bad() by setting chip->badblockbits. This robust reading is trivially implemented, does not depend on OOB ecc availability, and benefits all drivers. Even if your driver implements OOB ECC, it may not work on an erased block with a bitflip in its BBM (because erased data may not have a valid ECC). Moreover, reading just the OOB region with ECC may require a full page read on some drivers (when OOB and data are parts of the same codeword). To me, the only strong reason for wanting OOB ECC is the implementation of YAFFS2 or similar filesystems which require OOB metadata protection. But maybe I'm missing some other use cases ? What do you think ? BR,
Hi Ivan, thanks again for the comments. On 07/16/2012 07:01 AM, Ivan Djelic wrote: > On Sun, Jul 15, 2012 at 10:01:24PM +0200, Mike Dunn wrote: > >> Yeah, this is a strong argument for ecc on oob-only reads. > > Hello Mike, > > I think it is a strong argument for a robust reading of BBM, rather than an argument > for ECC on OOB-only reads. By "robust reading", I mean simply looking at the Hamming weight of the > marker (the number of 1s in the BBM) rather than its value, as done in nand_block_bad() by setting chip->badblockbits. > > This robust reading is trivially implemented, does not depend on OOB ecc availability, > and benefits all drivers. Even if your driver implements OOB ECC, it may not work > on an erased block with a bitflip in its BBM (because erased data may not have a valid > ECC). This is a certainty, no? Erased, by definition, includes any ecc bytes. > Moreover, reading just the OOB region with ECC may require a full page read on some drivers > (when OOB and data are parts of the same codeword). > > To me, the only strong reason for wanting OOB ECC is the implementation of YAFFS2 or similar filesystems > which require OOB metadata protection. But maybe I'm missing some other use cases ? > > What do you think ? If we assume the oob bytes on the first page of a good block can contain anything, won't simply counting the bits make the risk of falsly identifying a bb marker unacceptably high? Thanks, Mike
On Mon, Jul 16, 2012 at 08:36:56PM +0200, Mike Dunn wrote: > Hi Ivan, thanks again for the comments. > > On 07/16/2012 07:01 AM, Ivan Djelic wrote: > > On Sun, Jul 15, 2012 at 10:01:24PM +0200, Mike Dunn wrote: > > > >> Yeah, this is a strong argument for ecc on oob-only reads. > > > > Hello Mike, > > > > I think it is a strong argument for a robust reading of BBM, rather than an argument > > for ECC on OOB-only reads. By "robust reading", I mean simply looking at the Hamming weight of the > > marker (the number of 1s in the BBM) rather than its value, as done in nand_block_bad() by setting chip->badblockbits. > > > > This robust reading is trivially implemented, does not depend on OOB ecc availability, > > and benefits all drivers. Even if your driver implements OOB ECC, it may not work > > on an erased block with a bitflip in its BBM (because erased data may not have a valid > > ECC). > > > This is a certainty, no? Erased, by definition, includes any ecc bytes. If you add the right polynomial[1] to your Hamming or BCH code, then you can make sure the ECC of an erased page (possibly with OOB bytes) is actually a sequence of 0xff bytes. This trick enables bitflip correction on data that has never been programmed. Take a look at nand_ecc.c:63 (or nand_bch.c:62) for an example. This trick is not possible on all hardware ECC engines. > > > Moreover, reading just the OOB region with ECC may require a full page read on some drivers > > (when OOB and data are parts of the same codeword). > > > > To me, the only strong reason for wanting OOB ECC is the implementation of YAFFS2 or similar filesystems > > which require OOB metadata protection. But maybe I'm missing some other use cases ? > > > > What do you think ? > > > If we assume the oob bytes on the first page of a good block can contain > anything, won't simply counting the bits make the risk of falsly identifying a > bb marker unacceptably high? Counting bits on a BBM marker byte basically gives a 4-bitflip protection on a _single_ byte, which is *extremely* unlikely. If you assume an erased good block can contain garbage in its OOB region, then it can indeed be wrongly identified as a bad block; but this is also true if you use OOB ECC (e.g. exceeding correction capacity). In practice, since the bad block marker byte is normally never programmed to anything other than 0xff, there is no reason why we should find garbage in it (even if a power failure occurs during an erase/program operation). BR,
On 07/16/2012 02:34 PM, Ivan Djelic wrote: > On Mon, Jul 16, 2012 at 08:36:56PM +0200, Mike Dunn wrote: >> >> This is a certainty, no? Erased, by definition, includes any ecc bytes. > > If you add the right polynomial[1] to your Hamming or BCH code, then you can make > sure the ECC of an erased page (possibly with OOB bytes) is actually a sequence of 0xff bytes. > This trick enables bitflip correction on data that has never been programmed. > Take a look at nand_ecc.c:63 (or nand_bch.c:62) for an example. > This trick is not possible on all hardware ECC engines. Ah, yes, I forgot. You explained this when you were helping me with ecc on the diskonchip. > >> >>> Moreover, reading just the OOB region with ECC may require a full page read on some drivers >>> (when OOB and data are parts of the same codeword). >>> >>> To me, the only strong reason for wanting OOB ECC is the implementation of YAFFS2 or similar filesystems >>> which require OOB metadata protection. But maybe I'm missing some other use cases ? >>> >>> What do you think ? >> >> >> If we assume the oob bytes on the first page of a good block can contain >> anything, won't simply counting the bits make the risk of falsly identifying a >> bb marker unacceptably high? > > Counting bits on a BBM marker byte basically gives a 4-bitflip protection on a _single_ byte, which is > *extremely* unlikely. If you assume an erased good block can contain garbage in its OOB region, then > it can indeed be wrongly identified as a bad block; but this is also true if you use OOB ECC > (e.g. exceeding correction capacity). > In practice, since the bad block marker byte is normally never programmed to anything other than 0xff, > there is no reason why we should find garbage in it (even if a power failure occurs during an > erase/program operation). This is what I was missing. I was assuming that the bb marker byte could be programmed to an arbitrary value by the user. I see now that e.g., davinci defines the bb marker position in its ecc.layout definition, so it will never be written as long as MTD_OPS_AUTO_OOB mode is used. I need to fix that in the docg4. Are you trying to gently nudge me on this? :) I am loathe to touch that bbm code. OTOH, the docg4 would benefit, since it uses an oob bb marker. Thanks again Ivan, Mike
On Fri, 2012-06-22 at 16:35 -0700, Brian Norris wrote: > scan_read_raw_oob() is used in only in places where the MTD_OPS_PLACE_OOB > mode is preferable MTD_OPS_RAW mode, so use MTD_OPS_PLACE_OOB instead. > MTD_OPS_PLACE_OOB provides the same functionality with the potential[1] > added bonus of error correction. Amended the commit message to satisfy Smulik's request, and pushed to l2-mtd.git, thanks Brian! You are doing great job de-mystifying the NAND/BBT code! Sometimes I feel like you are playing too safe and you could be a bit more aggressive removing half-baked or obviously broken stuff.
Hi Artem, On Wed, 15 Aug 2012 15:05:17 +0300 Artem Bityutskiy <dedekind1@gmail.com> wrote: > On Fri, 2012-06-22 at 16:35 -0700, Brian Norris wrote: > > scan_read_raw_oob() is used in only in places where the MTD_OPS_PLACE_OOB > > mode is preferable MTD_OPS_RAW mode, so use MTD_OPS_PLACE_OOB instead. > > MTD_OPS_PLACE_OOB provides the same functionality with the potential[1] > > added bonus of error correction. > > Amended the commit message to satisfy Smulik's request, and pushed to > l2-mtd.git, thanks Brian! I had some second thoughts WRT the move from MTD_OPS_RAW to MTD_OPS_PLACE_OOB. When reading BBT content, MTD_OPS_PLACE_OOB is preferred. But when scanning the OOBs for BBMs (when creating the table), I guess MTD_OPS_RAW should be used - since manufacturer bad blocks are not protected by ECC (as Matthieu noted in [1], and I elaborated in [2]). Can you please review and comment regarding the ideas discussed there? Thanks, Shmulik [1] http://thread.gmane.org/gmane.linux.drivers.mtd/42243/focus=42365 [2] http://thread.gmane.org/gmane.linux.drivers.mtd/42243/focus=42669
Hi Shmulik, On Wed, 2012-08-15 at 17:31 +0300, Shmulik Ladkani wrote: > > Amended the commit message to satisfy Smulik's request, and pushed to > > l2-mtd.git, thanks Brian! > > I had some second thoughts WRT the move from MTD_OPS_RAW to > MTD_OPS_PLACE_OOB. > > When reading BBT content, MTD_OPS_PLACE_OOB is preferred. > > But when scanning the OOBs for BBMs (when creating the table), I guess > MTD_OPS_RAW should be used - since manufacturer bad blocks are not > protected by ECC (as Matthieu noted in [1], and I elaborated in [2]). > > Can you please review and comment regarding the ideas discussed there? > > Thanks, > Shmulik > > [1] > http://thread.gmane.org/gmane.linux.drivers.mtd/42243/focus=42365 > [2] > http://thread.gmane.org/gmane.linux.drivers.mtd/42243/focus=42669 I cannot spend much time reading the threads carefully, so I went through the discussion only very briefly. Apologies. But I took the patches because: 1. They do not break existing (in-tree) systems which do not have OOB protected with ECC. This means the patches are harmless and while useful because they clean up the code. 2. For systems like Brian has where OOB is protected with ECC, the BBM should just not be covered by the OOB ECC. And the OOB scanning code should not read non-BBM bytes, and we are done. This seems to be the only logical setup for me. If it is not the case, I just consider it is Brian's problem, he'll deal with it and send us another set of patches. Or to put it differently: he sent a good clean-up, which does not seem t break anything, why would we not take it? If you object merging these patches, I'd appreciate if you could elaborate why they would be _harmful_ to have. Thanks!
Hi Artem, On Thu, 16 Aug 2012 13:40:02 +0300 Artem Bityutskiy <dedekind1@gmail.com> wrote: > If you object merging these patches, I'd appreciate if you could > elaborate why they would be _harmful_ to have. Thanks. I understand your point. Took another look, and no, these are _not_ harmful to have. Regards, Shmulik
Hi Brian, Firstly, apologies for dragging up an issue that dates back well over a year! While preparing to upstream an out-of-tree NAND driver, we have fallen foul of the change from MTD_OPS_RAW to MTD_OPS_PLACE_OOB in nand_bbt.c:scan_read_oob(). One issue relates to what is at best a hack within our own driver, and that is for us to deal with :-) However, I also have a concern that the patch could result in genuine bad blocks escaping detection. As I understand it, the patch was attempting to address the following situation: - NAND-resident BBTs are not used. - The BBT is re-created on each boot by scanning for MBBM. - A page write yields one or more bit-flips in the location used for the MBBM, resulting in non-0xff data being present. - The non-0xff data is then misinterpreted as a MBBM on a subsequent boot, giving a false bad-block. In cases where the ECC scheme covers the MBBM location, then I can see that enabling the ECC would cause the non-0xff data to be corrected, and therefore avoid the block being falsely identified as bad. However, I can also construct a situation where a genuine MBBM gets "corrected" to 0xff. Consider, for example, an 8-bit ECC scheme covering the MBBM location, where the ECC for a sector of all 0xff data is also all 0xff. In this case, a MBBM of 0x00, with the remaining data all 0xff, would get "corrected" to 0xff. Although perhaps a slightly contrived example, the S/W BCH ECC included in the kernel scheme can be driven in this way, and I have seen blocks marked as bad with this pattern in the past. It is difficult to know if your particular system could suffer in this way. It all depends on the nature of your ECC scheme. I guess my concern is that the patch deviates from what is recommended by the NAND manufacturers, and that it makes certain assumptions on how the ECC scheme operates. My own view is that the only safe way to record and track bad blocks is to use NAND-resident BBTs; after all, if a block is bad then there is no guarantee that an attempt to write a MBBM would succeed. NAND-resident BBTs would also avoid the problem the patch was attempting fix in the first place. Cheers, Angus On 07/13/2012 06:39 PM, Brian Norris wrote: > On Tue, Jul 10, 2012 at 12:45 AM, Matthieu CASTET > <matthieu.castet@parrot.com> wrote: >> Brian Norris a écrit : >>> scan_read_raw_oob() is used in only in places where the MTD_OPS_PLACE_OOB >>> mode is preferable MTD_OPS_RAW mode, so use MTD_OPS_PLACE_OOB instead. >>> MTD_OPS_PLACE_OOB provides the same functionality with the potential[1] >>> added bonus of error correction. >>> >>> This brings scan_block_full() in line with scan_block_fast() so that they >>> both read bad block markers with MTD_OPS_PLACE_OOB. This can help in >>> preventing 0xff markers (in good blocks) from being interpreted as bad >>> block indicators in the presence of a single bitflip. >> >> As far I understand the code, this work when "chip->ecc.read_oob" (used in >> nand_do_read_oob) correct bit flip. >> >> But I see no "chip->ecc.read_oob" implementation that can return bit flip. Is >> that expected ? > > I have an out-of-tree driver that corrects OOB bitflips. Is there > really no other HW out there that corrects OOB errors? > > Anyway, I understand that my driver is an outlier here, but I don't > see a real disadvantage in these changes. But on the positive side, I > expect that in the future, more drivers/HW will either want to stop > using OOB for anything at all or will want ECC protection for OOB. > >> This can also work when nand_do_read_ops is used (ops->datbuf != NULL). But it >> is hard to see case where it can correct bit flip in bad block marker. Do you >> have any exemple ? > > First of all, this has no effect if the driver does not protect OOB > with ECC (i.e., for OOB-only reads, MTD_OPS_PLACE_OOB == MTD_OPS_RAW). > So the following argument only applies when OOB is ECC-protected. > > Consider a *good* block that is written with filesystem data. On > bootup, Linux may scan this block's BBM to check if it is bad. If a > bitflip occurs in the bad block marker, then it may be erroneously > considered bad. > > Similarly, if a block was marked bad from wear (not factory-marked), > then its BBM may be written along with ECC protection. Then, when we > scan for bad blocks, it will be protected from bitflips that could > possibly cause 0x00 to appear non-zero. (This is not a big issue, > since 'non-zero' is still bad, as long as 0x00 didn't flip to 0xff - > quite unlikely...) > >> PS : Did you have any comment on >> http://thread.gmane.org/gmane.linux.drivers.mtd/42243 ? > > I read it, and it seems promising. I agree with much of the premise > (that nand_bbt.c is ugly and repetitive at times) but haven't had > enough time to review properly. Sorry. I'm a bit backlogged and will > be for a few weeks, I think. But I'll see what I can do. > > Thanks, > Brian > > ______________________________________________________ > Linux MTD discussion mailing list > http://lists.infradead.org/mailman/listinfo/linux-mtd/ >
Hi Angus, On Thu, Nov 07, 2013 at 02:56:50PM +0000, Angus Clark wrote: > Firstly, apologies for dragging up an issue that dates back well over a year! No problem. Some of these same issues have been nagging at me in the back of my head, actually. I don't think this problem is solved completely correctly, so it's good you bring it up! > While preparing to upstream an out-of-tree NAND driver, we have fallen foul of > the change from MTD_OPS_RAW to MTD_OPS_PLACE_OOB in nand_bbt.c:scan_read_oob(). > > One issue relates to what is at best a hack within our own driver, and that is > for us to deal with :-) However, I also have a concern that the patch could > result in genuine bad blocks escaping detection. > > As I understand it, the patch was attempting to address the following situation: > - NAND-resident BBTs are not used. > - The BBT is re-created on each boot by scanning for MBBM. > - A page write yields one or more bit-flips in the location used for the > MBBM, resulting in non-0xff data being present. > - The non-0xff data is then misinterpreted as a MBBM on a subsequent boot, > giving a false bad-block. > > In cases where the ECC scheme covers the MBBM location, then I can see that > enabling the ECC would cause the non-0xff data to be corrected, and therefore > avoid the block being falsely identified as bad. This is more or less the situation that was being addressed. But my particular situation was actually targeting situations in which a BBT is used "most of the time", but sometimes (for various reasons) the BBT may be erased/corrupted and need rebuilt on an unclean flash. But it's a similar scenario either way, IMO. > However, I can also construct a situation where a genuine MBBM gets "corrected" > to 0xff. Consider, for example, an 8-bit ECC scheme covering the MBBM location, > where the ECC for a sector of all 0xff data is also all 0xff. In this case, a > MBBM of 0x00, with the remaining data all 0xff, would get "corrected" to 0xff. > Although perhaps a slightly contrived example, the S/W BCH ECC included in the > kernel scheme can be driven in this way, and I have seen blocks marked as bad > with this pattern in the past. Yes, I suspected that this may be possible. But I never observed it myself (I don't use SW ECC; I only use my HW ECC). > It is difficult to know if your particular system could suffer in this way. It > all depends on the nature of your ECC scheme. I guess my concern is that the > patch deviates from what is recommended by the NAND manufacturers, and that it > makes certain assumptions on how the ECC scheme operates. My system cannot suffer in this way, because (unfortunately) my HW ECC does not consider all-0xff to be valid ECC. So it cannot correct bitflips in erased pages, nor can it "correct" a bad block marker from 0x00 to 0xff. I'm not sure it's a deviation from manufacturer recommendations, since manufacturers make no statement about what to do if the BBT fails. But they do seem to imply that you should build the BBT once and never revisit the bad block markers. I agree that my patch does make a few assumptions about the ECC scheme. > My own view is that the only safe way to record and track bad blocks is to use > NAND-resident BBTs; after all, if a block is bad then there is no guarantee that > an attempt to write a MBBM would succeed. NAND-resident BBTs would also avoid > the problem the patch was attempting fix in the first place. I agree that BBTs should be used (I use them). However, they do not solve my original problem completely: how can a BBT be rebuilt robustly? I think that my original approach (use ECC while scanning BBMs) is not 100% correct, but I don't think dropping it is 100% correct either. Rather, I would prefer taking some form of Matthieu's patch (linked in the reply that you're bringing up from > 1 year ago!) so that we use the 'badblockbits' parameter appropriately. Then I would be fine with scanning BBMs in RAW mode, and I think we can satisfy both my use case and yours safely. > On 07/13/2012 06:39 PM, Brian Norris wrote: > > On Tue, Jul 10, 2012 at 12:45 AM, Matthieu CASTET <matthieu.castet@parrot.com> wrote: > >> http://thread.gmane.org/gmane.linux.drivers.mtd/42243 ? BTW, this patch came into discussion again recently for other reasons: http://lists.infradead.org/pipermail/linux-mtd/2013-November/049692.html Anything more you can contribute to this general conversation would be more than welcome! I think Ezequiel and I are still interested in unifying some of nand_bbt and nand_base, regarding BBM scanning, and I think we can resolve several pending issues by doing so. Thanks, Brian
diff --git a/drivers/mtd/nand/nand_bbt.c b/drivers/mtd/nand/nand_bbt.c index 45cdb97..7b9a0a7 100644 --- a/drivers/mtd/nand/nand_bbt.c +++ b/drivers/mtd/nand/nand_bbt.c @@ -289,14 +289,24 @@ static int scan_read_data(struct mtd_info *mtd, uint8_t *buf, loff_t offs, return mtd_read(mtd, offs, len, &retlen, buf); } -/* Scan read data from flash */ +/** + * scan_read_oob - [GENERIC] Scan data+OOB region to buffer + * @mtd: MTD device structure + * @buf: temporary buffer + * @offs: offset at which to scan + * @len: length of data region to read + * + * Scan read data from data+OOB. May traverse multiple pages, interleaving + * page,OOB,page,OOB,... in buf. Completes transfer and returns the "strongest" + * ECC condition (error or bitflip). May quit on the first (non-ECC) error. + */ static int scan_read_oob(struct mtd_info *mtd, uint8_t *buf, loff_t offs, size_t len) { struct mtd_oob_ops ops; - int res; + int res, ret = 0; - ops.mode = MTD_OPS_RAW; + ops.mode = MTD_OPS_PLACE_OOB; ops.ooboffs = 0; ops.ooblen = mtd->oobsize; @@ -306,15 +316,18 @@ static int scan_read_oob(struct mtd_info *mtd, uint8_t *buf, loff_t offs, ops.oobbuf = buf + ops.len; res = mtd_read_oob(mtd, offs, &ops); - - if (res) - return res; + if (res) { + if (!mtd_is_bitflip_or_eccerr(res)) + return res; + else if (mtd_is_eccerr(res) || !ret) + ret = res; + } buf += mtd->oobsize + mtd->writesize; len -= mtd->writesize; offs += mtd->writesize; } - return 0; + return ret; } static int scan_read(struct mtd_info *mtd, uint8_t *buf, loff_t offs,
scan_read_raw_oob() is used in only in places where the MTD_OPS_PLACE_OOB mode is preferable MTD_OPS_RAW mode, so use MTD_OPS_PLACE_OOB instead. MTD_OPS_PLACE_OOB provides the same functionality with the potential[1] added bonus of error correction. This brings scan_block_full() in line with scan_block_fast() so that they both read bad block markers with MTD_OPS_PLACE_OOB. This can help in preventing 0xff markers (in good blocks) from being interpreted as bad block indicators in the presence of a single bitflip. Note that ECC error codes (EUCLEAN or EBADMSG) are already silently ignored in all users of scan_read_raw_oob(). [1] Few drivers perform proper error correction on OOB data. In those cases, the use of MTD_OPS_RAW vs. MTD_OPS_PLACE_OOB is not significant. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Cc: Shmulik Ladkani <shmulik.ladkani@gmail.com> Cc: Mike Dunn <mikedunn@newsguy.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- drivers/mtd/nand/nand_bbt.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-)