Message ID | 1386619753-27613-3-git-send-email-computersforpeace@gmail.com |
---|---|
State | Superseded, archived |
Headers | show |
On Mon, Dec 09, 2013 at 12:09:12PM -0800, Brian Norris wrote: > + * nand_set_read_retry - [INTERN] Set the READ RETRY mode > + * @mtd: MTD device structure > + * @retry_mode: the retry mode to use > + * > + * Some vendors supply a special command to shift the Vt threshold, to be used > + * when there are too many bitflips in a page (i.e., ECC error). After setting > + * a new threshold, the host should retry reading the page. > + */ > +static int nand_set_read_retry(struct mtd_info *mtd, int retry_mode) > +{ > + struct nand_chip *chip = mtd->priv; > + uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode}; > + This can cause a DMA warning. > + if (retry_mode >= chip->read_retries) > + return -EINVAL; > + > + if (chip->onfi_params.jedec_id == NAND_MFR_MICRON) > + return chip->onfi_set_features(mtd, chip, > + ONFI_FEATURE_ADDR_READ_RETRY, feature); I suggest to add a hook such as for nand_chip{}: chip->read_retry(..) Different nand chips fill different hook. For Micron, fill it with micron_read_retry(); for Toshiba, fill it with a toshiab_read_retry(); For Hynix, fill it with hynix_read_retry(). I am wondar if we should add a file for the read-retry in the drivers/mtd/nand folder. thanks Huang Shijie
On Wed, Dec 11, 2013 at 09:54:54PM +0800, Huang Shijie wrote: > On Mon, Dec 09, 2013 at 12:09:12PM -0800, Brian Norris wrote: > > + * nand_set_read_retry - [INTERN] Set the READ RETRY mode > > + * @mtd: MTD device structure > > + * @retry_mode: the retry mode to use > > + * > > + * Some vendors supply a special command to shift the Vt threshold, to be used > > + * when there are too many bitflips in a page (i.e., ECC error). After setting > > + * a new threshold, the host should retry reading the page. > > + */ > > +static int nand_set_read_retry(struct mtd_info *mtd, int retry_mode) > > +{ > > + struct nand_chip *chip = mtd->priv; > > + uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode}; > > + > This can cause a DMA warning. ...on GPMI NAND, but not on most (any?) other drivers. Why does GPMI try to use DMA on *every* operation? That doesn't even make sense for a 4 or 5 byte transfer. Plus, we don't give a guarantee that buffers will be DMA-able in MTD (UBI uses vmalloc() buffers, for instance), so I'm sure you'll hit problems other places. I can fix this one (use chip->buffers->databuf instead?) but I think GPMI is a bad citizen in this regard, given that you have no guarantee. You need to fix MTD systematically if you expect this guarantee. (For one, nand_default_block_markbad() uses stack-allocated buffers; but I see that GPMI overrides this callback.) > > + if (retry_mode >= chip->read_retries) > > + return -EINVAL; > > + > > + if (chip->onfi_params.jedec_id == NAND_MFR_MICRON) > > + return chip->onfi_set_features(mtd, chip, > > + ONFI_FEATURE_ADDR_READ_RETRY, feature); > I suggest to add a hook such as for nand_chip{}: > chip->read_retry(..) > > Different nand chips fill different hook. OK, that can make sense. > For Micron, fill it with micron_read_retry(); > for Toshiba, fill it with a toshiab_read_retry(); > For Hynix, fill it with hynix_read_retry(). Do you have info on Toshiba/Hynix read retry? I'd be interested to know what their "features" look like. Do they have a similar set of supported retry modes, where we have to cycle through each mode and retry? I'm interested in whether the nand_chip.read_retries field is useful for the others. Also, what would a 'read_retry()' callback look like? What are its parameters? Right now, it's just the struct mtd_info + an incrementing integer, from 0 to chip->read_retries-1. > I am wondar if we should add a file for the read-retry in the > drivers/mtd/nand folder. TBH, I'm not really very happy to be opening the read-retry can of worms. From what I hear, read-retry is just the first step down a path of vendor-specific detail, which ends with you re-implementing the FTL that these vendors develop for their SSDs. But anyway, no I don't think read-retry warrants its own file, at least not yet. I'd need to understand where it's heading before we decide that it needs this kind of separation. I do think, though, that it may be worth moving some more of the vendor-specific stuff (like the identification code, nand_get_flash_type(), + read-retry callbacks) into a new file. Maybe nand_vendors.c? nand_ident.c? But I think it needs a bit of cleanup first; untangling the auto-buswidth stuff that Pekon, Ezequiel and I have been dealing with might be a start. Brian
On Wed, Dec 11, 2013 at 11:03:13AM -0800, Brian Norris wrote: > On Wed, Dec 11, 2013 at 09:54:54PM +0800, Huang Shijie wrote: > > On Mon, Dec 09, 2013 at 12:09:12PM -0800, Brian Norris wrote: > > > + * nand_set_read_retry - [INTERN] Set the READ RETRY mode > > > + * @mtd: MTD device structure > > > + * @retry_mode: the retry mode to use > > > + * > > > + * Some vendors supply a special command to shift the Vt threshold, to be used > > > + * when there are too many bitflips in a page (i.e., ECC error). After setting > > > + * a new threshold, the host should retry reading the page. > > > + */ > > > +static int nand_set_read_retry(struct mtd_info *mtd, int retry_mode) > > > +{ > > > + struct nand_chip *chip = mtd->priv; > > > + uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode}; > > > + > > This can cause a DMA warning. > > ...on GPMI NAND, but not on most (any?) other drivers. Why does GPMI try > to use DMA on *every* operation? That doesn't even make sense for a 4 or > 5 byte transfer. Plus, we don't give a guarantee that buffers will be > DMA-able in MTD (UBI uses vmalloc() buffers, for instance), so I'm sure > you'll hit problems other places. I can fix this one (use > chip->buffers->databuf instead?) but I think GPMI is a bad citizen in > this regard, given that you have no guarantee. You need to fix MTD > systematically if you expect this guarantee. > > (For one, nand_default_block_markbad() uses stack-allocated buffers; but > I see that GPMI overrides this callback.) > Isn't completely bloated to setup a DMA operation just to transfer a bunch of bytes? Wouldn't that seriously hurt performance? Or maybe it doesn't matter because those commands are scarcely invoked...
On Wed, Dec 11, 2013 at 11:03:13AM -0800, Brian Norris wrote: > On Wed, Dec 11, 2013 at 09:54:54PM +0800, Huang Shijie wrote: > > On Mon, Dec 09, 2013 at 12:09:12PM -0800, Brian Norris wrote: > > > + * nand_set_read_retry - [INTERN] Set the READ RETRY mode > > > + * @mtd: MTD device structure > > > + * @retry_mode: the retry mode to use > > > + * > > > + * Some vendors supply a special command to shift the Vt threshold, to be used > > > + * when there are too many bitflips in a page (i.e., ECC error). After setting > > > + * a new threshold, the host should retry reading the page. > > > + */ > > > +static int nand_set_read_retry(struct mtd_info *mtd, int retry_mode) > > > +{ > > > + struct nand_chip *chip = mtd->priv; > > > + uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode}; > > > + > > This can cause a DMA warning. > > ...on GPMI NAND, but not on most (any?) other drivers. Why does GPMI try > to use DMA on *every* operation? That doesn't even make sense for a 4 or > 5 byte transfer. Plus, we don't give a guarantee that buffers will be > DMA-able in MTD (UBI uses vmalloc() buffers, for instance), so I'm sure > you'll hit problems other places. I can fix this one (use > chip->buffers->databuf instead?) but I think GPMI is a bad citizen in > this regard, given that you have no guarantee. You need to fix MTD > systematically if you expect this guarantee. You do not need to change this code, if you like it. The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG. > > (For one, nand_default_block_markbad() uses stack-allocated buffers; but > I see that GPMI overrides this callback.) > > > > + if (retry_mode >= chip->read_retries) > > > + return -EINVAL; > > > + > > > + if (chip->onfi_params.jedec_id == NAND_MFR_MICRON) > > > + return chip->onfi_set_features(mtd, chip, > > > + ONFI_FEATURE_ADDR_READ_RETRY, feature); > > I suggest to add a hook such as for nand_chip{}: > > chip->read_retry(..) > > > > Different nand chips fill different hook. > > OK, that can make sense. > > > For Micron, fill it with micron_read_retry(); > > for Toshiba, fill it with a toshiab_read_retry(); > > For Hynix, fill it with hynix_read_retry(). > > Do you have info on Toshiba/Hynix read retry? I'd be interested to know > what their "features" look like. Do they have a similar set of supported > retry modes, where we have to cycle through each mode and retry? I'm > interested in whether the nand_chip.read_retries field is useful for the > others. i will send you the documents when i back office. Different nand chips use different read-retry methods. A headache to us. The common idea between the read-retry methods is: "we have to cycle through each mode and retry" So the chip.read_retries is used by others too. I ever implemented the Micron's read-retry in my local code. I planned to implement others, but i was intterrupted by high priority jobs. > > Also, what would a 'read_retry()' callback look like? What are its > parameters? Right now, it's just the struct mtd_info + an incrementing > integer, from 0 to chip->read_retries-1. I think the parameters are good now. > > > I am wondar if we should add a file for the read-retry in the > > drivers/mtd/nand folder. > > TBH, I'm not really very happy to be opening the read-retry can of > worms. From what I hear, read-retry is just the first step down a path it is ok to me if you do not like to create a new file, that is just a suggestion. After the nand chip use <19nm, afaik, all the NAND chips needs the read-retry. We have to implement the read-retry for all of them. The nightmare is coming. thanks Huang Shijie
On Wed, Dec 11, 2013 at 05:31:12PM -0300, Ezequiel Garcia wrote: > Isn't completely bloated to setup a DMA operation just to transfer a > bunch of bytes? Wouldn't that seriously hurt performance? > > Or maybe it doesn't matter because those commands are scarcely > invoked... As you said, those commands are scarcely invoked. thanks Huang Shijie
On Mon, Dec 16, 2013 at 08:43:42PM -0800, Brian Norris wrote: > > > > The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG. > > But the warning has a significant purpose: it is *not* legal to perform > DMA on stack memory, and no driver should do this. Either nand_base or > gpmi-nand need to be fixed here. > > But I think it's the GPMI driver is the one that needs fixing, not this > patch. Are there any scenarios where it makes sense for gpmi-nand to use > DMA for its read_buf/write_buf? Shouldn't the only DMA-necessary > operations come through the ecc.write_page and ecc.read_page callbacks? > If I'm correct, then gpmi-nand should not be using DMA at all in > read_buf/write_buf. Currently, all the operations, including the read_buf/write_buf, use the DMA. Is there any ABI tells us we should not use the DMA for read_buf/write_buf? :) > > > Different nand chips use different read-retry methods. A headache to us. > > Any chance we can push these vendors to implement things through a > standard, like JEDEC? I believe there's a JEDEC parameter page standard we do not have such influence to push these vendors. they will not listen to us. :( > now, though I haven't looked at it closely. Micron's support is nice and > simple because it properly uses the ONFI vendor block. Other vendors > should take note... yes. Micron is good example. thanks Huang Shijie
On Thu, Dec 12, 2013 at 11:47:52AM +0800, Huang Shijie wrote: > On Wed, Dec 11, 2013 at 11:03:13AM -0800, Brian Norris wrote: > > On Wed, Dec 11, 2013 at 09:54:54PM +0800, Huang Shijie wrote: > > > On Mon, Dec 09, 2013 at 12:09:12PM -0800, Brian Norris wrote: > > > > + uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode}; > > > > + > > > This can cause a DMA warning. > > > > ...on GPMI NAND, but not on most (any?) other drivers. Why does GPMI try > > to use DMA on *every* operation? That doesn't even make sense for a 4 or > > 5 byte transfer. Plus, we don't give a guarantee that buffers will be > > DMA-able in MTD (UBI uses vmalloc() buffers, for instance), so I'm sure > > you'll hit problems other places. I can fix this one (use > > chip->buffers->databuf instead?) but I think GPMI is a bad citizen in > > this regard, given that you have no guarantee. You need to fix MTD > > systematically if you expect this guarantee. > You do not need to change this code, if you like it. > > The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG. But the warning has a significant purpose: it is *not* legal to perform DMA on stack memory, and no driver should do this. Either nand_base or gpmi-nand need to be fixed here. But I think it's the GPMI driver is the one that needs fixing, not this patch. Are there any scenarios where it makes sense for gpmi-nand to use DMA for its read_buf/write_buf? Shouldn't the only DMA-necessary operations come through the ecc.write_page and ecc.read_page callbacks? If I'm correct, then gpmi-nand should not be using DMA at all in read_buf/write_buf. > Different nand chips use different read-retry methods. A headache to us. Any chance we can push these vendors to implement things through a standard, like JEDEC? I believe there's a JEDEC parameter page standard now, though I haven't looked at it closely. Micron's support is nice and simple because it properly uses the ONFI vendor block. Other vendors should take note... > The common idea between the read-retry methods is: > "we have to cycle through each mode and retry" > > So the chip.read_retries is used by others too. Good to know. > After the nand chip use <19nm, afaik, all the NAND chips needs the > read-retry. We have to implement the read-retry for all of them. > The nightmare is coming. *cue the dark organ music* Sigh. I might have to abandon MTD for eMMC some day ;) Brian
On Tue, Dec 17, 2013 at 12:23:08PM +0800, Huang Shijie wrote: > On Mon, Dec 16, 2013 at 08:43:42PM -0800, Brian Norris wrote: > > > > > > The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG. > > > > But the warning has a significant purpose: it is *not* legal to perform > > DMA on stack memory, and no driver should do this. Either nand_base or > > gpmi-nand need to be fixed here. > > > > But I think it's the GPMI driver is the one that needs fixing, not this > > patch. Are there any scenarios where it makes sense for gpmi-nand to use > > DMA for its read_buf/write_buf? Shouldn't the only DMA-necessary > > operations come through the ecc.write_page and ecc.read_page callbacks? > > If I'm correct, then gpmi-nand should not be using DMA at all in > > read_buf/write_buf. > Currently, all the operations, including the read_buf/write_buf, use > the DMA. > > Is there any ABI tells us we should not use the DMA for read_buf/write_buf? :) No... but this is not about ABI, it's about designing a reasonable API and abstraction between users (NAND drivers) and the helper code (nand_base). MTD never had a good design for DMA-able operations; but I think that it's at least an approachable problem if we provide limitations on when and where we might expect to use DMA. Particularly, I think it is quite reasonable to restrict DMA to mtd_read()/mtd_write()-like operations -- so this trickles down to read_page/write_page, but not to device configuration operations like ONFI set_features/get_features/read parameter tables, etc. Do you disagree? Do you see some need to perform DMA on all operations? > > > Different nand chips use different read-retry methods. A headache to us. > > > > Any chance we can push these vendors to implement things through a > > standard, like JEDEC? I believe there's a JEDEC parameter page standard > we do not have such influence to push these vendors. > they will not listen to us. :( We can try! Strength in numbers! BTW, last time I brought up ONFI w/ a Toshiba representative, I think I offended them :) But at least I had their ear for a moment. Anyway, it may be more reasonable to suggest a JEDEC standard, as I think they felt JEDEC was more vendor-independent, and they were at least open to entertaining the idea. Brian
On Mon, Dec 16, 2013 at 09:06:13PM -0800, Brian Norris wrote: > On Tue, Dec 17, 2013 at 12:23:08PM +0800, Huang Shijie wrote: > > On Mon, Dec 16, 2013 at 08:43:42PM -0800, Brian Norris wrote: > > > > > > > > The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG. > Particularly, I think it is quite reasonable to restrict DMA to > mtd_read()/mtd_write()-like operations -- so this trickles down to > read_page/write_page, but not to device configuration operations like > ONFI set_features/get_features/read parameter tables, etc. > > Do you disagree? Do you see some need to perform DMA on all operations? The gpmi is much coupled with the MXS-DMA engine. the gpmi can uses the mxs-dma to set the gpmi registers. Use the DMA makes the GPMI code very simple, it really makes the code very _ugly_ if we do not use the DMA for read_buf/write_buf. If you insist that it is okay to use the local array in the nand_base.c, I can use kmalloc/memcpy in the gpmi's read_buf/write_buf hooks. In such a way, we can make the gpmi looks more normal. sorry, I really think it is no need to force the gpmi do not use the DMA for read_buf/write_buf. > > > > > Different nand chips use different read-retry methods. A headache to us. > > > > > > Any chance we can push these vendors to implement things through a > > > standard, like JEDEC? I believe there's a JEDEC parameter page standard > > we do not have such influence to push these vendors. > > they will not listen to us. :( > > We can try! Strength in numbers! BTW, last time I brought up ONFI w/ a > Toshiba representative, I think I offended them :) But at least I had > their ear for a moment. > look good to me. :) I will suggest their FAE/AE when i meet them next time. thanks Huang Shijie
On Tue, Dec 17, 2013 at 01:11:25PM +0800, Huang Shijie wrote: > On Mon, Dec 16, 2013 at 09:06:13PM -0800, Brian Norris wrote: > > On Tue, Dec 17, 2013 at 12:23:08PM +0800, Huang Shijie wrote: > > > On Mon, Dec 16, 2013 at 08:43:42PM -0800, Brian Norris wrote: > > > > > > > > > > The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG. > > Particularly, I think it is quite reasonable to restrict DMA to > > mtd_read()/mtd_write()-like operations -- so this trickles down to > > read_page/write_page, but not to device configuration operations like > > ONFI set_features/get_features/read parameter tables, etc. > > > > Do you disagree? Do you see some need to perform DMA on all operations? > > The gpmi is much coupled with the MXS-DMA engine. the gpmi can uses the > mxs-dma to set the gpmi registers. > > Use the DMA makes the GPMI code very simple, it really makes the code very > _ugly_ if we do not use the DMA for read_buf/write_buf. It's not just about ugliness, it's about maintainability. I can clean up a few cases here and there that don't conform to your driver's needs now (notice I pointed out another instance of this in Uwe's patch), but unless we establish better guidelines for NAND, this is fragile. As a diversion, let's look at atmel_nand for an example. It uses DMA as well, but it dodges some of your problems using a heuristic, so that it only uses DMA for larger-than-OOB-size transactions (see atmel_read_buf). Additionally, it has a hack-y check for highmem, to try to confirm that the region it's trying to map is, in fact, DMA-able. So, atmel_nand does not have an ideal solution, but it at least attempts to solve parts of this problem. GPMI, on the other hand, ignores it entirely. > If you insist that it is okay to use the local array in the nand_base.c, > I can use kmalloc/memcpy in the gpmi's read_buf/write_buf hooks. In such a > way, we can make the gpmi looks more normal. I think a bounce buffer is necessary, if you really want to use DMA all the time; some transactions are just not worth initializing DMA-able buffers in the high-level code. But even better, can GPMI easily utilize some sort of PIO mode for small/slow-path transactions like this, or is it inextricably tied to the MXS-DMA engine? Brian
On Mon, Dec 16, 2013 at 11:01:03PM -0800, Brian Norris wrote: > On Tue, Dec 17, 2013 at 01:11:25PM +0800, Huang Shijie wrote: > > On Mon, Dec 16, 2013 at 09:06:13PM -0800, Brian Norris wrote: > > > On Tue, Dec 17, 2013 at 12:23:08PM +0800, Huang Shijie wrote: > > > > On Mon, Dec 16, 2013 at 08:43:42PM -0800, Brian Norris wrote: > > > > > > > > > > > > The DMA warning only occurs when we enable the CONFIG_DMA_API_DEBUG. > > > Particularly, I think it is quite reasonable to restrict DMA to > > > mtd_read()/mtd_write()-like operations -- so this trickles down to > > > read_page/write_page, but not to device configuration operations like > > > ONFI set_features/get_features/read parameter tables, etc. > > > > > > Do you disagree? Do you see some need to perform DMA on all operations? > > > > The gpmi is much coupled with the MXS-DMA engine. the gpmi can uses the > > mxs-dma to set the gpmi registers. > > > > Use the DMA makes the GPMI code very simple, it really makes the code very > > _ugly_ if we do not use the DMA for read_buf/write_buf. > > It's not just about ugliness, it's about maintainability. I can clean up > a few cases here and there that don't conform to your driver's needs now > (notice I pointed out another instance of this in Uwe's patch), but > unless we establish better guidelines for NAND, this is fragile. > > As a diversion, let's look at atmel_nand for an example. It uses DMA as > well, but it dodges some of your problems using a heuristic, so that it > only uses DMA for larger-than-OOB-size transactions (see > atmel_read_buf). Additionally, it has a hack-y check for highmem, to try > to confirm that the region it's trying to map is, in fact, DMA-able. > > So, atmel_nand does not have an ideal solution, but it at least attempts > to solve parts of this problem. GPMI, on the other hand, ignores it > entirely. ok. i will fix it by a new patch to add more checks for the read_buf/write_buf. > > > If you insist that it is okay to use the local array in the nand_base.c, > > I can use kmalloc/memcpy in the gpmi's read_buf/write_buf hooks. In such a > > way, we can make the gpmi looks more normal. > > I think a bounce buffer is necessary, if you really want to use DMA all > the time; some transactions are just not worth initializing DMA-able > buffers in the high-level code. But even better, can GPMI easily utilize > some sort of PIO mode for small/slow-path transactions like this, or is yes, the gpmi can utilize the PIO mode for the transactions. > it inextricably tied to the MXS-DMA engine? You can think the mxs-dma is designed for GPMI. thanks Huang Shijie
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c index e85b07f4293d..f7ac2bfb3445 100644 --- a/drivers/mtd/nand/nand_base.c +++ b/drivers/mtd/nand/nand_base.c @@ -1410,6 +1410,30 @@ static uint8_t *nand_transfer_oob(struct nand_chip *chip, uint8_t *oob, } /** + * nand_set_read_retry - [INTERN] Set the READ RETRY mode + * @mtd: MTD device structure + * @retry_mode: the retry mode to use + * + * Some vendors supply a special command to shift the Vt threshold, to be used + * when there are too many bitflips in a page (i.e., ECC error). After setting + * a new threshold, the host should retry reading the page. + */ +static int nand_set_read_retry(struct mtd_info *mtd, int retry_mode) +{ + struct nand_chip *chip = mtd->priv; + uint8_t feature[ONFI_SUBFEATURE_PARAM_LEN] = {retry_mode}; + + if (retry_mode >= chip->read_retries) + return -EINVAL; + + if (chip->onfi_params.jedec_id == NAND_MFR_MICRON) + return chip->onfi_set_features(mtd, chip, + ONFI_FEATURE_ADDR_READ_RETRY, feature); + + return -EOPNOTSUPP; +} + +/** * nand_do_read_ops - [INTERN] Read data with ECC * @mtd: MTD device structure * @from: offset to read from @@ -1431,6 +1455,7 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from, uint8_t *bufpoi, *oob, *buf; unsigned int max_bitflips = 0; + int retry_mode = 0; bool ecc_fail = false; chipnr = (int)(from >> chip->chip_shift); @@ -1494,8 +1519,6 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from, memcpy(buf, chip->buffers->databuf + col, bytes); } - buf += bytes; - if (unlikely(oob)) { int toread = min(oobreadlen, max_oobsize); @@ -1514,8 +1537,27 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from, nand_wait_ready(mtd); } - if (mtd->ecc_stats.failed - ecc_failures) - ecc_fail = true; + if (mtd->ecc_stats.failed - ecc_failures) { + if (retry_mode + 1 < chip->read_retries) { + retry_mode++; + pr_debug("ECC error; performing READ RETRY %d\n", + retry_mode); + + ret = nand_set_read_retry(mtd, + retry_mode); + if (ret < 0) + break; + + /* Reset failures */ + mtd->ecc_stats.failed = ecc_failures; + continue; + } else { + /* No more retry modes; real failure */ + ecc_fail = true; + } + } + + buf += bytes; } else { memcpy(buf, chip->buffers->databuf + col, bytes); buf += bytes; @@ -1525,6 +1567,14 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from, readlen -= bytes; + /* Reset to retry mode 0 */ + if (retry_mode) { + ret = nand_set_read_retry(mtd, 0); + if (ret < 0) + break; + retry_mode = 0; + } + if (!readlen) break; @@ -2932,6 +2982,16 @@ ext_out: } /* + * Configure chip properties from Micron vendor-specific ONFI table + */ +static void nand_onfi_detect_micron(struct nand_chip *chip, + struct nand_onfi_params *p) +{ + struct nand_onfi_vendor_micron *micron = (void *)p->vendor; + chip->read_retries = micron->read_retry_options; +} + +/* * Check if the NAND chip is ONFI compliant, returns 1 if it is, 0 otherwise. */ static int nand_flash_detect_onfi(struct mtd_info *mtd, struct nand_chip *chip, @@ -3037,6 +3097,9 @@ static int nand_flash_detect_onfi(struct mtd_info *mtd, struct nand_chip *chip, pr_warn("Could not retrieve ONFI ECC requirements\n"); } + if (p->jedec_id == NAND_MFR_MICRON) + nand_onfi_detect_micron(chip, p); + return 1; } diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h index 029fe5948dc4..6e579e90955e 100644 --- a/include/linux/mtd/nand.h +++ b/include/linux/mtd/nand.h @@ -219,6 +219,9 @@ struct nand_chip; /* ONFI feature address */ #define ONFI_FEATURE_ADDR_TIMING_MODE 0x1 +/* Vendor-specific feature address (Micron) */ +#define ONFI_FEATURE_ADDR_READ_RETRY 0x89 + /* ONFI subfeature parameters length */ #define ONFI_SUBFEATURE_PARAM_LEN 4 @@ -518,6 +521,7 @@ struct nand_buffers { * non 0 if ONFI supported. * @onfi_params: [INTERN] holds the ONFI page parameter when ONFI is * supported, 0 otherwise. + * @read_retries: [INTERN] the number of read retry modes supported * @onfi_set_features: [REPLACEABLE] set the features for ONFI nand * @onfi_get_features: [REPLACEABLE] get the features for ONFI nand * @bbt: [INTERN] bad block table pointer @@ -589,6 +593,8 @@ struct nand_chip { int onfi_version; struct nand_onfi_params onfi_params; + int read_retries; + flstate_t state; uint8_t *oob_poi;
MLC NAND can experience a large number of bitflips (beyond the recommended correctability capacity) due to drifts in the voltage threshold (Vt). These bitflips can cause ECC errors to occur well within the expected lifetime of the flash. To account for this, some manufacturers provide a mechanism for shifting the Vt threshold after a corrupted read. Micron provides the necessary information via the ONFI vendor-specific parameter block (to indicate how many read-retry modes are available) and the ONFI {GET,SET}_FEATURES commands with a vendor-specific feature address (to support reading/switching the current read-retry mode). The recommmended sequence is as follows: 1. Perform PAGE_READ operation 2. If no ECC error, we are done 3. Run SET_FEATURES with feature address 89h, mode 1 4. Retry PAGE_READ operation 5. If ECC error and there are remaining supported modes, increment the mode and return to step 3. Otherwise, this is a true ECC error. 6. Run SET_FEATURES with feature address 89h, mode 0, to return to the default state. Tested on Micron MT29F32G08CBADA, suppors 8 read-retry modes. Signed-off-by: Brian Norris <computersforpeace@gmail.com> --- v1 -> v2: fix a logic error when incrementing retry_mode, which caused -EINVAL failures on flash that didn't need READ RETRY One thing I'm not sure about: do all relevant (i.e., ONFI-capable) NAND drivers support SET_FEATURES properly? If not, then it's possible that this could break nand_do_read_ops for such drivers. Not sure what the best method of handling that would be. drivers/mtd/nand/nand_base.c | 71 +++++++++++++++++++++++++++++++++++++++++--- include/linux/mtd/nand.h | 6 ++++ 2 files changed, 73 insertions(+), 4 deletions(-)