Patchwork [resend] Newly erased page read workaround

login
register
mail settings
Submitter Viresh KUMAR
Date April 1, 2011, 10:25 a.m.
Message ID <1301653534-22223-1-git-send-email-viresh.kumar@st.com>
Download mbox | patch
Permalink /patch/89245/
State New
Headers show

Comments

Viresh KUMAR - April 1, 2011, 10:25 a.m.
From: Vipin Kumar <vipin.kumar@st.com>

A newly erased page contains ff in data as well as spare area. While reading an
erased page, the read out ecc from spare area does not match the ecc generated
by fsmc ecc hardware accelarator. This is because ecc of data ff ff is not ff
ff. This leads to errors when jffs2 fs erases and reads back the pages to
ensure consistency.

This patch adds a software workaround to ensure that the ecc check is not
performed for erased pages. This check is a memory comparison of 512 data bytes
against 0xff. The ecc correction and error reporting algorithm is skipped if the
check passes.
It adds a few extra cycles while reading an erased page but is safer to ensure
that the page is read properly

Signed-off-by: Vipin Kumar <vipin.kumar@st.com>
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
---
 drivers/mtd/nand/fsmc_nand.c |   28 ++++++++++++++++++++++------
 1 files changed, 22 insertions(+), 6 deletions(-)
Artem Bityutskiy - April 1, 2011, 11:30 a.m.
On Fri, 2011-04-01 at 15:55 +0530, ext Viresh Kumar wrote:
> +		/*
> +		 * This is a temporary erase check. A newly erased page read
> +		 * would result in an ecc error because the oob data is also
> +		 * erased to FF and the calculated ecc for an FF data is not
> +		 * FF..FF.
> +		 * This is a workaround to skip performing correction in case
> +		 * data is FF..FF
> +		 */

Sorry, but still questions.

The comment says this is temporary - why? When and how this going to be
change to become "permanent".

And you still did not add words about the problem with bit-flips.
Vipin Kumar - April 1, 2011, 11:42 a.m.
On 4/1/2011 5:00 PM, Artem Bityutskiy wrote:
> On Fri, 2011-04-01 at 15:55 +0530, ext Viresh Kumar wrote:
>> +		/*
>> +		 * This is a temporary erase check. A newly erased page read
>> +		 * would result in an ecc error because the oob data is also
>> +		 * erased to FF and the calculated ecc for an FF data is not
>> +		 * FF..FF.
>> +		 * This is a workaround to skip performing correction in case
>> +		 * data is FF..FF
>> +		 */
> 
> Sorry, but still questions.
> 
> The comment says this is temporary - why? When and how this going to be
> change to become "permanent".
> 

Temporary because the fsmc peripheral itself needs a fix for this problem

> And you still did not add words about the problem with bit-flips.
> 

OK, I would add that too
Artem Bityutskiy - April 1, 2011, 11:44 a.m.
On Fri, 2011-04-01 at 17:12 +0530, ext Vipin Kumar wrote:
> Temporary because the fsmc peripheral itself needs a fix for this problem

Do you mean a hardware fix?
Vipin Kumar - April 4, 2011, 5:45 a.m.
On 4/1/2011 5:14 PM, Artem Bityutskiy wrote:
> On Fri, 2011-04-01 at 17:12 +0530, ext Vipin Kumar wrote:
>> Temporary because the fsmc peripheral itself needs a fix for this problem
> 
> Do you mean a hardware fix?
> 
Ideally speaking, Yes

Vipin
Artem Bityutskiy - April 4, 2011, 6:27 a.m.
On Mon, 2011-04-04 at 11:15 +0530, ext Vipin Kumar wrote:
> On 4/1/2011 5:14 PM, Artem Bityutskiy wrote:
> > On Fri, 2011-04-01 at 17:12 +0530, ext Vipin Kumar wrote:
> >> Temporary because the fsmc peripheral itself needs a fix for this problem
> > 
> > Do you mean a hardware fix?
> > 
> Ideally speaking, Yes

Then the comment is not saying the truth I think - this solution is
permanent.
Vipin Kumar - April 4, 2011, 9 a.m.
On 4/4/2011 11:57 AM, Artem Bityutskiy wrote:
> On Mon, 2011-04-04 at 11:15 +0530, ext Vipin Kumar wrote:
>> On 4/1/2011 5:14 PM, Artem Bityutskiy wrote:
>>> On Fri, 2011-04-01 at 17:12 +0530, ext Vipin Kumar wrote:
>>>> Temporary because the fsmc peripheral itself needs a fix for this problem
>>>
>>> Do you mean a hardware fix?
>>>
>> Ideally speaking, Yes
> 
> Then the comment is not saying the truth I think - this solution is
> permanent.
> 

That's why it is a workaround
I thought "workaround" says it all

Regards
Vipin
Artem Bityutskiy - April 4, 2011, 9:02 a.m.
On Mon, 2011-04-04 at 14:30 +0530, Vipin Kumar wrote:
> On 4/4/2011 11:57 AM, Artem Bityutskiy wrote:
> > On Mon, 2011-04-04 at 11:15 +0530, ext Vipin Kumar wrote:
> >> On 4/1/2011 5:14 PM, Artem Bityutskiy wrote:
> >>> On Fri, 2011-04-01 at 17:12 +0530, ext Vipin Kumar wrote:
> >>>> Temporary because the fsmc peripheral itself needs a fix for this problem
> >>>
> >>> Do you mean a hardware fix?
> >>>
> >> Ideally speaking, Yes
> > 
> > Then the comment is not saying the truth I think - this solution is
> > permanent.
> > 
> 
> That's why it is a workaround
> I thought "workaround" says it all

I thought we were talking about word "temporary". Temporary means "will
be fixed very very soon".
Vipin Kumar - April 4, 2011, 9:28 a.m.
On 4/4/2011 2:32 PM, Artem Bityutskiy wrote:
> On Mon, 2011-04-04 at 14:30 +0530, Vipin Kumar wrote:
>> On 4/4/2011 11:57 AM, Artem Bityutskiy wrote:
>>> On Mon, 2011-04-04 at 11:15 +0530, ext Vipin Kumar wrote:
>>>> On 4/1/2011 5:14 PM, Artem Bityutskiy wrote:
>>>>> On Fri, 2011-04-01 at 17:12 +0530, ext Vipin Kumar wrote:
>>>>>> Temporary because the fsmc peripheral itself needs a fix for this problem
>>>>>
>>>>> Do you mean a hardware fix?
>>>>>
>>>> Ideally speaking, Yes
>>>
>>> Then the comment is not saying the truth I think - this solution is
>>> permanent.
>>>
>>
>> That's why it is a workaround
>> I thought "workaround" says it all
> 
> I thought we were talking about word "temporary". Temporary means "will
> be fixed very very soon".
> 

OK. May be I misunderstood it...
This patch would not be needed for devices using the fixed hardware
And I thought since it is the hardware which is at fault, this patch is a 
workaround, but yes, it may need a longer time to get fixed

Regards
Vipin

Patch

diff --git a/drivers/mtd/nand/fsmc_nand.c b/drivers/mtd/nand/fsmc_nand.c
index 205b10b..7df270a 100644
--- a/drivers/mtd/nand/fsmc_nand.c
+++ b/drivers/mtd/nand/fsmc_nand.c
@@ -420,7 +420,7 @@  static int fsmc_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 	struct fsmc_nand_data *host = container_of(mtd,
 					struct fsmc_nand_data, mtd);
 	struct fsmc_eccplace *ecc_place = host->ecc_place;
-	int i, j, s, stat, eccsize = chip->ecc.size;
+	int i, j, k, s, stat, eccsize = chip->ecc.size;
 	int eccbytes = chip->ecc.bytes;
 	int eccsteps = chip->ecc.steps;
 	uint8_t *p = buf;
@@ -460,11 +460,27 @@  static int fsmc_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 		memcpy(&ecc_code[i], oob, 13);
 		chip->ecc.calculate(mtd, p, &ecc_calc[i]);
 
-		stat = chip->ecc.correct(mtd, p, &ecc_code[i], &ecc_calc[i]);
-		if (stat < 0)
-			mtd->ecc_stats.failed++;
-		else
-			mtd->ecc_stats.corrected += stat;
+		/*
+		 * This is a temporary erase check. A newly erased page read
+		 * would result in an ecc error because the oob data is also
+		 * erased to FF and the calculated ecc for an FF data is not
+		 * FF..FF.
+		 * This is a workaround to skip performing correction in case
+		 * data is FF..FF
+		 */
+		for (k = 0; k < eccsize; k++) {
+			if (*(p + k) != 0xff)
+				break;
+		}
+
+		if (k < eccsize) {
+			stat = chip->ecc.correct(mtd, p, &ecc_code[i],
+					&ecc_calc[i]);
+			if (stat < 0)
+				mtd->ecc_stats.failed++;
+			else
+				mtd->ecc_stats.corrected += stat;
+		}
 	}
 
 	return 0;