Patchwork Newly erased page read workaround

login
register
mail settings
Submitter Viresh KUMAR
Date Feb. 24, 2011, 6:10 a.m.
Message ID <c7baa4a77775b54f2ef8450d8d1f44d1dc011b41.1298527669.git.viresh.kumar@st.com>
Download mbox | patch
Permalink /patch/84322/
State New
Headers show

Comments

Viresh KUMAR - Feb. 24, 2011, 6:10 a.m.
From: Vipin Kumar <vipin.kumar@st.com>

A newly erased page contains ff in data as well as spare area. While reading an
erased page, the read out ecc from spare area does not match the ecc generated
by fsmc ecc hardware accelarator. This is because ecc of data ff ff is not ff
ff. This leads to errors when jffs2 fs erases and reads back the pages to
ensure consistency.

This patch adds a software workaround to ensure that the ecc check is not
performed for erased pages. An erased page is checked by checking data as ff ff.

Signed-off-by: Vipin Kumar <vipin.kumar@st.com>
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
---
 drivers/mtd/nand/fsmc_nand.c |   28 ++++++++++++++++++++++------
 1 files changed, 22 insertions(+), 6 deletions(-)
Ivan Djelic - Feb. 24, 2011, 9:38 a.m.
On Thu, Feb 24, 2011 at 06:10:16AM +0000, Viresh Kumar wrote:
> From: Vipin Kumar <vipin.kumar@st.com>
> 
> A newly erased page contains ff in data as well as spare area. While reading an
> erased page, the read out ecc from spare area does not match the ecc generated
> by fsmc ecc hardware accelarator. This is because ecc of data ff ff is not ff
> ff. This leads to errors when jffs2 fs erases and reads back the pages to
> ensure consistency.
> 
> This patch adds a software workaround to ensure that the ecc check is not
> performed for erased pages. An erased page is checked by checking data as ff ff.

Hello Vipin,

Just a suggestion: maybe you could mention in your comments the fact that you
cannot workaround the problem using a mask to get a valid ECC on erased pages,
because your controller does not allow it ?

If you plan to use your workaround on recent NAND devices with UBIFS, you may
still experience problems because of uncorrected bitflips on erased pages, and
get errors such as:

UBIFS error (pid 576): ubifs_scan: corrupt empty space at LEB 509:126586
UBIFS error (pid 576): ubifs_scanned_corruption: corruption at LEB 509:126586
UBIFS error (pid 576): ubifs_scan: LEB 509 scanning failed
UBIFS warning (pid 576): ubifs_ro_mode: switched to read-only mode, error -117

Regards,

Ivan
Vipin Kumar - Feb. 24, 2011, 10:20 a.m.
On 2/24/2011 3:08 PM, Ivan Djelic wrote:
> On Thu, Feb 24, 2011 at 06:10:16AM +0000, Viresh Kumar wrote:
>> From: Vipin Kumar <vipin.kumar@st.com>
>>
>> A newly erased page contains ff in data as well as spare area. While reading an
>> erased page, the read out ecc from spare area does not match the ecc generated
>> by fsmc ecc hardware accelarator. This is because ecc of data ff ff is not ff
>> ff. This leads to errors when jffs2 fs erases and reads back the pages to
>> ensure consistency.
>>
>> This patch adds a software workaround to ensure that the ecc check is not
>> performed for erased pages. An erased page is checked by checking data as ff ff.
> 
> Hello Vipin,
> 

Hello Ivan,

> Just a suggestion: maybe you could mention in your comments the fact that you
> cannot workaround the problem using a mask to get a valid ECC on erased pages,
> because your controller does not allow it ?
> 
> If you plan to use your workaround on recent NAND devices with UBIFS, you may
> still experience problems because of uncorrected bitflips on erased pages, and
> get errors such as:
> 

Let me explain the problem again.

The problem is that the BCH algorithm (used by this controller to generate ecc 
and correct bitflips) generates an ecc which is not 0xffff for an erased 512 
bytes.

Since erasing a page results in all data including the spare area of the page 
resetting to 0xffff, and the ecc written in the spare area is incorrect.
This ecc is not useful to correct bitflips

One way to solve this problem is to write the correct ecc in the erased pages 
spare area. The other is to ensure that the page is erased and not run the 
correction algorithm. We are using the second option but there would not be 
any unwanted bitflips in any of the cases.

Let me know if this answers your questions

> UBIFS error (pid 576): ubifs_scan: corrupt empty space at LEB 509:126586
> UBIFS error (pid 576): ubifs_scanned_corruption: corruption at LEB 509:126586
> UBIFS error (pid 576): ubifs_scan: LEB 509 scanning failed
> UBIFS warning (pid 576): ubifs_ro_mode: switched to read-only mode, error -117
> 
> Regards,
> 
> Ivan
> .
> 

Regards
Vipin
Ivan Djelic - Feb. 24, 2011, 11:10 a.m.
On Thu, Feb 24, 2011 at 10:20:59AM +0000, Vipin Kumar wrote:
> On 2/24/2011 3:08 PM, Ivan Djelic wrote:
(...)
> > Just a suggestion: maybe you could mention in your comments the fact that you
> > cannot workaround the problem using a mask to get a valid ECC on erased pages,
> > because your controller does not allow it ?
> > 
> > If you plan to use your workaround on recent NAND devices with UBIFS, you may
> > still experience problems because of uncorrected bitflips on erased pages, and
> > get errors such as:
> > 
> 
> Let me explain the problem again.
> 
> The problem is that the BCH algorithm (used by this controller to generate ecc 
> and correct bitflips) generates an ecc which is not 0xffff for an erased 512 
> bytes.
> 
> Since erasing a page results in all data including the spare area of the page 
> resetting to 0xffff, and the ecc written in the spare area is incorrect.
> This ecc is not useful to correct bitflips
> 
> One way to solve this problem is to write the correct ecc in the erased pages 
> spare area. The other is to ensure that the page is erased and not run the 
> correction algorithm.

There is a third option: add (before writing oob/after reading oob) a fixed
polynomial to your HW-generated BCH ECC codeword. This polynomial is chosen such
that your ECC code on an erased page now becomes a sequence of 0xff bytes.
That way, erased pages can be read with ECC check enabled. That was my point.

I assume you cannot alter oob contents as described above, because your controller
performs error detection "on-the-fly" as you transfer data to/from NAND device (?).

> We are using the second option but there would not be 
> any unwanted bitflips in any of the cases.

On recent NAND devices, bitflips _do_ appear on erased pages, sometimes immediately
after a block erase.

Regards,

Ivan
Vipin Kumar - Feb. 24, 2011, 11:36 a.m.
On 2/24/2011 4:40 PM, Ivan Djelic wrote:
> On Thu, Feb 24, 2011 at 10:20:59AM +0000, Vipin Kumar wrote:
>> On 2/24/2011 3:08 PM, Ivan Djelic wrote:
> (...)
>>> Just a suggestion: maybe you could mention in your comments the fact that you
>>> cannot workaround the problem using a mask to get a valid ECC on erased pages,
>>> because your controller does not allow it ?
>>>
>>> If you plan to use your workaround on recent NAND devices with UBIFS, you may
>>> still experience problems because of uncorrected bitflips on erased pages, and
>>> get errors such as:
>>>
>>
>> Let me explain the problem again.
>>
>> The problem is that the BCH algorithm (used by this controller to generate ecc 
>> and correct bitflips) generates an ecc which is not 0xffff for an erased 512 
>> bytes.
>>
>> Since erasing a page results in all data including the spare area of the page 
>> resetting to 0xffff, and the ecc written in the spare area is incorrect.
>> This ecc is not useful to correct bitflips
>>
>> One way to solve this problem is to write the correct ecc in the erased pages 
>> spare area. The other is to ensure that the page is erased and not run the 
>> correction algorithm.
> 
> There is a third option: add (before writing oob/after reading oob) a fixed
> polynomial to your HW-generated BCH ECC codeword. This polynomial is chosen such
> that your ECC code on an erased page now becomes a sequence of 0xff bytes.
> That way, erased pages can be read with ECC check enabled. That was my point.
> 
> I assume you cannot alter oob contents as described above, because your controller
> performs error detection "on-the-fly" as you transfer data to/from NAND device (?).
> 

This is true. the controller performs the error detection on the fly ie. as the cpu 
reads the oob bytes
The expectation of the controller is

1. reset ecc logic
2. start reading 512 bytes + 13 bytes (ecc - lying in oob)
3. the controller would provide the information about flipped bit offsets

This is the reason that the above option would not work

>> We are using the second option but there would not be 
>> any unwanted bitflips in any of the cases.
> 
> On recent NAND devices, bitflips _do_ appear on erased pages, sometimes immediately
> after a block erase.
> 
> Regards,
> 
> Ivan
> .
> 

Regards
Vipin
Viresh KUMAR - March 22, 2011, 4:36 a.m.
On 02/24/2011 11:40 AM, Viresh KUMAR wrote:
> From: Vipin Kumar <vipin.kumar@st.com>
> 
> A newly erased page contains ff in data as well as spare area. While reading an
> erased page, the read out ecc from spare area does not match the ecc generated
> by fsmc ecc hardware accelarator. This is because ecc of data ff ff is not ff
> ff. This leads to errors when jffs2 fs erases and reads back the pages to
> ensure consistency.
> 
> This patch adds a software workaround to ensure that the ecc check is not
> performed for erased pages. An erased page is checked by checking data as ff ff.
> 
> Signed-off-by: Vipin Kumar <vipin.kumar@st.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
> Acked-by: Linus Walleij <linus.walleij@stericsson.com>

Artem, David,

Have you pushed this patch for 39-rc1?
Artem Bityutskiy - March 31, 2011, 1:51 p.m.
Hi,

sorry for late reply, was too busy and also missed your patch somehow.

On Thu, 2011-02-24 at 11:40 +0530, Viresh Kumar wrote:
> From: Vipin Kumar <vipin.kumar@st.com>
> 
> A newly erased page contains ff in data as well as spare area. While reading an
> erased page, the read out ecc from spare area does not match the ecc generated
> by fsmc ecc hardware accelarator. This is because ecc of data ff ff is not ff
> ff. This leads to errors when jffs2 fs erases and reads back the pages to
> ensure consistency.
> 
> This patch adds a software workaround to ensure that the ecc check is not
> performed for erased pages. An erased page is checked by checking data as ff ff.
> 
> Signed-off-by: Vipin Kumar <vipin.kumar@st.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
> Acked-by: Linus Walleij <linus.walleij@stericsson.com>

...

> +		/*
> +		 * This is a temporary erase check. A newly erased page read
> +		 * would result in an ecc error because the oob data is also
> +		 * erased to FF and the calculated ecc for an FF data is not
> +		 * FF..FF.
> +		 * This is a workaround to skip performing correction in case
> +		 * data is FF..FF
> +		 */
> +		for (k = 0; k < eccsize; k++) {
> +			if (*(p + k) != 0xff)
> +				break;
> +		}
> +

So eccsize is 512, and you are going to compare 512 bytes on every read?
Isn't it a very bad and wasteful solution? May be you could invent
something less straight-forward but more optimal?

Let's suppose ECC corresponding to all 0xFFs is P. May be you could do
this wasteful check only if ECC == P, not all the time?

Also, Ivan pointed you the right thing - you might have bit-flips on an
erased eraseblock. If not on freshly, then on an erasblock which was
erased and then not used for long time. If this is not of your concern,
then at least write this in the comments and tell that you know about
this issue but for reasons X and Y you do not handle them.

Thanks!
Vipin Kumar - April 1, 2011, 6:28 a.m.
On 3/31/2011 7:21 PM, Artem Bityutskiy wrote:
> Hi,
> 

Hello Artem,

> sorry for late reply, was too busy and also missed your patch somehow.
> 
> On Thu, 2011-02-24 at 11:40 +0530, Viresh Kumar wrote:
>> From: Vipin Kumar <vipin.kumar@st.com>
>>
>> A newly erased page contains ff in data as well as spare area. While reading an
>> erased page, the read out ecc from spare area does not match the ecc generated
>> by fsmc ecc hardware accelarator. This is because ecc of data ff ff is not ff
>> ff. This leads to errors when jffs2 fs erases and reads back the pages to
>> ensure consistency.
>>
>> This patch adds a software workaround to ensure that the ecc check is not
>> performed for erased pages. An erased page is checked by checking data as ff ff.
>>
>> Signed-off-by: Vipin Kumar <vipin.kumar@st.com>
>> Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
>> Acked-by: Linus Walleij <linus.walleij@stericsson.com>
> 
> ...
> 
>> +		/*
>> +		 * This is a temporary erase check. A newly erased page read
>> +		 * would result in an ecc error because the oob data is also
>> +		 * erased to FF and the calculated ecc for an FF data is not
>> +		 * FF..FF.
>> +		 * This is a workaround to skip performing correction in case
>> +		 * data is FF..FF
>> +		 */
>> +		for (k = 0; k < eccsize; k++) {
>> +			if (*(p + k) != 0xff)
>> +				break;
>> +		}
>> +
> 
> So eccsize is 512, and you are going to compare 512 bytes on every read?
> Isn't it a very bad and wasteful solution? May be you could invent
> something less straight-forward but more optimal?
> 
> Let's suppose ECC corresponding to all 0xFFs is P. May be you could do
> this wasteful check only if ECC == P, not all the time?
> 

Basically, the hardware only reports the flipped bit offset and not the actual ecc 
at the time of reading. It reports ecc only at the time of writing. So, comparing 
ECCs is not a possibility while reading.

The other solution can be to write P into OOB every time a page is erased. In that 
case we put a hook in erase. But since the read implementation was anyway specific 
for fsmc bch ecc4 scheme, this implementation was more convinient...

> Also, Ivan pointed you the right thing - you might have bit-flips on an
> erased eraseblock. If not on freshly, then on an erasblock which was
> erased and then not used for long time. If this is not of your concern,

In that case an ecc error would be reported since the ecc wont match the stored 
ecc i.e FFFF and the driver would mark it as a normal corrupted page

> then at least write this in the comments and tell that you know about
> this issue but for reasons X and Y you do not handle them.
> 

OK. I would add this in the patch comments

> Thanks!
> 

Regards
Vipin
Artem Bityutskiy - April 1, 2011, 6:51 a.m.
On Fri, 2011-04-01 at 11:58 +0530, Vipin Kumar wrote:
> On 3/31/2011 7:21 PM, Artem Bityutskiy wrote:
> > Hi,
> > 
> 
> Hello Artem,
> 
> > sorry for late reply, was too busy and also missed your patch somehow.
> > 
> > On Thu, 2011-02-24 at 11:40 +0530, Viresh Kumar wrote:
> >> From: Vipin Kumar <vipin.kumar@st.com>
> >>
> >> A newly erased page contains ff in data as well as spare area. While reading an
> >> erased page, the read out ecc from spare area does not match the ecc generated
> >> by fsmc ecc hardware accelarator. This is because ecc of data ff ff is not ff
> >> ff. This leads to errors when jffs2 fs erases and reads back the pages to
> >> ensure consistency.
> >>
> >> This patch adds a software workaround to ensure that the ecc check is not
> >> performed for erased pages. An erased page is checked by checking data as ff ff.
> >>
> >> Signed-off-by: Vipin Kumar <vipin.kumar@st.com>
> >> Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
> >> Acked-by: Linus Walleij <linus.walleij@stericsson.com>
> > 
> > ...
> > 
> >> +		/*
> >> +		 * This is a temporary erase check. A newly erased page read
> >> +		 * would result in an ecc error because the oob data is also
> >> +		 * erased to FF and the calculated ecc for an FF data is not
> >> +		 * FF..FF.
> >> +		 * This is a workaround to skip performing correction in case
> >> +		 * data is FF..FF
> >> +		 */
> >> +		for (k = 0; k < eccsize; k++) {
> >> +			if (*(p + k) != 0xff)
> >> +				break;
> >> +		}
> >> +
> > 
> > So eccsize is 512, and you are going to compare 512 bytes on every read?
> > Isn't it a very bad and wasteful solution? May be you could invent
> > something less straight-forward but more optimal?
> > 
> > Let's suppose ECC corresponding to all 0xFFs is P. May be you could do
> > this wasteful check only if ECC == P, not all the time?
> > 
> 
> Basically, the hardware only reports the flipped bit offset and not the actual ecc 
> at the time of reading. It reports ecc only at the time of writing. So, comparing 
> ECCs is not a possibility while reading.

Sorry, my suggestion was stupid.

But can you do like this?

1. Read ECC.
2. Compare it to all 0xFFs
3. If all 0xFFs, then check the page contents against 0xFF.
4. If not all 0xFFs, then this is not an erased page for sure, because
erased pages have to have all 0xFFs in OOB.

> The other solution can be to write P into OOB every time a page is erased. In that 
> case we put a hook in erase.

You cannot write anything to the ECC positions in OOB after erasure
because if you do this, how are you going to store real ECC in the
subsequent writes?

> > Also, Ivan pointed you the right thing - you might have bit-flips on an
> > erased eraseblock. If not on freshly, then on an erasblock which was
> > erased and then not used for long time. If this is not of your concern,
> 
> In that case an ecc error would be reported since the ecc wont match the stored 
> ecc i.e FFFF and the driver would mark it as a normal corrupted page

I'm confused. So you erased eraseblock A. Everything there contains
0xFFs, including the OOB area.

Now you have one of the modern lashes. You gen bit-flips in the page.
Say, a couple of bits flip. You read this page. You compare the contents
of the page with 0xFF and find out that the contends in not all 0xFFs.
What do you do then?
Vipin Kumar - April 1, 2011, 8:33 a.m.
On 4/1/2011 12:21 PM, Artem Bityutskiy wrote:
> On Fri, 2011-04-01 at 11:58 +0530, Vipin Kumar wrote:
>> On 3/31/2011 7:21 PM, Artem Bityutskiy wrote:
>>> Hi,
>>>
>>
>> Hello Artem,
>>

Hello Artem,

>>> sorry for late reply, was too busy and also missed your patch somehow.
>>>
>>> On Thu, 2011-02-24 at 11:40 +0530, Viresh Kumar wrote:
>>>> From: Vipin Kumar <vipin.kumar@st.com>
>>>>
>>>> A newly erased page contains ff in data as well as spare area. While reading an
>>>> erased page, the read out ecc from spare area does not match the ecc generated
>>>> by fsmc ecc hardware accelarator. This is because ecc of data ff ff is not ff
>>>> ff. This leads to errors when jffs2 fs erases and reads back the pages to
>>>> ensure consistency.
>>>>
>>>> This patch adds a software workaround to ensure that the ecc check is not
>>>> performed for erased pages. An erased page is checked by checking data as ff ff.
>>>>
>>>> Signed-off-by: Vipin Kumar <vipin.kumar@st.com>
>>>> Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
>>>> Acked-by: Linus Walleij <linus.walleij@stericsson.com>
>>>
>>> ...
>>>
>>>> +		/*
>>>> +		 * This is a temporary erase check. A newly erased page read
>>>> +		 * would result in an ecc error because the oob data is also
>>>> +		 * erased to FF and the calculated ecc for an FF data is not
>>>> +		 * FF..FF.
>>>> +		 * This is a workaround to skip performing correction in case
>>>> +		 * data is FF..FF
>>>> +		 */
>>>> +		for (k = 0; k < eccsize; k++) {
>>>> +			if (*(p + k) != 0xff)
>>>> +				break;
>>>> +		}
>>>> +
>>>
>>> So eccsize is 512, and you are going to compare 512 bytes on every read?
>>> Isn't it a very bad and wasteful solution? May be you could invent
>>> something less straight-forward but more optimal?
>>>
>>> Let's suppose ECC corresponding to all 0xFFs is P. May be you could do
>>> this wasteful check only if ECC == P, not all the time?
>>>
>>
>> Basically, the hardware only reports the flipped bit offset and not the actual ecc 
>> at the time of reading. It reports ecc only at the time of writing. So, comparing 
>> ECCs is not a possibility while reading.
> 
> Sorry, my suggestion was stupid.
> 
> But can you do like this?
> 
> 1. Read ECC.
> 2. Compare it to all 0xFFs
> 3. If all 0xFFs, then check the page contents against 0xFF.
> 4. If not all 0xFFs, then this is not an erased page for sure, because
> erased pages have to have all 0xFFs in OOB.
> 

Yes, OK

>> The other solution can be to write P into OOB every time a page is erased. In that 
>> case we put a hook in erase.
> 
> You cannot write anything to the ECC positions in OOB after erasure
> because if you do this, how are you going to store real ECC in the
> subsequent writes?
> 

Yes, you are right...This is not a feasible solution

>>> Also, Ivan pointed you the right thing - you might have bit-flips on an
>>> erased eraseblock. If not on freshly, then on an erasblock which was
>>> erased and then not used for long time. If this is not of your concern,
>>
>> In that case an ecc error would be reported since the ecc wont match the stored 
>> ecc i.e FFFF and the driver would mark it as a normal corrupted page
> 
> I'm confused. So you erased eraseblock A. Everything there contains
> 0xFFs, including the OOB area.
> 
> Now you have one of the modern lashes. You gen bit-flips in the page.
> Say, a couple of bits flip. You read this page. You compare the contents
> of the page with 0xFF and find out that the contends in not all 0xFFs.
> What do you do then?
> 

Then, the normal driver takes over and it reports an error because the 
number of errors in the page are beyond 8 bits (maximum the FSMC ecc 
logic can correct). Effectively speaking, the read page returns an error 
indicating that the page could not be read properly

Ideally, any filesystem would mark it as a bad block

Regards
Vipin
Artem Bityutskiy - April 1, 2011, 8:39 a.m.
On Fri, 2011-04-01 at 14:03 +0530, Vipin Kumar wrote:
> >>> Also, Ivan pointed you the right thing - you might have bit-flips
> on an
> >>> erased eraseblock. If not on freshly, then on an erasblock which
> was
> >>> erased and then not used for long time. If this is not of your
> concern,
> >>
> >> In that case an ecc error would be reported since the ecc wont
> match the stored 
> >> ecc i.e FFFF and the driver would mark it as a normal corrupted
> page
> > 
> > I'm confused. So you erased eraseblock A. Everything there contains
> > 0xFFs, including the OOB area.
> > 
> > Now you have one of the modern lashes. You gen bit-flips in the
> page.
> > Say, a couple of bits flip. You read this page. You compare the
> contents
> > of the page with 0xFF and find out that the contends in not all
> 0xFFs.
> > What do you do then?
> > 
> 
> Then, the normal driver takes over and it reports an error because
> the 
> number of errors in the page are beyond 8 bits (maximum the FSMC ecc 
> logic can correct).

Why 8? It may be just 1 single bit-flip. Just one bits becomes 0 instead
of 1.

>  Effectively speaking, the read page returns an error 
> indicating that the page could not be read properly

But why? It can be read properly. If this is just 1 wrong bit, you
should be able to correct it. And as Ivan indicated, modern flashes are
so crappy that 1 bit-flip on erased eraseblock is just normal there.

> Ideally, any filesystem would mark it as a bad block 

That's the point - no. This is normal on modern flashes.

I think one solution could be that you make your check more
sophisticated. You check for 0xFFs, if this is not true, you see is this
"almost all 0xFFs" and count amount of non-0xFF bits. If the count is,
say, 2, you assume this page contains all 0xFFs plus 2 bit-flips. But
I'm not sure it would work.

Anyway, If you do not care about such bit-flips for your SoC - fine. I
just wanted you to understand and accept the issue and write about it in
the comment. And I also wanted you to _not_ do expensive 0xFF comparison
every time - but it seems you accepted this :-)
Vipin Kumar - April 1, 2011, 9:06 a.m.
On 4/1/2011 2:09 PM, Artem Bityutskiy wrote:
> On Fri, 2011-04-01 at 14:03 +0530, Vipin Kumar wrote:
>>>>> Also, Ivan pointed you the right thing - you might have bit-flips
>> on an
>>>>> erased eraseblock. If not on freshly, then on an erasblock which
>> was
>>>>> erased and then not used for long time. If this is not of your
>> concern,
>>>>
>>>> In that case an ecc error would be reported since the ecc wont
>> match the stored 
>>>> ecc i.e FFFF and the driver would mark it as a normal corrupted
>> page
>>>
>>> I'm confused. So you erased eraseblock A. Everything there contains
>>> 0xFFs, including the OOB area.
>>>
>>> Now you have one of the modern lashes. You gen bit-flips in the
>> page.
>>> Say, a couple of bits flip. You read this page. You compare the
>> contents
>>> of the page with 0xFF and find out that the contends in not all
>> 0xFFs.
>>> What do you do then?
>>>
>>
>> Then, the normal driver takes over and it reports an error because
>> the 
>> number of errors in the page are beyond 8 bits (maximum the FSMC ecc 
>> logic can correct).
> 
> Why 8? It may be just 1 single bit-flip. Just one bits becomes 0 instead
> of 1.
> 

It is a maximum of 8. So, the logic can correct any number of bitflips from 
1 to 8 in 512 bytes of data

>>  Effectively speaking, the read page returns an error 
>> indicating that the page could not be read properly
> 
> But why? It can be read properly. If this is just 1 wrong bit, you
> should be able to correct it. And as Ivan indicated, modern flashes are
> so crappy that 1 bit-flip on erased eraseblock is just normal there.
> 

That's the problem. Ideally the ecc should have been programmed in OOB and then 
the driver would be able to correct the flipped bits. The problem happens only 
if we try to read the erased pages.

>> Ideally, any filesystem would mark it as a bad block 
> 
> That's the point - no. This is normal on modern flashes.
> 
> I think one solution could be that you make your check more
> sophisticated. You check for 0xFFs, if this is not true, you see is this
> "almost all 0xFFs" and count amount of non-0xFF bits. If the count is,
> say, 2, you assume this page contains all 0xFFs plus 2 bit-flips. But
> I'm not sure it would work.
> 
> Anyway, If you do not care about such bit-flips for your SoC - fine. I
> just wanted you to understand and accept the issue and write about it in
> the comment. And I also wanted you to _not_ do expensive 0xFF comparison
> every time - but it seems you accepted this :-)
> 

Yes, I had to accept this :-)
The flip side is that the hardware itself should not report errors when it 
reads all ff data and ff ecc..It should assume it as an erased page and not 
report any errors

Yes, I understand the issue and would write more details about it in the 
patch comment. Moreover, the comparison id only expensive when the page 
contains all FFs.. Most of the time comparison would fail at first few bytes

Regards
Vipin
Artem Bityutskiy - April 1, 2011, 9:42 a.m.
On Fri, 2011-04-01 at 14:36 +0530, Vipin Kumar wrote:
> Yes, I understand the issue and would write more details about it in the 
> patch comment. Moreover, the comparison id only expensive when the page 
> contains all FFs.. Most of the time comparison would fail at first few bytes

OK, actually yes, probably this ECC check is not needed. Anyway, will
you send the updated patch please?
Ivan Djelic - April 1, 2011, 12:14 p.m.
On Fri, Apr 01, 2011 at 10:06:51AM +0100, Vipin Kumar wrote:
> That's the problem. Ideally the ecc should have been programmed in OOB and then 
> the driver would be able to correct the flipped bits. The problem happens only 
> if we try to read the erased pages.
> 
> >> Ideally, any filesystem would mark it as a bad block 
> > 
> > That's the point - no. This is normal on modern flashes.
> > 
> > I think one solution could be that you make your check more
> > sophisticated. You check for 0xFFs, if this is not true, you see is this
> > "almost all 0xFFs" and count amount of non-0xFF bits. If the count is,
> > say, 2, you assume this page contains all 0xFFs plus 2 bit-flips. But
> > I'm not sure it would work.
> > 
> > Anyway, If you do not care about such bit-flips for your SoC - fine. I
> > just wanted you to understand and accept the issue and write about it in
> > the comment. And I also wanted you to _not_ do expensive 0xFF comparison
> > every time - but it seems you accepted this :-)
> > 
> 
> Yes, I had to accept this :-)
> The flip side is that the hardware itself should not report errors when it 
> reads all ff data and ff ecc..It should assume it as an erased page and not 
> report any errors
 
Hello Vipin,

Did you consider this idea: if you have an unused byte available in oob,
program it to 0x00 when a page is programmed.

That way, you just need to check a single byte when you read a page in order
to distinguish erased pages from programmed pages. And by counting the number
of 1s in the byte, you can be robust to bitflips.

As a special refinement, you could also "cleanup" pages detected as erased, in
order to iron out possible bitflips.

I think that this method is used by Micron for their internal on-die ecc engine:
they add a parity byte (0x00 or 0x01) to their BCH code, which can be used to:

1) detect failures (using parity) when the max error count is reached
2) distinguish between erased and programmed pages

Best Regards,

Ivan
Artem Bityutskiy - April 1, 2011, 1:04 p.m.
On Fri, 2011-04-01 at 14:14 +0200, Ivan Djelic wrote:
> Did you consider this idea: if you have an unused byte available in oob,
> program it to 0x00 when a page is programmed.

I guess this depends on the the controller, but probably this could mean
a substantial write  overhead, not?

> That way, you just need to check a single byte when you read a page in order
> to distinguish erased pages from programmed pages. And by counting the number
> of 1s in the byte, you can be robust to bitflips.

Could you please explain some more what do you mean? So you write the
0x00 byte. Then when you read, you count the number of "1" bits in the
byte. And what exactly this count gives you and to which bitflips you
become robust?

> As a special refinement, you could also "cleanup" pages detected as erased, in
> order to iron out possible bitflips.

What do you mean by "cleanup" a page?

Thanks!
Ivan Djelic - April 1, 2011, 2:04 p.m.
On Fri, Apr 01, 2011 at 02:04:41PM +0100, Artem Bityutskiy wrote:
> On Fri, 2011-04-01 at 14:14 +0200, Ivan Djelic wrote:
> > Did you consider this idea: if you have an unused byte available in oob,
> > program it to 0x00 when a page is programmed.
> 
> I guess this depends on the the controller, but probably this could mean
> a substantial write  overhead, not?

Sorry, my explanation was probably not very clear.

When you program a page, you send data and oob contents to your NAND controller.
The controller may modify oob contents to add ECC bytes, in a more or less
automatic way, then it will physically program the page.

So it's up to you to decide what to program into _free_ oob bytes. And you can
decide to systematically put 0x00 in a specific byte, to indicate that the page
has been programmed. Think of it as a marker (a bit like the bad block marker).

Vipin just needs to change a little bit the oob layout of his driver, removing
a free byte (assuming he has some free bytes left) and clear this byte in the
oob array before writing a page - no overhead.

When the page oob is read back, he will only need to check his marker in the
oob array. If the marker is 0xff, the page has not been programmed. If the
marker is 0x00, the page has been programmed.

> > That way, you just need to check a single byte when you read a page in order
> > to distinguish erased pages from programmed pages. And by counting the number
> > of 1s in the byte, you can be robust to bitflips.
> 
> Could you please explain some more what do you mean? So you write the
> 0x00 byte. Then when you read, you count the number of "1" bits in the
> byte. And what exactly this count gives you and to which bitflips you
> become robust?

I meant that you can be robust to bitflips occurring in the marker byte itself.
If you just compare the marker byte to 0x00 or 0xff, you will not be able to
handle values such as 0x01 or 0x7f (occuring because of bitflips).
You can count the number of 1s in the marker, and decide for instance:
count < 4   => marker is 0x00
count >= 4  => marker is 0xff
That way, the marker is reliable up to 3 bitflips (which is very improbable in
the same byte).

> > As a special refinement, you could also "cleanup" pages detected as erased, in
> > order to iron out possible bitflips.
> 
> What do you mean by "cleanup" a page?

Let us assume Vipin implements the marker idea. He reads a page, and the marker
test tells him the page has never been programmed. So he will avoid performing
ECC correction on that page. But that page could nevertheless contains bitflips
(i.e. not all bytes equal to 0xFF). He could memset page data with 0xFFs to
blindly remove possible bitflips, as if the ECC engine had fixed them.

Best Regards,

Ivan
Artem Bityutskiy - April 1, 2011, 2:16 p.m.
On Fri, 2011-04-01 at 16:04 +0200, Ivan Djelic wrote:
> On Fri, Apr 01, 2011 at 02:04:41PM +0100, Artem Bityutskiy wrote:
> > On Fri, 2011-04-01 at 14:14 +0200, Ivan Djelic wrote:
> > > Did you consider this idea: if you have an unused byte available in oob,
> > > program it to 0x00 when a page is programmed.
> > 
> > I guess this depends on the the controller, but probably this could mean
> > a substantial write  overhead, not?
> 
> Sorry, my explanation was probably not very clear.
> 
> When you program a page, you send data and oob contents to your NAND controller.
> The controller may modify oob contents to add ECC bytes, in a more or less
> automatic way, then it will physically program the page.

Right, I just assumed that some controllers allow you to send only data,
and programming OOB would be a separate operation. But if this is not
the case in ST's HW - then this sounds like a very good approach!

> > > That way, you just need to check a single byte when you read a page in order
> > > to distinguish erased pages from programmed pages. And by counting the number
> > > of 1s in the byte, you can be robust to bitflips.
> > 
> > Could you please explain some more what do you mean? So you write the
> > 0x00 byte. Then when you read, you count the number of "1" bits in the
> > byte. And what exactly this count gives you and to which bitflips you
> > become robust?
> 
> I meant that you can be robust to bitflips occurring in the marker byte itself.
> If you just compare the marker byte to 0x00 or 0xff, you will not be able to
> handle values such as 0x01 or 0x7f (occuring because of bitflips).
> You can count the number of 1s in the marker, and decide for instance:
> count < 4   => marker is 0x00
> count >= 4  => marker is 0xff
> That way, the marker is reliable up to 3 bitflips (which is very improbable in
> the same byte).

I see.

I'm curious, if the OOB area is so unreliable, how can we trust it to
store our ECC? We need a mini-ECC for the ECC then?

> Let us assume Vipin implements the marker idea. He reads a page, and the marker
> test tells him the page has never been programmed. So he will avoid performing
> ECC correction on that page. But that page could nevertheless contains bitflips
> (i.e. not all bytes equal to 0xFF). He could memset page data with 0xFFs to
> blindly remove possible bitflips, as if the ECC engine had fixed them.

Ah, I see. Although hiding bit-flips from upper layers might be not the
best idea sometimes.
Ivan Djelic - April 1, 2011, 2:49 p.m.
On Fri, Apr 01, 2011 at 03:16:01PM +0100, Artem Bityutskiy wrote:
> > That way, the marker is reliable up to 3 bitflips (which is very improbable in
> > the same byte).
> 
> I see.
> 
> I'm curious, if the OOB area is so unreliable, how can we trust it to
> store our ECC? We need a mini-ECC for the ECC then?

The oob area is no different from the data area (same technology).
Fortunately, Hamming and BCH codes (and usual ECC codes in general) actually
have built-in robustness to corruption.

But your point is still valid: when people want to use oob for storing other
things than ECC bytes, say filesystem metadata, they do have a problem; and they
need to make sure that either:
- ECC covers data AND oob metadata (easy to do with BCH, difficult with Hamming)
or
- extra ECC bytes are added to protect oob metadata

YAFFS implements the latter solution.

> > Let us assume Vipin implements the marker idea. He reads a page, and the marker
> > test tells him the page has never been programmed. So he will avoid performing
> > ECC correction on that page. But that page could nevertheless contains bitflips
> > (i.e. not all bytes equal to 0xFF). He could memset page data with 0xFFs to
> > blindly remove possible bitflips, as if the ECC engine had fixed them.
> 
> Ah, I see. Although hiding bit-flips from upper layers might be not the
> best idea sometimes.

I agree, this "blind" cleanup is not such a good idea after all :)

BR,
--
Ivan
Ricard Wanderlof - April 1, 2011, 2:58 p.m.
On Fri, 1 Apr 2011, Ivan Djelic wrote:

>> I'm curious, if the OOB area is so unreliable, how can we trust it to
>> store our ECC? We need a mini-ECC for the ECC then?
>
> The oob area is no different from the data area (same technology).
> Fortunately, Hamming and BCH codes (and usual ECC codes in general) actually
> have built-in robustness to corruption.

At least with Hamming, when the ECC has been calculated, if it differs 
from the ECC read from the oob, and the difference is just a single bit, 
it is assumed that the ECC stored in the oob was subjected to a bitflip, 
and the data is in fact ok. (A long time ago there was a bug in this in 
mtd, so that a (single) bit flip in the ECC bytes was flagged as an ECC 
failure).

I don't know how BCH (and other multiple-error-bit algorithms) deals with 
this though. Does it assume that there will be at most one bitflip within 
the ECC data (which is much smaller than the bytes covered by the ECC), or 
does it accept multiple bit errors also in the ECC data? One would assume 
that it would start to get difficult to distinguish multiple bit errors in 
the ECC data with valid ECC indicating bit errors in the data, but perhaps 
this isn't so thanks to the algorithm used?

/Ricard
Ivan Djelic - April 1, 2011, 3:46 p.m.
On Fri, Apr 01, 2011 at 03:58:47PM +0100, Ricard Wanderlof wrote:
> 
> On Fri, 1 Apr 2011, Ivan Djelic wrote:
> 
> >> I'm curious, if the OOB area is so unreliable, how can we trust it to
> >> store our ECC? We need a mini-ECC for the ECC then?
> >
> > The oob area is no different from the data area (same technology).
> > Fortunately, Hamming and BCH codes (and usual ECC codes in general) actually
> > have built-in robustness to corruption.
> 
> At least with Hamming, when the ECC has been calculated, if it differs 
> from the ECC read from the oob, and the difference is just a single bit, 
> it is assumed that the ECC stored in the oob was subjected to a bitflip, 
> and the data is in fact ok. (A long time ago there was a bug in this in 
> mtd, so that a (single) bit flip in the ECC bytes was flagged as an ECC 
> failure).

Exactly, because a single bitflip in data results in a 12-bit flip in a Hamming
24-bit code. Therefore you can easily distinguish a single bitflip in data
(12 bits change) from a single bitflip in the ECC code (1 bit change).

> I don't know how BCH (and other multiple-error-bit algorithms) deals with 
> this though. Does it assume that there will be at most one bitflip within 
> the ECC data (which is much smaller than the bytes covered by the ECC), or 
> does it accept multiple bit errors also in the ECC data? One would assume 
> that it would start to get difficult to distinguish multiple bit errors in 
> the ECC data with valid ECC indicating bit errors in the data, but perhaps 
> this isn't so thanks to the algorithm used?

With BCH and Reed-Solomon codes, data bytes and ecc bytes form a single
codeword. This whole codeword can be corrected, not just the data part.

Think of codewords as points in a space of binary words. In this space of
binary words, distance is measured as the number of differing bits between two
words.

A codeword has a remarkable property: it is distant from other codewords by at
least 2t+1 bits. You program the codeword to flash, and later read it back as a
(potentially) corrupted binary word. If you assume that this binary word is
distant from at most t bits from the original codeword (i.e. no more than t
bitflips occured in flash), then you can retrieve the original codeword (it is
the closest to your binary word).

The whole point of BCH (or RS) codes it to build such a codeword from the input
data, by adding the right ECC bits.

BR,

Ivan
Ivan Djelic - April 1, 2011, 4:09 p.m.
On Fri, Apr 01, 2011 at 03:16:01PM +0100, Artem Bityutskiy wrote:
> On Fri, 2011-04-01 at 16:04 +0200, Ivan Djelic wrote:
> > On Fri, Apr 01, 2011 at 02:04:41PM +0100, Artem Bityutskiy wrote:
> > > On Fri, 2011-04-01 at 14:14 +0200, Ivan Djelic wrote:
> > > > Did you consider this idea: if you have an unused byte available in oob,
> > > > program it to 0x00 when a page is programmed.
> > > 
> > > I guess this depends on the the controller, but probably this could mean
> > > a substantial write  overhead, not?
> > 
> > Sorry, my explanation was probably not very clear.
> > 
> > When you program a page, you send data and oob contents to your NAND controller.
> > The controller may modify oob contents to add ECC bytes, in a more or less
> > automatic way, then it will physically program the page.
> 
> Right, I just assumed that some controllers allow you to send only data,
> and programming OOB would be a separate operation. But if this is not
> the case in ST's HW - then this sounds like a very good approach!

I had a quick look at fsmc_nand.c, and I don't see anything in the controller
preventing this approach. The driver provides an IO_ADDR_W address for sending
data to the NAND device, and lets mtd upper layers do the job. By implementing
the page writing function in the driver, one could clear the marker in the oob
array before programming (and check it in the page reading function).

I would be very surprised if the controller did not allow control over oob
contents other than ecc bytes. By doing so, it would prevent things like
software bad block marking. But sometimes hardware can be very surprising :)

BR,

Ivan
Artem Bityutskiy - April 1, 2011, 4:16 p.m.
On Fri, 2011-04-01 at 18:09 +0200, ext Ivan Djelic wrote:
> I had a quick look at fsmc_nand.c, and I don't see anything in the controller
> preventing this approach. The driver provides an IO_ADDR_W address for sending
> data to the NAND device, and lets mtd upper layers do the job. By implementing
> the page writing function in the driver, one could clear the marker in the oob
> array before programming (and check it in the page reading function).

OK, cool!

> I would be very surprised if the controller did not allow control over oob
> contents other than ecc bytes. By doing so, it would prevent things like
> software bad block marking. But sometimes hardware can be very surprising :)

Well, I actually meant that programming OOB could be a separate
operation (not impossible): instead of "write data, oob data and ECC in
one go", something like "write data and oob, wait for completion, write
additional oob byte". So I meant that this could have write speed
impact. But this does not matter, since this is not the case now.

Patch

diff --git a/drivers/mtd/nand/fsmc_nand.c b/drivers/mtd/nand/fsmc_nand.c
index 205b10b..7df270a 100644
--- a/drivers/mtd/nand/fsmc_nand.c
+++ b/drivers/mtd/nand/fsmc_nand.c
@@ -420,7 +420,7 @@  static int fsmc_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 	struct fsmc_nand_data *host = container_of(mtd,
 					struct fsmc_nand_data, mtd);
 	struct fsmc_eccplace *ecc_place = host->ecc_place;
-	int i, j, s, stat, eccsize = chip->ecc.size;
+	int i, j, k, s, stat, eccsize = chip->ecc.size;
 	int eccbytes = chip->ecc.bytes;
 	int eccsteps = chip->ecc.steps;
 	uint8_t *p = buf;
@@ -460,11 +460,27 @@  static int fsmc_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 		memcpy(&ecc_code[i], oob, 13);
 		chip->ecc.calculate(mtd, p, &ecc_calc[i]);
 
-		stat = chip->ecc.correct(mtd, p, &ecc_code[i], &ecc_calc[i]);
-		if (stat < 0)
-			mtd->ecc_stats.failed++;
-		else
-			mtd->ecc_stats.corrected += stat;
+		/*
+		 * This is a temporary erase check. A newly erased page read
+		 * would result in an ecc error because the oob data is also
+		 * erased to FF and the calculated ecc for an FF data is not
+		 * FF..FF.
+		 * This is a workaround to skip performing correction in case
+		 * data is FF..FF
+		 */
+		for (k = 0; k < eccsize; k++) {
+			if (*(p + k) != 0xff)
+				break;
+		}
+
+		if (k < eccsize) {
+			stat = chip->ecc.correct(mtd, p, &ecc_code[i],
+					&ecc_calc[i]);
+			if (stat < 0)
+				mtd->ecc_stats.failed++;
+			else
+				mtd->ecc_stats.corrected += stat;
+		}
 	}
 
 	return 0;