Patchwork mtd/ubi: recognize empty flash with errors as empty

login
register
mail settings
Submitter Artem Bityutskiy
Date April 24, 2010, 11:24 a.m.
Message ID <1272108241.11751.1635.camel@localhost.localdomain>
Download mbox | patch
Permalink /patch/50886/
State New
Headers show

Comments

Artem Bityutskiy - April 24, 2010, 11:24 a.m.
On Fri, 2010-04-23 at 19:28 +0200, Sebastian Andrzej Siewior wrote:
> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> Attaching empty nand with a block which contains a RS-Error which can't
> be fixed resulted in:
> 
> | UBI: attaching mtd9 to ubi0
> | UBI error: ubi_io_read: error -74 while reading 64 bytes from PEB 3399:0, read 64 bytes
> | Call Trace:
> | [cfbd5c60] [c0008558] show_stack+0x48/0x19c (unreliable)
> | [cfbd5ca0] [c01a71e8] ubi_io_read+0x188/0x288
> | [cfbd5cf0] [c01a76e8] ubi_io_read_ec_hdr+0x74/0x2a4
> | [cfbd5d20] [c01abe9c] ubi_scan+0x178/0x10b4
> | [cfbd5d80] [c01a1464] ubi_attach_mtd_dev+0x67c/0xe44
> | [cfbd5e80] [c01a1fc8] ctrl_cdev_ioctl+0x178/0x210
> | [cfbd5ec0] [c008711c] do_ioctl+0x3c/0xc4
> | [cfbd5ee0] [c0087224] vfs_ioctl+0x80/0x448
> | [cfbd5f10] [c008762c] sys_ioctl+0x40/0x88
> | [cfbd5f40] [c000f960] ret_from_syscall+0x0/0x38
> | UBI error: ubi_read_volume_table: the layout volume was not found
> | UBI error: ubi_attach_mtd_dev: failed to attach by scanning, error -22
> 
> Assuming that blocks which can only be read with errors are empty will let
> the volume attach. Another access to the block in question resulted here
> in:
> 
> | UBI error: ubi_io_read: error -74 while reading 64 bytes from PEB 3399:0, read 64 bytes
> | nand_erase: Failed erase, page 0x000751c0
> | nand_erase: Failed erase, page 0x000751c0
> | nand_erase: Failed erase, page 0x000751c0
> | nand_erase: Failed erase, page 0x000751c0
> | UBI error: do_sync_erase: cannot erase PEB 3399, error -5
> | UBI error: erase_worker: failed to erase PEB 3399, error -5
> | UBI: mark PEB 3399 as bad
> | UBI: 39 PEBs left in the reserve
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  drivers/mtd/ubi/scan.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/mtd/ubi/scan.c b/drivers/mtd/ubi/scan.c
> index dc5f688..7529d46 100644
> --- a/drivers/mtd/ubi/scan.c
> +++ b/drivers/mtd/ubi/scan.c
> @@ -756,7 +756,8 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si,
>  		bitflips = 1;
>  	}
>  
> -	si->is_empty = 0;
> +	if (err != UBI_IO_BAD_EC_HDR)
> +		si->is_empty = 0;
>  
>  	if (!ec_corr) {
>  		int image_seq;
> @@ -827,6 +828,7 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si,
>  			return err;
>  		goto adjust_mean_ec;
>  	}
> +	si->is_empty = 0;
>  
>  	vol_id = be32_to_cpu(vidh->vol_id);
>  	if (vol_id > UBI_MAX_VOLUMES && vol_id != UBI_LAYOUT_VOLUME_ID) {

Thanks, pushed to ubi-2.6.git / master with the following minor tweak,
please check:
Sebastian Siewior - April 25, 2010, 9:09 p.m.
* Artem Bityutskiy | 2010-04-24 14:24:01 [+0300]:

>Thanks, pushed to ubi-2.6.git / master with the following minor tweak,
>please check:
>
>diff --git a/drivers/mtd/ubi/scan.c b/drivers/mtd/ubi/scan.c
>index 7529d46..48e570c 100644
>--- a/drivers/mtd/ubi/scan.c
>+++ b/drivers/mtd/ubi/scan.c
>@@ -756,12 +756,12 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si,
>                bitflips = 1;
>        }
> 
>-       if (err != UBI_IO_BAD_EC_HDR)
>-               si->is_empty = 0;
>-
>        if (!ec_corr) {
>                int image_seq;
> 
>+               /* There is an EC header, so the flash is not empty */
>+               si->is_empty = 0;
>+
>                /* Make sure UBI version is OK */
>                if (ech->version != UBI_VERSION) {
>                        ubi_err("this UBI version is %d, image version is %d",
>

I guess that's okay. What are the chances that you can't read the EC
header but you can somehow read the VID header. AND if there is a valid
VID header then there is more, and si->is_empty will be set later on,
right?

Sebastian
Artem Bityutskiy - April 26, 2010, 4:59 a.m.
On Sun, 2010-04-25 at 23:09 +0200, Sebastian Andrzej Siewior wrote:
> * Artem Bityutskiy | 2010-04-24 14:24:01 [+0300]:
> 
> >Thanks, pushed to ubi-2.6.git / master with the following minor tweak,
> >please check:
> >
> >diff --git a/drivers/mtd/ubi/scan.c b/drivers/mtd/ubi/scan.c
> >index 7529d46..48e570c 100644
> >--- a/drivers/mtd/ubi/scan.c
> >+++ b/drivers/mtd/ubi/scan.c
> >@@ -756,12 +756,12 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si,
> >                bitflips = 1;
> >        }
> > 
> >-       if (err != UBI_IO_BAD_EC_HDR)
> >-               si->is_empty = 0;
> >-
> >        if (!ec_corr) {
> >                int image_seq;
> > 
> >+               /* There is an EC header, so the flash is not empty */
> >+               si->is_empty = 0;
> >+
> >                /* Make sure UBI version is OK */
> >                if (ech->version != UBI_VERSION) {
> >                        ubi_err("this UBI version is %d, image version is %d",
> >
> 
> I guess that's okay. What are the chances that you can't read the EC
> header but you can somehow read the VID header. 

When the VID header sits in the next NAND page, there are some changes,
but I never observed such a situation in practice.

> AND if there is a valid
> VID header then there is more, and si->is_empty will be set later on,
> right?

Yes, AFAICS.
Sebastian Siewior - April 26, 2010, 8:28 a.m.
* Artem Bityutskiy | 2010-04-26 07:59:50 [+0300]:

>On Sun, 2010-04-25 at 23:09 +0200, Sebastian Andrzej Siewior wrote:
>> * Artem Bityutskiy | 2010-04-24 14:24:01 [+0300]:
>> 
>> >Thanks, pushed to ubi-2.6.git / master with the following minor tweak,
>> >please check:
>> >
>> >diff --git a/drivers/mtd/ubi/scan.c b/drivers/mtd/ubi/scan.c
>> >index 7529d46..48e570c 100644
>> >--- a/drivers/mtd/ubi/scan.c
>> >+++ b/drivers/mtd/ubi/scan.c
>> >@@ -756,12 +756,12 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si,
>> >                bitflips = 1;
>> >        }
>> > 
>> >-       if (err != UBI_IO_BAD_EC_HDR)
>> >-               si->is_empty = 0;
>> >-
>> >        if (!ec_corr) {
>> >                int image_seq;
>> > 
>> >+               /* There is an EC header, so the flash is not empty */
>> >+               si->is_empty = 0;
>> >+
>> >                /* Make sure UBI version is OK */
>> >                if (ech->version != UBI_VERSION) {
>> >                        ubi_err("this UBI version is %d, image version is %d",
>> >
>> 
>> I guess that's okay. What are the chances that you can't read the EC
>> header but you can somehow read the VID header. 
>
>When the VID header sits in the next NAND page, there are some changes,
>but I never observed such a situation in practice.
>
>> AND if there is a valid
>> VID header then there is more, and si->is_empty will be set later on,
>> right?
>
>Yes, AFAICS.

Oh. UBI_IO_BAD_EC_HDR / UBI_IO_BAD_VID_HDR is returned when
- the page page can not be read
- the page contains non-ubi information

So I think the latter case is now broken. In fact I just copied some
random things into my mtd partition and after attach & mkvol they were
gone with no error.

So in case we want to support something other than UBI then we should
probably add another error code in order to distinguish between read
error and not a vald EC / VID header.

Sebastian
Artem Bityutskiy - April 29, 2010, 9:42 a.m.
On Mon, 2010-04-26 at 10:28 +0200, Sebastian Andrzej Siewior wrote:
> * Artem Bityutskiy | 2010-04-26 07:59:50 [+0300]:
> 
> >On Sun, 2010-04-25 at 23:09 +0200, Sebastian Andrzej Siewior wrote:
> >> * Artem Bityutskiy | 2010-04-24 14:24:01 [+0300]:
> >> 
> >> >Thanks, pushed to ubi-2.6.git / master with the following minor tweak,
> >> >please check:
> >> >
> >> >diff --git a/drivers/mtd/ubi/scan.c b/drivers/mtd/ubi/scan.c
> >> >index 7529d46..48e570c 100644
> >> >--- a/drivers/mtd/ubi/scan.c
> >> >+++ b/drivers/mtd/ubi/scan.c
> >> >@@ -756,12 +756,12 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si,
> >> >                bitflips = 1;
> >> >        }
> >> > 
> >> >-       if (err != UBI_IO_BAD_EC_HDR)
> >> >-               si->is_empty = 0;
> >> >-
> >> >        if (!ec_corr) {
> >> >                int image_seq;
> >> > 
> >> >+               /* There is an EC header, so the flash is not empty */
> >> >+               si->is_empty = 0;
> >> >+
> >> >                /* Make sure UBI version is OK */
> >> >                if (ech->version != UBI_VERSION) {
> >> >                        ubi_err("this UBI version is %d, image version is %d",
> >> >
> >> 
> >> I guess that's okay. What are the chances that you can't read the EC
> >> header but you can somehow read the VID header. 
> >
> >When the VID header sits in the next NAND page, there are some changes,
> >but I never observed such a situation in practice.
> >
> >> AND if there is a valid
> >> VID header then there is more, and si->is_empty will be set later on,
> >> right?
> >
> >Yes, AFAICS.
> 
> Oh. UBI_IO_BAD_EC_HDR / UBI_IO_BAD_VID_HDR is returned when
> - the page page can not be read
> - the page contains non-ubi information

Bear in mind that it is difficult to distinguish between non-UBI
information and just very corrupted headers, so ATM, in case of CRC
error, UBI assumes this is a corrupted header, although this could
non-UBI stuff.

> So I think the latter case is now broken. In fact I just copied some
> random things into my mtd partition and after attach & mkvol they were
> gone with no error.

You mean UBI just attached your device? What would you expect it to do
when it sees that part of eraseblocks contain corrupted headers? ATM, it
just formats those eraseblocks. What would be your expectation?

> So in case we want to support something other than UBI then we should
> probably add another error code in order to distinguish between read
> error and not a vald EC / VID header.

If you feed UBI flash with no valid UBI headers, it will be refused, I
think.

I actually do not really see what is the use-case or scenario you want
UBI to handle better.

Patch

diff --git a/drivers/mtd/ubi/scan.c b/drivers/mtd/ubi/scan.c
index 7529d46..48e570c 100644
--- a/drivers/mtd/ubi/scan.c
+++ b/drivers/mtd/ubi/scan.c
@@ -756,12 +756,12 @@  static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si,
                bitflips = 1;
        }
 
-       if (err != UBI_IO_BAD_EC_HDR)
-               si->is_empty = 0;
-
        if (!ec_corr) {
                int image_seq;
 
+               /* There is an EC header, so the flash is not empty */
+               si->is_empty = 0;
+
                /* Make sure UBI version is OK */
                if (ech->version != UBI_VERSION) {
                        ubi_err("this UBI version is %d, image version is %d",