Message ID | 20100420184412.GA19928@Chamillionaire.breakpoint.cc |
---|---|
State | New, archived |
Headers | show |
On Tue, 2010-04-20 at 20:44 +0200, Sebastian Andrzej Siewior wrote: > Attaching empty nand with a block which contains a RS-Error which can't > be fixed: > > | UBI: attaching mtd9 to ubi0 > | UBI: physical eraseblock size: 131072 bytes (128 KiB) > | UBI: logical eraseblock size: 129024 bytes > | UBI: smallest flash I/O unit: 2048 > | UBI: sub-page size: 512 > | UBI: VID header offset: 512 (aligned 512) > | UBI: data offset: 2048 > | UBI error: ubi_io_read: error -74 while reading 64 bytes from PEB 3399:0, read 64 bytes > | Call Trace: > | [cfbd5c60] [c0008558] show_stack+0x48/0x19c (unreliable) > | [cfbd5ca0] [c01a71e8] ubi_io_read+0x188/0x288 > | [cfbd5cf0] [c01a76e8] ubi_io_read_ec_hdr+0x74/0x2a4 > | [cfbd5d20] [c01abe9c] ubi_scan+0x178/0x10b4 > | [cfbd5d80] [c01a1464] ubi_attach_mtd_dev+0x67c/0xe44 > | [cfbd5e80] [c01a1fc8] ctrl_cdev_ioctl+0x178/0x210 > | [cfbd5ec0] [c008711c] do_ioctl+0x3c/0xc4 > | [cfbd5ee0] [c0087224] vfs_ioctl+0x80/0x448 > | [cfbd5f10] [c008762c] sys_ioctl+0x40/0x88 > | [cfbd5f40] [c000f960] ret_from_syscall+0x0/0x38 > | UBI error: ubi_read_volume_table: the layout volume was not found > | UBI error: ubi_attach_mtd_dev: failed to attach by scanning, error -22 > > ubi_io_read() calls ubi->mtd->read() which ends in nand_do_read_ops() at > some point. The NAND controller was not able to read the page and > incremented mtd->ecc_stats.failed. The tail of nand_do_read_ops(): > > | if (mtd->ecc_stats.failed - stats.failed) > | return -EBADMSG; > > So we return with -EBADMSG as you can see in the error message. This > error is then converted by ubi_io_read() into -EIO: > | /* > | * The driver should never return -EBADMSG if it failed to read > | * all the requested data. But some buggy drivers might do > | * this, so we change it to -EIO. > | */ > | if (read != len && err == -EBADMSG) { > | ubi_assert(0); > | err = -EIO; > | } > > At this point we leave ubi_scan() and the mtd partition remains > unattached. No really sure how we ended up with -EINVAL in > ubi_attach_mtd_dev(). > > I guess this is not wanted, is it? > > The following patch led to: > | UBI: empty MTD device detected > | UBI: create volume table (copy #1) > | UBI: create volume table (copy #2) > | UBI: attached mtd9 to ubi0 > | UBI: MTD device name: "ubi" > | UBI: MTD device size: 512 MiB > | UBI: number of good PEBs: 4092 > | UBI: number of bad PEBs: 4 > | UBI: max. allowed volumes: 128 > | UBI: wear-leveling threshold: 4096 > | UBI: number of internal volumes: 1 > | UBI: number of user volumes: 0 > | UBI: available PEBs: 4048 > | UBI: total number of reserved PEBs: 44 > | UBI: number of PEBs reserved for bad PEB handling: 40 > | UBI: max/mean erase counter: 0/0 > | UBI: image sequence number: 0 > | UBI: background thread "ubi_bgt0d" started, PID 1986 > > ubimkvol gave me: > | UBI error: ubi_io_read: error -74 while reading 64 bytes from PEB 3399:0, read 64 bytes > | nand_erase: Failed erase, page 0x000751c0 > | nand_erase: Failed erase, page 0x000751c0 > | nand_erase: Failed erase, page 0x000751c0 > | nand_erase: Failed erase, page 0x000751c0 > | UBI error: do_sync_erase: cannot erase PEB 3399, error -5 > | UBI error: erase_worker: failed to erase PEB 3399, error -5 > | UBI: mark PEB 3399 as bad > | UBI: 39 PEBs left in the reserve > > This looks good so far I guess. Is there much I break with patch? Should > this case be handled at another level? Well, I think the idea was that the MTD layer tries to read all the requested data despite of a possible unrecoverable ECC error in the middle. So if it returns -EBADMSG, we know that the read buffer contains all the data anyway, even though some of the data may be incorrect. So, I'd suggest changing MTD and making it return read == len in that case. > @@ -756,7 +756,8 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si, > bitflips = 1; > } > > - si->is_empty = 0; > + if (err != UBI_IO_BAD_EC_HDR) > + si->is_empty = 0; > > if (!ec_corr) { > int image_seq; > @@ -827,6 +828,7 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si, > return err; > goto adjust_mean_ec; > } > + si->is_empty = 0; > > vol_id = be32_to_cpu(vidh->vol_id); > if (vol_id > UBI_MAX_VOLUMES && vol_id != UBI_LAYOUT_VOLUME_ID) { This part looks right.
* Artem Bityutskiy | 2010-04-21 05:50:57 [+0300]: >On Tue, 2010-04-20 at 20:44 +0200, Sebastian Andrzej Siewior wrote: >> | UBI error: ubi_io_read: error -74 while reading 64 bytes from PEB 3399:0, read 64 bytes >> So we return with -EBADMSG as you can see in the error message. This >> error is then converted by ubi_io_read() into -EIO: >> | /* >> | * The driver should never return -EBADMSG if it failed to read >> | * all the requested data. But some buggy drivers might do >> | * this, so we change it to -EIO. >> | */ >> | if (read != len && err == -EBADMSG) { >> | ubi_assert(0); >> | err = -EIO; >> | } >> >Well, I think the idea was that the MTD layer tries to read all the >requested data despite of a possible unrecoverable ECC error in the >middle. So if it returns -EBADMSG, we know that the read buffer contains >all the data anyway, even though some of the data may be incorrect. > >So, I'd suggest changing MTD and making it return read == len in that >case. Wait. According to the error message, I had read == len (both 64). So I think it never got changed to -EIO, my mistake. >> @@ -756,7 +756,8 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si, >> bitflips = 1; >> } >> >> - si->is_empty = 0; >> + if (err != UBI_IO_BAD_EC_HDR) >> + si->is_empty = 0; >> >> if (!ec_corr) { >> int image_seq; >> @@ -827,6 +828,7 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si, >> return err; >> goto adjust_mean_ec; >> } >> + si->is_empty = 0; >> >> vol_id = be32_to_cpu(vidh->vol_id); >> if (vol_id > UBI_MAX_VOLUMES && vol_id != UBI_LAYOUT_VOLUME_ID) { > >This part looks right. Do you want me to resend this part as the fix? Sebastian
diff --git a/drivers/mtd/ubi/io.c b/drivers/mtd/ubi/io.c index 533b1a4..710d069 100644 --- a/drivers/mtd/ubi/io.c +++ b/drivers/mtd/ubi/io.c @@ -181,10 +181,12 @@ retry: * all the requested data. But some buggy drivers might do * this, so we change it to -EIO. */ +#if 0 if (read != len && err == -EBADMSG) { ubi_assert(0); err = -EIO; } +#endif } else { ubi_assert(len == read); diff --git a/drivers/mtd/ubi/scan.c b/drivers/mtd/ubi/scan.c index dc5f688..7529d46 100644 --- a/drivers/mtd/ubi/scan.c +++ b/drivers/mtd/ubi/scan.c @@ -756,7 +756,8 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si, bitflips = 1; } - si->is_empty = 0; + if (err != UBI_IO_BAD_EC_HDR) + si->is_empty = 0; if (!ec_corr) { int image_seq; @@ -827,6 +828,7 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si, return err; goto adjust_mean_ec; } + si->is_empty = 0; vol_id = be32_to_cpu(vidh->vol_id); if (vol_id > UBI_MAX_VOLUMES && vol_id != UBI_LAYOUT_VOLUME_ID) {