Message ID | 548FE51D.60707@huawei.com |
---|---|
State | Changes Requested |
Headers | show |
On 2014/12/16 15:54, hujianyang wrote: > If an error occur while reading from PEBs, for example, an ECC error, > ubi_io_read() will print some error messages. But it's not enough for > debugging. These messages don't show the mapping info for a read from > UBIFS layer. > > Although UBIFS will soon print its error messages after catching the > return value from UBI layer, multi-path reading will confuse the > relationship between LEBs and PEBs showed by these messages. > > This patch adds an ubi_err() to report reading errors in the function > ubi_eba_read_leb(). The mapping info of LEB and PEB is showed by > this error message. > > Signed-off-by: hujianyang <hujianyang@huawei.com> > --- > drivers/mtd/ubi/eba.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c > index b698534..b4e69e1 100644 > --- a/drivers/mtd/ubi/eba.c > +++ b/drivers/mtd/ubi/eba.c > @@ -477,6 +477,8 @@ out_free: > ubi_free_vid_hdr(ubi, vid_hdr); > out_unlock: > leb_read_unlock(ubi, vol_id, lnum); > + ubi_err(ubi, "err %d while reading %d bytes from offset %d of LEB %d:%d, PEB %d", > + err, len, offset, vol_id, lnum, pnum); > return err; > } > Hi, I met a problem that I was failed to mount a UBIFS partition. [ 38.442770] UBI error: ubi_io_read: error -74 (ECC error) while reading 26624 bytes from PEB 54:104448, read 26624 bytes [ 38.852461] UBI error: ubi_io_read: error -74 (ECC error) while reading 77824 bytes from PEB 346:53248, read 77824 bytes [ 38.864142] UBIFS error (pid 1444): ubifs_recover_leb: corruption -3 [ 38.870487] UBIFS error (pid 1444): ubifs_scanned_corruption: corruption at LEB 928:55280 [ 38.878625] UBIFS error (pid 1444): ubifs_scanned_corruption: first 8192 bytes from LEB 928:55280 [ 38.892117] UBIFS error (pid 1444): ubifs_recover_leb: LEB 928 scanning failed mount: mounting ubi1:bak on /HFFS2: failed: Structure needs cleaning I think it is caused by an ECC error of nand flash. Do we have some methods to mount this partition? Data losing is acceptable. Thanks, Hu
Am 16.12.2014 um 08:54 schrieb hujianyang: > If an error occur while reading from PEBs, for example, an ECC error, > ubi_io_read() will print some error messages. But it's not enough for > debugging. These messages don't show the mapping info for a read from > UBIFS layer. > > Although UBIFS will soon print its error messages after catching the > return value from UBI layer, multi-path reading will confuse the > relationship between LEBs and PEBs showed by these messages. > > This patch adds an ubi_err() to report reading errors in the function > ubi_eba_read_leb(). The mapping info of LEB and PEB is showed by > this error message. > > Signed-off-by: hujianyang <hujianyang@huawei.com> > --- > drivers/mtd/ubi/eba.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c > index b698534..b4e69e1 100644 > --- a/drivers/mtd/ubi/eba.c > +++ b/drivers/mtd/ubi/eba.c > @@ -477,6 +477,8 @@ out_free: > ubi_free_vid_hdr(ubi, vid_hdr); > out_unlock: > leb_read_unlock(ubi, vol_id, lnum); > + ubi_err(ubi, "err %d while reading %d bytes from offset %d of LEB %d:%d, PEB %d", > + err, len, offset, vol_id, lnum, pnum); This label will also be reached if we run out of memory. Please make sure that the new ubi_err() can only be reached in case of an MTD error. Also make sure that the function prints only one message and not two. Thanks, //richard
Am 16.12.2014 um 09:02 schrieb hujianyang: > Hi, > > I met a problem that I was failed to mount a UBIFS partition. > > [ 38.442770] UBI error: ubi_io_read: error -74 (ECC error) while reading 26624 bytes from PEB 54:104448, read 26624 bytes > [ 38.852461] UBI error: ubi_io_read: error -74 (ECC error) while reading 77824 bytes from PEB 346:53248, read 77824 bytes > [ 38.864142] UBIFS error (pid 1444): ubifs_recover_leb: corruption -3 > [ 38.870487] UBIFS error (pid 1444): ubifs_scanned_corruption: corruption at LEB 928:55280 > [ 38.878625] UBIFS error (pid 1444): ubifs_scanned_corruption: first 8192 bytes from LEB 928:55280 > [ 38.892117] UBIFS error (pid 1444): ubifs_recover_leb: LEB 928 scanning failed > mount: mounting ubi1:bak on /HFFS2: failed: Structure needs cleaning > > I think it is caused by an ECC error of nand flash. Do we have some methods > to mount this partition? Data losing is acceptable. I don't think that UBIFS has such a mount option. You can dump the raw data and inspect the corrupted data. Maybe you can fix it by hand. Thanks, //richard
On 2014/12/16 17:21, Richard Weinberger wrote: > Am 16.12.2014 um 09:02 schrieb hujianyang: >> Hi, >> >> I met a problem that I was failed to mount a UBIFS partition. >> >> [ 38.442770] UBI error: ubi_io_read: error -74 (ECC error) while reading 26624 bytes from PEB 54:104448, read 26624 bytes >> [ 38.852461] UBI error: ubi_io_read: error -74 (ECC error) while reading 77824 bytes from PEB 346:53248, read 77824 bytes >> [ 38.864142] UBIFS error (pid 1444): ubifs_recover_leb: corruption -3 >> [ 38.870487] UBIFS error (pid 1444): ubifs_scanned_corruption: corruption at LEB 928:55280 >> [ 38.878625] UBIFS error (pid 1444): ubifs_scanned_corruption: first 8192 bytes from LEB 928:55280 >> [ 38.892117] UBIFS error (pid 1444): ubifs_recover_leb: LEB 928 scanning failed >> mount: mounting ubi1:bak on /HFFS2: failed: Structure needs cleaning >> >> I think it is caused by an ECC error of nand flash. Do we have some methods >> to mount this partition? Data losing is acceptable. > > I don't think that UBIFS has such a mount option. Er, I don't know it either. How about a mount option like --force? > You can dump the raw data and inspect the corrupted data. > Maybe you can fix it by hand. Yes, I want a try~! If we have to introduce a new feature or new mount option. So would you like to help me? Do you think it's a valuable work? Thanks~ Hu
Am 16.12.2014 um 10:52 schrieb hujianyang: > On 2014/12/16 17:21, Richard Weinberger wrote: >> Am 16.12.2014 um 09:02 schrieb hujianyang: >>> Hi, >>> >>> I met a problem that I was failed to mount a UBIFS partition. >>> >>> [ 38.442770] UBI error: ubi_io_read: error -74 (ECC error) while reading 26624 bytes from PEB 54:104448, read 26624 bytes >>> [ 38.852461] UBI error: ubi_io_read: error -74 (ECC error) while reading 77824 bytes from PEB 346:53248, read 77824 bytes >>> [ 38.864142] UBIFS error (pid 1444): ubifs_recover_leb: corruption -3 >>> [ 38.870487] UBIFS error (pid 1444): ubifs_scanned_corruption: corruption at LEB 928:55280 >>> [ 38.878625] UBIFS error (pid 1444): ubifs_scanned_corruption: first 8192 bytes from LEB 928:55280 >>> [ 38.892117] UBIFS error (pid 1444): ubifs_recover_leb: LEB 928 scanning failed >>> mount: mounting ubi1:bak on /HFFS2: failed: Structure needs cleaning >>> >>> I think it is caused by an ECC error of nand flash. Do we have some methods >>> to mount this partition? Data losing is acceptable. >> >> I don't think that UBIFS has such a mount option. > > Er, I don't know it either. How about a mount option like --force? Then every single embedded vendor will use this flag to keep the broken MTD/UBI/UBIFS setups running as long as possible no mater of how corrupted the data is. :-) IIRC UBIFS will either mount and work correctly as expected or fail hard. >> You can dump the raw data and inspect the corrupted data. >> Maybe you can fix it by hand. > > Yes, I want a try~! If we have to introduce a new feature or new mount > option. So would you like to help me? Do you think it's a valuable > work? I'm not a fan of such a mount option. What we really need is a fsck.ubifs and a ubifs dump tool to fix and recover broken UBIFS images. Thanks, //richard
On 2014/12/16 16:58, Richard Weinberger wrote: > Am 16.12.2014 um 08:54 schrieb hujianyang: >> If an error occur while reading from PEBs, for example, an ECC error, >> ubi_io_read() will print some error messages. But it's not enough for >> debugging. These messages don't show the mapping info for a read from >> UBIFS layer. >> >> Although UBIFS will soon print its error messages after catching the >> return value from UBI layer, multi-path reading will confuse the >> relationship between LEBs and PEBs showed by these messages. >> >> This patch adds an ubi_err() to report reading errors in the function >> ubi_eba_read_leb(). The mapping info of LEB and PEB is showed by >> this error message. >> >> Signed-off-by: hujianyang <hujianyang@huawei.com> >> --- >> drivers/mtd/ubi/eba.c | 2 ++ >> 1 files changed, 2 insertions(+), 0 deletions(-) >> >> diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c >> index b698534..b4e69e1 100644 >> --- a/drivers/mtd/ubi/eba.c >> +++ b/drivers/mtd/ubi/eba.c >> @@ -477,6 +477,8 @@ out_free: >> ubi_free_vid_hdr(ubi, vid_hdr); >> out_unlock: >> leb_read_unlock(ubi, vol_id, lnum); >> + ubi_err(ubi, "err %d while reading %d bytes from offset %d of LEB %d:%d, PEB %d", >> + err, len, offset, vol_id, lnum, pnum); > > This label will also be reached if we run out of memory. Yes, but I think it's OK. Because a read operation is really failed by some errors. We can print these errors not only in case of MTD errors but also in case of CRC errors, ENOMEM or others. > Please make sure that the new ubi_err() can only be reached in case of an MTD error. > Also make sure that the function prints only one message and not two. Er, At first, I want to print just MTD error, perform ubi_err() in mtd_is_eccerr() case, for example. But I found it's not easy to print it only once. The *retry* label makes it difficulty to determine when this function, ubi_eba_read_leb() returns, this message will print two times or none. So at last, I move this ubi_err() to the end of error handling path for easy. ^.^ What's your opinion? Hu
On 2014/12/16 17:57, Richard Weinberger wrote: > Then every single embedded vendor will use this flag to keep the broken MTD/UBI/UBIFS setups running > as long as possible no mater of how corrupted the data is. :-) > IIRC UBIFS will either mount and work correctly as expected or fail hard. You are right~! Maybe we can set filesystem to RO if it is mounted with --force, and allow users to copy their data to other place. How about this? > >>> You can dump the raw data and inspect the corrupted data. >>> Maybe you can fix it by hand. >> >> Yes, I want a try~! If we have to introduce a new feature or new mount >> option. So would you like to help me? Do you think it's a valuable >> work? > > I'm not a fan of such a mount option. > What we really need is a fsck.ubifs and a ubifs dump tool to fix and recover > broken UBIFS images. I think it's better, but a bit harder. As I know, my UBIDUMP is far from what you expect. I should spend more time on it. After all, I was asked to fix this error. My plan is do something after an ECC error is detected, not directly breakout to allow this partition to be mounted. I don't think this solution will be easily accept by mainline. However, I'd like to show my work if I succeed. Thanks~! Hu
diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c index b698534..b4e69e1 100644 --- a/drivers/mtd/ubi/eba.c +++ b/drivers/mtd/ubi/eba.c @@ -477,6 +477,8 @@ out_free: ubi_free_vid_hdr(ubi, vid_hdr); out_unlock: leb_read_unlock(ubi, vol_id, lnum); + ubi_err(ubi, "err %d while reading %d bytes from offset %d of LEB %d:%d, PEB %d", + err, len, offset, vol_id, lnum, pnum); return err; }
If an error occur while reading from PEBs, for example, an ECC error, ubi_io_read() will print some error messages. But it's not enough for debugging. These messages don't show the mapping info for a read from UBIFS layer. Although UBIFS will soon print its error messages after catching the return value from UBI layer, multi-path reading will confuse the relationship between LEBs and PEBs showed by these messages. This patch adds an ubi_err() to report reading errors in the function ubi_eba_read_leb(). The mapping info of LEB and PEB is showed by this error message. Signed-off-by: hujianyang <hujianyang@huawei.com> --- drivers/mtd/ubi/eba.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-)