Patchwork ubi_eba_init_scan: cannot reserve enough PEBs

login
register
mail settings
Submitter Artem Bityutskiy
Date Aug. 22, 2010, 6:30 p.m.
Message ID <1282501832.16502.97.camel@brekeke>
Download mbox | patch
Permalink /patch/62393/
State Accepted
Commit 3a8fa0ed3e1a8dc54f9297d2a2292b87e03e21b5
Headers show

Comments

Artem Bityutskiy - Aug. 22, 2010, 6:30 p.m.
On Tue, 2010-07-27 at 18:12 +0300, Artem Bityutskiy wrote:
> > Certainly.  I enabled all the relevant UBI and UBIFS debugging options
> > that I saw, along with internal self-checks, but there's still not a
> > whole lot of output.  Full console dump is attached - this is a
> > different device than the first, but exhibits the same problem.
> 
> I'm sure your ring buffer contains more information. This is one of the
> reasons I gave you the above link - it explains that not all messages go
> to console and how to get all meassages. Try to use dmesg. In UBIFS code
> I see that 'ubifs_read_node()' calls 'dbg_dump_node()' which should dump
> the node.
> 
> But '255' is 0xFF, so probably UBIFS read all 0xFF. This may be an UBIFS
> bug, or some corruption, difficult to say. For some reason the place
> where a valid znode should live is erased.
> 
> May be if I have a NAND dump of your broken device I can look at it, but
> do not promise anything, and I'm also on holiday :-)

Could you please dump LEB 5586 and check whether it is really erased or
not? Please, apply the following patch:

From feb1616809b0bebeaf0cb596fdb5d715f6d75700 Mon Sep 17 00:00:00 2001
From: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Date: Sun, 22 Aug 2010 21:27:30 +0300
Subject: [PATCH] UBIFS: improve error reporting when reading bad node

When an error happens during validation of read node, the typical situation is that
the LEB we read is unmapped (due to some bug). It is handy to include the mapping
status into the error message.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
---
 fs/ubifs/io.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)
Matthew L. Creech - Aug. 24, 2010, 10:38 p.m.
On Sun, Aug 22, 2010 at 2:30 PM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
>
> Could you please dump LEB 5586 and check whether it is really erased or
> not? Please, apply the following patch:
>
...

Hi Artem,

I applied your patch to 2.6.35 and booted that on the same bad device
for which I posted a boot log.  But to my surprise, it didn't give any
errors at all - the boot process completed normally, and it appears to
be working fine so far.  I had booted the kernel from memory, so I
pulled the power and let it go through the normal boot process (using
the 2.6.31 kernel that's built in to the firmware) - that also works
fine.

I checked my sanity by trying another bad device, and the same thing
happened there - booting 2.6.35 somehow "fixed" my problem.
Unfortunately, this means I can no longer tell what happened to the
original block you were interested in.  I'll try to dig up another bad
device, and use your patch with an older kenel version to see what
happens there.

Is it at all possible that the error was caused by something at the
UBI or MTD layer which was fixed between 2.6.34 and 2.6.35, and
booting up with 2.6.35 "touched" something that made it work again
even after reverting to an older kernel?  Sounds pretty far-fetched,
but I don't know how else to explain these suddenly-recovered
devices...
Artem Bityutskiy - Aug. 25, 2010, 3:51 a.m.
Hi,

On Tue, 2010-08-24 at 18:38 -0400, Matthew L. Creech wrote:
> I applied your patch to 2.6.35 and booted that on the same bad device
> for which I posted a boot log.  But to my surprise, it didn't give any
> errors at all - the boot process completed normally, and it appears to
> be working fine so far.  I had booted the kernel from memory, so I
> pulled the power and let it go through the normal boot process (using
> the 2.6.31 kernel that's built in to the firmware) - that also works
> fine.
> 
> I checked my sanity by trying another bad device, and the same thing
> happened there - booting 2.6.35 somehow "fixed" my problem.
> Unfortunately, this means I can no longer tell what happened to the
> original block you were interested in.  I'll try to dig up another bad
> device, and use your patch with an older kenel version to see what
> happens there.

This is interesting. BTW, if you use 2.6.31, you should in any case
apply patches from the ubifs-v2.6.31 back-port tree, there were some
good fixes.

> Is it at all possible that the error was caused by something at the
> UBI or MTD layer which was fixed between 2.6.34 and 2.6.35, and
> booting up with 2.6.35 "touched" something that made it work again
> even after reverting to an older kernel?  Sounds pretty far-fetched,
> but I don't know how else to explain these suddenly-recovered
> devices...

OK :-)

Artem.
Matthew L. Creech - Aug. 31, 2010, 3:36 p.m.
On Tue, Aug 24, 2010 at 6:38 PM, Matthew L. Creech <mlcreech@gmail.com> wrote:
>
> I applied your patch to 2.6.35 and booted that on the same bad device
> for which I posted a boot log.  But to my surprise, it didn't give any
> errors at all - the boot process completed normally, and it appears to
> be working fine so far.  I had booted the kernel from memory, so I
> pulled the power and let it go through the normal boot process (using
> the 2.6.31 kernel that's built in to the firmware) - that also works
> fine.
>

Another device had similar UBIFS corruption, so this time we stayed
with 2.6.34 and added the mapping status to that error message.  This
is the output:

UBIFS error (pid 455): ubifs_read_node: bad node type (255 but expected 1)
UBI DBG (pid 455): ubi_is_mapped: test LEB 0:7746
UBIFS error (pid 455): ubifs_read_node: bad node at LEB 7746:110360,
LEB mapping status 0
Not a node, first 24 bytes:
00000000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff                          ........................
Call Trace:
[c7b69bf0] [c0008410] show_stack+0x54/0x134 (unreliable)
[c7b69c30] [c00cfeb8] ubifs_read_node+0x2f0/0x308
[c7b69c60] [c00ef4a8] ubifs_tnc_read_node+0x60/0x188
[c7b69ca0] [c00d34a4] ubifs_tnc_locate+0xd0/0x1d8
[c7b69d00] [c00c5818] do_readpage+0x1d0/0x4b8
[c7b69d50] [c00c7410] ubifs_readpage+0x3ec/0x400
--- Exception: 901 at flush_dcache_icache_page+0x24/0x30
    LR = flush_dcache_icache_page+0x20/0x30
[c7b69db0] [c0048c44] generic_file_aio_read+0x454/0x630 (unreliable)
[c7b69e50] [c00698d4] do_sync_read+0xa4/0xe0
[c7b69ef0] [c0069f5c] vfs_read+0xc4/0x16c
[c7b69f10] [c006a29c] sys_read+0x4c/0x80
[c7b69f40] [c000f40c] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfb45b4c
    LR = 0xfb45b34

Is this what was expected?

FYI, this particular device only has corruption on a file that's not
boot-critical, so it continues the boot process somewhat normally.  A
partial log (cut off once it seemed that nothing else useful was
happening) is here, if that's of any use:

http://mcreech.com/work/ubifs-mapping-status.txt

It's also worth mentioning that when enabling UBI & UBIFS self checks
on this device, the boot process halts after some errors are detected.
 The pertinent portion is:

UBIFS error (pid 1): scan_check_cb: bad accounting of LEB 31: free 0,
dirty 560 flags 0x0, should be free 129024, dirty 0
(pid 1) start dumping LEB 31
UBIFS DBG (pid 1): ubifs_start_scan: scan LEB 31:0
UBI DBG (pid 1): ubi_leb_read: read 129024 bytes from LEB 0:31:0
UBI DBG (pid 1): ubi_eba_read_leb: read 129024 bytes from offset 0 of
LEB 0:31 (unmapped)
UBIFS DBG (pid 1): ubifs_scan: look at LEB 31:0 (129024 bytes left)
UBIFS DBG (pid 1): ubifs_scan_a_node: hit empty space
UBIFS DBG (pid 1): ubifs_end_scan: stop scanning LEB 31 at offset 0
LEB 31 has 0 nodes ending at 0
(pid 1) finish dumping LEB 31
UBIFS error (pid 1): do_commit: commit failed, error -22
UBIFS warning (pid 1): ubifs_ro_mode: switched to read-only mode, error -22
Call Trace:
[c7827d00] [c0008410] show_stack+0x54/0x134 (unreliable)
[c7827d40] [c00ce774] ubifs_ro_mode+0x60/0x70
[c7827d50] [c00daf40] do_commit+0x5f0/0x5fc
[c7827dd0] [c00eab14] ubifs_rcvry_gc_commit+0x440/0x46c
[c7827e10] [c00cc900] ubifs_get_sb+0xe60/0x1734
[c7827ea0] [c006c794] vfs_kern_mount+0x68/0xf0
[c7827ed0] [c006c85c] do_kern_mount+0x40/0xf0
[c7827f00] [c0080dd8] do_mount+0x634/0x6a0
[c7827f50] [c0080f28] sys_mount+0x90/0xcc
[c7827f80] [c02cdec8] mount_block_root+0x108/0x284
[c7827fd0] [c02ce288] prepare_namespace+0xac/0x1e8
[c7827fe0] [c02cd820] kernel_init+0x144/0x154
[c7827ff0] [c000f230] kernel_thread+0x4c/0x68

I've no idea if this is related or not, but there's a full log of that
run available here:

http://mcreech.com/work/ubifs-self-checks.txt

Please let me know if there are any other tests you'd like run on this
device.  Otherwise, we'll probably try booting 2.6.35 and see what
happens.

Thanks!
Artem Bityutskiy - Sept. 1, 2010, 6:57 p.m.
Hi,

On Tue, 2010-08-31 at 11:36 -0400, Matthew L. Creech wrote:
> UBIFS error (pid 455): ubifs_read_node: bad node type (255 but expected 1)
> UBI DBG (pid 455): ubi_is_mapped: test LEB 0:7746
> UBIFS error (pid 455): ubifs_read_node: bad node at LEB 7746:110360,
> LEB mapping status 0

OK, so this LEB is unmapped.

> Not a node, first 24 bytes:
> 00000000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> ff ff ff ff                          ........................
> Call Trace:
> [c7b69bf0] [c0008410] show_stack+0x54/0x134 (unreliable)
> [c7b69c30] [c00cfeb8] ubifs_read_node+0x2f0/0x308
> [c7b69c60] [c00ef4a8] ubifs_tnc_read_node+0x60/0x188
> [c7b69ca0] [c00d34a4] ubifs_tnc_locate+0xd0/0x1d8
> [c7b69d00] [c00c5818] do_readpage+0x1d0/0x4b8
> [c7b69d50] [c00c7410] ubifs_readpage+0x3ec/0x400
> --- Exception: 901 at flush_dcache_icache_page+0x24/0x30
>     LR = flush_dcache_icache_page+0x20/0x30
> [c7b69db0] [c0048c44] generic_file_aio_read+0x454/0x630 (unreliable)
> [c7b69e50] [c00698d4] do_sync_read+0xa4/0xe0
> [c7b69ef0] [c0069f5c] vfs_read+0xc4/0x16c
> [c7b69f10] [c006a29c] sys_read+0x4c/0x80
> [c7b69f40] [c000f40c] ret_from_syscall+0x0/0x38
> --- Exception: c01 at 0xfb45b4c
>     LR = 0xfb45b34
> 
> Is this what was expected?

Well, it is what I thought it wold be.

> FYI, this particular device only has corruption on a file that's not
> boot-critical, so it continues the boot process somewhat normally.  A
> partial log (cut off once it seemed that nothing else useful was
> happening) is here, if that's of any use:
> 
> http://mcreech.com/work/ubifs-mapping-status.txt
> 
> It's also worth mentioning that when enabling UBI & UBIFS self checks
> on this device, the boot process halts after some errors are detected.
>  The pertinent portion is:
> 
> UBIFS error (pid 1): scan_check_cb: bad accounting of LEB 31: free 0,
> dirty 560 flags 0x0, should be free 129024, dirty 0
> (pid 1) start dumping LEB 31
> UBIFS DBG (pid 1): ubifs_start_scan: scan LEB 31:0
> UBI DBG (pid 1): ubi_leb_read: read 129024 bytes from LEB 0:31:0
> UBI DBG (pid 1): ubi_eba_read_leb: read 129024 bytes from offset 0 of
> LEB 0:31 (unmapped)
> UBIFS DBG (pid 1): ubifs_scan: look at LEB 31:0 (129024 bytes left)
> UBIFS DBG (pid 1): ubifs_scan_a_node: hit empty space
> UBIFS DBG (pid 1): ubifs_end_scan: stop scanning LEB 31 at offset 0
> LEB 31 has 0 nodes ending at 0
> (pid 1) finish dumping LEB 31
> UBIFS error (pid 1): do_commit: commit failed, error -22
> UBIFS warning (pid 1): ubifs_ro_mode: switched to read-only mode, error -22
> Call Trace:
> [c7827d00] [c0008410] show_stack+0x54/0x134 (unreliable)
> [c7827d40] [c00ce774] ubifs_ro_mode+0x60/0x70
> [c7827d50] [c00daf40] do_commit+0x5f0/0x5fc
> [c7827dd0] [c00eab14] ubifs_rcvry_gc_commit+0x440/0x46c
> [c7827e10] [c00cc900] ubifs_get_sb+0xe60/0x1734
> [c7827ea0] [c006c794] vfs_kern_mount+0x68/0xf0
> [c7827ed0] [c006c85c] do_kern_mount+0x40/0xf0
> [c7827f00] [c0080dd8] do_mount+0x634/0x6a0
> [c7827f50] [c0080f28] sys_mount+0x90/0xcc
> [c7827f80] [c02cdec8] mount_block_root+0x108/0x284
> [c7827fd0] [c02ce288] prepare_namespace+0xac/0x1e8
> [c7827fe0] [c02cd820] kernel_init+0x144/0x154
> [c7827ff0] [c000f230] kernel_thread+0x4c/0x68
> 
> I've no idea if this is related or not, but there's a full log of that
> run available here:
> 
> http://mcreech.com/work/ubifs-self-checks.txt
> 
> Please let me know if there are any other tests you'd like run on this
> device.  Otherwise, we'll probably try booting 2.6.35 and see what
> happens.

I need to take some time and carefully look at this. And think. Please,
make a copy of the contents of your flash, if you can. 

From your side what would be helpful is if you tried to figure out how
to reproduce this. Since I do not have your HW I cannot reproduce this,
so the only thing I can do is to ask you to reproduce the problem with
various debugging patches.

CCing Adrian - may be he has ideas.
Artem Bityutskiy - Sept. 6, 2010, 9:17 a.m.
On Wed, 2010-09-01 at 21:57 +0300, Artem Bityutskiy wrote:
> > Please let me know if there are any other tests you'd like run on this
> > device.  Otherwise, we'll probably try booting 2.6.35 and see what
> > happens.
> 
> I need to take some time and carefully look at this. And think. Please,
> make a copy of the contents of your flash, if you can. 
> 
> From your side what would be helpful is if you tried to figure out how
> to reproduce this. Since I do not have your HW I cannot reproduce this,
> so the only thing I can do is to ask you to reproduce the problem with
> various debugging patches.

Matthew, I've sent a series of UBI patches which may help us to find out
the root cause of your issues.

The other idea which would definitely help is to create a debugging
patch which will track all erasures of PEBs and store them somewhere. I
do not know which tracing debugging tools you have, if you have some
fast tracing you can just send this info via your tracing interface. But
if you don't, you can use other flash or another partition on your flash
and store the info.

Here is my thinking:

UBIFS index points to an unmapped LEB. There are 2 possibilities: either
the index is incorrect or someone - UBI or UBIFS mistakenly unmapped an
needed LEB.

I personally think this is most probably the latter.

So we need to gather information about all unmap operations:

1. which LEB and which PEB is unmapped. The best place to get this info
is the 'do_sync_erase()' function. But it does not lave LEB. But we need
to add an 'int lnum' parameter there, and amend the callers as well. It
is some work, but should not be too difficult.

2. Then we need to know who unmapped the LEB - we need the stackdump.
Normally, we use 'dump_stack()' function to print stack dump - but it
prints to the log buffer. So we need a function which gives us an array
of addresses which we then can save and later transform to symbols. Or
we need a func which gives us a string containing an array of addresses.

We probably need to implement it. But I think kmemleak is doing
something like that - we can look there.

But also, make sure no-one in UBI use mtd->erase directly, just in case.
I think all erases should go via 'do_sync_erase()'

3. Most erasures are done in the background thread, so the stackdump
will point to the background thread, which is not informative at all.
This means we should also print PEB/LEB/stackdump in 'schedule_erase()'
to track all places where the erasure is initiated.

So, for each PEB erase / LEB unmap we store 1, 2 and 3 somewhere. When
we hit the UBIFS bug, we can see how the LEB was unmapped and how the
PEB was erased - this should give use idea what happened.

Do you think you can do something like this? I do not think I have time
in near future for this. What do you think?

But of course, if you learn how to reproduce this - this would be great.
Matthew L. Creech - Sept. 7, 2010, 3:59 p.m.
On Mon, Sep 6, 2010 at 5:17 AM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
>
> The other idea which would definitely help is to create a debugging
> patch which will track all erasures of PEBs and store them somewhere. I
> do not know which tracing debugging tools you have, if you have some
> fast tracing you can just send this info via your tracing interface. But
> if you don't, you can use other flash or another partition on your flash
> and store the info.
>
...

This makes sense, however I'm not sure of a good way to store this
info.  I don't have any hardware debugging tools (BDI, etc.) though I
could probably get my hands on one if you think it would help.
Creating another flash partition and saving data there could work,
although I'm not familiar with how to do that safely/cleanly from
within the kernel (I could give it a try though).

The brute-force method would be to just dump all of this info out to
the serial console, and I'll leave a minicom session running on a PC
to capture everything that happens.  I can't be certain how long it
would take to get a freshly-formatted device into this state, but the
quantity of output isn't a problem if I'm capturing to a PC.

I'll probably have time to set this test up in the next few days, but
it may be weeks until the test device goes bad (if it ever goes bad at
all).  As far as what code should be tested, do you want me to just
pull a copy of the latest ubifs-2.6 tree?  Or apply these patches to
something more stable?

Thanks for your help Artem!
Artem Bityutskiy - Sept. 7, 2010, 5:17 p.m.
On Tue, 2010-09-07 at 11:59 -0400, Matthew L. Creech wrote:
> On Mon, Sep 6, 2010 at 5:17 AM, Artem Bityutskiy <dedekind1@gmail.com> wrote:
> >
> > The other idea which would definitely help is to create a debugging
> > patch which will track all erasures of PEBs and store them somewhere. I
> > do not know which tracing debugging tools you have, if you have some
> > fast tracing you can just send this info via your tracing interface. But
> > if you don't, you can use other flash or another partition on your flash
> > and store the info.
> >
> ...
> 
> This makes sense, however I'm not sure of a good way to store this
> info.  I don't have any hardware debugging tools (BDI, etc.) though I
> could probably get my hands on one if you think it would help.
> Creating another flash partition and saving data there could work,
> although I'm not familiar with how to do that safely/cleanly from
> within the kernel (I could give it a try though).

Well, MTD API provides all basic operations. But I guess this is more
complex, so I'd leave this as a last resort method.

> The brute-force method would be to just dump all of this info out to
> the serial console,

Serial is too slow. If this bug is about race conditions, serial will
make it unreproducible. And this will slow down stuff too much, so you'd
run the same amount of operations tens time slower.

Take a look at the netconsole - using ethernet will be faster. It is
really easy to setup - just do not forget to switch off/configure
iptables. See Documentation/networking/netconsole.txt

>  and I'll leave a minicom session running on a PC
> to capture everything that happens.  I can't be certain how long it
> would take to get a freshly-formatted device into this state, but the
> quantity of output isn't a problem if I'm capturing to a PC.

Yeah, s/minicom/netconsole/, and capture the stuff in a file. This will
makes stuff simpler, at least the 'dump_stack()' issue I told about will
go away.

> I'll probably have time to set this test up in the next few days, but
> it may be weeks until the test device goes bad (if it ever goes bad at
> all).  As far as what code should be tested, do you want me to just
> pull a copy of the latest ubifs-2.6 tree?

This is probably not so important. Use the ubifs you have, but I'd like
to also see it. And send the debugging patch to me for reveiw to
comments (you'll need to prepare).

>   Or apply these patches to
> something more stable?

I think the ubifs base where you already saw the issue will be the best.

> Thanks for your help Artem!

NP, feel free to ask questions.
Artem Bityutskiy - Sept. 7, 2010, 5:48 p.m.
On Tue, 2010-09-07 at 20:17 +0300, Artem Bityutskiy wrote:
> > I'll probably have time to set this test up in the next few days, but
> > it may be weeks until the test device goes bad (if it ever goes bad at
> > all).  As far as what code should be tested, do you want me to just
> > pull a copy of the latest ubifs-2.6 tree?
> 
> This is probably not so important. Use the ubifs you have, but I'd like
> to also see it. And send the debugging patch to me for reveiw to
> comments (you'll need to prepare).

Err, I also think you should try my UBI patches which may help. But they
may themselves have a bug. May be 2 setups - one with one of your trees
and the other is latest ubifs + those patches?

Patch

diff --git a/fs/ubifs/io.c b/fs/ubifs/io.c
index bcf5a16..9432431 100644
--- a/fs/ubifs/io.c
+++ b/fs/ubifs/io.c
@@ -815,7 +815,8 @@  int ubifs_read_node(const struct ubifs_info *c, void *buf, int type, int len,
 	return 0;
 
 out:
-	ubifs_err("bad node at LEB %d:%d", lnum, offs);
+	ubifs_err("bad node at LEB %d:%d, LEB mapping status %d", lnum, offs,
+		  ubi_is_mapped(c->ubi, lnum));
 	dbg_dump_node(c, buf);
 	dbg_dump_stack();
 	return -EINVAL;