Message ID | CAEGnBBSC=k5kkQYkLn37Q8KLLC+n43KiNr9ZZ-8NPJaGkGf0Qg@mail.gmail.com |
---|---|
State | RFC |
Headers | show |
Hi Wang, I have researched into the code but I still have some problems: 1) What's your kernel version? 2) Is the mechanism of removing XENT_NODE as same as removing DATA_NODE? 3) Is the mechanism of removing XENT_NODE same on ubifs_removexattr() and ubifs_jnl_delete_inode(). On 2014/7/12 21:36, 王丁 wrote: > Hi all, > > Now we use xattr based on ubifs, and find some issues about it. > Situation like that: > 1.poweron ->2.create a file -> 3.set xattr -> 4.delete the file -> 5.power cut > After several cycles with above steps, we can not boot up the device > with the error below. > > > Analysis: > when delete a file, ubifs will remove the xent node from tnc, if gc > happend, it will remove the xent node data from the GCed LEB because > of it has been removed form tnc , > then if a power cut happen, the journal replay may also try to remove > the related xattr node, the error occurred because of it has been > GCed. > > > Now I run commit when ubifs_jnl_delete_inode called, and it's OK. > Does anyone have a better way for the issue? > Can you draw a figure of this race? According to your description, I think the race is removing XENT_NODE twice. Is that true? > > diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c > index f755a24..eba555e 100755 > --- a/fs/ubifs/journal.c > +++ b/fs/ubifs/journal.c > @@ -900,6 +900,9 @@ int ubifs_jnl_delete_inode(struct ubifs_info *c, > const struct inode *inode) > else > ubifs_delete_orphan(c, inode->i_ino); > up_read(&c->commit_sem); > + > + ubifs_run_commit(c); > + > return err; > } Run commit after each deletion is not a good choice and I think this fix is just decreasing the rate of error happening. Let's find out a better solution. Thanks, Hu
On Sat, 2014-07-12 at 21:36 +0800, 王丁 wrote: > -------- kernel 3.10 log -------- > [ 64.916532] UBIFS: background thread "ubifs_bgt0_16" started, PID 92 > [ 64.926411] UBIFS: recovery needed > [ 65.013786] UBIFS error (pid 91): ubifs_read_node: bad node type > (255 but expected 3) > [ 65.021595] UBIFS error (pid 91): ubifs_read_node: bad node at LEB > 88:39680, LEB mapping status 1 > [ 65.030431] Not a node, first 24 bytes: > [ 65.034073] 00000000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > ff ff ff ff ff ff ff ff ff There was a similar report recently (but not about xattr) and I suspect "fastmap". Please, check if you use it - it is new and I did not see any report that it was extensively tested WRT power cuts.
On Wed, Jul 16, 2014 at 1:08 PM, Artem Bityutskiy <dedekind1@gmail.com> wrote: > On Sat, 2014-07-12 at 21:36 +0800, 王丁 wrote: >> -------- kernel 3.10 log -------- >> [ 64.916532] UBIFS: background thread "ubifs_bgt0_16" started, PID 92 >> [ 64.926411] UBIFS: recovery needed >> [ 65.013786] UBIFS error (pid 91): ubifs_read_node: bad node type >> (255 but expected 3) >> [ 65.021595] UBIFS error (pid 91): ubifs_read_node: bad node at LEB >> 88:39680, LEB mapping status 1 >> [ 65.030431] Not a node, first 24 bytes: >> [ 65.034073] 00000000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >> ff ff ff ff ff ff ff ff ff > > There was a similar report recently (but not about xattr) and I suspect > "fastmap". Please, check if you use it - it is new and I did not see any > report that it was extensively tested WRT power cuts. To be more precisely, is CONFIG_MTD_UBI_FASTMAP=y and was the image attached by fastmap? The kernel log would contain the message "attached by fastmap".
diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c index f755a24..eba555e 100755 --- a/fs/ubifs/journal.c +++ b/fs/ubifs/journal.c @@ -900,6 +900,9 @@ int ubifs_jnl_delete_inode(struct ubifs_info *c, const struct inode *inode) else ubifs_delete_orphan(c, inode->i_ino); up_read(&c->commit_sem); + + ubifs_run_commit(c); + return err; }