diff mbox

error handling in replay_log_leb()

Message ID AANLkTik882oGTwZZZ31T7YjfyvCmETlZgisjS6teDCB1@mail.gmail.com
State New, archived
Headers show

Commit Message

twebb June 24, 2010, 8 p.m. UTC
In the replay.c/replay_log_leb(), is there any disadvantage to calling
ubifs_recover_log_leb() regardless of whether need_recovery is true or
not?  I'm having an issue with ubifs dealing with a PEB with corrupt
empty space and this condition is handled fine during a mount when
need_recovery is true, but is not handled the same otherwise and
results in a failed mount.  A patch with the proposed change is below.

This question is along the same lines as one I asked yesterday about
ubifs_scan() error handling.

Thanks,
twebb

Comments

Artem Bityutskiy July 13, 2010, 4:21 a.m. UTC | #1
Hi,

On Thu, 2010-06-24 at 16:00 -0400, twebb wrote:
> In the replay.c/replay_log_leb(), is there any disadvantage to calling
> ubifs_recover_log_leb() regardless of whether need_recovery is true or
> not?

All the UBIFS recovery was written and tested for the unclean power cut
cases. We worked on SLC which is quite trustworthy and did not show
other types of corruptions so fat.

When UBIFS is cleanly unmounted, we update the UBIFS master node and
clean the "dirty" flag there. When UBIFS mounts, it checks the master
node, and if it is dirty, there was an unclean reboot, and the FS needs
recovery. Otherwise UBIFS was unmounted cleanly, and UBIFS assumes there
cannot be any issues, and if there are issues, they are not because of
unclean unmounts, and the current implementation does not deal with
them.

This is why UBIFS does not try to recover if !c->need_recovery - this
was just not needed, not implemented and not tested.

You are working with MLC and you may have issues even if there was a
clean ummount.

You can teach UBIFS handle corrupted empty space. However, the current
way is not appropriate for MLC. AFAIR, currently UBIFS assumes that
there may be a half-written UBIFS node at the end, but then there should
be only 0xFF bytes. In your case, you can have bit-flips in the empty
space, so some bytes will be 0xEF, etc.

I suggest you to introduce another function which checks the 0xFF space
and distinguish between 0xFFs + bitflips and total garbage. In the
former case you recover, in the latter - refuse mounting.

This should not be too difficult to implement.

>   I'm having an issue with ubifs dealing with a PEB with corrupt
> empty space and this condition is handled fine during a mount when
> need_recovery is true, but is not handled the same otherwise and
> results in a failed mount.  A patch with the proposed change is below.

This patch is not enough for your case anyway, because recovery will
fail in ubifs_recover_leb() (see is_last_write() usage).
diff mbox

Patch

Index: replay.c
================================================
--- replay.c    (revision 2438)
+++ replay.c    (working copy)
@@ -838,7 +838,7 @@ 
        dbg_mnt("replay log LEB %d:%d", lnum, offs);
        sleb = ubifs_scan(c, lnum, offs, sbuf);
        if (IS_ERR(sleb) ) {
-               if (PTR_ERR(sleb) != -EUCLEAN || !c->need_recovery)
+               if (PTR_ERR(sleb) != -EUCLEAN)
                        return PTR_ERR(sleb);
                sleb = ubifs_recover_log_leb(c, lnum, offs, sbuf);
                if (IS_ERR(sleb))