Patchwork [2/2] UBIFS: seek journal heads to the latest bud in replay

login
register
mail settings
Submitter Artem Bityutskiy
Date April 25, 2011, 4:55 p.m.
Message ID <1303750531-13800-2-git-send-email-dedekind1@gmail.com>
Download mbox | patch
Permalink /patch/92759/
State New
Headers show

Comments

Artem Bityutskiy - April 25, 2011, 4:55 p.m.
From: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

This is another preparation which I need to fix and clean-up the monster
'ubifs_rcvry_gc_commit()' function.

Currently, UBIFS replay seeks the journal heads to the last _replayed_ bud.
But they are replayed out-of-order, so the replay basically seeks journal heads
to the "random" bud belonging to this head, and not to the last one.

This adds complications to the recovery, and this is a yet another subtle thing
which is easy to miss and forget. Just a little example how this harms (I've
seen this during recovery failure debugging).

We are in 'ubifs_rcvry_gc_commit()' and we need to restore c->gc_lnum. We have
2 GC buds, one with no free space and one with plenty of free space. But replay
seeks the GC head to the bud with no free space. And then we call
'ubifs_find_dirty_leb()' to find an LEB which we could garbage-collect to the
GC head, and of course we fail, so recovery fails.

This patch teaches the replay to initialize the GC heads exactly to the latest
buds, i.e. the buds which have the greatest sequence number in corresponding
log reference nodes. This makes things simpler and more predictable. This does
not fix all 'ubifs_rcvry_gc_commit()' issues, but make it fail much much less
often. I do not know the other reasons why it fails so far, though.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
---
 fs/ubifs/replay.c |   18 ++++++++++++------
 1 files changed, 12 insertions(+), 6 deletions(-)
Artem Bityutskiy - April 26, 2011, 8:40 a.m.
On Mon, 2011-04-25 at 19:55 +0300, Artem Bityutskiy wrote:
> This patch teaches the replay to initialize the GC heads exactly to the latest
> buds, i.e. the buds which have the greatest sequence number in corresponding
> log reference nodes. This makes things simpler and more predictable. This does
> not fix all 'ubifs_rcvry_gc_commit()' issues, but make it fail much much less
> often. I do not know the other reasons why it fails so far, though.

Actually I cannot reproduce 'ubifs_rcvry_gc_commit()' failures with this
patch so far.

Patch

diff --git a/fs/ubifs/replay.c b/fs/ubifs/replay.c
index b716a18..c29c468 100644
--- a/fs/ubifs/replay.c
+++ b/fs/ubifs/replay.c
@@ -59,6 +59,7 @@  enum {
  * @new_size: truncation new size
  * @free: amount of free space in a bud
  * @dirty: amount of dirty space in a bud from padding and deletion nodes
+ * @jhead: journal head number of the bud
  *
  * UBIFS journal replay must compare node sequence numbers, which means it must
  * build a tree of node information to insert into the TNC.
@@ -80,6 +81,7 @@  struct replay_entry {
 		struct {
 			int free;
 			int dirty;
+			int jhead;
 		};
 	};
 };
@@ -159,6 +161,11 @@  static int set_bud_lprops(struct ubifs_info *c, struct replay_entry *r)
 		err = PTR_ERR(lp);
 		goto out;
 	}
+
+	/* Make sure the journal head points to the latest bud */
+	err = ubifs_wbuf_seek_nolock(&c->jheads[r->jhead].wbuf, r->lnum,
+				     c->leb_size - r->free, UBI_SHORTTERM);
+
 out:
 	ubifs_release_lprops(c);
 	return err;
@@ -627,10 +634,6 @@  static int replay_bud(struct ubifs_info *c, int lnum, int offs, int jhead,
 	ubifs_assert(sleb->endpt - offs >= used);
 	ubifs_assert(sleb->endpt % c->min_io_size == 0);
 
-	if (sleb->endpt + c->min_io_size <= c->leb_size && !c->ro_mount)
-		err = ubifs_wbuf_seek_nolock(&c->jheads[jhead].wbuf, lnum,
-					     sleb->endpt, UBI_SHORTTERM);
-
 	*dirty = sleb->endpt - offs - used;
 	*free = c->leb_size - sleb->endpt;
 
@@ -653,12 +656,14 @@  out_dump:
  * @sqnum: sequence number
  * @free: amount of free space in bud
  * @dirty: amount of dirty space from padding and deletion nodes
+ * @jhead: journal head number for the bud
  *
  * This function inserts a reference node to the replay tree and returns zero
  * in case of success or a negative error code in case of failure.
  */
 static int insert_ref_node(struct ubifs_info *c, int lnum, int offs,
-			   unsigned long long sqnum, int free, int dirty)
+			   unsigned long long sqnum, int free, int dirty,
+			   int jhead)
 {
 	struct rb_node **p = &c->replay_tree.rb_node, *parent = NULL;
 	struct replay_entry *r;
@@ -688,6 +693,7 @@  static int insert_ref_node(struct ubifs_info *c, int lnum, int offs,
 	r->flags = REPLAY_REF;
 	r->free = free;
 	r->dirty = dirty;
+	r->jhead = jhead;
 
 	rb_link_node(&r->rb, parent, p);
 	rb_insert_color(&r->rb, &c->replay_tree);
@@ -712,7 +718,7 @@  static int replay_buds(struct ubifs_info *c)
 		if (err)
 			return err;
 		err = insert_ref_node(c, b->bud->lnum, b->bud->start, b->sqnum,
-				      free, dirty);
+				      free, dirty, b->bud->jhead);
 		if (err)
 			return err;
 	}