Patchwork [2/5] ubi: Limit amount of reserved eraseblocks for bad PEB handling

login
register
mail settings
Submitter Shmulik Ladkani
Date July 4, 2012, 8:06 a.m.
Message ID <1341389164-24409-3-git-send-email-shmulik.ladkani@gmail.com>
Download mbox | patch
Permalink /patch/168922/
State New
Headers show

Comments

Shmulik Ladkani - July 4, 2012, 8:06 a.m.
The existing mechanism of reserving PEBs for bad PEB handling has two
flaws:
- It is calculated as a percentage of good PEBs instead of total PEBs.
- There's no limit on the amount of PEBs UBI reserves for future bad
  eraseblock handling.

This patch changes the mechanism to overcome these flaws.

The desired level of PEBs reserved for bad PEB handling (beb_rsvd_level)
is set to the maximum expected bad eraseblocks (bad_peb_limit) minus the
existing number of bad eraseblocks (bad_peb_count).

The actual amount of PEBs reserved for bad PEB handling is usually set
to the desired level (but in some circumstances may be lower than the
desired level, e.g. when attaching to a device that has too few
available PEBs to satisfy the desired level).

In the case where the device has too many bad PEBs (above the expected
limit), then the desired level, and the actual amount of PEBs reserved
are set to zero. No PEBs will be set aside for future bad eraseblock
handling - even if some PEBs are made available (e.g. by shrinking a
volume).
If another PEB goes bad, and there are available PEBs, then the
eraseblock will be marked bad (consuming one available PEB). But if
there are no available PEBs, ubi will go into readonly mode.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
---
 drivers/mtd/ubi/Kconfig |    8 ++++++++
 drivers/mtd/ubi/misc.c  |   16 ++++++++++++----
 drivers/mtd/ubi/wl.c    |   44 ++++++++++++++++++++++++++++----------------
 3 files changed, 48 insertions(+), 20 deletions(-)
Richard Genoud - July 9, 2012, 10:15 a.m.
2012/7/4 Shmulik Ladkani <shmulik.ladkani@gmail.com>:
> diff --git a/drivers/mtd/ubi/misc.c b/drivers/mtd/ubi/misc.c
> index f6a7d7a..e9dcb83 100644
> --- a/drivers/mtd/ubi/misc.c
> +++ b/drivers/mtd/ubi/misc.c
> @@ -98,10 +98,18 @@ int ubi_check_volume(struct ubi_device *ubi, int vol_id)
>   */
>  void ubi_calculate_reserved(struct ubi_device *ubi)
>  {
> -       ubi->beb_rsvd_level = ubi->good_peb_count/100;
> -       ubi->beb_rsvd_level *= CONFIG_MTD_UBI_BEB_RESERVE;
> -       if (ubi->beb_rsvd_level < MIN_RESEVED_PEBS)
> -               ubi->beb_rsvd_level = MIN_RESEVED_PEBS;
> +       /*
> +        * Calculate the actual number of PEBs currently needed to be reserved
> +        * for future bad eraseblock handling.
> +        */
> +       ubi->beb_rsvd_level = ubi->bad_peb_limit - ubi->bad_peb_count;
> +       if (ubi->beb_rsvd_level < 0) {
> +               ubi->beb_rsvd_level = 0;
> +               ubi_warn("number of bad PEBs (%d) is above the expected limit "
> +                        "(%d), not reserving any PEBs for bad PEB handling, "
> +                        "will use available PEBs (if any)",
> +                        ubi->bad_peb_count, ubi->bad_peb_limit);
> +       }
>  }
is it ok for beb_rsvd_level to be in the range [0..x[ instead of [2..x[ ?
Shmulik Ladkani - July 9, 2012, 11:02 a.m.
On Mon, 9 Jul 2012 12:15:17 +0200 Richard Genoud <richard.genoud@gmail.com> wrote:
> 2012/7/4 Shmulik Ladkani <shmulik.ladkani@gmail.com>:
> > +       /*
> > +        * Calculate the actual number of PEBs currently needed to be reserved
> > +        * for future bad eraseblock handling.
> > +        */
> > +       ubi->beb_rsvd_level = ubi->bad_peb_limit - ubi->bad_peb_count;
> > +       if (ubi->beb_rsvd_level < 0) {
> > +               ubi->beb_rsvd_level = 0;
> > +               ubi_warn("number of bad PEBs (%d) is above the expected limit "
> > +                        "(%d), not reserving any PEBs for bad PEB handling, "
> > +                        "will use available PEBs (if any)",
> > +                        ubi->bad_peb_count, ubi->bad_peb_limit);
> > +       }
> >  }
> is it ok for beb_rsvd_level to be in the range [0..x[ instead of [2..x[ ?

Yes, it is ok in my new scheme.
It is even mandatory, otherwise more and more PEBs will attempt to be
reserved for future bad PEB handling (as 'beb_rsvd_pebs' always wishes
to reach 'beb_rsvd_level') even if passed the limit - this is exactly
what I'd like to fix.

Regards
Shmulik
Artem Bityutskiy - July 18, 2012, 10:28 a.m.
On Wed, 2012-07-04 at 11:06 +0300, Shmulik Ladkani wrote:
> @@ -1045,20 +1046,14 @@ static int erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
>  	}
>  
>  	spin_lock(&ubi->volumes_lock);
> -	need = ubi->beb_rsvd_level - ubi->beb_rsvd_pebs + 1;
> -	if (need > 0) {
> -		need = ubi->avail_pebs >= need ? need : ubi->avail_pebs;
> -		ubi->avail_pebs -= need;
> -		ubi->rsvd_pebs += need;
> -		ubi->beb_rsvd_pebs += need;
> -		if (need > 0)
> -			ubi_msg("reserve more %d PEBs", need);
> -	}
> -
>  	if (ubi->beb_rsvd_pebs == 0) {
> -		spin_unlock(&ubi->volumes_lock);
> -		ubi_err("no reserved physical eraseblocks");
> -		goto out_ro;
> +		if (ubi->avail_pebs == 0) {
> +			spin_unlock(&ubi->volumes_lock);
> +			ubi_err("no reserved/available physical eraseblocks");
> +			goto out_ro;
> +		}
> +		ubi->avail_pebs -= 1;
> +		available_consumed = 1;
>  	}
>  	spin_unlock(&ubi->volumes_lock);

The whole thing will become simpler if we first mark the PEB as bad
unconditionally (because it _is_ bad), then grab the lock and do all the
re-calculations.

>  
> @@ -1068,11 +1063,23 @@ static int erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
>  		goto out_ro;
>  
>  	spin_lock(&ubi->volumes_lock);
> -	ubi->beb_rsvd_pebs -= 1;
> +	if (ubi->beb_rsvd_pebs > 0) {
> +		if (available_consumed) {
> +			/*
> +			 * Some PEBs were added to the reserved pool since we
> +			 * last checked. Use a PEB from the reserved pool.
> +			 */
> +			ubi->avail_pebs += 1;
> +			available_consumed = 0;
> +		}
> +		ubi->beb_rsvd_pebs -= 1;
> +	}
>  	ubi->bad_peb_count += 1;
>  	ubi->good_peb_count -= 1;
>  	ubi_calculate_reserved(ubi);

We do not need to call this function from here, right?
Artem Bityutskiy - July 18, 2012, 11:01 a.m.
On Wed, 2012-07-18 at 13:28 +0300, Artem Bityutskiy wrote:
> >       ubi->bad_peb_count += 1;
> >       ubi->good_peb_count -= 1;
> >       ubi_calculate_reserved(ubi);
> We do not need to call this function from here, right?

Err, sorry, it _is_ needed!
Artem Bityutskiy - July 18, 2012, 11:40 a.m.
On Wed, 2012-07-04 at 11:06 +0300, Shmulik Ladkani wrote:
> The existing mechanism of reserving PEBs for bad PEB handling has two
> flaws:
> - It is calculated as a percentage of good PEBs instead of total PEBs.
> - There's no limit on the amount of PEBs UBI reserves for future bad
>   eraseblock handling.

OK, I've modified this patch a lot, did not test, and sent what I have
in the beb branch to you - please, take a look.
Shmulik Ladkani - July 18, 2012, 7:55 p.m.
Hi Artem,

On Wed, 18 Jul 2012 13:28:37 +0300 Artem Bityutskiy <dedekind1@gmail.com> wrote:
> The whole thing will become simpler if we first mark the PEB as bad
> unconditionally (because it _is_ bad), then grab the lock and do all the
> re-calculations.

On first glance that would sound the right thing to do.

Actually, when looked at the original code, which does NOT mark the
block bad when 'beb_rsvd_pebs == 0' (instead, does a 'goto out_ro'),
I initially thought "this is SO wrong, the block IS bad, we MUST mark it
bad, what IS the deal here?".
But then I spent more and more time trying to backtrack the reason for
it - and I think I found a reason, quite an improtant one.

Suppose 'beb_rsvd_pebs == 0'.

In the original scheme, it means that there are NO available PEBs at
all. All good PEBs are "assigned" (to user volumes, layout volume, WL
and EBA).

Now, if one of these PEBs goes bad, and you DO mark it bad, and
decrement the good_peb_count, you'll have a shortage of good PEBs - it
will not match the amount of PEBs assigned to the consumers (user
volumes, layout, WL, EBA).
Meaning, next attach would simply fail with a "no enough physical
eraseblocks" message (grep for these in ubi_wl_init and ubi_eba_init).
You won't even be able to attach anymore. Not good.

However, if you DO NOT mark it bad, but instead go into RO mode, you
should be able to later re-attach because the good_peb_count would fit
(no shortage of PEBs).

So going into RO seems a "less worse" solution than marking bad, for
that particular case.

Anyways, I've really invested some thought into this patch, trying to
cover all sorts of esoteric cases.
I think the logic itself is pretty robust, although I would really
appreciate some reassurances on that...

I agree the code does not "read simple" enough, I tried to make the best
simplifying and adding some comments. Let me know if you'd like some
more polish on it.

I saw you submitted a simplified version, I can surely take a look, but
I'm afraid it lacks the proper treatment discussed above (NOT marking
the bad block as bad when beb_rsvd_pebs is zero).

Let me know.

Many thanks,
Shmulik
Artem Bityutskiy - July 19, 2012, 3:35 a.m.
On Wed, 2012-07-18 at 22:55 +0300, Shmulik Ladkani wrote:
> However, if you DO NOT mark it bad, but instead go into RO mode, you
> should be able to later re-attach because the good_peb_count would fit
> (no shortage of PEBs). 

Yeah, you are right, I'll return to the original patch.

Patch

diff --git a/drivers/mtd/ubi/Kconfig b/drivers/mtd/ubi/Kconfig
index 8df256f..7eb91cb 100644
--- a/drivers/mtd/ubi/Kconfig
+++ b/drivers/mtd/ubi/Kconfig
@@ -47,8 +47,16 @@  config MTD_UBI_BEB_LIMIT
 	default 2
 	range 0 25
 	help
+	  If the MTD device admits of bad eraseblocks (e.g. NAND flash), UBI
+	  reserves some amount of physical eraseblocks to handle new bad
+	  eraseblocks.
 	  This option specifies the maximum bad eraseblocks UBI expects on the
 	  ubi device (percents of total number of flash eraseblocks).
+	  This limit is used in order to derive amount of eraseblock UBI
+	  reserves for handling new bad blocks.
+	  If the device has more bad eraseblocks than this limit, UBI does not
+	  reserve any physical eraseblocks for new bad eraseblocks, but
+	  attempts to use available eraseblocks (if any).
 	  If the underlying flash does not admit of bad eraseblocks (e.g. NOR
 	  flash), this value is ignored.
 	  Leave the default value if unsure.
diff --git a/drivers/mtd/ubi/misc.c b/drivers/mtd/ubi/misc.c
index f6a7d7a..e9dcb83 100644
--- a/drivers/mtd/ubi/misc.c
+++ b/drivers/mtd/ubi/misc.c
@@ -98,10 +98,18 @@  int ubi_check_volume(struct ubi_device *ubi, int vol_id)
  */
 void ubi_calculate_reserved(struct ubi_device *ubi)
 {
-	ubi->beb_rsvd_level = ubi->good_peb_count/100;
-	ubi->beb_rsvd_level *= CONFIG_MTD_UBI_BEB_RESERVE;
-	if (ubi->beb_rsvd_level < MIN_RESEVED_PEBS)
-		ubi->beb_rsvd_level = MIN_RESEVED_PEBS;
+	/*
+	 * Calculate the actual number of PEBs currently needed to be reserved
+	 * for future bad eraseblock handling.
+	 */
+	ubi->beb_rsvd_level = ubi->bad_peb_limit - ubi->bad_peb_count;
+	if (ubi->beb_rsvd_level < 0) {
+		ubi->beb_rsvd_level = 0;
+		ubi_warn("number of bad PEBs (%d) is above the expected limit "
+			 "(%d), not reserving any PEBs for bad PEB handling, "
+			 "will use available PEBs (if any)",
+			 ubi->bad_peb_count, ubi->bad_peb_limit);
+	}
 }
 
 /**
diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index b6be644..9143f35 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -978,9 +978,10 @@  static int erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
 			int cancel)
 {
 	struct ubi_wl_entry *e = wl_wrk->e;
-	int pnum = e->pnum, err, need;
+	int pnum = e->pnum, err;
 	int vol_id = wl_wrk->vol_id;
 	int lnum = wl_wrk->lnum;
+	int available_consumed = 0;
 
 	if (cancel) {
 		dbg_wl("cancel erasure of PEB %d EC %d", pnum, e->ec);
@@ -1045,20 +1046,14 @@  static int erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
 	}
 
 	spin_lock(&ubi->volumes_lock);
-	need = ubi->beb_rsvd_level - ubi->beb_rsvd_pebs + 1;
-	if (need > 0) {
-		need = ubi->avail_pebs >= need ? need : ubi->avail_pebs;
-		ubi->avail_pebs -= need;
-		ubi->rsvd_pebs += need;
-		ubi->beb_rsvd_pebs += need;
-		if (need > 0)
-			ubi_msg("reserve more %d PEBs", need);
-	}
-
 	if (ubi->beb_rsvd_pebs == 0) {
-		spin_unlock(&ubi->volumes_lock);
-		ubi_err("no reserved physical eraseblocks");
-		goto out_ro;
+		if (ubi->avail_pebs == 0) {
+			spin_unlock(&ubi->volumes_lock);
+			ubi_err("no reserved/available physical eraseblocks");
+			goto out_ro;
+		}
+		ubi->avail_pebs -= 1;
+		available_consumed = 1;
 	}
 	spin_unlock(&ubi->volumes_lock);
 
@@ -1068,11 +1063,23 @@  static int erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
 		goto out_ro;
 
 	spin_lock(&ubi->volumes_lock);
-	ubi->beb_rsvd_pebs -= 1;
+	if (ubi->beb_rsvd_pebs > 0) {
+		if (available_consumed) {
+			/*
+			 * Some PEBs were added to the reserved pool since we
+			 * last checked. Use a PEB from the reserved pool.
+			 */
+			ubi->avail_pebs += 1;
+			available_consumed = 0;
+		}
+		ubi->beb_rsvd_pebs -= 1;
+	}
 	ubi->bad_peb_count += 1;
 	ubi->good_peb_count -= 1;
 	ubi_calculate_reserved(ubi);
-	if (ubi->beb_rsvd_pebs)
+	if (available_consumed)
+		ubi_warn("no PEBs in the reserved pool, used an available PEB");
+	else if (ubi->beb_rsvd_pebs)
 		ubi_msg("%d PEBs left in the reserve", ubi->beb_rsvd_pebs);
 	else
 		ubi_warn("last PEB from the reserved pool was used");
@@ -1081,6 +1088,11 @@  static int erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
 	return err;
 
 out_ro:
+	if (available_consumed) {
+		spin_lock(&ubi->volumes_lock);
+		ubi->avail_pebs += 1;
+		spin_unlock(&ubi->volumes_lock);
+	}
 	ubi_ro_mode(ubi);
 	return err;
 }