diff mbox series

ubi: Select fastmap anchor PEBs considering wear level rules

Message ID 20200113145622.18357-1-arne.edholm@axis.com
State Accepted
Delegated to: Richard Weinberger
Headers show
Series ubi: Select fastmap anchor PEBs considering wear level rules | expand

Commit Message

Arne Edholm Jan. 13, 2020, 2:56 p.m. UTC
There is a risk that the fastmap anchor PEB is alternating between
just two PEBs, the current anchor and the previous anchor that was just
deleted. As the fastmap pools gets the first take on free PEBs, the
pools may leave no free PEBs to be selected as the new anchor,
resulting in the two PEBs alternating behaviour. If the anchor PEBs gets
a high erase count the PEBs will not be used by the pools but remain in
ubi->free, even more increasing the likelihood they will be used as
anchors.

Getting stuck using only a couple of PEBs continuously will result in an
uneven wear, eventually leading to failure.

To fix this:

- Choose the fastmap anchor when the most free PEBs are available. This is
  during rebuilding of the fastmap pools, after the unused pool PEBs are
  added to ubi->free but before the pools are populated again from the
  free PEBs. Also reserve an additional second best PEB as a candidate
  for the next time the fast map anchor is updated. If a better PEB is
  found the next time the fast map anchor is updated, the candidate is
  made available for building the pools.

- Enable anchor move within the anchor area again as it is useful for
  distributing wear.

- The anchor candidate for the next fastmap update is the most suited free
  PEB. Check this PEB's erase count during wear leveling. If the wear
  leveling limit is exceeded, the PEB is considered unsuitable for now. As
  all other non used anchor area PEBs should be even worse, free up the
  used anchor area PEB with the lowest erase count.

Signed-off-by: Arne Edholm <arne.edholm@axis.com>
---
 drivers/mtd/ubi/fastmap-wl.c | 39 ++++++++++++++++++++++++---------------
 drivers/mtd/ubi/fastmap.c    | 11 +++++++++++
 drivers/mtd/ubi/ubi.h        |  4 +++-
 drivers/mtd/ubi/wl.c         | 28 +++++++++++++++++++---------
 4 files changed, 57 insertions(+), 25 deletions(-)

Comments

Arne Edholm Feb. 18, 2020, 2 p.m. UTC | #1
Hi Richard,

On 1/13/20 3:56 PM, Arne Edholm wrote:
> There is a risk that the fastmap anchor PEB is alternating between
> just two PEBs, the current anchor and the previous anchor that was just
> deleted. As the fastmap pools gets the first take on free PEBs, the
> pools may leave no free PEBs to be selected as the new anchor,
> resulting in the two PEBs alternating behaviour. If the anchor PEBs gets
> a high erase count the PEBs will not be used by the pools but remain in
> ubi->free, even more increasing the likelihood they will be used as
> anchors.
>
> Getting stuck using only a couple of PEBs continuously will result in an
> uneven wear, eventually leading to failure.
>
> To fix this:
>
> - Choose the fastmap anchor when the most free PEBs are available. This is
>   during rebuilding of the fastmap pools, after the unused pool PEBs are
>   added to ubi->free but before the pools are populated again from the
>   free PEBs. Also reserve an additional second best PEB as a candidate
>   for the next time the fast map anchor is updated. If a better PEB is
>   found the next time the fast map anchor is updated, the candidate is
>   made available for building the pools.
>
> - Enable anchor move within the anchor area again as it is useful for
>   distributing wear.
>
> - The anchor candidate for the next fastmap update is the most suited free
>   PEB. Check this PEB's erase count during wear leveling. If the wear
>   leveling limit is exceeded, the PEB is considered unsuitable for now. As
>   all other non used anchor area PEBs should be even worse, free up the
>   used anchor area PEB with the lowest erase count.
>
> Signed-off-by: Arne Edholm <arne.edholm@axis.com>
Do you have any comments on this suggestion?
Richard Weinberger Feb. 27, 2020, 7:50 a.m. UTC | #2
Arne,

On Tue, Feb 18, 2020 at 3:00 PM Arne Edholm <arne.edholm@axis.com> wrote:
>
> Hi Richard,
>
> On 1/13/20 3:56 PM, Arne Edholm wrote:
> > There is a risk that the fastmap anchor PEB is alternating between
> > just two PEBs, the current anchor and the previous anchor that was just
> > deleted. As the fastmap pools gets the first take on free PEBs, the
> > pools may leave no free PEBs to be selected as the new anchor,
> > resulting in the two PEBs alternating behaviour. If the anchor PEBs gets
> > a high erase count the PEBs will not be used by the pools but remain in
> > ubi->free, even more increasing the likelihood they will be used as
> > anchors.
> >
> > Getting stuck using only a couple of PEBs continuously will result in an
> > uneven wear, eventually leading to failure.
> >
> > To fix this:
> >
> > - Choose the fastmap anchor when the most free PEBs are available. This is
> >   during rebuilding of the fastmap pools, after the unused pool PEBs are
> >   added to ubi->free but before the pools are populated again from the
> >   free PEBs. Also reserve an additional second best PEB as a candidate
> >   for the next time the fast map anchor is updated. If a better PEB is
> >   found the next time the fast map anchor is updated, the candidate is
> >   made available for building the pools.
> >
> > - Enable anchor move within the anchor area again as it is useful for
> >   distributing wear.
> >
> > - The anchor candidate for the next fastmap update is the most suited free
> >   PEB. Check this PEB's erase count during wear leveling. If the wear
> >   leveling limit is exceeded, the PEB is considered unsuitable for now. As
> >   all other non used anchor area PEBs should be even worse, free up the
> >   used anchor area PEB with the lowest erase count.
> >
> > Signed-off-by: Arne Edholm <arne.edholm@axis.com>
> Do you have any comments on this suggestion?

Thank you for working on this. On first sight what you describe makes sense
but I need to verify it first. Fastmap is tricky. ;-\
Maybe we even can have a xfstests test for it.
Richard Weinberger March 30, 2020, 8:54 p.m. UTC | #3
On Mon, Jan 13, 2020 at 3:56 PM Arne Edholm <arne.edholm@axis.com> wrote:
>
> There is a risk that the fastmap anchor PEB is alternating between
> just two PEBs, the current anchor and the previous anchor that was just
> deleted. As the fastmap pools gets the first take on free PEBs, the
> pools may leave no free PEBs to be selected as the new anchor,
> resulting in the two PEBs alternating behaviour. If the anchor PEBs gets
> a high erase count the PEBs will not be used by the pools but remain in
> ubi->free, even more increasing the likelihood they will be used as
> anchors.
>
> Getting stuck using only a couple of PEBs continuously will result in an
> uneven wear, eventually leading to failure.
>
> To fix this:
>
> - Choose the fastmap anchor when the most free PEBs are available. This is
>   during rebuilding of the fastmap pools, after the unused pool PEBs are
>   added to ubi->free but before the pools are populated again from the
>   free PEBs. Also reserve an additional second best PEB as a candidate
>   for the next time the fast map anchor is updated. If a better PEB is
>   found the next time the fast map anchor is updated, the candidate is
>   made available for building the pools.
>
> - Enable anchor move within the anchor area again as it is useful for
>   distributing wear.
>
> - The anchor candidate for the next fastmap update is the most suited free
>   PEB. Check this PEB's erase count during wear leveling. If the wear
>   leveling limit is exceeded, the PEB is considered unsuitable for now. As
>   all other non used anchor area PEBs should be even worse, free up the
>   used anchor area PEB with the lowest erase count.
>
> Signed-off-by: Arne Edholm <arne.edholm@axis.com>
> ---
>  drivers/mtd/ubi/fastmap-wl.c | 39 ++++++++++++++++++++++++---------------
>  drivers/mtd/ubi/fastmap.c    | 11 +++++++++++
>  drivers/mtd/ubi/ubi.h        |  4 +++-
>  drivers/mtd/ubi/wl.c         | 28 +++++++++++++++++++---------
>  4 files changed, 57 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c
> index 426820ab9afe..97cb57a12440 100644
> --- a/drivers/mtd/ubi/fastmap-wl.c
> +++ b/drivers/mtd/ubi/fastmap-wl.c
> @@ -110,6 +110,21 @@ void ubi_refill_pools(struct ubi_device *ubi)
>         wl_pool->size = 0;
>         pool->size = 0;
>
> +       if (ubi->fm_anchor) {
> +               wl_tree_add(ubi->fm_anchor, &ubi->free);
> +               ubi->free_count++;
> +       }
> +       if (ubi->fm_next_anchor) {
> +               wl_tree_add(ubi->fm_next_anchor, &ubi->free);
> +               ubi->free_count++;
> +       }
> +
> +       /* All available PEBs are in ubi->free, now is the time to get
> +        * the best anchor PEBs.
> +        */
> +       ubi->fm_anchor = ubi_wl_get_fm_peb(ubi, 1);
> +       ubi->fm_next_anchor = ubi_wl_get_fm_peb(ubi, 1);
> +
>         for (;;) {
>                 enough = 0;
>                 if (pool->size < pool->max_size) {
> @@ -265,26 +280,20 @@ static struct ubi_wl_entry *get_peb_for_wl(struct ubi_device *ubi)
>  int ubi_ensure_anchor_pebs(struct ubi_device *ubi)
>  {
>         struct ubi_work *wrk;
> -       struct ubi_wl_entry *anchor;
>
>         spin_lock(&ubi->wl_lock);
>
> -       /* Do we already have an anchor? */
> -       if (ubi->fm_anchor) {
> -               spin_unlock(&ubi->wl_lock);
> -               return 0;
> -       }
> -
> -       /* See if we can find an anchor PEB on the list of free PEBs */
> -       anchor = ubi_wl_get_fm_peb(ubi, 1);
> -       if (anchor) {
> -               ubi->fm_anchor = anchor;
> -               spin_unlock(&ubi->wl_lock);
> -               return 0;
> +       /* Do we have a next anchor? */
> +       if (!ubi->fm_next_anchor) {
> +               ubi->fm_next_anchor = ubi_wl_get_fm_peb(ubi, 1);
> +               if (!ubi->fm_next_anchor)
> +                       /* Tell wear leveling to produce a new anchor PEB */
> +                       ubi->fm_do_produce_anchor = 1;
>         }
>
> -       /* No luck, trigger wear leveling to produce a new anchor PEB */
> -       ubi->fm_do_produce_anchor = 1;
> +       /* Do wear leveling to get a new anchor PEB or check the
> +        * existing next anchor candidate.
> +        */
>         if (ubi->wl_scheduled) {
>                 spin_unlock(&ubi->wl_lock);
>                 return 0;
> diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c
> index 1c7be4eb3ba6..02332f9ea996 100644
> --- a/drivers/mtd/ubi/fastmap.c
> +++ b/drivers/mtd/ubi/fastmap.c
> @@ -1220,6 +1220,17 @@ static int ubi_write_fastmap(struct ubi_device *ubi,
>                 fm_pos += sizeof(*fec);
>                 ubi_assert(fm_pos <= ubi->fm_size);
>         }
> +       if (ubi->fm_next_anchor) {
> +               fec = (struct ubi_fm_ec *)(fm_raw + fm_pos);
> +
> +               fec->pnum = cpu_to_be32(ubi->fm_next_anchor->pnum);
> +               set_seen(ubi, ubi->fm_next_anchor->pnum, seen_pebs);
> +               fec->ec = cpu_to_be32(ubi->fm_next_anchor->ec);
> +
> +               free_peb_count++;
> +               fm_pos += sizeof(*fec);
> +               ubi_assert(fm_pos <= ubi->fm_size);
> +       }
>         fmh->free_peb_count = cpu_to_be32(free_peb_count);
>
>         ubi_for_each_used_peb(ubi, wl_e, tmp_rb) {
> diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
> index 9688b411c930..a12fdb137fa4 100644
> --- a/drivers/mtd/ubi/ubi.h
> +++ b/drivers/mtd/ubi/ubi.h
> @@ -491,7 +491,8 @@ struct ubi_debug_info {
>   * @fm_work: fastmap work queue
>   * @fm_work_scheduled: non-zero if fastmap work was scheduled
>   * @fast_attach: non-zero if UBI was attached by fastmap
> - * @fm_anchor: The next anchor PEB to use for fastmap
> + * @fm_anchor: The new anchor PEB used during fastmap update
> + * @fm_next_anchor: An anchor PEB candidate for the next time fastmap is updated
>   * @fm_do_produce_anchor: If true produce an anchor PEB in wl
>   *
>   * @used: RB-tree of used physical eraseblocks
> @@ -602,6 +603,7 @@ struct ubi_device {
>         int fm_work_scheduled;
>         int fast_attach;
>         struct ubi_wl_entry *fm_anchor;
> +       struct ubi_wl_entry *fm_next_anchor;
>         int fm_do_produce_anchor;
>
>         /* Wear-leveling sub-system's stuff */
> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
> index 5d77a38dba54..804c434928aa 100644
> --- a/drivers/mtd/ubi/wl.c
> +++ b/drivers/mtd/ubi/wl.c
> @@ -688,20 +688,27 @@ static int wear_leveling_worker(struct ubi_device *ubi, struct ubi_work *wrk,
>         }
>
>  #ifdef CONFIG_MTD_UBI_FASTMAP
> +       e1 = find_anchor_wl_entry(&ubi->used);
> +       if (e1 && ubi->fm_next_anchor &&
> +           (ubi->fm_next_anchor->ec - e1->ec >= UBI_WL_THRESHOLD)) {
> +               ubi->fm_do_produce_anchor = 1;
> +               /* fm_next_anchor is no longer considered a good anchor
> +                * candidate.
> +                * NULL assignment also prevents multiple wear level checks
> +                * of this PEB.
> +                */
> +               wl_tree_add(ubi->fm_next_anchor, &ubi->free);
> +               ubi->fm_next_anchor = NULL;
> +               ubi->free_count++;
> +       }
> +
>         if (ubi->fm_do_produce_anchor) {
> -               e1 = find_anchor_wl_entry(&ubi->used);
>                 if (!e1)
>                         goto out_cancel;

The logic here is a little strange. Why don't you check for e1 being
NULL in the new code block
above?

>                 e2 = get_peb_for_wl(ubi);
>                 if (!e2)
>                         goto out_cancel;
>
> -               /*
> -                * Anchor move within the anchor area is useless.
> -                */
> -               if (e2->pnum < UBI_FM_MAX_START)
> -                       goto out_cancel;
> -

Are you sure we can drop this optimization?

Other than that I like your approach and would merge it. :-)

>                 self_check_in_wl_tree(ubi, e1, &ubi->used);
>                 rb_erase(&e1->u.rb, &ubi->used);
>                 dbg_wl("anchor-move PEB %d to PEB %d", e1->pnum, e2->pnum);
> @@ -1080,8 +1087,11 @@ static int __erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk)
>         if (!err) {
>                 spin_lock(&ubi->wl_lock);
>
> -               if (!ubi->fm_anchor && e->pnum < UBI_FM_MAX_START) {
> -                       ubi->fm_anchor = e;
> +               if (!ubi->fm_next_anchor && e->pnum < UBI_FM_MAX_START) {
> +                       /* Abort anchor production, if needed it will be
> +                        * enabled again in the wear leveling started below.
> +                        */
> +                       ubi->fm_next_anchor = e;
>                         ubi->fm_do_produce_anchor = 0;
>                 } else {
>                         wl_tree_add(e, &ubi->free);
> --
> 2.11.0
>
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
Arne Edholm April 2, 2020, 9:02 a.m. UTC | #4
On 3/30/20 10:54 PM, Richard Weinberger wrote:
> On Mon, Jan 13, 2020 at 3:56 PM Arne Edholm <arne.edholm@axis.com> wrote:
> >
> > There is a risk that the fastmap anchor PEB is alternating between
> > just two PEBs, the current anchor and the previous anchor that was just
> > deleted. As the fastmap pools gets the first take on free PEBs, the
> > pools may leave no free PEBs to be selected as the new anchor,
> > resulting in the two PEBs alternating behaviour. If the anchor PEBs gets
> > a high erase count the PEBs will not be used by the pools but remain in
> > ubi->free, even more increasing the likelihood they will be used as
> > anchors.
> >
> > Getting stuck using only a couple of PEBs continuously will result in an
> > uneven wear, eventually leading to failure.
> >
> > To fix this:
> >
> > - Choose the fastmap anchor when the most free PEBs are available. 
> This is
> >   during rebuilding of the fastmap pools, after the unused pool PEBs are
> >   added to ubi->free but before the pools are populated again from the
> >   free PEBs. Also reserve an additional second best PEB as a candidate
> >   for the next time the fast map anchor is updated. If a better PEB is
> >   found the next time the fast map anchor is updated, the candidate is
> >   made available for building the pools.
> >
> > - Enable anchor move within the anchor area again as it is useful for
> >   distributing wear.
> >
> > - The anchor candidate for the next fastmap update is the most 
> suited free
> >   PEB. Check this PEB's erase count during wear leveling. If the wear
> >   leveling limit is exceeded, the PEB is considered unsuitable for 
> now. As
> >   all other non used anchor area PEBs should be even worse, free up the
> >   used anchor area PEB with the lowest erase count.
> >
> > Signed-off-by: Arne Edholm <arne.edholm@axis.com>
> > ---
> >  drivers/mtd/ubi/fastmap-wl.c | 39 
> ++++++++++++++++++++++++---------------
> >  drivers/mtd/ubi/fastmap.c    | 11 +++++++++++
> >  drivers/mtd/ubi/ubi.h        |  4 +++-
> >  drivers/mtd/ubi/wl.c         | 28 +++++++++++++++++++---------
> >  4 files changed, 57 insertions(+), 25 deletions(-)
> >
> > diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c
> > index 426820ab9afe..97cb57a12440 100644
> > --- a/drivers/mtd/ubi/fastmap-wl.c
> > +++ b/drivers/mtd/ubi/fastmap-wl.c
> > @@ -110,6 +110,21 @@ void ubi_refill_pools(struct ubi_device *ubi)
> >         wl_pool->size = 0;
> >         pool->size = 0;
> >
> > +       if (ubi->fm_anchor) {
> > +               wl_tree_add(ubi->fm_anchor, &ubi->free);
> > +               ubi->free_count++;
> > +       }
> > +       if (ubi->fm_next_anchor) {
> > +               wl_tree_add(ubi->fm_next_anchor, &ubi->free);
> > +               ubi->free_count++;
> > +       }
> > +
> > +       /* All available PEBs are in ubi->free, now is the time to get
> > +        * the best anchor PEBs.
> > +        */
> > +       ubi->fm_anchor = ubi_wl_get_fm_peb(ubi, 1);
> > +       ubi->fm_next_anchor = ubi_wl_get_fm_peb(ubi, 1);
> > +
> >         for (;;) {
> >                 enough = 0;
> >                 if (pool->size < pool->max_size) {
> > @@ -265,26 +280,20 @@ static struct ubi_wl_entry 
> *get_peb_for_wl(struct ubi_device *ubi)
> >  int ubi_ensure_anchor_pebs(struct ubi_device *ubi)
> >  {
> >         struct ubi_work *wrk;
> > -       struct ubi_wl_entry *anchor;
> >
> >         spin_lock(&ubi->wl_lock);
> >
> > -       /* Do we already have an anchor? */
> > -       if (ubi->fm_anchor) {
> > -               spin_unlock(&ubi->wl_lock);
> > -               return 0;
> > -       }
> > -
> > -       /* See if we can find an anchor PEB on the list of free PEBs */
> > -       anchor = ubi_wl_get_fm_peb(ubi, 1);
> > -       if (anchor) {
> > -               ubi->fm_anchor = anchor;
> > -               spin_unlock(&ubi->wl_lock);
> > -               return 0;
> > +       /* Do we have a next anchor? */
> > +       if (!ubi->fm_next_anchor) {
> > +               ubi->fm_next_anchor = ubi_wl_get_fm_peb(ubi, 1);
> > +               if (!ubi->fm_next_anchor)
> > +                       /* Tell wear leveling to produce a new 
> anchor PEB */
> > +                       ubi->fm_do_produce_anchor = 1;
> >         }
> >
> > -       /* No luck, trigger wear leveling to produce a new anchor PEB */
> > -       ubi->fm_do_produce_anchor = 1;
> > +       /* Do wear leveling to get a new anchor PEB or check the
> > +        * existing next anchor candidate.
> > +        */
> >         if (ubi->wl_scheduled) {
> >                 spin_unlock(&ubi->wl_lock);
> >                 return 0;
> > diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c
> > index 1c7be4eb3ba6..02332f9ea996 100644
> > --- a/drivers/mtd/ubi/fastmap.c
> > +++ b/drivers/mtd/ubi/fastmap.c
> > @@ -1220,6 +1220,17 @@ static int ubi_write_fastmap(struct 
> ubi_device *ubi,
> >                 fm_pos += sizeof(*fec);
> >                 ubi_assert(fm_pos <= ubi->fm_size);
> >         }
> > +       if (ubi->fm_next_anchor) {
> > +               fec = (struct ubi_fm_ec *)(fm_raw + fm_pos);
> > +
> > +               fec->pnum = cpu_to_be32(ubi->fm_next_anchor->pnum);
> > +               set_seen(ubi, ubi->fm_next_anchor->pnum, seen_pebs);
> > +               fec->ec = cpu_to_be32(ubi->fm_next_anchor->ec);
> > +
> > +               free_peb_count++;
> > +               fm_pos += sizeof(*fec);
> > +               ubi_assert(fm_pos <= ubi->fm_size);
> > +       }
> >         fmh->free_peb_count = cpu_to_be32(free_peb_count);
> >
> >         ubi_for_each_used_peb(ubi, wl_e, tmp_rb) {
> > diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
> > index 9688b411c930..a12fdb137fa4 100644
> > --- a/drivers/mtd/ubi/ubi.h
> > +++ b/drivers/mtd/ubi/ubi.h
> > @@ -491,7 +491,8 @@ struct ubi_debug_info {
> >   * @fm_work: fastmap work queue
> >   * @fm_work_scheduled: non-zero if fastmap work was scheduled
> >   * @fast_attach: non-zero if UBI was attached by fastmap
> > - * @fm_anchor: The next anchor PEB to use for fastmap
> > + * @fm_anchor: The new anchor PEB used during fastmap update
> > + * @fm_next_anchor: An anchor PEB candidate for the next time 
> fastmap is updated
> >   * @fm_do_produce_anchor: If true produce an anchor PEB in wl
> >   *
> >   * @used: RB-tree of used physical eraseblocks
> > @@ -602,6 +603,7 @@ struct ubi_device {
> >         int fm_work_scheduled;
> >         int fast_attach;
> >         struct ubi_wl_entry *fm_anchor;
> > +       struct ubi_wl_entry *fm_next_anchor;
> >         int fm_do_produce_anchor;
> >
> >         /* Wear-leveling sub-system's stuff */
> > diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
> > index 5d77a38dba54..804c434928aa 100644
> > --- a/drivers/mtd/ubi/wl.c
> > +++ b/drivers/mtd/ubi/wl.c
> > @@ -688,20 +688,27 @@ static int wear_leveling_worker(struct 
> ubi_device *ubi, struct ubi_work *wrk,
> >         }
> >
> >  #ifdef CONFIG_MTD_UBI_FASTMAP
> > +       e1 = find_anchor_wl_entry(&ubi->used);
> > +       if (e1 && ubi->fm_next_anchor &&
> > +           (ubi->fm_next_anchor->ec - e1->ec >= UBI_WL_THRESHOLD)) {
> > +               ubi->fm_do_produce_anchor = 1;
> > +               /* fm_next_anchor is no longer considered a good anchor
> > +                * candidate.
> > +                * NULL assignment also prevents multiple wear level 
> checks
> > +                * of this PEB.
> > +                */
> > +               wl_tree_add(ubi->fm_next_anchor, &ubi->free);
> > +               ubi->fm_next_anchor = NULL;
> > +               ubi->free_count++;
> > +       }
> > +
> >         if (ubi->fm_do_produce_anchor) {
> > -               e1 = find_anchor_wl_entry(&ubi->used);
> >                 if (!e1)
> >                         goto out_cancel;
>
> The logic here is a little strange. Why don't you check for e1 being
> NULL in the new code block
> above?
Failing to get e1 is not critical in the new above code, it just means 
we skip the anchor wl check and continue what used to be done before 
this change. e1 is required however when specifically tasked with 
generating a new anchor, hence the placement of the NULL check.
>
> >                 e2 = get_peb_for_wl(ubi);
> >                 if (!e2)
> >                         goto out_cancel;
> >
> > -               /*
> > -                * Anchor move within the anchor area is useless.
> > -                */
> > -               if (e2->pnum < UBI_FM_MAX_START)
> > -                       goto out_cancel;
> > -
>
> Are you sure we can drop this optimization?

I can't find a problem with this. An anchor area PEB (with high EC) in 
the wl pool is consumed to make another anchor area PEB (lower EC) in 
ubi->used available by processing it to ubi->free. This process should 
be finished before fastmap update needs the PEB thanks to synchronous 
delete and fm_eba_sem.

On the other hand, keeping your optimization doesn't break the 
functionality of the new code either, the removal is just a small 
optimization with regard to wear leveling. So if you are concerned it 
can be kept. (If it is kept though, it needs to be modified to return e2 
to ubi->free as right now it is leaked. But that would be a separate 
patch as it is not tied to this change right?)

>
> Other than that I like your approach and would merge it. :-)
Thank you!
>
> >                 self_check_in_wl_tree(ubi, e1, &ubi->used);
> >                 rb_erase(&e1->u.rb, &ubi->used);
> >                 dbg_wl("anchor-move PEB %d to PEB %d", e1->pnum, 
> e2->pnum);
> > @@ -1080,8 +1087,11 @@ static int __erase_worker(struct ubi_device 
> *ubi, struct ubi_work *wl_wrk)
> >         if (!err) {
> >                 spin_lock(&ubi->wl_lock);
> >
> > -               if (!ubi->fm_anchor && e->pnum < UBI_FM_MAX_START) {
> > -                       ubi->fm_anchor = e;
> > +               if (!ubi->fm_next_anchor && e->pnum < 
> UBI_FM_MAX_START) {
> > +                       /* Abort anchor production, if needed it will be
> > +                        * enabled again in the wear leveling 
> started below.
> > +                        */
> > +                       ubi->fm_next_anchor = e;
> >                         ubi->fm_do_produce_anchor = 0;
> >                 } else {
> >                         wl_tree_add(e, &ubi->free);
> > --
> > 2.11.0
> >
> >
> > ______________________________________________________
> > Linux MTD discussion mailing list
> > http://lists.infradead.org/mailman/listinfo/linux-mtd/
>
>
>
> -- 
> Thanks,
> //richard
Arne Edholm April 30, 2020, 8:34 a.m. UTC | #5
Hi Richard,

On 4/2/20 11:02 AM, Arne Edholm wrote:
>
> On 3/30/20 10:54 PM, Richard Weinberger wrote:
>> On Mon, Jan 13, 2020 at 3:56 PM Arne Edholm <arne.edholm@axis.com> 
>> wrote:
>> >
>> > There is a risk that the fastmap anchor PEB is alternating between
>> > just two PEBs, the current anchor and the previous anchor that was 
>> just
>> > deleted. As the fastmap pools gets the first take on free PEBs, the
>> > pools may leave no free PEBs to be selected as the new anchor,
>> > resulting in the two PEBs alternating behaviour. If the anchor PEBs 
>> gets
>> > a high erase count the PEBs will not be used by the pools but 
>> remain in
>> > ubi->free, even more increasing the likelihood they will be used as
>> > anchors.
>> >
>> > Getting stuck using only a couple of PEBs continuously will result 
>> in an
>> > uneven wear, eventually leading to failure.
>> >
>> > To fix this:
>> >
>> > - Choose the fastmap anchor when the most free PEBs are available. 
>> This is
>> >   during rebuilding of the fastmap pools, after the unused pool 
>> PEBs are
>> >   added to ubi->free but before the pools are populated again from the
>> >   free PEBs. Also reserve an additional second best PEB as a candidate
>> >   for the next time the fast map anchor is updated. If a better PEB is
>> >   found the next time the fast map anchor is updated, the candidate is
>> >   made available for building the pools.
>> >
>> > - Enable anchor move within the anchor area again as it is useful for
>> >   distributing wear.
>> >
>> > - The anchor candidate for the next fastmap update is the most 
>> suited free
>> >   PEB. Check this PEB's erase count during wear leveling. If the wear
>> >   leveling limit is exceeded, the PEB is considered unsuitable for 
>> now. As
>> >   all other non used anchor area PEBs should be even worse, free up 
>> the
>> >   used anchor area PEB with the lowest erase count.
>> >
>> > Signed-off-by: Arne Edholm <arne.edholm@axis.com>
>> > ---
>> >  drivers/mtd/ubi/fastmap-wl.c | 39 
>> ++++++++++++++++++++++++---------------
>> >  drivers/mtd/ubi/fastmap.c    | 11 +++++++++++
>> >  drivers/mtd/ubi/ubi.h        |  4 +++-
>> >  drivers/mtd/ubi/wl.c         | 28 +++++++++++++++++++---------
>> >  4 files changed, 57 insertions(+), 25 deletions(-)
>> >
>> > diff --git a/drivers/mtd/ubi/fastmap-wl.c 
>> b/drivers/mtd/ubi/fastmap-wl.c
>> > index 426820ab9afe..97cb57a12440 100644
>> > --- a/drivers/mtd/ubi/fastmap-wl.c
>> > +++ b/drivers/mtd/ubi/fastmap-wl.c
>> > @@ -110,6 +110,21 @@ void ubi_refill_pools(struct ubi_device *ubi)
>> >         wl_pool->size = 0;
>> >         pool->size = 0;
>> >
>> > +       if (ubi->fm_anchor) {
>> > +               wl_tree_add(ubi->fm_anchor, &ubi->free);
>> > +               ubi->free_count++;
>> > +       }
>> > +       if (ubi->fm_next_anchor) {
>> > +               wl_tree_add(ubi->fm_next_anchor, &ubi->free);
>> > +               ubi->free_count++;
>> > +       }
>> > +
>> > +       /* All available PEBs are in ubi->free, now is the time to get
>> > +        * the best anchor PEBs.
>> > +        */
>> > +       ubi->fm_anchor = ubi_wl_get_fm_peb(ubi, 1);
>> > +       ubi->fm_next_anchor = ubi_wl_get_fm_peb(ubi, 1);
>> > +
>> >         for (;;) {
>> >                 enough = 0;
>> >                 if (pool->size < pool->max_size) {
>> > @@ -265,26 +280,20 @@ static struct ubi_wl_entry 
>> *get_peb_for_wl(struct ubi_device *ubi)
>> >  int ubi_ensure_anchor_pebs(struct ubi_device *ubi)
>> >  {
>> >         struct ubi_work *wrk;
>> > -       struct ubi_wl_entry *anchor;
>> >
>> >         spin_lock(&ubi->wl_lock);
>> >
>> > -       /* Do we already have an anchor? */
>> > -       if (ubi->fm_anchor) {
>> > -               spin_unlock(&ubi->wl_lock);
>> > -               return 0;
>> > -       }
>> > -
>> > -       /* See if we can find an anchor PEB on the list of free 
>> PEBs */
>> > -       anchor = ubi_wl_get_fm_peb(ubi, 1);
>> > -       if (anchor) {
>> > -               ubi->fm_anchor = anchor;
>> > -               spin_unlock(&ubi->wl_lock);
>> > -               return 0;
>> > +       /* Do we have a next anchor? */
>> > +       if (!ubi->fm_next_anchor) {
>> > +               ubi->fm_next_anchor = ubi_wl_get_fm_peb(ubi, 1);
>> > +               if (!ubi->fm_next_anchor)
>> > +                       /* Tell wear leveling to produce a new 
>> anchor PEB */
>> > +                       ubi->fm_do_produce_anchor = 1;
>> >         }
>> >
>> > -       /* No luck, trigger wear leveling to produce a new anchor 
>> PEB */
>> > -       ubi->fm_do_produce_anchor = 1;
>> > +       /* Do wear leveling to get a new anchor PEB or check the
>> > +        * existing next anchor candidate.
>> > +        */
>> >         if (ubi->wl_scheduled) {
>> >                 spin_unlock(&ubi->wl_lock);
>> >                 return 0;
>> > diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c
>> > index 1c7be4eb3ba6..02332f9ea996 100644
>> > --- a/drivers/mtd/ubi/fastmap.c
>> > +++ b/drivers/mtd/ubi/fastmap.c
>> > @@ -1220,6 +1220,17 @@ static int ubi_write_fastmap(struct 
>> ubi_device *ubi,
>> >                 fm_pos += sizeof(*fec);
>> >                 ubi_assert(fm_pos <= ubi->fm_size);
>> >         }
>> > +       if (ubi->fm_next_anchor) {
>> > +               fec = (struct ubi_fm_ec *)(fm_raw + fm_pos);
>> > +
>> > +               fec->pnum = cpu_to_be32(ubi->fm_next_anchor->pnum);
>> > +               set_seen(ubi, ubi->fm_next_anchor->pnum, seen_pebs);
>> > +               fec->ec = cpu_to_be32(ubi->fm_next_anchor->ec);
>> > +
>> > +               free_peb_count++;
>> > +               fm_pos += sizeof(*fec);
>> > +               ubi_assert(fm_pos <= ubi->fm_size);
>> > +       }
>> >         fmh->free_peb_count = cpu_to_be32(free_peb_count);
>> >
>> >         ubi_for_each_used_peb(ubi, wl_e, tmp_rb) {
>> > diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
>> > index 9688b411c930..a12fdb137fa4 100644
>> > --- a/drivers/mtd/ubi/ubi.h
>> > +++ b/drivers/mtd/ubi/ubi.h
>> > @@ -491,7 +491,8 @@ struct ubi_debug_info {
>> >   * @fm_work: fastmap work queue
>> >   * @fm_work_scheduled: non-zero if fastmap work was scheduled
>> >   * @fast_attach: non-zero if UBI was attached by fastmap
>> > - * @fm_anchor: The next anchor PEB to use for fastmap
>> > + * @fm_anchor: The new anchor PEB used during fastmap update
>> > + * @fm_next_anchor: An anchor PEB candidate for the next time 
>> fastmap is updated
>> >   * @fm_do_produce_anchor: If true produce an anchor PEB in wl
>> >   *
>> >   * @used: RB-tree of used physical eraseblocks
>> > @@ -602,6 +603,7 @@ struct ubi_device {
>> >         int fm_work_scheduled;
>> >         int fast_attach;
>> >         struct ubi_wl_entry *fm_anchor;
>> > +       struct ubi_wl_entry *fm_next_anchor;
>> >         int fm_do_produce_anchor;
>> >
>> >         /* Wear-leveling sub-system's stuff */
>> > diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
>> > index 5d77a38dba54..804c434928aa 100644
>> > --- a/drivers/mtd/ubi/wl.c
>> > +++ b/drivers/mtd/ubi/wl.c
>> > @@ -688,20 +688,27 @@ static int wear_leveling_worker(struct 
>> ubi_device *ubi, struct ubi_work *wrk,
>> >         }
>> >
>> >  #ifdef CONFIG_MTD_UBI_FASTMAP
>> > +       e1 = find_anchor_wl_entry(&ubi->used);
>> > +       if (e1 && ubi->fm_next_anchor &&
>> > +           (ubi->fm_next_anchor->ec - e1->ec >= UBI_WL_THRESHOLD)) {
>> > +               ubi->fm_do_produce_anchor = 1;
>> > +               /* fm_next_anchor is no longer considered a good 
>> anchor
>> > +                * candidate.
>> > +                * NULL assignment also prevents multiple wear 
>> level checks
>> > +                * of this PEB.
>> > +                */
>> > +               wl_tree_add(ubi->fm_next_anchor, &ubi->free);
>> > +               ubi->fm_next_anchor = NULL;
>> > +               ubi->free_count++;
>> > +       }
>> > +
>> >         if (ubi->fm_do_produce_anchor) {
>> > -               e1 = find_anchor_wl_entry(&ubi->used);
>> >                 if (!e1)
>> >                         goto out_cancel;
>>
>> The logic here is a little strange. Why don't you check for e1 being
>> NULL in the new code block
>> above?
> Failing to get e1 is not critical in the new above code, it just means 
> we skip the anchor wl check and continue what used to be done before 
> this change. e1 is required however when specifically tasked with 
> generating a new anchor, hence the placement of the NULL check.
>>
>> >                 e2 = get_peb_for_wl(ubi);
>> >                 if (!e2)
>> >                         goto out_cancel;
>> >
>> > -               /*
>> > -                * Anchor move within the anchor area is useless.
>> > -                */
>> > -               if (e2->pnum < UBI_FM_MAX_START)
>> > -                       goto out_cancel;
>> > -
>>
>> Are you sure we can drop this optimization?
>
> I can't find a problem with this. An anchor area PEB (with high EC) in 
> the wl pool is consumed to make another anchor area PEB (lower EC) in 
> ubi->used available by processing it to ubi->free. This process should 
> be finished before fastmap update needs the PEB thanks to synchronous 
> delete and fm_eba_sem.
>
> On the other hand, keeping your optimization doesn't break the 
> functionality of the new code either, the removal is just a small 
> optimization with regard to wear leveling. So if you are concerned it 
> can be kept. (If it is kept though, it needs to be modified to return 
> e2 to ubi->free as right now it is leaked. But that would be a 
> separate patch as it is not tied to this change right?)
>
>>
>> Other than that I like your approach and would merge it. :-)
> Thank you!
>>
>> >                 self_check_in_wl_tree(ubi, e1, &ubi->used);
>> >                 rb_erase(&e1->u.rb, &ubi->used);
>> >                 dbg_wl("anchor-move PEB %d to PEB %d", e1->pnum, 
>> e2->pnum);
>> > @@ -1080,8 +1087,11 @@ static int __erase_worker(struct ubi_device 
>> *ubi, struct ubi_work *wl_wrk)
>> >         if (!err) {
>> >                 spin_lock(&ubi->wl_lock);
>> >
>> > -               if (!ubi->fm_anchor && e->pnum < UBI_FM_MAX_START) {
>> > -                       ubi->fm_anchor = e;
>> > +               if (!ubi->fm_next_anchor && e->pnum < 
>> UBI_FM_MAX_START) {
>> > +                       /* Abort anchor production, if needed it 
>> will be
>> > +                        * enabled again in the wear leveling 
>> started below.
>> > +                        */
>> > +                       ubi->fm_next_anchor = e;
>> >                         ubi->fm_do_produce_anchor = 0;
>> >                 } else {
>> >                         wl_tree_add(e, &ubi->free);
>> > --
>> > 2.11.0
>> >
>> >
>> > ______________________________________________________
>> > Linux MTD discussion mailing list
>> > http://lists.infradead.org/mailman/listinfo/linux-mtd/
>>
>>
>>
>> -- 
>> Thanks,
>> //richard

Are you satisfied with my answers or are there additional 
information/changes needed?

Arne
Richard Weinberger April 30, 2020, 2:29 p.m. UTC | #6
On Thu, Apr 30, 2020 at 10:34 AM Arne Edholm <arne.edholm@axis.com> wrote:
> Are you satisfied with my answers or are there additional
> information/changes needed?

Yes. In the meanwhile I did more testing and with your changes the
anchor PEB selection
is *much* better. Testing took some time and then I scheduled away to
other stuff...
Critical workloads are these where a fastmap is not written due to
heavy write load,
but other events like volume change/creation.

A good reproducer seems something stupid like that:
for i in `seq 1000` ; do ubimkvol -N test -m /dev/ubi0 >/dev/null &&
ubirmvol /dev/ubi0 -n 0 ; done
Wearleveling threshold is 50, btw.

Without your patch, the erase counter of the first 64 PEBs:
4    4    4    4    4    4    4    4
4    4    4    4    4    4    4    4
4    4    4    4    4    4    4    4
4    4    4    4    4    4    4    4
4    4    4    4    4    4    4    4
4    4    4    4    4    4    22   4
4    19   4    4    4    4    4    4
4    4    4    4    8    908  906  1

And with your patch:
95   95   95   95   95   95   95   95
95   95   95   95   95   95   95   95
95   95   95   95   95   95   95   95
95   95   95   95   95   95   95   95
95   95   95   95   95   95   95   95
95   95   95   95   95   95   95   95
95   95   95   94   94   94   94   94
94   94   94   94   94   94   94   95

In tests where the volumes are just filled and erased again, I could
not see a difference.

So, your patch will be part of the next merge window. :-)
Richard Weinberger April 30, 2020, 2:35 p.m. UTC | #7
On Thu, Apr 30, 2020 at 4:29 PM Richard Weinberger
<richard.weinberger@gmail.com> wrote:
>
> On Thu, Apr 30, 2020 at 10:34 AM Arne Edholm <arne.edholm@axis.com> wrote:
> > Are you satisfied with my answers or are there additional
> > information/changes needed?
>
> Yes. In the meanwhile I did more testing and with your changes the
> anchor PEB selection
> is *much* better. Testing took some time and then I scheduled away to
> other stuff...
> Critical workloads are these where a fastmap is not written due to
> heavy write load,
> but other events like volume change/creation.
>
> A good reproducer seems something stupid like that:
> for i in `seq 1000` ; do ubimkvol -N test -m /dev/ubi0 >/dev/null &&
> ubirmvol /dev/ubi0 -n 0 ; done
> Wearleveling threshold is 50, btw.
>
> Without your patch, the erase counter of the first 64 PEBs:
> 4    4    4    4    4    4    4    4
> 4    4    4    4    4    4    4    4
> 4    4    4    4    4    4    4    4
> 4    4    4    4    4    4    4    4
> 4    4    4    4    4    4    4    4
> 4    4    4    4    4    4    22   4
> 4    19   4    4    4    4    4    4
> 4    4    4    4    8    908  906  1
>
> And with your patch:
> 95   95   95   95   95   95   95   95
> 95   95   95   95   95   95   95   95
> 95   95   95   95   95   95   95   95
> 95   95   95   95   95   95   95   95
> 95   95   95   95   95   95   95   95
> 95   95   95   95   95   95   95   95
> 95   95   95   94   94   94   94   94
> 94   94   94   94   94   94   94   95

While observing my own mail on the mailing list I discovered something
I didn't notice
last time on my terminal.
If we summarize all numbers in the squares it should be more or less 2000.
because the test triggered 2000 fastmap writes.
But 95 times 64 is much more than 2000.

Your patch produces almost a perfect distribution, the overall erase count
is three times as much as it is expected.
Hmmm.
Richard Weinberger May 17, 2020, 9:33 p.m. UTC | #8
On Thu, Apr 30, 2020 at 4:35 PM Richard Weinberger
<richard.weinberger@gmail.com> wrote:
>
> On Thu, Apr 30, 2020 at 4:29 PM Richard Weinberger
> <richard.weinberger@gmail.com> wrote:
> >
> > On Thu, Apr 30, 2020 at 10:34 AM Arne Edholm <arne.edholm@axis.com> wrote:
> > > Are you satisfied with my answers or are there additional
> > > information/changes needed?
> >
> > Yes. In the meanwhile I did more testing and with your changes the
> > anchor PEB selection
> > is *much* better. Testing took some time and then I scheduled away to
> > other stuff...
> > Critical workloads are these where a fastmap is not written due to
> > heavy write load,
> > but other events like volume change/creation.
> >
> > A good reproducer seems something stupid like that:
> > for i in `seq 1000` ; do ubimkvol -N test -m /dev/ubi0 >/dev/null &&
> > ubirmvol /dev/ubi0 -n 0 ; done
> > Wearleveling threshold is 50, btw.
> >
> > Without your patch, the erase counter of the first 64 PEBs:
> > 4    4    4    4    4    4    4    4
> > 4    4    4    4    4    4    4    4
> > 4    4    4    4    4    4    4    4
> > 4    4    4    4    4    4    4    4
> > 4    4    4    4    4    4    4    4
> > 4    4    4    4    4    4    22   4
> > 4    19   4    4    4    4    4    4
> > 4    4    4    4    8    908  906  1
> >
> > And with your patch:
> > 95   95   95   95   95   95   95   95
> > 95   95   95   95   95   95   95   95
> > 95   95   95   95   95   95   95   95
> > 95   95   95   95   95   95   95   95
> > 95   95   95   95   95   95   95   95
> > 95   95   95   95   95   95   95   95
> > 95   95   95   94   94   94   94   94
> > 94   94   94   94   94   94   94   95
>
> While observing my own mail on the mailing list I discovered something
> I didn't notice
> last time on my terminal.
> If we summarize all numbers in the squares it should be more or less 2000.
> because the test triggered 2000 fastmap writes.
> But 95 times 64 is much more than 2000.
>
> Your patch produces almost a perfect distribution, the overall erase count
> is three times as much as it is expected.
> Hmmm.

I did more tests and can no longer reproduce the problem with too much
wear-leveling.
Maybe my initial tests were wonky. So, patch looks good, results too.
Let's merge it with 5.8. :-)
Arne Edholm May 18, 2020, 12:20 p.m. UTC | #9
On 5/17/20 11:33 PM, Richard Weinberger wrote:
> On Thu, Apr 30, 2020 at 4:35 PM Richard Weinberger
> <richard.weinberger@gmail.com> wrote:
> >
> > On Thu, Apr 30, 2020 at 4:29 PM Richard Weinberger
> > <richard.weinberger@gmail.com> wrote:
> > >
> > > On Thu, Apr 30, 2020 at 10:34 AM Arne Edholm 
> <arne.edholm@axis.com> wrote:
> > > > Are you satisfied with my answers or are there additional
> > > > information/changes needed?
> > >
> > > Yes. In the meanwhile I did more testing and with your changes the
> > > anchor PEB selection
> > > is *much* better. Testing took some time and then I scheduled away to
> > > other stuff...
> > > Critical workloads are these where a fastmap is not written due to
> > > heavy write load,
> > > but other events like volume change/creation.
> > >
> > > A good reproducer seems something stupid like that:
> > > for i in `seq 1000` ; do ubimkvol -N test -m /dev/ubi0 >/dev/null &&
> > > ubirmvol /dev/ubi0 -n 0 ; done
> > > Wearleveling threshold is 50, btw.
> > >
> > > Without your patch, the erase counter of the first 64 PEBs:
> > > 4    4    4    4    4    4    4    4
> > > 4    4    4    4    4    4    4    4
> > > 4    4    4    4    4    4    4    4
> > > 4    4    4    4    4    4    4    4
> > > 4    4    4    4    4    4    4    4
> > > 4    4    4    4    4    4    22   4
> > > 4    19   4    4    4    4    4    4
> > > 4    4    4    4    8    908  906  1
> > >
> > > And with your patch:
> > > 95   95   95   95   95   95   95   95
> > > 95   95   95   95   95   95   95   95
> > > 95   95   95   95   95   95   95   95
> > > 95   95   95   95   95   95   95   95
> > > 95   95   95   95   95   95   95   95
> > > 95   95   95   95   95   95   95   95
> > > 95   95   95   94   94   94   94   94
> > > 94   94   94   94   94   94   94   95
> >
> > While observing my own mail on the mailing list I discovered something
> > I didn't notice
> > last time on my terminal.
> > If we summarize all numbers in the squares it should be more or less 
> 2000.
> > because the test triggered 2000 fastmap writes.
> > But 95 times 64 is much more than 2000.
> >
> > Your patch produces almost a perfect distribution, the overall erase 
> count
> > is three times as much as it is expected.
> > Hmmm.
>
> I did more tests and can no longer reproduce the problem with too much
> wear-leveling.
> Maybe my initial tests were wonky. So, patch looks good, results too.
> Let's merge it with 5.8. :-)
>
Thank you Richard. I have also been trying to reproduce this issue 
without success.

/Arne
diff mbox series

Patch

diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c
index 426820ab9afe..97cb57a12440 100644
--- a/drivers/mtd/ubi/fastmap-wl.c
+++ b/drivers/mtd/ubi/fastmap-wl.c
@@ -110,6 +110,21 @@  void ubi_refill_pools(struct ubi_device *ubi)
 	wl_pool->size = 0;
 	pool->size = 0;
 
+	if (ubi->fm_anchor) {
+		wl_tree_add(ubi->fm_anchor, &ubi->free);
+		ubi->free_count++;
+	}
+	if (ubi->fm_next_anchor) {
+		wl_tree_add(ubi->fm_next_anchor, &ubi->free);
+		ubi->free_count++;
+	}
+
+	/* All available PEBs are in ubi->free, now is the time to get
+	 * the best anchor PEBs.
+	 */
+	ubi->fm_anchor = ubi_wl_get_fm_peb(ubi, 1);
+	ubi->fm_next_anchor = ubi_wl_get_fm_peb(ubi, 1);
+
 	for (;;) {
 		enough = 0;
 		if (pool->size < pool->max_size) {
@@ -265,26 +280,20 @@  static struct ubi_wl_entry *get_peb_for_wl(struct ubi_device *ubi)
 int ubi_ensure_anchor_pebs(struct ubi_device *ubi)
 {
 	struct ubi_work *wrk;
-	struct ubi_wl_entry *anchor;
 
 	spin_lock(&ubi->wl_lock);
 
-	/* Do we already have an anchor? */
-	if (ubi->fm_anchor) {
-		spin_unlock(&ubi->wl_lock);
-		return 0;
-	}
-
-	/* See if we can find an anchor PEB on the list of free PEBs */
-	anchor = ubi_wl_get_fm_peb(ubi, 1);
-	if (anchor) {
-		ubi->fm_anchor = anchor;
-		spin_unlock(&ubi->wl_lock);
-		return 0;
+	/* Do we have a next anchor? */
+	if (!ubi->fm_next_anchor) {
+		ubi->fm_next_anchor = ubi_wl_get_fm_peb(ubi, 1);
+		if (!ubi->fm_next_anchor)
+			/* Tell wear leveling to produce a new anchor PEB */
+			ubi->fm_do_produce_anchor = 1;
 	}
 
-	/* No luck, trigger wear leveling to produce a new anchor PEB */
-	ubi->fm_do_produce_anchor = 1;
+	/* Do wear leveling to get a new anchor PEB or check the
+	 * existing next anchor candidate.
+	 */
 	if (ubi->wl_scheduled) {
 		spin_unlock(&ubi->wl_lock);
 		return 0;
diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c
index 1c7be4eb3ba6..02332f9ea996 100644
--- a/drivers/mtd/ubi/fastmap.c
+++ b/drivers/mtd/ubi/fastmap.c
@@ -1220,6 +1220,17 @@  static int ubi_write_fastmap(struct ubi_device *ubi,
 		fm_pos += sizeof(*fec);
 		ubi_assert(fm_pos <= ubi->fm_size);
 	}
+	if (ubi->fm_next_anchor) {
+		fec = (struct ubi_fm_ec *)(fm_raw + fm_pos);
+
+		fec->pnum = cpu_to_be32(ubi->fm_next_anchor->pnum);
+		set_seen(ubi, ubi->fm_next_anchor->pnum, seen_pebs);
+		fec->ec = cpu_to_be32(ubi->fm_next_anchor->ec);
+
+		free_peb_count++;
+		fm_pos += sizeof(*fec);
+		ubi_assert(fm_pos <= ubi->fm_size);
+	}
 	fmh->free_peb_count = cpu_to_be32(free_peb_count);
 
 	ubi_for_each_used_peb(ubi, wl_e, tmp_rb) {
diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
index 9688b411c930..a12fdb137fa4 100644
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -491,7 +491,8 @@  struct ubi_debug_info {
  * @fm_work: fastmap work queue
  * @fm_work_scheduled: non-zero if fastmap work was scheduled
  * @fast_attach: non-zero if UBI was attached by fastmap
- * @fm_anchor: The next anchor PEB to use for fastmap
+ * @fm_anchor: The new anchor PEB used during fastmap update
+ * @fm_next_anchor: An anchor PEB candidate for the next time fastmap is updated
  * @fm_do_produce_anchor: If true produce an anchor PEB in wl
  *
  * @used: RB-tree of used physical eraseblocks
@@ -602,6 +603,7 @@  struct ubi_device {
 	int fm_work_scheduled;
 	int fast_attach;
 	struct ubi_wl_entry *fm_anchor;
+	struct ubi_wl_entry *fm_next_anchor;
 	int fm_do_produce_anchor;
 
 	/* Wear-leveling sub-system's stuff */
diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 5d77a38dba54..804c434928aa 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -688,20 +688,27 @@  static int wear_leveling_worker(struct ubi_device *ubi, struct ubi_work *wrk,
 	}
 
 #ifdef CONFIG_MTD_UBI_FASTMAP
+	e1 = find_anchor_wl_entry(&ubi->used);
+	if (e1 && ubi->fm_next_anchor &&
+	    (ubi->fm_next_anchor->ec - e1->ec >= UBI_WL_THRESHOLD)) {
+		ubi->fm_do_produce_anchor = 1;
+		/* fm_next_anchor is no longer considered a good anchor
+		 * candidate.
+		 * NULL assignment also prevents multiple wear level checks
+		 * of this PEB.
+		 */
+		wl_tree_add(ubi->fm_next_anchor, &ubi->free);
+		ubi->fm_next_anchor = NULL;
+		ubi->free_count++;
+	}
+
 	if (ubi->fm_do_produce_anchor) {
-		e1 = find_anchor_wl_entry(&ubi->used);
 		if (!e1)
 			goto out_cancel;
 		e2 = get_peb_for_wl(ubi);
 		if (!e2)
 			goto out_cancel;
 
-		/*
-		 * Anchor move within the anchor area is useless.
-		 */
-		if (e2->pnum < UBI_FM_MAX_START)
-			goto out_cancel;
-
 		self_check_in_wl_tree(ubi, e1, &ubi->used);
 		rb_erase(&e1->u.rb, &ubi->used);
 		dbg_wl("anchor-move PEB %d to PEB %d", e1->pnum, e2->pnum);
@@ -1080,8 +1087,11 @@  static int __erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk)
 	if (!err) {
 		spin_lock(&ubi->wl_lock);
 
-		if (!ubi->fm_anchor && e->pnum < UBI_FM_MAX_START) {
-			ubi->fm_anchor = e;
+		if (!ubi->fm_next_anchor && e->pnum < UBI_FM_MAX_START) {
+			/* Abort anchor production, if needed it will be
+			 * enabled again in the wear leveling started below.
+			 */
+			ubi->fm_next_anchor = e;
 			ubi->fm_do_produce_anchor = 0;
 		} else {
 			wl_tree_add(e, &ubi->free);