diff mbox

[2/6] UBI: Fastmap: Ensure that only one fastmap work is scheduled

Message ID 1416835236-25185-3-git-send-email-richard@nod.at
State Accepted
Headers show

Commit Message

Richard Weinberger Nov. 24, 2014, 1:20 p.m. UTC
If the WL pool runs out of PEBs we schedule a fastmap write
to refill it as soon as possible.
Ensure that only one at a time is scheduled otherwise we might end in
a fastmap write storm because writing the fastmap can schedule another
write if bitflips are detected.

Signed-off-by: Richard Weinberger <richard@nod.at>
---
 drivers/mtd/ubi/ubi.h | 4 +++-
 drivers/mtd/ubi/wl.c  | 8 +++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

Comments

Artem Bityutskiy Nov. 27, 2014, 3:27 p.m. UTC | #1
On Mon, 2014-11-24 at 14:20 +0100, Richard Weinberger wrote:
> If the WL pool runs out of PEBs we schedule a fastmap write
> to refill it as soon as possible.
> Ensure that only one at a time is scheduled otherwise we might end in
> a fastmap write storm because writing the fastmap can schedule another
> write if bitflips are detected.

Could you please provide some more details about the "write storm". Does
it happen when there are 2 fastmap works in the queue? Or if they run
simultaneously? Why the storm happens and white kind of "writes" it
consists of?

Thanks!
Richard Weinberger Nov. 27, 2014, 4:13 p.m. UTC | #2
Am 27.11.2014 um 16:27 schrieb Artem Bityutskiy:
> On Mon, 2014-11-24 at 14:20 +0100, Richard Weinberger wrote:
>> If the WL pool runs out of PEBs we schedule a fastmap write
>> to refill it as soon as possible.
>> Ensure that only one at a time is scheduled otherwise we might end in
>> a fastmap write storm because writing the fastmap can schedule another
>> write if bitflips are detected.
> 
> Could you please provide some more details about the "write storm". Does
> it happen when there are 2 fastmap works in the queue? Or if they run
> simultaneously? Why the storm happens and white kind of "writes" it
> consists of?

If UBI needs to write a new fastmap while wear leveling (by using get_peb_for_wl())
a fastmap work is scheduled.
We cannot write a fastmap in this context because we're in atomic context.
At this point one fastmap write is scheduled. If now get_peb_for_wl() is executed
a second time it will schedule another fastmap work because the pools are still not refilled.
This will go on until finally a fastmap got written. Either by the the kworker or
ubi_wl_get_peb().
As now a lot of fastmap works are scheduled they all will be issues in series.
Hence, a write storm. :)

Thanks,
//richard
Artem Bityutskiy Nov. 27, 2014, 4:35 p.m. UTC | #3
On Thu, 2014-11-27 at 17:13 +0100, Richard Weinberger wrote:
> Am 27.11.2014 um 16:27 schrieb Artem Bityutskiy:
> > On Mon, 2014-11-24 at 14:20 +0100, Richard Weinberger wrote:
> >> If the WL pool runs out of PEBs we schedule a fastmap write
> >> to refill it as soon as possible.
> >> Ensure that only one at a time is scheduled otherwise we might end in
> >> a fastmap write storm because writing the fastmap can schedule another
> >> write if bitflips are detected.
> > 
> > Could you please provide some more details about the "write storm". Does
> > it happen when there are 2 fastmap works in the queue? Or if they run
> > simultaneously? Why the storm happens and white kind of "writes" it
> > consists of?
> 
> If UBI needs to write a new fastmap while wear leveling (by using get_peb_for_wl())
> a fastmap work is scheduled.
> We cannot write a fastmap in this context because we're in atomic context.
> At this point one fastmap write is scheduled. If now get_peb_for_wl() is executed
> a second time it will schedule another fastmap work because the pools are still not refilled.

Sounds like just you do not need any works and any queues at all. All
you need is a "please, fastmap me!" flag.

Then this flag should be checked every time we enter the background
thread or the fastmap code, and be acted upon.

So the background thread would first check this flag, and if it is set -
call the fastmap stuff. The go do the WL works.

Just off-the top of my head, take with grain of salt.
Richard Weinberger Nov. 27, 2014, 4:39 p.m. UTC | #4
Am 27.11.2014 um 17:35 schrieb Artem Bityutskiy:
> On Thu, 2014-11-27 at 17:13 +0100, Richard Weinberger wrote:
>> Am 27.11.2014 um 16:27 schrieb Artem Bityutskiy:
>>> On Mon, 2014-11-24 at 14:20 +0100, Richard Weinberger wrote:
>>>> If the WL pool runs out of PEBs we schedule a fastmap write
>>>> to refill it as soon as possible.
>>>> Ensure that only one at a time is scheduled otherwise we might end in
>>>> a fastmap write storm because writing the fastmap can schedule another
>>>> write if bitflips are detected.
>>>
>>> Could you please provide some more details about the "write storm". Does
>>> it happen when there are 2 fastmap works in the queue? Or if they run
>>> simultaneously? Why the storm happens and white kind of "writes" it
>>> consists of?
>>
>> If UBI needs to write a new fastmap while wear leveling (by using get_peb_for_wl())
>> a fastmap work is scheduled.
>> We cannot write a fastmap in this context because we're in atomic context.
>> At this point one fastmap write is scheduled. If now get_peb_for_wl() is executed
>> a second time it will schedule another fastmap work because the pools are still not refilled.
> 
> Sounds like just you do not need any works and any queues at all. All
> you need is a "please, fastmap me!" flag.
> 
> Then this flag should be checked every time we enter the background
> thread or the fastmap code, and be acted upon.
> 
> So the background thread would first check this flag, and if it is set -
> call the fastmap stuff. The go do the WL works.
> 
> Just off-the top of my head, take with grain of salt.

So you want me to redesign it?
IMHO this is just a matter of taste.

Face it, my brain does not work like yours. I design things differently.

Thanks,
//richard
Artem Bityutskiy Nov. 27, 2014, 4:49 p.m. UTC | #5
On Thu, 2014-11-27 at 17:39 +0100, Richard Weinberger wrote:
> > So the background thread would first check this flag, and if it is set -
> > call the fastmap stuff. The go do the WL works.
> > 
> > Just off-the top of my head, take with grain of salt.
> 
> So you want me to redesign it?
> IMHO this is just a matter of taste.
> 
> Face it, my brain does not work like yours. I design things differently.

OK.
Tatyana Brokhman Dec. 4, 2014, 4:14 p.m. UTC | #6
On 11/24/2014 3:20 PM, Richard Weinberger wrote:
> If the WL pool runs out of PEBs we schedule a fastmap write
> to refill it as soon as possible.
> Ensure that only one at a time is scheduled otherwise we might end in
> a fastmap write storm because writing the fastmap can schedule another
> write if bitflips are detected.
>

Reviewed-by: Tanya Brokhman <tlinder@codeaurora.org>

> Signed-off-by: Richard Weinberger <richard@nod.at>
> ---
>   drivers/mtd/ubi/ubi.h | 4 +++-
>   drivers/mtd/ubi/wl.c  | 8 +++++++-
>   2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
> index f80ffab..04c4c05 100644
> --- a/drivers/mtd/ubi/ubi.h
> +++ b/drivers/mtd/ubi/ubi.h
> @@ -427,6 +427,7 @@ struct ubi_debug_info {
>    * @fm_size: fastmap size in bytes
>    * @fm_sem: allows ubi_update_fastmap() to block EBA table changes
>    * @fm_work: fastmap work queue
> + * @fm_work_scheduled: non-zero if fastmap work was scheduled
>    *
>    * @used: RB-tree of used physical eraseblocks
>    * @erroneous: RB-tree of erroneous used physical eraseblocks
> @@ -438,7 +439,7 @@ struct ubi_debug_info {
>    * @pq_head: protection queue head
>    * @wl_lock: protects the @used, @free, @pq, @pq_head, @lookuptbl, @move_from,
>    *	     @move_to, @move_to_put @erase_pending, @wl_scheduled, @works,
> - *	     @erroneous, and @erroneous_peb_count fields
> + *	     @erroneous, @erroneous_peb_count, and @fm_work_scheduled fields
>    * @move_mutex: serializes eraseblock moves
>    * @work_sem: used to wait for all the scheduled works to finish and prevent
>    * new works from being submitted
> @@ -533,6 +534,7 @@ struct ubi_device {
>   	void *fm_buf;
>   	size_t fm_size;
>   	struct work_struct fm_work;
> +	int fm_work_scheduled;
>
>   	/* Wear-leveling sub-system's stuff */
>   	struct rb_root used;
> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
> index 834f6fe..7f135df 100644
> --- a/drivers/mtd/ubi/wl.c
> +++ b/drivers/mtd/ubi/wl.c
> @@ -149,6 +149,9 @@ static void update_fastmap_work_fn(struct work_struct *wrk)
>   {
>   	struct ubi_device *ubi = container_of(wrk, struct ubi_device, fm_work);
>   	ubi_update_fastmap(ubi);
> +	spin_lock(&ubi->wl_lock);
> +	ubi->fm_work_scheduled = 0;
> +	spin_unlock(&ubi->wl_lock);
>   }
>
>   /**
> @@ -660,7 +663,10 @@ static struct ubi_wl_entry *get_peb_for_wl(struct ubi_device *ubi)
>   		/* We cannot update the fastmap here because this
>   		 * function is called in atomic context.
>   		 * Let's fail here and refill/update it as soon as possible. */
> -		schedule_work(&ubi->fm_work);
> +		if (!ubi->fm_work_scheduled) {
> +			ubi->fm_work_scheduled = 1;
> +			schedule_work(&ubi->fm_work);
> +		}
>   		return NULL;
>   	} else {
>   		pnum = pool->pebs[pool->used++];
>


Thanks,
Tanya Brokhman
Guido Martínez Dec. 17, 2014, 1:51 p.m. UTC | #7
Hi Richard,

On Mon, Nov 24, 2014 at 02:20:32PM +0100, Richard Weinberger wrote:
> If the WL pool runs out of PEBs we schedule a fastmap write
> to refill it as soon as possible.
> Ensure that only one at a time is scheduled otherwise we might end in
> a fastmap write storm because writing the fastmap can schedule another
> write if bitflips are detected.

Reviewed-by: Guido Martínez <guido@vanguardiasur.com.ar>

> Signed-off-by: Richard Weinberger <richard@nod.at>
> ---
>  drivers/mtd/ubi/ubi.h | 4 +++-
>  drivers/mtd/ubi/wl.c  | 8 +++++++-
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
> index f80ffab..04c4c05 100644
> --- a/drivers/mtd/ubi/ubi.h
> +++ b/drivers/mtd/ubi/ubi.h
> @@ -427,6 +427,7 @@ struct ubi_debug_info {
>   * @fm_size: fastmap size in bytes
>   * @fm_sem: allows ubi_update_fastmap() to block EBA table changes
>   * @fm_work: fastmap work queue
> + * @fm_work_scheduled: non-zero if fastmap work was scheduled
>   *
>   * @used: RB-tree of used physical eraseblocks
>   * @erroneous: RB-tree of erroneous used physical eraseblocks
> @@ -438,7 +439,7 @@ struct ubi_debug_info {
>   * @pq_head: protection queue head
>   * @wl_lock: protects the @used, @free, @pq, @pq_head, @lookuptbl, @move_from,
>   *	     @move_to, @move_to_put @erase_pending, @wl_scheduled, @works,
> - *	     @erroneous, and @erroneous_peb_count fields
> + *	     @erroneous, @erroneous_peb_count, and @fm_work_scheduled fields
>   * @move_mutex: serializes eraseblock moves
>   * @work_sem: used to wait for all the scheduled works to finish and prevent
>   * new works from being submitted
> @@ -533,6 +534,7 @@ struct ubi_device {
>  	void *fm_buf;
>  	size_t fm_size;
>  	struct work_struct fm_work;
> +	int fm_work_scheduled;
>  
>  	/* Wear-leveling sub-system's stuff */
>  	struct rb_root used;
> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
> index 834f6fe..7f135df 100644
> --- a/drivers/mtd/ubi/wl.c
> +++ b/drivers/mtd/ubi/wl.c
> @@ -149,6 +149,9 @@ static void update_fastmap_work_fn(struct work_struct *wrk)
>  {
>  	struct ubi_device *ubi = container_of(wrk, struct ubi_device, fm_work);
>  	ubi_update_fastmap(ubi);
> +	spin_lock(&ubi->wl_lock);
> +	ubi->fm_work_scheduled = 0;
> +	spin_unlock(&ubi->wl_lock);
>  }
>  
>  /**
> @@ -660,7 +663,10 @@ static struct ubi_wl_entry *get_peb_for_wl(struct ubi_device *ubi)
>  		/* We cannot update the fastmap here because this
>  		 * function is called in atomic context.
>  		 * Let's fail here and refill/update it as soon as possible. */
> -		schedule_work(&ubi->fm_work);
> +		if (!ubi->fm_work_scheduled) {
> +			ubi->fm_work_scheduled = 1;
> +			schedule_work(&ubi->fm_work);
> +		}
>  		return NULL;
>  	} else {
>  		pnum = pool->pebs[pool->used++];
> -- 
> 1.8.4.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
index f80ffab..04c4c05 100644
--- a/drivers/mtd/ubi/ubi.h
+++ b/drivers/mtd/ubi/ubi.h
@@ -427,6 +427,7 @@  struct ubi_debug_info {
  * @fm_size: fastmap size in bytes
  * @fm_sem: allows ubi_update_fastmap() to block EBA table changes
  * @fm_work: fastmap work queue
+ * @fm_work_scheduled: non-zero if fastmap work was scheduled
  *
  * @used: RB-tree of used physical eraseblocks
  * @erroneous: RB-tree of erroneous used physical eraseblocks
@@ -438,7 +439,7 @@  struct ubi_debug_info {
  * @pq_head: protection queue head
  * @wl_lock: protects the @used, @free, @pq, @pq_head, @lookuptbl, @move_from,
  *	     @move_to, @move_to_put @erase_pending, @wl_scheduled, @works,
- *	     @erroneous, and @erroneous_peb_count fields
+ *	     @erroneous, @erroneous_peb_count, and @fm_work_scheduled fields
  * @move_mutex: serializes eraseblock moves
  * @work_sem: used to wait for all the scheduled works to finish and prevent
  * new works from being submitted
@@ -533,6 +534,7 @@  struct ubi_device {
 	void *fm_buf;
 	size_t fm_size;
 	struct work_struct fm_work;
+	int fm_work_scheduled;
 
 	/* Wear-leveling sub-system's stuff */
 	struct rb_root used;
diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 834f6fe..7f135df 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -149,6 +149,9 @@  static void update_fastmap_work_fn(struct work_struct *wrk)
 {
 	struct ubi_device *ubi = container_of(wrk, struct ubi_device, fm_work);
 	ubi_update_fastmap(ubi);
+	spin_lock(&ubi->wl_lock);
+	ubi->fm_work_scheduled = 0;
+	spin_unlock(&ubi->wl_lock);
 }
 
 /**
@@ -660,7 +663,10 @@  static struct ubi_wl_entry *get_peb_for_wl(struct ubi_device *ubi)
 		/* We cannot update the fastmap here because this
 		 * function is called in atomic context.
 		 * Let's fail here and refill/update it as soon as possible. */
-		schedule_work(&ubi->fm_work);
+		if (!ubi->fm_work_scheduled) {
+			ubi->fm_work_scheduled = 1;
+			schedule_work(&ubi->fm_work);
+		}
 		return NULL;
 	} else {
 		pnum = pool->pebs[pool->used++];