diff mbox series

ubi: check kthread_should_stop() after the setting of task state

Message ID 20200601091231.3794350-1-chengzhihao1@huawei.com
State New
Headers show
Series ubi: check kthread_should_stop() after the setting of task state | expand

Commit Message

Zhihao Cheng June 1, 2020, 9:12 a.m. UTC
A detach hung is possible when a race occurs between the detach process
and the ubi background thread. The following sequences outline the race:

  ubi thread: if (list_empty(&ubi->works)...

  ubi detach: set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
              => by kthread_stop()
              wake_up_process()
              => ubi thread is still running, so 0 is returned

  ubi thread: set_current_state(TASK_INTERRUPTIBLE)
              schedule()
              => ubi thread will never be scheduled again

  ubi detach: wait_for_completion()
              => hung task!

To fix that, we need to check kthread_should_stop() after we set the
task state, so the ubi thread will either see the stop bit and exit or
the task state is reset to runnable such that it isn't scheduled out
indefinitely.

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: <Stable@vger.kernel.org>
Fixes: 801c135ce73d5df1ca ("UBI: Unsorted Block Images")
Reported-by: syzbot+853639d0cb16c31c7a14@syzkaller.appspotmail.com
---
 drivers/mtd/ubi/wl.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Richard Weinberger Aug. 2, 2020, 9:25 p.m. UTC | #1
On Mon, Jun 1, 2020 at 11:13 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>
> A detach hung is possible when a race occurs between the detach process
> and the ubi background thread. The following sequences outline the race:
>
>   ubi thread: if (list_empty(&ubi->works)...
>
>   ubi detach: set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
>               => by kthread_stop()
>               wake_up_process()
>               => ubi thread is still running, so 0 is returned
>
>   ubi thread: set_current_state(TASK_INTERRUPTIBLE)
>               schedule()
>               => ubi thread will never be scheduled again
>
>   ubi detach: wait_for_completion()
>               => hung task!
>
> To fix that, we need to check kthread_should_stop() after we set the
> task state, so the ubi thread will either see the stop bit and exit or
> the task state is reset to runnable such that it isn't scheduled out
> indefinitely.
>
> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
> Cc: <Stable@vger.kernel.org>
> Fixes: 801c135ce73d5df1ca ("UBI: Unsorted Block Images")
> Reported-by: syzbot+853639d0cb16c31c7a14@syzkaller.appspotmail.com
> ---
>  drivers/mtd/ubi/wl.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
> index 5146cce5fe32..a4d4343053d7 100644
> --- a/drivers/mtd/ubi/wl.c
> +++ b/drivers/mtd/ubi/wl.c
> @@ -1628,6 +1628,19 @@ int ubi_thread(void *u)
>                     !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
>                         set_current_state(TASK_INTERRUPTIBLE);
>                         spin_unlock(&ubi->wl_lock);
> +
> +                       /*
> +                        * Check kthread_should_stop() after we set the task
> +                        * state to guarantee that we either see the stop bit
> +                        * and exit or the task state is reset to runnable such
> +                        * that it's not scheduled out indefinitely and detects
> +                        * the stop bit at kthread_should_stop().
> +                        */
> +                       if (kthread_should_stop()) {
> +                               set_current_state(TASK_RUNNING);
> +                               break;
> +                       }
> +

Hmm, I see the problem but I fear this patch does not cure the race completely.
It just lowers the chance to hit it.
What if KTHREAD_SHOULD_STOP is set right after you checked for it?

>                         schedule();
>                         continue;
>                 }
Zhihao Cheng Aug. 3, 2020, 2:01 a.m. UTC | #2
在 2020/8/3 5:25, Richard Weinberger 写道:
> On Mon, Jun 1, 2020 at 11:13 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>> A detach hung is possible when a race occurs between the detach process
>> and the ubi background thread. The following sequences outline the race:
>>
>>    ubi thread: if (list_empty(&ubi->works)...
>>
>>    ubi detach: set_bit(KTHREAD_SHOULD_STOP, &kthread->flags)
>>                => by kthread_stop()
>>                wake_up_process()
>>                => ubi thread is still running, so 0 is returned
>>
>>    ubi thread: set_current_state(TASK_INTERRUPTIBLE)
>>                schedule()
>>                => ubi thread will never be scheduled again
>>
>>    ubi detach: wait_for_completion()
>>                => hung task!
>>
>> To fix that, we need to check kthread_should_stop() after we set the
>> task state, so the ubi thread will either see the stop bit and exit or
>> the task state is reset to runnable such that it isn't scheduled out
>> indefinitely.
>>
>> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
>> Cc: <Stable@vger.kernel.org>
>> Fixes: 801c135ce73d5df1ca ("UBI: Unsorted Block Images")
>> Reported-by: syzbot+853639d0cb16c31c7a14@syzkaller.appspotmail.com
>> ---
>>   drivers/mtd/ubi/wl.c | 13 +++++++++++++
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
>> index 5146cce5fe32..a4d4343053d7 100644
>> --- a/drivers/mtd/ubi/wl.c
>> +++ b/drivers/mtd/ubi/wl.c
>> @@ -1628,6 +1628,19 @@ int ubi_thread(void *u)
>>                      !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
>>                          set_current_state(TASK_INTERRUPTIBLE);
>>                          spin_unlock(&ubi->wl_lock);
>> +
>> +                       /*
>> +                        * Check kthread_should_stop() after we set the task
>> +                        * state to guarantee that we either see the stop bit
>> +                        * and exit or the task state is reset to runnable such
>> +                        * that it's not scheduled out indefinitely and detects
>> +                        * the stop bit at kthread_should_stop().
>> +                        */
>> +                       if (kthread_should_stop()) {
>> +                               set_current_state(TASK_RUNNING);
>> +                               break;
>> +                       }
>> +
> Hmm, I see the problem but I fear this patch does not cure the race completely.
> It just lowers the chance to hit it.
> What if KTHREAD_SHOULD_STOP is set right after you checked for it?

The patch can handle this case. ubi_thread will exit at 
kthread_should_stop() in next iteration.


You can apply following patch to verify it. (You may set 
'kernel.hung_task_timeout_secs = 10' by sysctl)


diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 27636063ed1b..44047861c2a1 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -91,6 +91,7 @@
  #include <linux/kthread.h>
  #include "ubi.h"
  #include "wl.h"
+#include <linux/delay.h>

  /* Number of physical eraseblocks reserved for wear-leveling purposes */
  #define WL_RESERVED_PEBS 1
@@ -1627,8 +1628,10 @@ int ubi_thread(void *u)
         for (;;) {
                 int err;

-               if (kthread_should_stop())
+               if (kthread_should_stop()) {
+                       pr_err("Exit at stop A\n");
                         break;
+               }

                 if (try_to_freeze())
                         continue;
@@ -1638,6 +1641,15 @@ int ubi_thread(void *u)
                     !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
                         set_current_state(TASK_INTERRUPTIBLE);
                         spin_unlock(&ubi->wl_lock);
+                       if (kthread_should_stop()) {
+                               set_current_state(TASK_RUNNING);
+                               break;
+                       }
+
+                       pr_err("Check should stop B\n");
+                       mdelay(5000);
+                       pr_err("delay 5000ms \n");
+
                         schedule();
                         continue;
                 }
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 132f84a5fde3..18627acb9573 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -590,6 +590,10 @@ int kthread_stop(struct task_struct *k)
         get_task_struct(k);
         kthread = to_kthread(k);
         set_bit(KTHREAD_SHOULD_STOP, &kthread->flags);
+
+       if (k->comm && !strncmp(k->comm, "ubi_bgt0d", strlen("ubi_bgt0d")))
+               pr_err("Stop flag has been set for task %s\n", k->comm);
+
         kthread_unpark(k);
         wake_up_process(k);
         wait_for_completion(&kthread->exited);

kernel msg:
[  210.602005] Check should stop B             # Please execute 
ubi_detach manually when you see this
[  211.347638] Stop flag has been set for task ubi_bgt0d
[  215.603026] delay 5000ms
[  215.605728] Exit at stop A
>>                          schedule();
>>                          continue;
>>                  }
>
diff mbox series

Patch

diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 5146cce5fe32..a4d4343053d7 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -1628,6 +1628,19 @@  int ubi_thread(void *u)
 		    !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) {
 			set_current_state(TASK_INTERRUPTIBLE);
 			spin_unlock(&ubi->wl_lock);
+
+			/*
+			 * Check kthread_should_stop() after we set the task
+			 * state to guarantee that we either see the stop bit
+			 * and exit or the task state is reset to runnable such
+			 * that it's not scheduled out indefinitely and detects
+			 * the stop bit at kthread_should_stop().
+			 */
+			if (kthread_should_stop()) {
+				set_current_state(TASK_RUNNING);
+				break;
+			}
+
 			schedule();
 			continue;
 		}