ubi_check_volume() hung on a single core system

Message ID	549AAD57.8020307@huawei.com
State	Changes Requested
Headers	show Return-Path: <linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org> Message-ID: <549AAD57.8020307@huawei.com> Date: Wed, 24 Dec 2014 20:11:03 +0800 From: hujianyang <hujianyang@huawei.com> User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: Artem Bityutskiy <dedekind1@gmail.com> Subject: ubi_check_volume() hung on a single core system summary: Content analysis details: (-0.7 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [119.145.14.65 listed in list.dnswl.org] -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay domain -0.0 RCVD_IN_MSPIKE_H4 RBL: Very Good reputation (+4) [119.145.14.65 listed in wl.mailspike.net] -0.0 SPF_PASS SPF: sender matches SPF record -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders Cc: Richard Weinberger <richard@nod.at>, linux-mtd <linux-mtd@lists.infradead.org> Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-mtd" <linux-mtd-bounces@lists.infradead.org> Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org

Message ID

549AAD57.8020307@huawei.com

State

Changes Requested

Headers

Message-ID: <549AAD57.8020307@huawei.com>
Date: Wed, 24 Dec 2014 20:11:03 +0800
From: hujianyang <hujianyang@huawei.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1;
	rv:17.0) Gecko/20130801 Thunderbird/17.0.8
MIME-Version: 1.0
To: Artem Bityutskiy <dedekind1@gmail.com>
Subject: ubi_check_volume() hung on a single core system
Cc: Richard Weinberger <richard@nod.at>,
	linux-mtd <linux-mtd@lists.infradead.org>
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-mtd" <linux-mtd-bounces@lists.infradead.org>
Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org

Commit Message

hujianyang Dec. 24, 2014, 12:11 p.m. UTC

Hi,

When I was running mtd-utils/tests/ubi-tests/io_basic.c on a
single core system, watchdog reset the OS and printed:

ERR:The task of feeding senior watchdog overtimes, system will reset!

io_basic.c tests the UBI_IOCVOLUP feature of UBI driver. UBI
will perform ubi_check_volume() after updating operation is
finished. The used ebs will be scanned for a static volume in
this function.

If I run schedule() in the loop of eraseblock scanning, the
*reset* not happen and the system works in right condition.




I think this error can't be re-created on a multi-core system.
It can only happen on a single core system. This directly
schedule I modified would hurt the performance of volume check.

Does anyone interested in this issue?


Thanks,
Hu

Comments

Richard Weinberger Dec. 24, 2014, 1:29 p.m. UTC | #1

Am 24.12.2014 um 13:11 schrieb hujianyang:
> Hi,
> 
> When I was running mtd-utils/tests/ubi-tests/io_basic.c on a
> single core system, watchdog reset the OS and printed:
> 
> ERR:The task of feeding senior watchdog overtimes, system will reset!
> 
> io_basic.c tests the UBI_IOCVOLUP feature of UBI driver. UBI
> will perform ubi_check_volume() after updating operation is
> finished. The used ebs will be scanned for a static volume in
> this function.
> 
> If I run schedule() in the loop of eraseblock scanning, the
> *reset* not happen and the system works in right condition.
> 
> diff --git a/drivers/mtd/ubi/misc.c b/drivers/mtd/ubi/misc.c
> index dbda77e..f4f478c 100644
> --- a/drivers/mtd/ubi/misc.c
> +++ b/drivers/mtd/ubi/misc.c
> @@ -74,6 +74,9 @@ int ubi_check_volume(struct ubi_device *ubi, int vol_id)
>         for (i = 0; i < vol->used_ebs; i++) {
>                 int size;
> 
> +               set_current_state(TASK_UNINTERRUPTIBLE);
> +               schedule_timeout(HZ/10);

cond_resched() please.

>                 if (i == vol->used_ebs - 1)
>                         size = vol->last_eb_bytes;
>                 else
> 
> 
> 
> I think this error can't be re-created on a multi-core system.
> It can only happen on a single core system. This directly
> schedule I modified would hurt the performance of volume check.
> 
> Does anyone interested in this issue?

Of course!

Thanks,
//richard

hujianyang Dec. 25, 2014, 4:43 a.m. UTC | #2

On 2014/12/24 21:29, Richard Weinberger wrote:
> Am 24.12.2014 um 13:11 schrieb hujianyang:
>> Hi,
>>
>> When I was running mtd-utils/tests/ubi-tests/io_basic.c on a
>> single core system, watchdog reset the OS and printed:
>>
>> ERR:The task of feeding senior watchdog overtimes, system will reset!
>>
>> io_basic.c tests the UBI_IOCVOLUP feature of UBI driver. UBI
>> will perform ubi_check_volume() after updating operation is
>> finished. The used ebs will be scanned for a static volume in
>> this function.
>>
>> If I run schedule() in the loop of eraseblock scanning, the
>> *reset* not happen and the system works in right condition.
>>
>> diff --git a/drivers/mtd/ubi/misc.c b/drivers/mtd/ubi/misc.c
>> index dbda77e..f4f478c 100644
>> --- a/drivers/mtd/ubi/misc.c
>> +++ b/drivers/mtd/ubi/misc.c
>> @@ -74,6 +74,9 @@ int ubi_check_volume(struct ubi_device *ubi, int vol_id)
>>         for (i = 0; i < vol->used_ebs; i++) {
>>                 int size;
>>
>> +               set_current_state(TASK_UNINTERRUPTIBLE);
>> +               schedule_timeout(HZ/10);
> 
> cond_resched() please.
> 
>>                 if (i == vol->used_ebs - 1)
>>                         size = vol->last_eb_bytes;
>>                 else
>>
>>
>>
>> I think this error can't be re-created on a multi-core system.
>> It can only happen on a single core system. This directly
>> schedule I modified would hurt the performance of volume check.
>>
>> Does anyone interested in this issue?
> 
> Of course!
> 
> Thanks,
> //richard
> 
> .
> 

Hi Richard,

Thanks for your suggestion. I've tested it and achieve much better
performance. Certainly, fix this issue.

I'd like to send a patch about this problem. Do you think it is
necessary to fix this in mainline?

Thanks,
Hu

diff --git a/drivers/mtd/ubi/misc.c b/drivers/mtd/ubi/misc.c
index dbda77e..f4f478c 100644
--- a/drivers/mtd/ubi/misc.c
+++ b/drivers/mtd/ubi/misc.c
@@ -74,6 +74,9 @@  int ubi_check_volume(struct ubi_device *ubi, int vol_id)
        for (i = 0; i < vol->used_ebs; i++) {
                int size;

+               set_current_state(TASK_UNINTERRUPTIBLE);
+               schedule_timeout(HZ/10);
+
                if (i == vol->used_ebs - 1)
                        size = vol->last_eb_bytes;
                else

ubi_check_volume() hung on a single core system

Commit Message

Comments

Patch