Patchwork "blocked for more than 120 secs" --> a valid situation, how to prevent?

login
register
mail settings
Submitter Mark Lord
Date Sept. 24, 2010, 3:51 a.m.
Message ID <4C9C2039.8050903@teksavvy.com>
Download mbox | patch
Permalink /patch/65622/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Mark Lord - Sept. 24, 2010, 3:51 a.m.
On 10-09-23 10:53 PM, Mark Lord wrote:
> On 10-09-23 08:05 PM, Douglas Gilbert wrote:
>> Mark,
>> If you issued the SG_IO ioctl with a timeout of at
>> least 66 minutes (expressed in milliseconds) then
>> it looks like ata_scsi_queuecmd() has a problem.
> ..
>
> Mmm.. more like blk_execute_rq() perhaps.
> That's where the wait_for_completion(&wait) call is at.
>
> Perhaps I should change it to wait in smaller increments,
> so that the lockup detection doesn't trigger on it..
..

This patch (below) seems to work.

Does this look kosher enough for me to roll it up
as a proper patch submission?   Jens?  Joel?

The problem, again, is that the hangcheck timer fires
inappropriately during very long SG_IO commands,
such as --security-erase operations which take minutes/hours to complete.

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe - Sept. 24, 2010, 9:12 a.m.
On 2010-09-24 05:51, Mark Lord wrote:
> On 10-09-23 10:53 PM, Mark Lord wrote:
>> On 10-09-23 08:05 PM, Douglas Gilbert wrote:
>>> Mark,
>>> If you issued the SG_IO ioctl with a timeout of at
>>> least 66 minutes (expressed in milliseconds) then
>>> it looks like ata_scsi_queuecmd() has a problem.
>> ..
>>
>> Mmm.. more like blk_execute_rq() perhaps.
>> That's where the wait_for_completion(&wait) call is at.
>>
>> Perhaps I should change it to wait in smaller increments,
>> so that the lockup detection doesn't trigger on it..
> ..
> 
> This patch (below) seems to work.
> 
> Does this look kosher enough for me to roll it up
> as a proper patch submission?   Jens?  Joel?

Ideally it would be nice to just pass the info down that it should not
complain, since waiting > 120 seconds (or whatever the timeout is set
to) is expected by the caller in some cases.

But your patch is simple enough and it gets the job done. I will queue
it up for .37 if you send a properly formatted and signed-off-by
version.

Patch

--- old/block/blk-exec.c	2010-08-26 19:47:12.000000000 -0400
+++ linux/block/blk-exec.c	2010-09-23 23:41:47.478826002 -0400
@@ -95,7 +95,8 @@ 
  
  	rq->end_io_data = &wait;
  	blk_execute_rq_nowait(q, bd_disk, rq, at_head, blk_end_sync_rq);
-	wait_for_completion(&wait);
+	while (!wait_for_completion_timeout(&wait, (sysctl_hung_task_timeout_secs >> 1) * HZ))
+		; /* periodic wakeup prevents "hung_task" warnings */
  
  	if (rq->errors)
  		err = -EIO;